Abstract:
PURPOSE: to analyze the Mean Length of Utterance-words (MLU-w) in children aged 4;00-5;05 years.
METHODS: ninety two Portuguese children with normal development were observed: 49 girls and 43 boys, divided in age range groups of six months. A sample of 100 utterances produced in spontaneous discourse was collected from each child. The utterances were transcribed and analyzed.
RESULTS: MLU-w was shown to vary between 4,5 to 5 words, progressing with age. This progression had been previously observed in US English and in Brazilian Portuguese speaking children, although in European Portuguese the number of words is overall a little higher. Both boys and girls performed similarly. Years of parents formal education showed some influence, but not in all age groups. Results showed a positive and significant correlation with a formal test for language assessment, both in comprehension as in language production.
CONCLUSION: the MLU-w is a good measure of language development up to 5 years. The values found can serve as a normative reference for Portuguese children, but also in comparative studies on the development of spontaneous language.
Keywords: Language; Development; Children; Evaluation
Resumo:
OBJETIVO: analisar a Extensão Média do Enunciado-palavras (EME-p) em crianças entre os 4;00 e os 5;05.
MÉTODOS: foram observadas 92 crianças portuguesas com desenvolvimento típico: 49 meninas e 43 meninos, divididas em grupos etários com 6 meses de intervalo. Foi recolhida para cada criança uma amostra de 100 enunciados produzidos em discurso espontâneo. Os enunciados foram transcritos e analisados.
RESULTADOS: a EME-p variou de 4,5 a 5 palavras, aumentando com a idade. Esta progressão foi verificada anteriormente em crianças falantes de Inglês dos EUA e de Português do Brasil, embora no Português Europeu o número de palavras seja, no geral, um pouco superior. O desempenho de meninos e meninas foi idêntico. A escolaridade dos pais mostrou ter alguma influência, mas não em todos os grupos etários. Os resultados mostraram uma correlação positiva e significante com um teste formal de linguagem, tanto na compreensão, como na expressão.
CONCLUSÃO: a EME-p é uma boa medida de desenvolvimento da linguagem até aos 5 anos. Os valores encontrados podem servir como referência normativa relativamente às crianças portuguesas, mas também em estudos comparativos sobre o desenvolvimento da linguagem espontânea.
Descritores: Linguagem; Desenvolvimento; Criança; Avaliação
Introduction
The use of language assessment tests in children, although interesting for diagnosis, does not exclude the need to better understand the linguistic performance of the child in a natural context. Collecting samples of spontaneous discourse is becoming more important, as these are closer to the child's daily environment and experience, i.e. to the child's communication habits, routines and partners. Several authors support this kind of assessment, since much of linguistic research involves longitudinal qualitative values and data, in which the process of language acquisition and development is monitored considering certain patterns that unfold and change through time1. Moreover much criticism has emerged regarding the process of formal assessment in favor of a naturalistic observation2 3. The latter comes closer to the child's linguistic behavior in her daily life and allows the observation of language use in different contexts4.
Despite the advantages of using samples of spontaneous discourse over formal tests, there are limitations, namely the way in which the collection process unfolds or the number of utterances that are collected and analysed5 . Given these difficulties some authors have worked towards optimizing this process by proposing several strategies. The greatest concern is obtaining a reliable sample of the child's speech, the communicative interaction should be conducted in such a way that the resulting sample is representative of the typical production of the child, i.e. that it describes her usual linguistic production, including language that may be somewhat inferior or superior to her usual performance6. As early as the 70s and 80, attention was drawn to the materials to be used in the elicitation of spontaneous discourse, the different contexts where it can be collected, the way of registering it or the sample's extent7 8. Regarding the latter aspect Brown7 has claimed that a sample of 100 utterances is sufficient. Other authors have suggested larger and smaller samples; the alternative is to establish a period of 30 minutes, for example - regardless of how many productions may occur in that window6. In general these 30 minutes are enough for obtaining 100 to 200 utterances from children aged 2 and beyond8.
Mean length of utterance
The mean length utterance (MLU) is one of the language measurements that can be obtained through spontaneous discourse. Its main goal is obtaining data about morphological and syntactical aspects of language in children with both typical development and with language disorders2 7 9. The MLU has derived from the concept of mean length of response, coined by Margaret Nice10, who in 1925 had already considered that the length of utterances should be one of the most important criteria to assess linguistic progress, thus being a language maturation marker. In recent years the mean length of utterance in words (MLU-w) has been considered very useful, and its calculation formula has been kept the same as the original one: total number of words divided by the total number of produced utterances2.
After the publication of the studies by Brown7 measurements for the length of utterances produced by children became popular among researchers. Mean length of utterance morphemes (MLU-m) proposed by Brown is less used than MLU-w because the latter is easier to analyse4 11 and although they focus on different linguistic aspects, high correlations have been observed between these two measures12 13, which has resulted in many researchers preferring MLU-w because it is easier to analyse. Some authors propose the use of MLU-w, especially for performing comparative studies between languages, since the counting of words minimizes the morphological differences that can interfere in the calculation of MLU-m14. Regardless of the variant used, there seems to be a level around age 5 in which MLU stabilizes15 or even slightly declines16. The existing correlation between age and MLU disappear, since the differences in the utterances concern their complexity rather than their extension18.
Even if most studies do not make distinctions based on gender, this is an important matter, since at the level of language some results state that boys start to produce their first words and sentences later than girls, and that even at the level of use of simple gestures, girls start earlier19. Besides there are studies that show that girls have a more complete vocabulary and use a larger variety of sentences in their earliest communication20. At schooling age girls also seem to be more successful in all verbal competences than boys21. Regarding MLU, no gender differences are usually observed22, although there are some different results, indicating a better performance of girls, though only until the age of 3 or 423 24 25.
Another factor to be considered is the possible interference of the child's sociocultural background. Regarding this point, results have been controversial, largely depending on the methodologies used, including the ages of the children being studied, the language measures, and what may be considered as an indicator of the sociocultural background. A strong influence of this variable was found by some authors, children from a higher sociocultural background produced more complex utterances25 26 27; the same was observed with respect to the influence of the education level of their mothers3. However in some studies using specifically MLU-w no significant relations were found with the education level of the mother, neither in very young children aged 228 nor in older children aged up to 9 years9 29 .
Despite evidence of the high quality of MLU for the identifying and assessing English children's production and despite different studies conducted in other languages, there are no studies focusing on European Portuguese. In Brazilian Portuguese studies have been done describing a significant increase of the values of MLU-m and MLU-w with progressing age30, as observed in children with typical development, aged between 2 and 4 years. Table 1 shows the average values of MLU in two studies: one on Brazilian Portuguese30 and another one focusing on American English29, concerning children with typical development. There is some disparity both for MLU-w and for MLU-m.
MLU-w and MLU-m values for Brazilian Portuguese (BP) and for American English (E) in children with normal development
The main aim of the present work is therefore to ascertain standard MLU-w values for Portuguese children aged between 4;00 and 5;05, divided in age ranges of six months, so that these values may serve as indicators of language development. One further aim is to ascertain whether there are gender differences and whether parent education is related to MLU-w.
Methods
Participants
The sample included 92 children, 43 boys and 49 girls, ages 4;00 to 5;05 years, divided in three age ranges of six months (Table 2). The data was collected between January and May of 2012 in nursery schools in the Lisbon region. The selection of the nursery schools was undertaken randomly. All institutions that had responded positively to the collaboration request were included in the study. Within each institution those children were included whose parents gave their consent in writing and who matched the inclusion criteria: Portuguese as native language; a result in the formal assessment of language within the expected values for their age; and not having been included in a Speech and Language Therapy rehabilitation programme. One hundred and sixteen children were observed and 24 were excluded for not fulfilling all these criteria.
Considering the possible influence of the parent education, this variable was controlled. The measurement used was the number of years of formal education of both the father and the mother; the highest of the two was considered. Parent education ranged from 4 years of formal education to university degree (Table 3).
Procedure
All children were assessed using a Portuguese test of language development, addressed at children aged between 2 and a half and 6 years31 (TALC). This test encompasses Comprehension tasks [Vocabulary (identifying objects and images); Semantic Relations (relations of two and three content words) and Complex Sentences] and Expression [Vocabulary (naming of objects and images); Absurd Sentences; Morphosyntactic Constituents; Pragmatics (communicative intentions)]. Only children, whose values in the assessment were situated within the expected average for their age were considered for this study (between -1 and +1 SD).
The methodology and materials used in this study are identical to those mentioned in a recent study29. Conversation samples were collected by three of the authors of this paper, who had been previously trained in data collecting, and who used a range of age-appropriate toys, such as household objects and toy animals, in order to elicit different grammatical forms and sentence types. The examiners interacted with the children in a play-suitable environment, and avoiding a predominance of verbal interactions, "yes/no" answers and Wh-questions.
All samples were collected in quiet rooms in the schools the children attended and they were audio-recorded with an Olympus WS-650S recorder, chosen for its noise-reduction features and sound reception at 50 cm distance. The examiner went to the room to fetch each child, interacting with her for about 5 minutes, as they walked to the room where assessment was to be performed. This procedure was intended to make the child feel at ease before collecting a sample of her spontaneous discourse. The interaction with the child lasted for about 20 to 30 minutes and aimed at collecting a minimum of 100 valid utterances. Sample collection from each child was undertaken during one single session.
The samples were transcribed and coded by the examiners using ELAN software (EUDICO Linguistic Annotator), an annotation tool created at Max Planck Institute for Psycholinguistics (http://www.lat-mpi.eu/tools/elan).
For the purposes of the analysis, the first 100 utterances of each child were considered, following Brown's criterion7. The following criteria were used for counting utterances and words:
-
Segmentation of utterances: it was considered that the utterance is limited by its intonational curve32 33; the use of the expression "e depois" ("and then", a filler frequently used by Portuguese children) was considered as a division between utterances and a marker for the beginning of the following utterance; all utterances were counted, even if they included morphosyntactic mistakes, as these are common in the age ranges under study; utterances repeated in the same way were counted only once7; songs, enumerations, counting and isolated words were not counted32; excluded were also utterances resulting from imitations produced immediately after having been used by the examiner, as well as those utterances that could not be understood due to unintelligible words, partial utterances resulting from the change of child's attention focus, false starts and reformulations (in this case, only the last formulation was counted), discourse markers (oh, ah...) which were not integrated in the utterance meaning, and isolated words32.
-
Word counting: "word" has been defined as any sequence that is semantically interpretable and delimited by blank spaces or punctuation marks34; the word was counted only once, in the most complete form that it was produced, except for the cases in which repetition aimed at stressing an idea7; contractions were counted as one word17; clitics, compounds, fixed expressions (e.g. David Beckham; Central Park) and onomatopoetic words were counted as one word; unintelligible words were not counted17; discourse auxiliaries such as exclamations were not counted7.
The examiners had previous training in transcribing and counting valid utterances and words. Each of them transcribed the utterances and the transcriptions were checked by the other two examiners and by two trained researchers. This procedure was carried out until the end of the work, with any disagreements resolved through consensus.
Results
The MLU-w of each child was calculated, and for each age group 3000 to 3200 utterances were used in the analysis. It could be ascertained that between 4 and 5 and a half years of age the MLU-w varies on average between 4,5 and 5 words. A progressive increase is observed in the three age groups considered, both in formal linguistic assessment as in MLU-w (Table 4). Since MLU-w had a normal distribution in the three age groups, the statistic test oneway ANOVA was used, which shows significant differences between them (F(2)= 7.72, p= 0.001), but the test Scheffé post hoc indicates that the differences occur only between groups 4;00-4;05 and 4;06-4;11 (p= 0.037) and between 4;00-4;05 and 5;00-5;05 groups (p= 0.001). There is no difference between 4;06-4;11 and 5;00-5;05 groups (p= 0.49).
Results in a formal language test (Comprehension and Production), MLU-w results, and distribution by Percentile
When compared to the values obtained by Rice and colleagues29 the same exact progression can be ascertained for the MLU-w in the first three age groups, a greater difference between the first and the second group aged 4 (Figure 1). In European Portuguese, utterances are overall slightly longer in all age groups. It was not possible to compare results with values from Brazilian Portuguese30 since the age groups are different.
After the analysis of results obtained by boys and girls through Student's t test for independent samples, no differences were observed between them (average for boys: 4.88 ± 0.69; average for girls: 4.78 ± 0.64; t(90)= 0.755, p= 0.45).
In the general sample, parent education showed significant correlation with MLU-w (Pearson r= 0.289, p= 0.005). The same could be ascertained when considering only the education level of the mother (Pearson r= 0.292, p= 0.005). However, the analysis of the data by age group showed that only the intermediate group aged 4;06 to 5;00 showed a strong correlation, both positive and significant (Pearson r= 0.753, p= 0.000). In the group of the younger children there was a non-significant relation (Pearson r= 0.179, p= 0.34) and the same was the case of the group of older children (Pearson r= -0.125, p= 0.50).
A positive and significant correlation could be ascertained between the results obtained in MLU-w and the values in the language development test (TALC), both for total values of this test (Pearson r= 0.33, p= 0.001), and for the areas of language comprehension (Pearson r= 0.287, p= 0.006) and language production (Pearson r= 0.286, p= 0.006).
Discussion
The data here presented are the first study of the MLU-w for European Portuguese, ranging from ages 4;00 to 5;05. The average obtained for the three groups, divided by a six month age range, is different from the one obtained for American English29; in Portuguese these average is slightly higher, as expected, given the morphsyntactic differences between these two languages. Moreover these results differ from the values found in Brazilian Portuguese30. However this latter comparison is difficult to perform, since the age ranges considered in both studies are not the same. In any case, the values obtained in Brazil are much lower than the ones we obtained in the group of children aged 4 years, which could have be due to a difference in methodology. In the Brazilian study the criteria used for segmentation of utterances and for word counting are not accurately described. Therefore the resulting values may not be comparable to the values obtained in our study.
Considering the specific features of the languages and/or their variants it is possible to understand the differences observed. The fact that we obtained higher MLU-w averages can be documented by the examples below, which show how the use of the gerund in English and in Brazilian Portuguese reduces the number of words in the utterance, or that a lower use of determiners in English also reduces that number.
-
English: He told me (that) Ana is eating (6 / 7 words)
-
Brazilian P.: Ele me disse que a Ana está comendo (8 words)
-
European P.: Ele disse-me que a Ana está a comer (9 words)
Besides their specific interest for the analysis of the development of each language and of clinical applications, studies available in various countries do tend to show certain universality in the number of words per utterance when a similar methodology is used in the analysis. For example in the first half of the four year old group an average variation can be observed of only 0,39 between American English (4.10) and European Portuguese (4,49).
Studies aimed at assessing MLU are scarce due to their intrinsic difficulty, as they require large samples and rigorous criteria of analysis. Any methodological difference will influence the results, and values obtained through different elicitation contexts and methods cannot be compared35. This difference prevents the realization of comparative analyses between language and even within the same language, when for instance one intends to study a large period of language development.
With regards to the influence of gender in the values of MLU-w, no statistically significant differences were found between boys and girls. This result is identical to that obtained by other researchers who have analysed MLU in Chinese children aged 2;03 to 5;08 years22, in French children over three years old25 and in older American children aged 6;03 to 15;0236. Gender-dependent differences seem to exist only in younger children, up to 3 or 4 years old. Girls present higher MLU-w than boys and they also produce more complex syntactic forms23 25. Therefore it is possible to claim that at around age 4 syntactic development has already acquired a certain stability, thus evening out the gender differences that may exist until that age.
Parent education has been shown to influence MLU-w in the total sample. These results are in agreement with other authors25 27, who observed a strong influence of this variable in such a way that children who came from higher sociocultural environments showed more complex oral productions and a more advanced linguistic development than children from lower sociocultural backgrounds. However, other authors have not observed a correlation with the background of the children as Rice and colleagues5. These differences in the studies and their results may depend upon different variables, such as the age ranges analysed in each study, the kind of linguistic analysis undertaken and also the measures of sociocultural background (parents' level of education, family income and/or father's occupation,...).
In our analysis we have taken into account parent education and we have considered the parent with the highest level of education, as measured in the number of years of formal education. We have also considered only the education level of the mother, as have other authors29. In both situations, results were similar. However this relation between parent education and language development as measured by MLU-w is not linear, since in the analysis by age group only the intermediate group, aged 4;06 to 5;00, showed significant correlation. There is apparently no influence of parent education in utterance extent among younger children. In the following phase that influence increases, along with the significant increase of utterance extent. Afterwards it ceases to exist, probably due to the development of more complex linguistic structures. MLU-w remains similar, as there is no significant difference between the group aged 4;06 to 5;00 and the group aged 5;01 to 5;05. On the other hand, we have observed that at age 5 a tendency for a negative correlation emerges between MLU and parent education. Similarly, Rice and colleagues29 have observed that there is only a significant relation of MLU with parent education in the group of children aged 5 and 6, that relation being negative, "suggesting higher MLU levels for the children of less educated mothers". Therefore the influence of parent education in MLU should not be excluded; nevertheless this measure may not be the most adequate for this observation, since an utterance containing more words is not necessarily more complex at morphological or syntactic level. The progressive development of cognitive and linguistic abilities will allow the child to use language fast and effectively, employing fewer words to convey the same meaning and making use of more complex linguistic structures. The type of language used by parents, certainly dependent on their education level, is a model that can influence the child at this stage, preceding the beginning of school. No wonder that there may be a tendency, as proposed in the present study, or even a significant correlation29 showing that the use of shorter utterances relates to higher levels of parents' education.
Significant correlations were obtained between MLU-w and a language development test, confirming that it is a valid development measurement18. Throughout the child's first years vocabulary increases progressively and the same occurs in sentence production: sentences have an increasingly higher number of words. Therefore a positive and significant correlation is expected between MLU-w and assessment tests for language development until the end of pre-school years. After this period this correlation might no longer be adequate, since MLU-w no longer increases in an evident manner, as the child is then able to produce more complex sentences, thus using less words to convey the same meaning.
Conclusion
The study of the spontaneous speech of 92 Portuguese children allows us to conclude the following: (1) at age 4 children produce, on average, 4 to 5 words per utterance, and at age 5 this number is clearly 5 words; (2) gender differences have not been found, as boys and girls have shown identical performance; (3) a clear influence of parental educational level was not established, so that it is not possible to ascertain the impact of this variable on children's performance; (4) a positive and significative correlation with the results of a formal test of language development was confirmed, both for comprehension and language production, which stresses the validity of EME-w as a measuring instrument for language development.
Despite the number of children in each age group being relatively small, we consider that the results obtained may serve as reference for Portuguese children, since the number of utterances collected was high and the criteria for participant inclusion, speech elicitation and analysis of verbal production were rigorous.
References
- 1 Gries ST, Stol S. Finding developmental groups in acquisition data: variability based neighbor clustering. Journal of Quantitative Linguistics. 2009;16:217-42.
- 2 Parker MD, Brorson, K. A comparative study between mean length of utterance in morphemes (MLUm) and mean length of utterance in words (MLUw). First Language. 2005;25:365-76.
- 3 Roy D. New horizons in the study of child language acquisition. Proceedings of INTERSPEECH 2009, 10th Annual Conference of the International Speech Communication Association, September 6-10; Brighton, United Kingdom; 2009.
- 4 Eisenbeiss S. Production methods in language acquisition research. In: Blom E, Unsworth S, editors. Experimental methods in language acquisition research. Amsterdam: John Benjamins Publishing Company; 2010. P. 11-34.
- 5 Rice ML, Redmond SM, Hoffman, L. Mean Length of Utterance in children with Specific Language Impairment and in younger control children shows current validity and stable and parallel growth trajectories. Journal of Speech, Language, and Hearing Research. 2006;49:793-808.
- 6 Retherford, KS. Guide to analysis of language transcripts. 3rd edition. Eau Claire, WI: Thinking Publications; 2000.
- 7 Brown R. A first language: the early stages. Cambridge, MA: Harvard University Press; 1973.
- 8 Miller JF. Assessing language production in children. Baltimore, MD: University Park Press; 1981.
- 9 Hickey T. Mean length of utterance and the acquisition of Irish. Journal of Child Language. 1991;18:553-69.
- 10 Nice MM. Length of sentences as a criterion of a child's progress in speech. Journal of Education Psychology. 1925;16:370-9.
- 11 Eisenberg SL, Fersko TM, Undgreen C. The use of MLU for identifying impairment in preschool children: a review. American Journal of Speech-Language Pathology. 2001;10:323-42.
- 12 Arif H, Bol GW. Counting MLU in morphemes and MLU in words in a normally developing child and child with language disorder: a comparative study. Dhaka University Journal of Linguistics. 2008;1:167-82.
- 13 Oosthuizen H, Southwood, F. Methodological issues in the calculation of mean length of utterance. South African Journal of Communication Disorders. 2009;56:76-87.
- 14 Gutiérrez-Clellen VF, Restrepo MA, Peña LB, Anderson R. Language sample analysis in Spanish speaking children: methodological consideration. Journal of Language, Speech, and Hearing Services in Schools. 2000;31:88-98.
- 15 Bol GW. Optimal subjects in Dutch child language. In: Koster C, Wijnen F, editors. Proceeding of the Groningen assembly on language acquisition. Groningen: Center for Language and Cognition; 1996. P. 125-33.
- 16 Klee T. Developmental and diagnostic characteristics of quantitative measures of children's language production. Topics in Language Disorders. 1992;12:28-41.
- 17Wieczorek R. Using MLU to study early language development in English. Psychology of Language and Communication. 2010;14:59-69.
- 18 Blake J, Quartaro G, Onorati S. Evaluating quantitative measures of grammatical complexity in spontaneous speech samples. Journal of Child Language. 1993;20:139-52.
- 19 Özçaliskan S, Goldin-Meadow S. Sex differences in language first appear in gesture. Developmental Science. 2010;13:752-60.
- 20 Ramer ALH. Syntactic styles in emerging language. Journal of Child Language. 1976;3:49-62.
- 21 Davies A. An introduction to applied Linguistics - from practice to theory. 2nd edition. Edinburgh: Edinburgh textbooks in applied Linguistics; 2007.
- 22 Klee T, Stokes SF, Wong AMY, Fletcher P, Gavin WJ. Utterance length and lexical diversity in Cantonese-speaking children with and without specific language impairment. Journal of Speech, Language and Hearing Research. 2004;47:1396-410.
- 23 Jackson SC, Roberts JE. Complex syntax production of African American preschoolers. Journal of Speech, Language, and Hearing Research. 2001; 4:1083-96.
- 24 Tse SK, Chan C, Kwong SM, Li H. Sex differences in syntactic development: evidence from Cantonese-speaking preschoolers in Hong-Kong. International Journal of Behavioral Development. 2002;26:509-17.
- 25 Le Normand M-T, Parisse C, Cohen H. Lexical diversity and productivity in French preschoolers. Clinical Linguistics and Phonetics. 2008;22:47-58.
- 26 Bornstein MH, Hahn CS, Haynes OM. Specific and general language performance across early childhood: stability and gender consideration. First Language. 2004;24:267-304.
- 27 Walker D, Greenwood C, Hart B, Carta J. Prediction of school outcomes based on early language production and socioeconomic factors. Children and Poverty. 1994;65:606-21.
- 28 Hoff E. The specificity of environmental influence: socioeconomic status affects early vocabulary development via maternal speech. Child Development. 2003;74:1368-78.
- 29 Rice ML, Smolik F, Perpich D, Thompson T, Rytting N, Blossom M. Mean Length of Utterance levels in 6-month intervals for children 3 to 9 years with and without language impairments. Journal of Speech, Language, and Hearing Research. 2010;53:333-49.
- 30 Araujo K, Befi-Lopes DM. Extensão média do enunciado de crianças entre 2 e 4 anos de idade: diferenças no uso de palavras e morfemas. Revista da Sociedade Brasileira de Fonoaudiologia. 2004;9:156-63.
- 31 Sua-Kay E, Tavares MD. Teste de avaliação da linguagem na criança - TALC. Lisboa: Oficina Didáctica; 2008.
- 32 Lund NJ, Duchan JF. Assessing children's language in naturalistic contexts. New Jersey: Prentice Hall; 1993.
- 33 Miller JF, Chapman RS. Systematic analysis of language transcripts [Computer software]. Madison: University of Wisconsin; 1991.
- 34 Villalva A. Morfologia do Português. Lisboa: Universidade Aberta; 2008.
- 35 Heilmann JJ. Myths and realities of language sample analysis. Perspectives on Language Learning and Education. 2010;17:4-8.
- 36 McEwen S. Learning to speak like girls and boys: a developmental study in gender and narrative style. Language, Information and Computation. 1996;11:449-58.
Publication Dates
-
Publication in this collection
Aug 2015
History
-
Received
27 Jan 2015 -
Accepted
23 Apr 2015