Initial lexical acquisition and noun bias hypothesis verification



verifying howthe initial lexical acquisition occurs in children with typical development, regarding to types and tokens of the lexical items. Furthermore, one wants to verify if the noun bias hypothesis occurs, and in what version, strong or weak.


the sample consisted of 20 children, male and female, with typical language development. Thisresearch covered ages from 1:0 to 1:11 (years: months) divided in three age groups(1:0 - 1:3;29, 1:4 - 1:7;29, 1:8 - 1:11;29). Audio data from spontaneous speech were collected, and after, lexicalanalysis was performed regarding to types and tokens produced. The Statistical tests Mann Whitney; Kruskal - Wallis and Wilcoxon were used, with significance level p< 0.05.


no statistical significance was found to the variables regarding to sex. However, statistical difference was found between the age group 1 in relation to 2 and 3 to the majority of variables. Furthermore, one verified prevalence of content words in the age groups 2 and 3. The prevalence of nouns over verbs in all age groups was observed.


the initial lexical acquisition in children with typical development occurs gradually according to the increase of age. In this period the sex variable doesn't influence in the linguistic performance. Furthermore, the noun bias hypothesis was confirmed in its weak version, confirming the thesis that inspired this research.

Child; Language Development; Vocabulary


verificar de que modo se dá a aquisição lexical inicial de crianças com desenvolvimento típico, em termos de tipos e ocorrências dos itens lexicais e se a hipótese do viés nominal realmente ocorre, e em qual versão, forte ou fraca.


a amostra foi composta por 20 crianças de ambos os sexos, com desenvolvimento típico de linguagem. Este estudo abrangeu a faixa etária de 1:0 a 1:11 (ano : meses) dividida em três subfaixas (1:0 - 1:3;29, 1:4 - 1:7;29, 1:8 - 1:11;29). Foram realizadas gravações da fala espontânea de cada sujeito, e após, análise lexical quanto aos tipos e ocorrências dos itens lexicais produzidos. Empregou-se a estatística Mann Whitney; Kruskal - Wallis e Wilcoxon, com nível de significância p< 0.05.


não houve diferença estatística para nenhuma das variáveis em relação ao sexo, porém, há diferença entre as subfaixas etárias 1 em relação a 2 e a 3 para a maioria das variáveis, além disso, verificou-se o predomínio das palavras de conteúdo nas subfaixas etárias 2 e 3 e por fim, verificou-se o predomínio de substantivos sobre os verbos em todas as faixas etárias.


a aquisição lexical inicial em crianças com desenvolvimento típico dá-se de forma progressiva de acordo com o aumento da faixa etária e neste período, a variável sexo não influencia na produção linguística. Além disso, a existência da hipótese do viés nominal foi confirmada em sua versão fraca, corroborando a tese que inspirou essa pesquisa.

Criança; Desenvolvimento da Linguagem; Vocabulário


In order to exist a communication between people it is necessary some sort of language. Therefore, when there is not an organic or psychic impediment for it, oral language is used. In this manner, the communication ability is one of humans differential traits, presenting different complexity levels, also being likely to irregular inadequacies and productions, which can be or not significant to the speech intelligibility. Language is composed by the lexical heap still acquired as a child, these initial acquisition steps. This communicative ability can be explained through the individual capability and fulfillment in receiving, elaborating and transmitting messages as long as they have an informative content11. Andrade CRF. Fases e Níveis de Prevenção em Fonoaudiologia - Ações Coletivas e Individuais. In: Vieira RM, Vieira MM, Ávila CRB, Pereira LD. Fonoaudiologia e Saúde Pública. Carapicuíba: Pró-Fono Departamento Editorial; 2000. p. 81-104. , 22. Andrade CRF. Fonoaudiologia preventiva. São Paulo: Lovise; 1996..

The lexical acquisition is one of the first remarkable manifestations in the language development and it is related to the comprehension capacity, also to the production of several types of meanings33. Hage SRV, Pereira MB. Desempenho de crianças com desenvolvimento típico de linguagem em prova de vocabulário expressivo. Rev CEFAC. 2006;8(4):419-28. , 44. Limongi SCO. Da ação a comunicação: um processo de aprendizagem. Rev. Psicopedagogia. 1996;15(56):24-8.. In this complementary manner, the vocabulary is the initial feature that distinguishes the acquisition from a specific language and allows the analysis plus the entire development of the features: phonology, morphology, syntax, pragmatics, semantics and fluency55. Costa SG. Ampliação de Vocabulário por Centro de Interesse. Universidade Federal de Mato Grosso. 2008..

In this study the term lexicon was chosen instead of the term vocabulary, because the first refers to lexical items inserted in the speech and the second refers to the language terms, which can be considered separately. Besides this, it is not intended to just consider nouns, verbs and adjectives, like in another studies about the theme33. Hage SRV, Pereira MB. Desempenho de crianças com desenvolvimento típico de linguagem em prova de vocabulário expressivo. Rev CEFAC. 2006;8(4):419-28. , 66. Brancalioni AR. Desempenho em prova de vocabulário de crianças com desvio fonológico e com desenvolvimento fonológico normal. Rev CEFAC. 2010;13(3):428-36. , 77. Athayde ML, Mota HB, Mezzomo CL. Vocabulário expressivo de crianças com desenvolvimento fonológico normal e desviante. Rev. Pró-Fono Atual. Cient. 2010;22(2):145-50., but as well as other elements that compounds the language grammar, such as, pronouns, conjunctions, prepositions, numerals, articles, interjections and adverbs apart from the ones already mentioned, nouns, verbs and adjectives88. Nomenclatura Gramatical Brasileira. Ministério da Educação do Brasil. Portaria Ministerial. 28 de janeiro de 1959..

The lexicon is a phenomenon in continuous growth as more knowledge is acquired, it is an open system, in constant improvement and enlargement. The contact between people, as a group, in society, at work and in several settings that offer human communication, also lead to their lexical heap increase, through an individual and heterogenic process99. Vidor DCGM. Aquisição lexical inicial por crianças falantes de português brasileiro: discussão do fenômeno da explosão do vocabulário e da atuação da hipótese do viés nominal [tese]. Porto Alegre (RS): Pontifícia Universidade Católica do Rio Grande do Sul; 2008. , 1010. Leffa VJ. Aspectos externos e internos da aquisição lexical. In: Leffa VJ. As palavras e sua companhia; o léxico na aprendizagem. Pelotas: EDUCAT. 2000; 1:15-44..

Still, lexicon is defined as a unit set, without the origin in the grammatical rules, but in the internal language. From what it was referred to, it becomes comprehensible the difficulty in lexical analysis accomplishment, through cultural implicatures or through the dynamic and mutant characteristics of the language observations, that seeks to follow the communication needs1111. Orzi V, Zavaglia C. Propostas de elaboração de vocabulário de itens lexicais tabuizados. Rev. Investigações. 2009;22(2):309-30.. The lexicon is dynamic and precise, it also results from the settings we attend to. So there it can follow the several new nomenclatures, new objects, new situations that happen in our peculiar quotidian; the gradual heap increase oh each person is necessary99. Vidor DCGM. Aquisição lexical inicial por crianças falantes de português brasileiro: discussão do fenômeno da explosão do vocabulário e da atuação da hipótese do viés nominal [tese]. Porto Alegre (RS): Pontifícia Universidade Católica do Rio Grande do Sul; 2008..

The first items from this heap appear when the child is about one year old. This universal phenomenon is explained by the fact that in this age the child reaches certain neuropsychological maturity1212. Barret M. Desenvolvimento lexical inicial. In: Fletcher P, Macwhinney B. Compêndio da linguagem da criança. Porto Alegre: Artes Médicas, 1997. p. 299-322. , 1313. Andersen EML. Representações lexicais subjacentes: verbos e léxico inicial. Revel. 2008;6(11):1-31.. The beginning of the standard lexical acquisition easily happens through recognition and word repetition that are similar in its phonology, followed by a fast increase in word number, which is characterized by the vocabulary explosion when the child is around 18 months old. This would explain itself by the initial codification system and attributions of the characteristics1414. Bloom P. Précis of how children learn the meanings of world. Behavioral and brain Sciences. 2001;24(6):1095-103.. When the child is around 2 years old, it is noticed an acquisition of 50 to 600 words in a speed of 10 words a day1515. Scherer S, Souza APR. Types e Tokens na aquisição típica de linguagem por sujeitos de 18 a 32 meses falantes do português brasileiro. Rev. CEFAC. 2011;13(5):838-45..

To check the lexical variety or the different varieties of spoken words by the child, a calculus is done through the rate in relation to the type/token - number of the several lexical items produced, divided by the total of lexical items, in other words, it is a measure of the linguistic production to estimate the lexical proficiency. In this way the type (kind) is each different lexical item spoken by the child and token (occurrences) refer to the repetitions of each type in the same talk99. Vidor DCGM. Aquisição lexical inicial por crianças falantes de português brasileiro: discussão do fenômeno da explosão do vocabulário e da atuação da hipótese do viés nominal [tese]. Porto Alegre (RS): Pontifícia Universidade Católica do Rio Grande do Sul; 2008. , 1515. Scherer S, Souza APR. Types e Tokens na aquisição típica de linguagem por sujeitos de 18 a 32 meses falantes do português brasileiro. Rev. CEFAC. 2011;13(5):838-45..

Still the examined variables in the lexical study, some papers propose the noun bias hypothesis, in which the names (nouns) are the word prevalent categories during the initial lexical acquisition. There are two versions from this hypothesis, the first being the strongest version, referring the noun acquisition, then the verb acquisition and then the remaining parts of speech; the second and weaker version in which the names appear simultaneously to the verbs, however, still in a high-priority manner99. Vidor DCGM. Aquisição lexical inicial por crianças falantes de português brasileiro: discussão do fenômeno da explosão do vocabulário e da atuação da hipótese do viés nominal [tese]. Porto Alegre (RS): Pontifícia Universidade Católica do Rio Grande do Sul; 2008. , 1616. Tonietto L, Parente MAMP, Duvignau K, Gaume B, Bosa CA. Aquisição inicial do léxico verbal e aproximações semânticas em português. Psicol. Reflex. Crit. 2007;20(1):114-23..

Based on what was exposed, the objective of this article is to verify in to what extent the initial lexical acquisition happens in children with typical development, in terms of types and tokens plus the lexical items occurrences. Besides this, it seeks to verify if the noun bias hypothesis really occurs, and in which version, stronger or weaker, related to the age group.


The current research is from a transversal and quantitative nature and it is attached to a previously approved project by the Research Ethics Committee from the Federal University of Santa Maria under the registration 0219.0.243.000-11. The sample was composed by 20 children from both genders, with a typical language development, Portuguese speakers from the south, no bilingual record and from a low economical class. The number of individuals was obtained through a calculus sample1717. Andrade CRF. Prevalência das desordens idiopáticas da fala e da linguagem em crianças de um a onze anos de idade. Rev. Saúde Pública. 1997;31(5):495-501. over the enrolment number from the child education in public child schools from three regions of Santa Maria - RS. This study covered the age group of 1:0 to 1:11 (years old: months) resulting in a sample number of 20 children. Whereas the sample was divided as it follows: three age groups and genders. In age group 1 are 1:0 to 1:3; 29; in age group 2 are 1:4 to 1:7; 29 and in age group 3 are 1:8 to 1:11; 29. The first two age groups were composed by six subjects and the last one by eight subjects.

The guardians for these subjects accepted participating in the research after they received a complete explanation about the research nature, its procedures, risks, benefits and secrecy about their identities. After everything, they signed a Free and Clarified Consent Term and filled a questionnaire involving pre and postnatal exams.

The inclusion criteria adopted for the subjects of the study participation were the following: to be between 1:0 and 1:11; 29 days; be a member of a Portuguese speaking family; present a typical language development of both genders, and to be from a low economical class.

The established exclusion criteria were: to present any level of hearing loss; neurological, emotional and/or cognitive limitation; the presence of alteration in the motor or organic origin; have done speech therapy, or be doing during the research; to present speech alterations that damage the language and speech development.

The subjects from the sample did the following evaluations: Behavioral Observation Protocol1818. Zorzi JL, Hage SRV. PROC - Protocolo de observação comportamental: avaliação de linguagem e aspectos cognitivos infantis. São José dos Campos: Pulso Editorial; 2004.; orofacial structures evaluation based on the Orofacial Myofunctional Evaluation Protocol with Scores (OMES) 1919. Felício CM, Ferreira CL. Protocol of orofacial functional evaluation with scores. Int J Pediatr Otorhinolaryngol. 2008;72:367-75. and Visual Reinforcement Audiometry (VRA). Later on, samples were collected through video with a Samsung camera (SMX-C200). The materials used in the film shooting were a box with several toys, including cars, animal miniatures, dolls, children's books, used by the researchers and the children.All alterations were done in the children's school. The film shootings were kept in microcomputers in order to have a phonetic transcription and a data analysis by three judges (two were undergraduate students and one was a doctoral student). From the transcriptions, the word was excluded if there was not an agreement from, at least, two judges. It is highlighted that for the lexical analysis, the phonetic transcription it would not be necessary, however, this measure will help discover early the possible delay/detour in the phonological development, contributing with other researches. The shooting took 20 minutes so there it could grasp a relevant sample from the child's speech.

As far as the data classification, two criteria were used: "types and tokens" plus the "produced parts of speech". The types were classified as each different lexical item said by the child and the tokens followed the same criteria, from each kind of repetition done in the same talk2020. Templin MC. Certain language skills in children: their development and interrelations. Westport, CT: Greenwood; 1957.. The content words are understood as the verbs and the nouns, and the grammatical words as adjectives, adverbs, interjections, pronouns and prepositions.

Thereby, the production frequency can be verified from each part of the speech according to each age group and gender, analyzing if the noun bias hypothesis really occurs and if it presents the stronger or weaker version, besides comparing the data found with other studies about the theme, in national and international literature.

The study data were submitted to Mann Whitney statistical analysis; Kruskal - Wallis and Wilcoxon. The significance level adopted for the statistical tests was 5% (p<0, 05).


Table 1 presents the analysis between the linguistic variables in each subject productions in relation to gender, through comparative averages. There was not a significant difference to none of the variables.

Table 1:
Numerical variable average between genders

Table 2 exposes the results as far as the comparisons between the age groups through the Kruskal - Wallis statistical test. It was verified that there is a significant statistical difference between the age group 1 in relation to age group 2 and 3 for most of the variables.

Table 2:
Numerical variable average between age groups

Table 3 presents the grammatical word analysis and the content produced in each age group, through comparative average. It was checked the predominance of the content words in the age groups 2 and 3.

Table 3:
Comparative analysis of grammatical and content words in each age group

Table 4 presents the analysis with the noun and verb class comparisons in each age group. It was checked the noun predominance over the verbs in all age groups.

Table 4:
Comparative analysis of nouns and verbs in each age group


In Table 1 it was possible to observe that there was not a significant statistical difference of the variables studied between boys and girls. These data confirm a study1515. Scherer S, Souza APR. Types e Tokens na aquisição típica de linguagem por sujeitos de 18 a 32 meses falantes do português brasileiro. Rev. CEFAC. 2011;13(5):838-45., which had as an objective to analyze the comparison in shifting between types and tokens also the type/token rate in children, from both genders, Brazilian Portuguese speakers, as far as the parts of speech plus the total and segmental measure. The study authors concluded the fact that there was not a difference between genders, which shows a balance in initial lexical acquisition between these two groups.

However, according with another study2121. Le Normand MT, Parisse C, Cohen H. Lexical diversity and productivity in French preschoolers: Developmental,gender, and sociocultural factors. Clin Linguist Phon. 2008;22:47-58. in which grammatical and lexical development measures were made including the average extension of the statement in relation to type/token, there is a gender variable influence in language acquisition. The study statistical result reveled a general gender effect, showing a small advantage in language production for the girls over the boys until 36 months of age. This difference between the studies can be related to the language, since the mentioned study21 was done with French children, same ages, for being premature.

Researching about lexical and morphological coda acquisition it was found that female gender as favoring the correct production. This fact reinforces the findings that highlights the female superiority in the tasks related to language and speech abilities77. Athayde ML, Mota HB, Mezzomo CL. Vocabulário expressivo de crianças com desenvolvimento fonológico normal e desviante. Rev. Pró-Fono Atual. Cient. 2010;22(2):145-50. , 2222. Mezzomo CL, Mota HB, Dias RF, Giacchini V. Fatores relevantes para aquisição da coda lexical e morfológica no português brasileiro. Rev. CEFAC. 2010;12(3):412-20.. Nevertheless, the same did not occur in the current study, because there was not a variation between the genders, probably for being aimed to the initial lexical heap not containing the phonetic analyses.

In the nouns and verbs comparative analysis between the genders, the current study confirms with a research2323. Befi-Lopes DM, Cáceres AM, Araújo K. Aquisição de verbos em pré-escolares falantes do português brasileiro. Rev CEFAC. 2007;9(4):444-52. that demonstrates the relation between the nouns and verbs usage and their classification, in a spontaneous speech situation in preschool with typical language development. Likewise it was here found, the study concluded that the genders did not influence in the verb and noun production.

In the comparative analysis between the age groups, it was verified the difference statistically significant for: types, tokens, grammatical words, nouns, content words. This data agrees with authors2424. Klee T, Stokes SF, Wong AM, Fletcher P, Gavin WJ. Utterance length and lexical diversity in Cantonesespeaking children with and without specific language impairment. J Speech Lang Hear Res. 2004;47(6):1396-410. who checked that the numbers of types and tokens occurrences of a language sample in a fixed extension increases due to age, being classified as "linguistic facility index", in which it reflects several factors, such as speech maturation; to produce a minimal syntactic organization like a nominal and verbal phrase and even a possible clause which demand a higher syntactic and lexical knowledge , in other words, with the highest number of conjunctions, pronouns and articles, among others2525. Garcia PCC, Vigário M. Palavras complexas nas primeiras produções infantis (estudo de caso) [Dissertação]. Lisboa: Universidade Católica Portuguesa; 2010..

As the child grows, his or her lexical heap increases. If, at the beginning of the analysis the child uses a reduced number of words, that belongs to a few parts of speech, with aging, the number of words increases also the variety of the parts of speech2626. Kim M, Mcgregor K, Thompson C. Early lexical developmet in English-and Korean-speaking children: language-general and language-specific patterns. J Child Lang. 2000;27:225-54..

International studies confirm that, after a small vocabulary growth approximately from 12 to 24 months of age, the child goes through a period called vocabulary explosion, demonstrating the age effect over the produced lexical items, in the same manner that occurred in the current study2727. Kauschke C, Hofmeister C. Early lexical development in German: a study on vocabulary growth and vocabulary composition during the second and third year of life. J Child Lang. 2001;29:735-57.

28. D'dorico L, Carubbi S, Salerni N, Calvo V. Vocabulary development in Italian children: a longitudinal evaluation of quantitative and qualitative aspect. J Child Lang. 2001;28:351-2.

29. Choi S, Gopnik A. Early acquisition of verbs in Korean: A cross-linguistic study. J. Child Lang. 1995; 22:497-529.
- 3030. Nelson K. Structure and strategy in learning to talk. [monografia]. Chicago (IL): University of Chicago; 1973..

Still related to the age group, the data found here confirm an international study in which it places the rising of grammatical words as slow3030. Nelson K. Structure and strategy in learning to talk. [monografia]. Chicago (IL): University of Chicago; 1973., since there was a certain content word predominance.

According to other study, the first children's words could be stuck to the context, being produced only in limited or specific situations. This context is an event that occurs with certain regularity for the children, however, there are words contextually flexible that are used in a reference manner to indicate classes of objects, proper names, individualized objects, people/animals, or actions. Initially, words are acquired in a slow velocity (around one, two or three new words per week), the statements are reduced to one word each time. That explains the fact that in this study the significant differences are between the age group 1 and 2; also 1 and 3, not between 2 and 3, because these last two groups a bigger stability was verified1212. Barret M. Desenvolvimento lexical inicial. In: Fletcher P, Macwhinney B. Compêndio da linguagem da criança. Porto Alegre: Artes Médicas, 1997. p. 299-322..

By analyzing the age group studied in relation to the parts of speech and content words it is possible to observe that the age groups 2 and 3 present statistical significance, something that did not occur in age group 1.This result can occur due to the reduced number of produced words by the children in age group 1. Still, it confirms a literature finding, in which the 18 month age group, the nouns are highest in the lexical set in this children's group. In the age groups from 24 to 32 months of age, there is a verb tendency to match or overcome the nouns. And finally, around the 32 months of age, which in this study was not approached, the parts of speech and the content words should be more balanced1515. Scherer S, Souza APR. Types e Tokens na aquisição típica de linguagem por sujeitos de 18 a 32 meses falantes do português brasileiro. Rev. CEFAC. 2011;13(5):838-45..

A study done3131. Bates J. Child-care history and kindergarten adjustment. Dev Psychol. 1994;30:690-700. with a sample of 8 months of age 2:6 also observed that nouns are predominant, fulfilling a 55% average in the children's lexicon with a vocabulary between 100 to 200 words, while the content words were less than 15%.

Table 4 can confirm the noun bias hypothesis in its weaker version, because the analysis results showed the noun numbers was higher than the verb numbers during the study lexical acquisition period, however the noun production it was not exclusive even in this initial period of the language acquisition.

This contributes with the Natural Partitions Theory3232. Gentner D. Why nouns are learned before verbs: Linguistic relativity versus natural partitioning. In: Kuczaj SA. Language development: Language, thought, and culture. Washington: Erlbaum; 1982. p. 301-34., in which the noun prevalence in relation to verbs during the initial lexical acquisition is the result of a cognitive tendency, for being the first to be understood by the child, because the noun is more concrete than the verb.It is known, the verbs are relational terms, that refer to the most abstract concepts and less cohesive, therefore the limits that differentiate one verb from the other are less clear and harder in the acquisition3333. Rodrigues JC, Tonietto L, Sperb TM, Parente MAMP. Convencionalidade na aquisição de verbos: estudo comparativo das análises dicotômicas e contínuas. Psic. Teor. e Pesq. 2012;28(1):77-85. , 3434. Sancassani M. Aquisição de verbos: uma questão de perspectiva sintática? ReVEL. 2012;6:27-61..

So, even though there are limitations with the reduced time in the sample speech recordings plus the fact that the child interacted with the examiner and not someone who his or her is used to, it is believed, this study can contribute with the speech clinic also with the early diagnose in language alterations in children from low socioeconomic class and for it to be considered in the therapeutic planning, an adequate lexical heap.


After the data analysis from this research, it was verified that the initial lexical acquisition in children with typical development happens in a progressive manner as the age increases and in this period the gender variable do not influence in the linguistic production.

The existence of the noun bias hypothesis was confirmed in its weaker version, agreeing with the thesis that justifies this research. Based on the results therapeutic sessions can improve, according to the words used by children in their initial language acquisition phase, helping the nomination techniques, used in therapy, for example. Besides, it is possible to early detect the risk of children developing language alterations and from this point on, perform strategies for prevention, guidance for the mothers and early stimulation.

From the results found, new researches are suggested about the theme, using a wider age group, comparing genders, analyzing children from different social classes, as well as from other cities.


We would like to thank CNPq and CAPES for their support for the accomplishment of this research.

