Word frequency of the CHILDES corpus : Another perspective of child language features

Hanhong Li, Alex C Fang

Research output: Contribution to journalArticlepeer-review


Based on the corpus of the Child Language Data Exchange System (CHILDES), the current research explores the grammatical composition of child language in terms of word classes. Unlike past studies, the research reported in this article examines not only nouns and verbs but also adjectives, adverbs, prepositions, pronouns, determiners, and interjections. We investigated the word frequency patterns of these word classes in both child and maternal language in order to explore the correlation between input word frequency on the mother's side and the output word frequency on the child's side. Our results show that the fre-quency patterns for word classes differ between child language and maternal language. Due to children's mental development, young children use many more words with concrete and imageable referents such as nouns and pronouns than less concrete or imageable words such as adjectives, adverbs, prepositions, interjections and conjunctions. Data prove that nouns are the most frequently used word class in child language. In addition, children acquire monosyllabic words more easily and use them much more frequently than multi-syllabic words except for those with extremely high frequency. Moreover, our study reveals a positive correlation between the token volume of maternal language output and that of child language output, suggesting the important role of maternal lan-guage in children's language acquisition. A principle of comprehensible input should be highlighted in adults' speech to children in order to make them achieve larger vocabulary. This discovery in first language acquisition will hopefully help further studies in second or foreign language acquisition, learn-ing and teaching.
Original languageEnglish
Pages (from-to)95-116
Number of pages22
Publication statusPublished - 2011


Dive into the research topics of 'Word frequency of the CHILDES corpus : Another perspective of child language features'. Together they form a unique fingerprint.

Cite this