CHILDES English Manchester Corpus

Elena V. M. Lieven

Max Planck Institute for Evolutionary Anthropology

Julian Pine
Department of Psychology
University of Liverpool

Caroline Rowland
Department of Psychology
University of Liverpool


Anna Theakston
Department of Psychology
University of Manchester


Participants: 12
Type of Study: normal play activities with mother
Location: England
Media type: audio archived
DOI: doi:10.21415/T54G6D

Browsable transcripts

Download transcripts

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This corpus consists of transcripts of audio recordings from a longitudinal study of 12 English-speaking children between the ages of approximately 2 and 3 years. The children were recruited through newspaper advertisements and local nurseries. All the children were first borns, monolingual and were cared for primarily by their mothers. Although socioeconomic status was not taken into account with respect to recruitment, the children were from predominantly middle-class families. There were six boys and six girls, half from Manchester and half from Nottingham. At the beginning of the study, the children ranged in age from 1;8.22 to 2;0.25 with MLUs ranging between 1.06 to 2.27 in morphemes. The children’s ages are available in the headers to each transcript. There birthdates are as follows:

The transcripts for each child are numbered from 1 to 34 corresponding to the tape number and labeled (a) and (b) to correspond to the two 30-minute sessions within each recording. The following recording sessions were missed and therefore have no corresponding transcript: Aran14a/b, Carl14b, Carl24a/b, John15a/b, John16a/b, Ruth4a/b, Warren3b.


The children were audiotaped in their homes for an hour on two separate occasions in every 3-week period for one year. They engaged in normal play activities with their mothers. For the first 30 minutes of each hour they played with their own toys whilst for the second 30 minutes, toys provided by the experimenter were available to the child. For the duration of the recordings, the experimenter attempted as far as possible to remain in the background to allow contextual notes to be taken.


All speech was transcribed with the exception of speech not directed to the child(i.e. speech between adults, telephone calls etc.). However, if the child produced an utterance in response to such speech, the relevant utterances were transcribed. Generally speaking, contextual information was added only when the utterance would otherwise be unclear. Of course, because the children were not videotaped, we had only the experimenter’s notes for such information. Punctuation was kept to a minimum – double commas indicate tag questions and single commas were used to indicate vocatives.

Phonological Forms

The data were collected with the intention of looking specifically at early grammatical development. We were not interested in the specific phonological forms the children used. Therefore, unless the child used what appeared to be child-specific forms, the target word was transcribed rather than an approximation of the child’s phonological form. This also helped with coding using the MOR program.

Error Coding

The data were coded for the following errors (where ‘0’ indicates a missing speech component). For all of the errors the marker [*] was added to the main line and a dependent tier was added showing the correct form.
Missing morphemestwo dog-0s, he’s go-0ing
Case errorsher do it, me get it
Missing auxiliariesit 0is going there, I 0am getting a drink
Word Class Errorsa that one
Agreement errorsa bricks, does she likes it?, it don’t go there
Pronominal Errorscarry you (when the child wants to be carried)
Wrong wordI put it off (where the context indicates take is appropriate)
Overgeneralisationsit broked, I stayed it on there.

Although we have attempted to be consistent in coding, errors may have been missed. In particular, missing auxiliaries and copulas have often not been coded. Where it was impossible to identify exactly what the error was, the error was simply marked on the main line with [*]. Anyone wishing to work on particular error types should carry out a detailed analysis of the child’s use of a particular system (e.g., pronoun case marking) rather than relying on pulling out errors by searching for the [*] error marker.