CHILDES English Edinburgh Corpus


Mitsuhiko Ota
Philosophy, Psychology and Language Sciences
University of Edinburgh

website

Barbora Skarabela
Philosophy, Psychology and Language Sciences
University of Edinburgh

website

Participants: 47
Recordings: 355
Type of Study: longitudinal
Location: United Kingdom
Media type: audio
DOI: doi:10.21415/SF64-JK03

Browsable transcripts

Download transcripts

Link to media folder

Citation Information

Ota, M. Davies-Jenkins, N., & Skarabela, B. (2018). Why choo-choo is better than train: The role of register-specific words in early vocabulary growth. Cognitive Science, 42, 1974-1999.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Corpus Description

This corpus consists of transcripts and audio-recordings of naturalistic interactions involving 47 full-term infants (24 girls and 23 boys) in Edinburgh, UK. Participating families were asked to record 90 minutes' worth of daily interactions with the target infant when they were 9, 15 and 21 months old. The number and length of individual sessions varied. The data currently available cover the 9- and 15-month sessions (60 minutes' worth per age point for each per child). Of the 47 families, 24 consented to releasing the audio-recordings, which are linked to the corresponding transcripts.

Associated Data

As part of the project, CDI data were also collected for all 47 children at 9, 15 and 21 months of age. These are available at the project's OSF site.

File Naming Conventions

Each file name consists of the child's pseudonym + month of recording + session number (e.g., adam0901.cha is the first recording session for Adam when he was 9 months old).

Warnings

Transcription of unidentifiable materials ('&~') was done using impressionistic orthography and has not been checked for consistency. When the child forms deviated noticeably from the targets, the %pho line was used to provide phonetic transcriptions. Where there is some overlap between two or more utterances, the sound bullets start at the onset of the first utterance and end at the offset of the last overlapping utterance (marked by '+<'). To facilitate MOR parsing and cross-dialectal comparison, American, instead of British, spelling has been used throughout, except for the word Mummy.

Codes

The following unique codes are used: Researchers interested in looking only at infant/child-directed speech, for example, can exclude utterances with [+ NAC] in their analysis.

Acknowledgements

This research was supported by the Economic & Social Research Council (Standard grant ES/J023825/1 "The role of baby-talk words in early language development") and the Royal Society of Edinburgh (Small grant 2082 "The Edinburgh Child Language Corpus"). We thank the families for their participation; Nicola Davies-Jenkins for data collection and transcription; Euan Adamson, Adela Bartlick Salcedo, Annie Doherty, Judit Fazekas, Angel Garmpi, Lisanne Go, Julia Heimann, Annie Holtz, Cliodhna Hughes, Anna Kinsella, Klara Kunst, Ines Lee, Tiffany Li, Brandon Papineau, Griffith Tai, Yuqi Qin, Yao Xiao, Isaac Yip, and Fotiana Zouvani for data transcription and processing.

Usage Restrictions

None.