CHILDES Dutch De Houwer Corpus

Annick De Houwer

Director
Harmonious Bilingualism Network (HaBilNet)
annick.dehouwer@habilnet.org
website

Participants:	4 (plus siblings and parents)
Type of Study:	naturalistic
Location:	Antwerp, Belgium
Media type:	audio
DOI:	doi:10.21415/T58C8C

Citation information

De Houwer, Annick (2003)

Language Variation and Change, 15

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least the above reference.

Project Description

This corpus of Dutch child language and child-directed speech was collected in Antwerp, Belgium.

The corpus consists of 15 recordings transcribed orthographically and phonetically. Some transcripts also contain variety codes, speaker codes, addressee codes and utterance numbers (see further below). Participants are four children between the ages of ca. 4;9 and 5;0 (two boys Dieter and Michiel, and two girls Kim and Katrien) and their families, with some other persons on occasion present as well. The families are lower-middle to middle-middle class. All children are addressed in some form of Dutch common around the city of Antwerp and go to school fulltime (second year of nursery school). They are being raised monolingually. The interactions are mostly free and spontaneous, but include some structured interactions as well, in which the mother or father had a conversation with the 4-year-old about the past day at school, or prompted the child to describe a picture and tell a picture book story.

File Utterances Sex Age Birth Order
KIM Saturday 902 female 4;11.03 middle of three
KIM Friday 954 female 4;11.02 middle of three
KIM Tuesday 382 female 4;10.30 middle of three
DIETER Saturday 697 male 4;11.29 older of two
DIETER Tuesday 210 male 4;11.25 older of two
DIETER Wednesday 457 male 4;11.26 older of two
DIETER Thursday 132 male 4;11.27 older of two
KATRIEN Wednesday 2156 female 4;08.25 younger of two
KATRIEN SatAft 2148 female 4;08,28 younger of two
KATRIEN SatMorning 1931 female 4;08,28 younger of two
KATRIEN Tuesday 268 female 4;08.24 younger of two
MICHIEL Saturday 1037 male 4;08.22 younger of two
MICHIEL Wednesday 1062 male 4;08.26 younger of two
MICHIEL Monday 1135 male 4;08.24 younger of two
MICHIEL Tuesday 131 male 4;08.25 younger of two
The transcripts consist of 13,602 utterances (children and adults combined). Both adult and child utterances were phonetically and orthographically transcribed by three separate coders: the first two made a transcript from scratch, and the third resolved any differences between the two. For each transcript there was at least one coder from the Antwerp area, and one coder not from the Antwerp region. Phonetic transcription was originally carried out in Dutch UNIBET as developed by Steven Gillis, and is fairly narrow, especially as regards vowel sounds. However, prosody was not transcribed. As most recently described in Nuyts (1989), Antwerp vowel phonemes differ quite substantially from standard Dutch phonemes both in their type and in their distribution. The Dutch UNIBET system first used for the phonological transcription could not handle all the phonemes. Rather than develop a new system, approximations were used where necessary, with an explanation in a following %exp line of how a particular phoneme symbol was best interpreted. The UNIBET symbols were converted in Unicode but researchers who prefer to work with the original UNIBET files are welcome to contact the author of the data for more information. Also, there remain 0Xfa symbols in the Unicode for sounds that could not be approximated with the UNIBET symbols. Finally, the files for the child MICHIEL may contain some inaccuracies on the %pho line with regard to the long low open vowel phoneme used in Antwerp renderings of HIJ, MIJN and the like. Researchers wanting to work with these data are welcome to contact the author of the data to resolve these problems. While Dutch standard spelling was generally used, the orthographic transcript stays as close to the phonetic transcript as possible, and indicates missing initial and final sounds between brackets. Where this is not the case, and there seems to be a mismatch between the phonetic and orthographic transcript lines, it is the phonetic line that should be taken as most closely resembling the original utterance. Utterance lines may be followed by comment lines. These are in Dutch. For 10 of the 15 data files there is an additional coding line for each utterance (5 of these are complete and double-checked; the other 5 are provisional). This line includes the following: - an utterance number followed by a slash - a three letter code, where the first letter refers to the speaker, the second letter refers to the kind of Dutch that is being used (variety neutral, or 'local', meaning that the utterance contained a form typical of Antwerp dialect), and the third letter refers to the addressee. More information on these codes can be found in De Houwer, 2003 (reference below), or can be obtained directly from the author of these data at annick.dehouwer@ua.ac.be. If the coding line indicated that the utterance contained material coded as 'local', an explanation line follows to identify what exactly it was in the utterance that led to that coding decision (e.g., a particular dialect phoneme, use of a dialect pronoun, use of specific dialect vocabulary, etc. - see De Houwer 2003). The data show that the following distinctions in usage emerge: 'local' utterances containing dialect elements tend to be used when older children and adults in the family address each other. 'Neutral' forms that are common all over Flanders may also be used, while 'distal' features, which are clear 'imports' from a Dutch variety outside Flanders are being avoided. However, when older children and adults address the younger members of the family, they increase their use of neutral forms, substantially reduce their use of local forms, and occasionally use distal forms. The younger children use mainly utterances categorized as neutral, dependent on who they are addressing. Implications of this variation across family members for language change are discussed. (Reference: Nuyts, Jan. (1989). Het Antwerps vokaalsysteem: een synchronische en diachronische schets. Taal en tongval 41(1-2): 22-48.)

File	Utterances	Sex	Age	Birth Order
KIM Saturday	902	female	4;11.03	middle of three
KIM Friday	954	female	4;11.02	middle of three
KIM Tuesday	382	female	4;10.30	middle of three
DIETER Saturday	697	male	4;11.29	older of two
DIETER Tuesday	210	male	4;11.25	older of two
DIETER Wednesday	457	male	4;11.26	older of two
DIETER Thursday	132	male	4;11.27	older of two
KATRIEN Wednesday	2156	female	4;08.25	younger of two
KATRIEN SatAft	2148	female	4;08,28	younger of two
KATRIEN SatMorning	1931	female	4;08,28	younger of two
KATRIEN Tuesday	268	female	4;08.24	younger of two
MICHIEL Saturday	1037	male	4;08.22	younger of two
MICHIEL Wednesday	1062	male	4;08.26	younger of two
MICHIEL Monday	1135	male	4;08.24	younger of two
MICHIEL Tuesday	131	male	4;08.25	younger of two

Acknowledgements

Transcription and coding of the Antwerp Dutch corpus was made possible through grants to the author from the Belgian Science Foundation and the University of Antwerp.

Browsable transcripts

Download transcripts

Media folder

Citation information

Project Description

Acknowledgements