CHILDES Dutch De Houwer Corpus
|
Annick De Houwer
Director
Harmonious Bilingualism Network (HaBilNet)
annick.dehouwer@habilnet.org
website
|
Participants: | 4 (plus siblings and parents) |
Type of Study: | naturalistic |
Location: | Antwerp, Belgium |
Media type: | audio |
DOI: | doi:10.21415/T58C8C |
In accordance with TalkBank rules, any use of data from this corpus
must be accompanied by at least the above reference.
Project Description
This corpus of Dutch child language and child-directed speech was
collected in Antwerp, Belgium.
The corpus consists of 15 recordings transcribed orthographically and
phonetically. Some transcripts also contain variety codes, speaker
codes, addressee codes and utterance numbers (see further below).
Participants are four children between the ages of ca. 4;9 and 5;0 (two
boys Dieter and Michiel, and two girls Kim and Katrien) and their
families, with some other persons on occasion present as well. The
families are lower-middle to middle-middle class. All children are
addressed in some form of Dutch common around the city of Antwerp and go
to school fulltime (second year of nursery school). They are being
raised monolingually. The interactions are mostly free and spontaneous,
but include some structured interactions as well, in which the mother or
father had a conversation with the 4-year-old about the past day at
school, or prompted the child to describe a picture and tell a picture
book story.
File | Utterances | Sex | Age | Birth Order | KIM
Saturday | 902 | female | 4;11.03 | middle of three | KIM
Friday | 954 | female | 4;11.02 | middle of three | KIM
Tuesday | 382 | female | 4;10.30 | middle of three | DIETER
Saturday | 697 | male | 4;11.29 | older of two | DIETER
Tuesday | 210 | male | 4;11.25 | older of two | DIETER
Wednesday | 457 | male | 4;11.26 | older of two | DIETER
Thursday | 132 | male | 4;11.27 | older of two | KATRIEN
Wednesday | 2156 | female | 4;08.25 | younger of two | KATRIEN
SatAft | 2148 | female | 4;08,28 | younger of two | KATRIEN
SatMorning | 1931 | female | 4;08,28 | younger of two
| KATRIEN Tuesday | 268 | female | 4;08.24 | younger of two
| MICHIEL Saturday | 1037 | male | 4;08.22 | younger of two
| MICHIEL Wednesday | 1062 | male | 4;08.26 | younger of two
| MICHIEL Monday | 1135 | male | 4;08.24 | younger of two
| MICHIEL Tuesday | 131 | male | 4;08.25 | younger of two
|
The transcripts consist of 13,602 utterances (children and adults
combined). Both adult and child utterances were phonetically and
orthographically transcribed by three separate coders: the first two
made a transcript from scratch, and the third resolved any differences
between the two. For each transcript there was at least one coder from
the Antwerp area, and one coder not from the Antwerp region. Phonetic
transcription was originally carried out in Dutch UNIBET as developed by
Steven Gillis, and is fairly narrow, especially as regards vowel sounds.
However, prosody was not transcribed. As most recently described in
Nuyts (1989), Antwerp vowel phonemes differ quite substantially from
standard Dutch phonemes both in their type and in their distribution.
The Dutch UNIBET system first used for the phonological transcription
could not handle all the phonemes. Rather than develop a new system,
approximations were used where necessary, with an explanation in a
following %exp line of how a particular phoneme symbol was best
interpreted. The UNIBET symbols were converted in Unicode but
researchers who prefer to work with the original UNIBET files are
welcome to contact the author of the data for more information. Also,
there remain 0Xfa symbols in the Unicode for sounds that could not be
approximated with the UNIBET symbols. Finally, the files for the child
MICHIEL may contain some inaccuracies on the %pho line with regard to
the long low open vowel phoneme used in Antwerp renderings of HIJ, MIJN
and the like. Researchers wanting to work with these data are welcome to
contact the author of the data to resolve these problems. While Dutch
standard spelling was generally used, the orthographic transcript stays
as close to the phonetic transcript as possible, and indicates missing
initial and final sounds between brackets. Where this is not the case,
and there seems to be a mismatch between the phonetic and orthographic
transcript lines, it is the phonetic line that should be taken as most
closely resembling the original utterance. Utterance lines may be
followed by comment lines. These are in Dutch. For 10 of the 15 data
files there is an additional coding line for each utterance (5 of these
are complete and double-checked; the other 5 are provisional). This line
includes the following: - an utterance number followed by a slash - a
three letter code, where the first letter refers to the speaker, the
second letter refers to the kind of Dutch that is being used (variety
neutral, or 'local', meaning that the utterance contained a form typical
of Antwerp dialect), and the third letter refers to the addressee. More
information on these codes can be found in De Houwer, 2003 (reference
below), or can be obtained directly from the author of these data at
annick.dehouwer@ua.ac.be. If the coding line indicated that the
utterance contained material coded as 'local', an explanation line
follows to identify what exactly it was in the utterance that led to
that coding decision (e.g., a particular dialect phoneme, use of a
dialect pronoun, use of specific dialect vocabulary, etc. - see De
Houwer 2003). The data show that the following distinctions in usage
emerge: 'local' utterances containing dialect elements tend to be used
when older children and adults in the family address each other.
'Neutral' forms that are common all over Flanders may also be used,
while 'distal' features, which are clear 'imports' from a Dutch variety
outside Flanders are being avoided. However, when older children and
adults address the younger members of the family, they increase their
use of neutral forms, substantially reduce their use of local forms, and
occasionally use distal forms. The younger children use mainly
utterances categorized as neutral, dependent on who they are addressing.
Implications of this variation across family members for language change
are discussed. (Reference: Nuyts, Jan. (1989). Het Antwerps
vokaalsysteem: een synchronische en diachronische schets. Taal en
tongval 41(1-2): 22-48.)
Acknowledgements
Transcription and coding of the Antwerp
Dutch corpus was made possible through grants to the author from the
Belgian Science Foundation and the University of Antwerp.
|