CHILDES Danish Plunkett Corpus

Kim Plunkett
Department of Psychology
University of Oxford
kim.plunkett@psy.oxford.ac.uk
website

Participants:	2
Type of Study:	naturalistic
Location:	Denmark
Media type:	video
DOI:	doi:10.21415/T53594

Citation information

Plunkett, K. (1985). Preliminary approaches to language development. Århus: Århus University Press.

Plunkett, K. (1986). Learning strategies in two Danish children’s language development. Scandinavian Journal of Psychology, 27, 64–73.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Other relevant references include:

Peters, A. (1983). The units of language acquisition. New York: Cambridge University Press.

Piaget, J. (1952). The origins of intelligence in children. New York: International Universities Press.

Snow, C. E. (1972). Mothers’ speech to children learning language. Child Development, 43, 549–565

Uzgiris, I., & Hunt, J. (1975). Toward ordinal scales of psychological development in infancy. Champaign: University of Illinois Press.

Project Description

This directory contains longitudinal corpora from two Danish children — Anne and Jens — studied by Kim Plunkett of Århus University from 1982 to 1987. The data were contributed to CHILDES in 1989. In addition to the CHAT transcripts, results from the Uzgiris-Hunt Infant Assessment Scales are available for both children during the first year of the study, as are results of various comprehension tests.

Anne was born 20-FEB-1982 and Jens was born 14-NOV-1981. Data collection continued until both children were 6;0. The study began when Jens was 0;11.15 and Anne was 0;8.1. Anne had a sister who was 2 years older. Both parents had completed a university education. Jens is a single child with a divorced unemployed mother. The father was a skilled worker and the mother had just started on a university education. Both chil-dren spent a good deal of time in nursery school. The children were visited in their homes fortnightly. Each visit consisted of an interview, testing procedures, and a free play session. The interview focused on the parents’ observations of their child’s language behavior since the previous visit; whether any new words had emerged; whether the child had begun using old words in new ways; whether the child’s social and communicative skills had developed in any way; finally, any other noteworthy developments the parents may have observed. To this end, the parents were asked to keep a diary of the various aspects of their child’s development on a week-to-week basis. The contents of the diary formed the basis of much of the discussion in the interview session. The testing procedures were taken from the Uzgiris-Hunt Infant Assessment Scales (Uzgiris & Hunt, 1975). The rationale for these scales is based on Piaget’s (1952) theory of the sensorimotor period. The object permanence and means–ends subscales were administered on each visit. The remaining sub-scales were administered less frequently. In the final free play session, parent and child were encouraged to engage in a variety of social situations. An attempt was made to establish some regularity in the kind of situations observed across visits (feeding time, solving a problem together, story-telling). However, importance was attached to collecting naturalistic data and so coercion was avoided. The entirety of each visit, which lasted approximately 90 minutes, was recorded on videotape. Transmitting microphones were used to collect the vocal data from child and parent.

After the visit, a transcription was made of the videotape. A standard orthographic tran-scription was made of all the verbal behavior during the session together with a transcription of any nonverbal activity that might aid in the interpretation of the verbal behavior. The speech of all participants was analyzed into utterances after Snow’s (1972) guidelines. On this view, utterances are not defined in terms of adult grammatical structures like the sentence but according to the pauses and intonational patterns in the dialogue. Utterances were then analyzed into morphemes. For children, this can be a problematic process. For exam-ple, “What is that” may be uttered by the child as a single undifferentiated formula. In such cases, utterances are coded as containing only a single morpheme. The criteria used for deciding the morphemic breakdown of an utterance are based on articulatory and fluency criteria (Peters, 1983). A distinction between idiosyncratic expressions, lexicalized morphemes, and formulaic expressions is made explicit in the coding of the transcription such that a variety of different analyses can be performed on the same database. For exam-ple, it is an easy matter using the CLAN programs to observe the effect of including or excluding a child’s idiosyncratic expressions in an MLU count.

Every file comes with a list of warnings regarding certain inherent limitations in the quality or potential use of the data. The list of warnings is as follows:

These data are not useful for the analysis of overlaps, because overlapping was not accurately transcribed.
Retracings and hesitation phenomena have not been accurately transcribed in these data.
Sections of the session that repeat previous episodes were not transcribed, i.e. rep-etitions of identical utterances in similar situations are excluded.
Productive units within an utterance are identified on the basis of articulation and fluency criteria.
The phonetic tier is used to describe the child’s pronunciation of a given sound. However, it does not provide a precise phonetic analysis.
Immediate imitations are excluded.
Note that a timing irregularity occurs in this session.
Note blank lines indicate shorter gaps in the transcription.
Note that gaps in the timing indicate untranscribed material.
Modifications of verb and noun stems by regular inflections are marked on the main text line. However, when the stem itself is notified this change is not marked on the main text line. Instead the basic stem is used and the correct modified form is noted on the %cor tier.
The present tense inflections are marked by @n; the plural inflections by @f; the definite plural by @fd; the infinitive by @i; the definite inflections by @d; past par-ticiple by @pp; past tense by @pt; comparative by @cp; superlative by @sp3; tillægsord ubestemte forms by @ki (intetkœn), @kf (fælleskœn), @kif (flertal in-tetkœn), @kff (fælleskœn, flertal), @kd (intetkœn, fælleskœn, ental, flertal, be-stemte former); passive of verbs by @p; and genitive of nouns by @g.
Irregular forms are marked on the main text line.