Clark Corpus

Clark Corpus

Eve Clark
Department of Linguistics
Stanford University

Participants: 1
Type of Study: naturalistic
Location: USA
Media type: not available
DOI: doi:10.21415/T5HP44

Browsable transcripts

Download transcripts

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This subdirectory contains files from a short-term longitudinal study conducted by Eve Clark during 1976 two-year-old child. The transcripts pay close attention to repetitions, hesitations, and retracings. Shem was seen on a nearly weekly basis by an observer (Cindy) who became a friend of the family over the course of the year’s recording. The recordings were made at Shem’s home, except on a few occasions when the parents made the recording because either Cindy or they were away on vacation so Shem would have missed more than one session. The child’s name, his home address, his sister’s name, the names of his parents, and the observ-er’s name have been changed to preserve their confidentiality. The names of nearby places and institutions remain unchanged. Shem was from a middle- to upper-middle-class professional family in the Palo Alto area. He was an only child until just after the recordings began when his first sister, Ana, was born. He attended a local day care center (Little Kids’ Place) in the mornings, and oc-casionally went there for a short time in the afternoon. Most of the recording sessions took place at his home. Shem’s age and the date (month and day) is noted at the top of each tran-script. His date of birth was February 5, 1974. For convenience, the ages for each session are summarized here:
A few sessions are split into two parts if they lasted longer than usual. Most sessions lasted an hour. Shem’s pronunciation at the beginning of the recording period was often unclear, and he frequently made more than one attempt to get himself understood. In the transcripts, all repairs are noted, but Shem’s pronunciation has been largely normalized for representation in English orthography, except where his meaning remained unclear, or his pronunciation was critical to the overall form of an interchange. Typical features were voicing of intervo-calic voiceless stops (whether or not at word boundaries); omission of voiced final stops; voicing of voiceless initial stops; substitutions among fricatives; great variation in vowel quality; extensive reliance on schwa or syllabic /n/ for function words (the syllabic /n/ was typically, but not always a locative preposition); simplification of clusters with loss of post-consonantal /l/ and /r/; initial /l/ often /y/; initial /s/ often /d ~ t/; final /s/ often /t/; final voiced stops often /n/ (e.g., birn/ /for “bird,” /wen/ for “red,” /bun/ or /bung/ for “bug”); voiceless final stops often replaced by glottal stops (especially /t/, and often /k/); and occasional homorganic voiceless stops as releases to final nasals (e.g., /lawnt/ for “lawn”). Shem regularly produced the definite article as “duh”. In order to improve the readability and analyzability of the transcript, all cases of “duh” were changed to conventional “the.” Intonation is indicated by punctuation, with a period marking a terminal fall, a question mark marking interrogative rise, an exclamation mark indicating emphatic tone, and a com-ma indicating continuing or listing contour (slight pause, with sustained level tone, or slight falling but nonterminal tone). The bulk of the transcription is in English orthography for ease of reading, but a few persistent forms are left with glosses more or less in the form Shem produced them. On a few tapes, background conversations (e.g., on the telephone) are omitted from the final transcription.


The data collection was supported by an NSF grant (BNS 75-17126) to E. V. Clark.