CHILDES Eng-AAE DELV Corpus


Barbara Zurer Pearson
Linguistics
University of Massachusetts

Peter A. de Villiers
Psychology
Smith College

Jill G. de Villiers
Psychology and Philosophy
Smith College

Participants: 78
Type of Study: cross-sectional
Location: United States
Media type: no longer available
DOI: doi:10.21415/KKA1-GM10

Browsable transcripts

Download transcripts

Citation information

Pearson, B. Z., Jackson, J. E., & Wu, H. (2014). Seeking a valid gold standard for an innovative, dialect-neutral language test. Journal of Speech, Language, and Hearing Research, 57(2), 495-508.

Seymour, H.N. & Pearson, B.Z. (2004, Feb). Evaluating language variation: Distinguishing development and dialect from disorder. Special Issue. Seminars in Speech, 25(1).

de Villiers, P. A., & de Villiers, J. G. (2010). Assessment of language acquisition. Reviews in Cognitive Science, 1, 230–244. doi:10.1002/wcs.30

Seymour, H.N., Roeper, T.W., de Villiers, J.G., de Villiers, P.A., & Pearson, B.Z. Diagnostic Evaluation of Language Variation (DELV) Screening Test (DELV-ST, 2003) and Norm-Referenced (DELV-NR). Originally published by Harcourt Assessments, San Antonio TX. Since 2018, Ventris Learning, LLC, Sun Prairie WI.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The corpus consists of 78 language samples from African American children from AAE-speaking communities, 38 5-yr-olds, 40 6-yr-olds. Twenty-one children were identified by the DELV-NR as at risk for language impairment. Demographic information (date of birth, date of test, gender, age, parent education level, and region of the U.S.) is in the header. This readme file gives the demographic information in spreadsheet format, including the legend for fields, and the child's DELV Norm Referenced score, Language Variation Status as a label and as a ratio for a continuous variable. there is also a guide for administration of the protocol .

Information and documents from the DELV project that gave rise to the corpus of language samples can be found at the official university repository: http://scholarworks.umass.edu/aae/ The samples followed a protocol provided by the study authors to elicit three elements: spontaneous conversation with the child, responses to several picture prompts to elicit 3rd person singular verb forms, short personal narratives, a story based on a 4-picture sequence, some exposition, and responses to activities suggested in pragmatics worksheets.

The samples were transcribed in Excel and eventually transferred to CHAT. Descriptive statistics were generated (overall length in words, MLU, MLU-in words, number of different words in 50 utterances, an IPSyn Score), plus a pragmatics scoring by Peter de Villiers. They are referenced in the NIH Report available (soon) at the UMass Scholarworks site and in the paper above by Jill and Peter de Villiers. A subsequent publication by Pearson, Jackson, and Wu (2014) reanalyzed the diagnostic accuracy, especially sensitivity and specificity of the DELV-NR using an improved (but not perfect) "gold standard"--language sample analysis supplemented with discrepancy resolution techniques. The data and conclusions in that paper correct the diagnostic accuracy data published in the original DELV-NR manual, now found in an addendum to Table 7.11 (p. 141).

DELV is describe in Pearson et al., which is downloadable here.

Acknowledgements

NIH Contract: N01 DC8-2104 (Harry Seymour, PI). Andrew Yankes reformatted this corpus into accord with current versions of CHAT.