Spanish MiamiBiling Corpus

Barbara Zuker Pearson
Department of Linguistics
University of Massachusetts


Participants: 43 children age 7, 46 children age 9
Type of Study: narrative
Location: USA
Media type: no longer available
DOI: doi:10.21415/T5MK6S

Browsable transcripts

Download transcripts

Citation information

Pearson, Barbara Z. (2002). Narrative competence among monolingual and bilingual school children in Miami. In D. K. Oller and R. E. Eilers (Eds.), Language and literacy in bilingual children (pp. 135-174). Clevedon, UK: Multilingual Matters.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This directory contains frog story narratives collected in Miami, Florida by Barbara Zuker Pearson with the help of Ana Maria Ferrer, Patricia Ortega, Mayrela Palau, Samantha Pearson, Esperanza Rodriguez, and Yael Wiesner.

There are 447 files: 269 in English and 178 in Spanish. They complete the 20 cells of a nested factorial with the factors explained later in the description of the ID numbers. There are 16 cells of 10 Spanish-English bilinguals, with two stories each (for the most part), and 4 cells of monolinguals, 20 children in each with only one story per child. All of the children were born in the United States; they were enrolled in three different instructional programs in Dade County Public Schools in Miami: 1) English immersion for Hispanic students, 2) so-called “two-way” bilingual programs for Hispanic students with 50% Spanish and 50% English instruction, 3) regular monolingual English classrooms for non-Hispanic students, and 4) monolingual English children in schools with primarily Hispanic populations. (Groups 1, 3, and 4 have essentially the same instructional program, but the relation between the student’s own language and the language of the peer population is different.) The stories are on audiotape and 15% of the tapes have been independently transcribed twice for reliability; another 60% have had “second listenings” (where the second transcriber worked from the first listener’s transcription). There are six directories of files:

BLENG2:Bilingual 2nd graders speaking English
BLENG5:Bilingual 5th graders speaking English
BLSPAN2:Bilingual 2nd graders speaking Spanish
BLSPAN5:Bilingual 5th graders speaking Spanish
MLENG2:Monolingual English children 2nd graders
MLENG5:Monolingual English 5th graders.

Most, but not all, of the bilingual children have both an English and a Spanish story, which can be located by matching the ID number and file names in the English and Spanish directories. Those who wish to work with the matching files would be advised to modify the file names to indicate the language of the story, but should be aware that the @ID line matches the current file name and does not distinguish language. Whether the Spanish or English was told first (on different days) is indicated in the header.

ID Numbers: Files are arranged in English and Spanish directories by ID number, which gives information about group status: digit 1 is school type (above), digit 2 is SES (1=mid, 2=low), digit 3 is language of the home (1 = mostly Spanish, 2 = English and Spanish equally, 3 = only English), and digit 4 is grade (2 = 2nd, 3 = 5th), followed by a 4-digit unique identifier. (For example, 21131489.cha are the stories from participant #1489: she is a bilingual in a two-way school, mid-SES, with mostly Spanish in the home, in fifth grade at the time of the story.) Within the header, gender is indicated as M or F; the approximate age is in parentheses alongside the grade, 7 or 8 years old for second grade, 10 or 11 for fifth grade. The project records also have birthdays for each child and information about the country of the parents’ origin.

The transcribing conventions were derived loosely from the guidelines found in Berman and Slobin (1994) and then converted to CHAT with extensions as noted in the 00depadd file. Comments in the text marked by %exc indicate nonnarrative comments and %pro indicates a pronunciation that is not predictable from the standard orthography. Each verbed clause is marked by a [c]. Verbed clauses need not have a finite verb and in some cases the verb will be absent, as in ellipsis. Modals and aspectual serial verbs are considered as a single verb, as long as the subject does not change. Morphological errors or omissions are marked with %err coding, although users should be aware that this coding has not been found reliable and is used only as a guide by the original researchers.

Please see this link for a general description of the Frog Story methods.


This is the second set of frog stories collected in conjunction with the Bilingualism Study Group Literacy Grant, supported by NIH Grant #IR01 HD 30762-01 to D. Kimbrough Oller and Rebecca Eilers, with Barbara Pearson and Vivian Umbel.