CHILDES English Van Houten Corpus

Lori Van Houten
Director of Analytic Support

Participants: 54
Type of Study: naturalistic
Location: USA
Media type: audio
DOI: doi:10.21415/T5Z014

Browsable transcripts

Download transcripts

Link to media folder

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

These data were obtained from Lori Van Houten’s doctoral dissertation, which studied differences in mother–child interaction between adolescent and older mothers. The disser-tation work was a part of a larger study, conducted by Cynthia Garcia-Coll of Department of Education at Brown University.

The mothers were followed from the time of the birth of their children. Data were col-lected at 4 months, 8 months, 2 years, and 3 years. Only the 2 year and 3 year data is on the computer.

Two Year Data

The two year data are in the subdirectory “twos.” The children were studied at home in three different situational contexts in which the mother attempted to teach the child a task, and a free play situation. First there is a 3-minute segment of interaction while the child eats lunch. Videotaped for a half hour, mothers were instructed to try to ignore the camera and do whatever they would normally do during lunch. If the child finished lunch before the half hour was up mothers were instructed to do whatever they would normally do after eat-ing lunch. The 3 minutes of tape following the first minute of interaction were transcribed for this study.

The second session was a teaching session in which the mother was instructed to teach the child three tasks from the Bayley Scales of Infant Development which were considered too difficult for the child’s age. Mothers did not know the task was too difficult for the child. The tasks were: placing a block in specific locations (on, in, under, and so forth) around a cup and a small chair, stringing beads, and sorting black and white buttons. The mothers were given one task at a time. The task was explained and the mother was given 3 minutes to teach each one. Only the first 2 minutes of the bead stringing and 1 minute of the sorting buttons task are transcribed.

The third session consisted of a half hour of play with a box of experimenter-provided toys. Among the toys were: cloth books, a tea set, a truck with different-shaped blocks which fit in holes in the side of the truck, a miniature playground set with small characters, giant Legos, Ernie and Cookie Monster puppets, and a chalk board/magnetic board with chalk and magnetic pieces. Mother and child played in an area in which they usually inter-acted. Also, mothers were requested to play only with the experimenter provided toys. The 3 minutes of interaction following the first minute of tape is transcribed.

Three Year Data

The files for these data are in the subdirectory “threes.” The children in this part of the study were between 3;2 and 3;7. In this segment of the study, 27 children were recorded during a free-play situation, and 25 children were recorded during a teaching activity in which the child attempts to teach the mother a simple task.

Thirteen of the children were children of adolescent mothers. All of the children, with the exception of Goose, have data at the 2-year level as well. Wilson, Doll, Dean, and Val-ley have 2-year transcripts but no 3-year data. The participant Park has two free-play tapes. The child was generally uncooperative using the toys provided by the experimenter. A sec-ond file, entitled “Bestpark” is probably more representative of the child’s true linguistic abilities as the child plays with his own toys. The reader will have to decide whether to opt for a more controlled sample on the same topic as the other files or for a more representative linguistic sample.

The children were seen in their homes by two experimenters. The mother was taken to another room where she was given a standard IQ test by one of the examiners. The child remained with the other examiner and the McCarthy Scales of Children’s Abilities and the Rhode Island Test of Language Structure were administered. The children’s scores for each of these tests (McCarthy verbal and cognitive scores, RITLS number of errors out of 100) are given in the headers for each file. With the RITLS in particular, it sometimes took more than one visit to complete the test.

Free play

Following the tests, the examiner and the child engaged in at least 5 minutes of free play with an experimenter-provided toy. The toy was a miniature park set including a slide, mer-ry-go-round, park bench, some small figures, and a mother figure with a baby in a stroller. These interactions were audiotaped only. The goal of the interaction was to elicit a reason-able language sample in a fairly controlled setting. The examiner tried to use the same line of conversation with each child. Some of the children, however, responded better to some forms of conversation than others. For example, some children preferred to act out a story with the characters and others preferred to merely talk about the characters. The free-play sessions were transcribed and coded using the same procedures used with the 2-year data.


The second transcript for each child, and the last activity to take place during the home visit, consists of audiotapes of the child trying to teach the mother a given activity. The mother joined the experimenter and child. The mother was instructed to “close her eyes and cover her ears” while the examiner taught the child a simple task. The examiner taught the child the task (manipulating the small characters from the park set and stringing beads) in such a way as to ensure that the child could perform the task, and to offer a verbal model of how to teach the task. The child was then told to teach the mother the task. Throughout the teaching the investigator encouraged the child to teach the mother the task and then have the mother perform the task. The investigator again tried to use similar procedures and utterances with each child. We were interested in looking at whether the child chose to demonstrate the task, to teach it verbally, or used a combination of the two techniques. The final goal was to compare the child’s teaching technique with what the mothers had done at 2 years in a similar situation. These teaching segments were not timed and each transcript may be of a different length. A separate coding system was devised for this segment.

Coding System for Twos

This coding system is appropriate for use with children from approximately Stage 1 to about 4 years. It is based on the premise that there are elements of interaction beyond the sentence level that may affect the course and rate of language acquisition. There are three main components to the coding system: Structural Complexity (MLU and Number of Main Verbs), Discourse Role (Initiate, Respond, Continue Turn, and so forth) and Pragmatic Role (Request Information, Report, Clarification, Control/Restrict, and so forth). These are coded for both mother and child (although some of the pragmatic variables pertain only to the mother or child) in an attempt to characterize the reciprocal nature of the interaction. The codes are basically the same as those used in the INCA coding system and in the New England corpus.

Coding for Threes

The coding system used for the teaching situation at 3 years is different from that used with free play. It was designed specifically for use with these transcripts with several ques-tions in mind. First, how well does the child adhere to the teaching procedure in terms of the type of utterances used and the structure of the teaching situation? Secondly, what role do the adults play in this interaction? Finally, how do the utterances in teaching differ from those used in free play? Based on these questions, a coding system was developed that in-cluded rough measures of grammatical complexity, variables representing the various seg-ments of the teaching situation, and variables coding the pragmatic role of both the adult’s and the child’s utterances. The following is a list of the variable names and the three-letter codes used for each. This is followed by a description of each variable: The interaction is divided in terms of who is teaching whom and, in general, what the purpose of the interaction is. To this end, the following segments are used:

  1. Teach Mother (Tmo): By far the largest portion of the interaction, this segment in-cludes all utterances by the examiner and mother exhorting the child to teach the mother, all the child’s utterances surrounding the teaching process, and all utteranc-es evaluating the mother’s performance of the task.
  2. Teach Child (Tch): Some children forget what they are supposed to teach. The ex-aminer interrupts the interaction to teach the child the task again.
  3. Closing (Clo): Includes any closing statements, usually evaluations of the child’s teaching techniques, following the teaching of the task.
These segment markers are the third item entered on the coding line following the two grammatical complexity measures. The final measure considers the pragmatic role of the individual utterance.


In addition to the standard CHAT headers such as Participants, Sex, and Situation, there are some project-specific headers.
  1. Mother’s Age Group: The mother’s status as an adolescent or older mother is pro-vided.
  2. Mother’s SES: Socioeconomic status based on the Hollingshead four factor index is given for each mother. The information necessary for calculating SES was col-lected when the child was 8 months old.
  3. Mother’s Education: Maternal educational level. 1 = completed junior high, 2 = completed high school, 3 = some post-secondary education. Again this is based on the mother’s educational status at 8 months. Not too many of the adolescents had continued with school after the birth of their child and none of the older mothers were students. Therefore, these figures can be considered reasonably accurate.
  4. McCarthy-Cognitive: (3 year data only) The child’s IQ based on his or her perfor-mance on the McCarthy Scales of Infant Development.
  5. McCarthy-Verbal: (3 year data only) The child’s scaled score on the verbal portion of the McCarthy Scales of Infant Development.
  6. RITLS: (3 year data only) The total number of errors out of 100 on the Rhode Island Test of Language Structure, a standardized test of comprehension of various simple and complex syntactic structures. Utilizing a picture identification task, the test requires children to choose from an array of three the one picture that most closely exemplifies the examiner’s stimulus sentence.


The repeated measures ANOVAs at both age levels demonstrate main effects for ma-ternal age but no significant interactions between maternal age and situation. At 2 years, teenage mothers confirmed or acknowledged children’s utterances significantly less and had fewer teaching utterances. These results, combined with other trends in the data, sug-gest that adolescent mothers did not differ significantly from children of older mothers in their general linguistic competencies. Thus, despite differences in the nature of their input, adolescent and older mothers provided at least the minimum amount of the right kind of input to ensure that acquisition proceeded at a “normal” rate. A review of mother’s instruc-tional strategies revealed that teenage mothers were less likely to use the decontextualized, syntactically complex, language of the classroom. Lack of familiarity with this form of dis-course may have contributed to the children’s poor performance. Thus, adolescent mothers’ communicative strategies with their language learning children could be associated with the children’s lack of success in school and school-related tasks.