Hayashi Bilingual Corpus

Mariko Hayashi
Institute of Psychology
University of Århus


Participants: 1
Type of Study: naturalistic
Location: Denmark
Media type: no longer available
DOI: doi:10.21415/T5SG7P

Browsable transcripts

Download transcripts

Citation information

Publications using these data should cite:

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This corpus includes longitudinal data from a child growing up in a Japanese-Danish bilingual family in the age range of 12 to 29 months. The data were collected by Mariko Hayashi, University of Aarhus, Denmark, in the context of her doctoral study investigating language development in bilingual children. Pseudonyms have been used to preserve informant anonymity. The child is called “Anders.” Anders was a first-born boy, and had no siblings during the period studied. The father had an university education, the mother college education, thus the family belonged to the educated middle class.

Anders’ mother was Japanese and his father was Danish. The family resided in Den-mark, where the community language is Danish. The parents spoke their respective native tongue to the child from the beginning. Occasional code-switching, especially by the father, occurred to a certain extent. The parents spoke mainly English, and occasionally Japanese and Danish to each other. Anders and his mother spent summer vacation in Japan at the child’s age of 21 to 23 months. In this period, Anders was exposed exclusively to Japanese.

Anders was taken care of by his mother in the day time. He had a couple of Danish- speaking playmates he was occasionally together with. In the evenings and the weekends the father took care of the child as well. The father’s parents, who spoke Danish, lived in the neighborhood and visited the family regularly. People who visited the mother spoke either Japanese or English, as the mother did not understand much Danish. The father and the mother, as mentioned above, spoke mainly English to each other. Otherwise, the child was not exposed to English.

The language Anders was exposed most to was Japanese, as it was the mother who took care of him in the day time. He also spent a three-month summer vacation in Japan, where he was exposed exclusively to Japanese. In his productive vocabulary Japanese began to be dominant at 20 months. The dominance of the Japanese language became especially clear during and after his visit to Japan. Although Anders did not show any clear sign for comprehending English, he did pick up a few English expressions such as “see you” and “two.”

Monthly videotapings of the child of about an hour’s duration were made in the age range of 11 to 38 months. All recordings were made in the child’s own home by Hayashi. With a few exceptions, both parents were present at each session. Each visit included until a certain time testing on the Uzgiris-Hunt Infant Assessment Scales (1978) as well. For a certain period, the parents kept a record of lexical items, which was used as a supplement to the videotapings. The mother made audio recordings during their stay in Japan as well.

Thirty minutes of each session were transcribed based on standard orthography by Hayashi, who is a native speaker of Japanese as well as a fluent speaker of Danish. All transcripts were checked by a native speaker of Danish. Three or four different situations, typically dinner, free play, and book reading, were selected for transcription. Furthermore, care was taken so that the mother and the father were more or less equally included in the portion of recording to be transcribed. Utterances are identified after prosodic criteria such as intonation and pauses, whereas utterances themselves are divided into units based on clarity of articulation and fluency. Limited attention is paid to overlapping, retracings, and hesitations. A deviated phonological form is described in the phonetic tier. However, it does not provide a precise phonetic analysis. Speech errors are not coded.

The corpus contains the following 17 files:
FileDate of recordingAge of Child