BiSLI Bilingual Corpus


Elena Tribushinina
Utrecht Institute of Linguistics
Utrecht University

website

Participants: ~1000
Type of Study: narrative
Location: Netherlands
Media type: audio not open
DOI: doi:10.21415/T5NG62

Browsable transcripts

Download transcripts

Citation information

Publications using these data should cite:

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This corpus contains 1058 transcriptions of narratives collected within the framework of the European project “Discourse Coherence in Bilingualism and SLI” coordinated by Elena Tribushinina (Utrecht University, The Netherlands), Natalia Gagarina (ZAS Berlin, Germany) and Ekaterina Abrosova (Herzen State Pedagogical University of Russia, St. Petersburg, Russia). The project aimed to disentangle discourse profiles of simultaneous bilinguals and their peers with a language impairment (SLI) and has been supported by a Marie Curie International Research Staff Exchange Scheme Fellowship within the 7th European Community Framework Programme (grant number 269173). Data collection by ZAS has also been partly supported by the German Federal Ministry for Education and Research (BMBF) and Deutsche Forschungsgemeinschaft (DFG). This database is related to an earlier corpus of narratives collected from Russian-German bilinguals (see Gagarina corpus in Bilingual Corpora).

The agencies funding this research are not permitting access to the audio recordings for these transcripts.

Dutch monolinguals were recruited from primary schools and daycares in The Netherlands. They all had normal language development (Dutch children with SLI were not included in the study because the bilingual group had age-appropriate language skills in Dutch). Russian monolinguals with typical language development were recruited from kindergartens and primary schools in St. Petersburg, Kemerovo and Sochi (Russia). Russian participants with SLI were recruited through special schools and daycares for language disorders located in the Kemerovo region (Russia). The children were monolingual speakers of Russian and had been independently diagnosed for SLI (in Russian – obščee nedorazvitie reči II-III urovnja) by a multidisciplinary committee. In Russia SLI is officially diagnosed at age 4; therefore no language-impaired participants below that age are included in the corpus.

The bilingual participants were recruited from the Russian Saturday schools in Amsterdam, Amersfoort, Hilversum, Utrecht and The Hague (The Netherlands). These children were dominant in Dutch; they were born in the Netherlands and raised bilingual from birth (in most cases by a Russian mother and a Dutch father).

Materials and procedure

Two picture stories were used to elicit children’s narratives – the Fox Story (Gülzow & Gagarina, 2007) and the Cat Story (Hickmann, 2003). The stories contained six pictures each; both picture sets were black-and-white drawings, 12x12 cm (Fox Story) and 10x13 cm (Cat Story) in size. The narratives were elicited by native speakers of each language, following the procedure described in Gülzow & Gagarina (2007), see also the description of the Gagarina corpus (Bilingual Corpora).

Most (but not all) bilingual participants told both stories (with a minimum interval of two weeks), one in each language (either Cat or Fox). The narratives elicited from the same children can be identified by matching the unique ID numbers in both directories. The Dutch monolinguals were randomly assigned to one of the narratives (Cat or Fox). The Russian participants with and without SLI told both stories (in one session); the order was counterbalanced among participants. Nineteen 8-year-olds with SLI were additionally tested 16 months after the first session (these children can be recognised by their unique ID numbers).

File codes

Files are arranged in five directories: Bilinguals NL (Dutch narratives of bilingual participants), Bilinguals RU (Russian narratives of bilingual participants), Dutch monolinguals with TLD (typical language development), Russian monolinguals with TLD, and Russian monolinguals with SLI. File names contain the following information:

So the file mr_td_3_372_cat contains a Cat Story elicited from a monolingual Russian child with typical language development, 3 years old (unique participant number: 372).