Rigol Corpus

Heike Behrens
English Department
University of Basel


Participants: 21
Type of Study: naturalistic
Location: USA
Media type: password to audio
DOI: doi:10.21415/T50S34

Browsable transcripts

Download transcripts

Citation information

Project Description

Between 1990 and 2003, Rosemarie Rigol, a former professor of German linguistics at the University of Osnabrueck, took more that 1900 30-minute recordings of children acquiring German as their first language. The Rigol corpus consists of a total of 21 children, including a set of twins and a set of triplets. The 21 children (11 boys and 10 girls) come from 11 families: 2 of the boys are singletons, 13 children have one sibling, the twins have one sister, and the triplets a brother. Of the 21 children, 19 grew up in a rural community in the German province of Hessen, and two in the city of Osnabrück (Lower Saxony). Rosemarie Rigol transcribed and checked these data. Later, the MPI Leipzig supported reformatting to CHAT. Heike Behrens can provide further details regarding the project, if needed. Two intelligence tests (German versions based on the Catell-Tests) were taken for 18 of the children from Hessen, first before they started school (Version A), second at the end of the first year at school (Version B). The tests served to determine the range of intelligence measures and to ensure compatibility when analysing the children's language development. (Rudolf Weiß – Jürgen Osterland : Grundintelligenztest CFT 1976, auf der Basis der Intelligenztheorie von Catell entwickelt. Braunschweig: Westermann 1977)

Sociological background

While many families volunteered to be recorded, care was taken to have a representative sample of children from non-academic parents. 7 children come from 4 families with academic background, the other parents had vocational training (skilled labour, craftsmen, or office clerks/employees). The mothers of the children recorded represent the first generation of women for whom higher education was widely available (before, only daughters of academics attended higher education / vocational training in the nearby larger city). It turned out that the experience that higher education is within reach has changed the expectations of the parents regarding the educational chances of their children, and this in turn influenced educational style. 8 of the 11 families would like their children to reach the highest educational degree (Abitur), 3 families aim at a medium degree (Realschule). More sociological detail about the families can be made available upon request. In order to ensure anonymity, family names and place names are transcribed as "www [= Name]" or "www [= Ortname]"

The data can be analyzed in terms of linguistic development, but also in terms of communicative processes (adult-child interaction), and in terms socio-cultural aspects regarding child and family culture (e.g., manners of playing, toy use, educational values).


For four children, recordin

gs started in the second year, for seven children in the first year of life, and 10 children were recorded from birth. With only one exception, recordings ended when the children had finished the first year at school (typically in the 8th year of life).

Recordings were made with a 8mm or HI-8 camera for the first 3 years of life in order to study the interactional patterns between children and adults (including language, gesture, facial expression, non-verbal context etc.). After age 3, film recordings and sound recordings (Tape or DAT) were mixed. Towards the end of the recording period, only sound recordings were made. In the near future, all digitized audio and video recordings will be made available to the CHILDES archive, even if not all recordings have been transcribed). When the sessions were filmed, the focus of the camera was on the child.

Up to age 4, children were recorded for 30 minutes every two weeks, after that every 4th week. Recordings took place in the children's home or in its vicinity. Along with the children adults were recorded, most frequently the mother of the children and his/her siblings, occasionally the father, the grandparents or great-grandparents, or playmates of the children.

During the first 5 years of life, recordings were made of spontaneous interactions only. As the children approached school age, elicitation tasks were recorded as well. Here, the focus was on testing the child's processing skill, for example through requests to segment phrases into "words" or "syllables" by clapping the hands when the child perceived a boundary. In addition, several sessions focus on phoneme-grapheme correspondences. Immediately following each of the recordings, Rosemarie Rigol wrote a protocol of the main activities including some comments on remarkable events or developments. Many sessions were also transcribed in Word. Unfortunately it turned out that these transcripts could not be transformed into CHAT format, and therefore it was decided to re-transcribe the data in CHAT, making use of SONIC Chat whenever possible. Between 1999 and 2003, this process was supported by the Max-Planck-Institute for Evolutionary Anthropology at Leipzig (Germany), which sponsored several research assistants and students to digitize and transcribe the data. Heike Behrens supervised the transcription in order to gain maximum compatibility with other German corpora (Leo-corpus, Miller-corpus). Rosemarie Rigol checked all transcripts and continues to transcribe further data.

The transcription follows CHAT-conventions, and care was taken to facilitate automatic analyses of the data. Hence, the data are transcribed such that words can be recognized without misrepresenting the original. E.g., clitics are resolved whenever possible ('s becomes (e)s or (da)s), missing segments are inserted ((Gi)raffe), and non-standard pronunciation is "corrected" by the replacement function (baba [: Papa]). The special form-marker "@o" was used rather widely to indicate non-word material as well as interjections etc. With these measures, the resulting lexicon (FREQ) should show recognizable German words only, all other words should be indicated by special form markers (@o, @c, @f, @d).

In order to represent the elicitation tasks discussed above in CHAT format, some special form markers were used in non-standard fashion: @l letter (including Umlauts a_e@l, o_e@l, u_e@l, e_I@l, and c_k@l or s_c_h@l)
@p part of a word (often but not necessarily a syllable
@t word or phrase that was supposed to be segmented (e.g. "Kindergarten" or "er_schwimmt_im_Gartenteich@t"

Sound, Letters, and Word Parts

Beginning at age 5, the children were examined for their learning of the units of written language. The protocols distinguish the learning of letters (@l) and syllables (@p). Test words are marked with @t. The marking of sounds with @s was removed to conform to CHAT guidelines. Probe questions focused on asking about word beginnings, numbers of syllables,


There are 129 recordings (tape or video) of Cosima's development between ages 0;00,13 and 7;2,22. Transcription started at age 1;8.22. Cosima's father had university education, her mother vocational training (Lehre). Cosima has a brother, Niklas, who is two years older. One grandmother lives in a separate apartment in the parental home, and Cosima also has good contact to various cousins, aunts and uncles and her grandfather, who often takes the children out to the countryside to discover plants and animals. Cosima is a very social child with a lot of humour and has one best friend, Ina, since she was 18 months old, with whom she shares almost all activities.

Between age 3 and 6, Cosima visited a protestant kindergarten and in addition had some musical education. She then attended primary school and entered high school (Gymnasium) at age 10.

Recordings were taken in very regular intervals. The mother is often present, but also her brother Niklas and her friend Ina, as well as the cousins Kai and Markus.


There are 130 recordings (tape or video) between ages 0;00,12 and 7;11,03. Transcription started at age 1;10.12. Pauline's parents have university education. She has a brother, Robert, who is three years older. She has intensive contacts with her aunts and uncles as well as 6 cousins. They frequently visit each other and also travel together. Also, she has several good girlfriends. Throughout her childhood she had several pets (cats, HAMSTER, WELL). Pauline attended kindergarten between age 3 and 7, and went to an integrated school (Gesamtschule) afterwards. She was a lively child with a lot of interests, as well as a good observer of things. Because of the many activities and trips of Pauline and her parents, recordings couldn't take place at regular intervals.


There are 134 recordings (tape or video) between ages 0;00,17 and 7;5,11. Transcription started at age 2;1.12. Sebastian's parents have vocational training (Lehre). Sebastian has a younger brother, Christian. The grandparents lived close by and there was intensive contact. Sebastian grew up with several pets, and became a breeder of rabbits at age 4. He is very interested in practical issues and craftsmanship. Sebastian attended protestant kindergarten between age 4 and 6, then an integrated school (Gesamtschule). Recordings took place at regular intervals.