Palasis Corpus


Katerina Palasis
Department of Linguistics
Université Côte d’Azur, CNRS, BCL, France

website

Participants: 22
Type of Study: classroom
Location: France
Media type: audio
DOI: doi:10.21415/T5SW4P

Browsable transcripts

Download transcripts

Link to media folder

Video is available offline on special request.

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by the above reference.

Articles related to the data:

Project Description - Year 1

This is a longitudinal and cross-sectional study of language development in French with recordings made over a 3-year period with 22 children aged 2;5.5 to 4;0.1 from the same kindergarten class. Corpus 1 represents the first year with recordings from November 2006 to June 2007 (13 sessions, approximately 20 hours). The file names begin with the session number and there are usually several transcripts from a given day.

Rita's first language was Portuguese and Eliza's first language was Russian. The interactions are child-child and child-investigator spontaneous interactions, including playing various games, ‘reading’ books, commenting on the class’ photo album, asking the children about their activities in and out of school.

History: this series of recordings (Corpus 1) was undertaken within my thesis project in order to study the development of subject clitics, strong pronouns, and noun phrases in spontaneous child French. This is the reason why very specific codes were devised with regard to clitic and strong pronouns in this corpus (cf. attached Palasis_06-07_Specific.cut file, and details in Palasis, Katerina, 2010, "Introducing New French Child Data: Thoughts on their Gathering and Coding", Corpus, Vol. 9 "La syntaxe de corpus", p. 33-51, http://corpus.revues.org/index1801.html). There was no specific funding for Corpus 1.

I will very shortly start working on Corpus 2 (25 hours of recordings) and Corpus 3 (20 hours) thanks to recently granted funding by the ANR (Agence Nationale de la Recherche) within a joint French-German (University of Nice-Sophia Antipolis and University of Konstanz) 3-year project named DADDIPRO (‘Dialectal, acquisitional, and diachronic data and investigations on subject pronouns in Gallo-Romance’).

Warnings

Proper first names used in final transcripts (since each transcript is now linked to either a video or an audio file). Family names and some locations not transcribed in order to preserve privacy. Since the initial project was to use pseudonyms (I had not thought about linkage…), discrepancies can be found between these final transcripts and previous transcripts cited in preceding publications. Informed consent form signed by parents for each child.

There are a number of videos in this file where there is transcribed material, usually at the begining or end, for which there is no video. For most of these files there is a comment saying that the transcription was made from the audio files. The files that fit this description are:

01-1b, 04-16, 07-29b, 08-32a, 08-33, 09-37a, 11-43a, 13-46a, 13-48a.

Also, some of the linking near the end of file 08-31a was not very accurate.

Project Description - Year 2

Corpus 2 is the second part of a project that collected data from an entire kindergarten class in France during three years:
Corpus 1 2 3
Dates 2006/2007 2007/2008 2008/2009
Recordings (hours) 20 20 25
Sessions 13 10 10
Media Audio & video Audio & video Audio & video
Participants (children) 20 19 18
Age ranges2;5-4;0 3;6-4;11 4;5-5;11
Utterances (children) 15.992 14.348 In progress
Participants (adults) 3 1 1
Utterances (adults) 12.891 9.291 In progress

The data are semi-naturalistic throughout, with child-child and child-investigator interactions. The children were encouraged to narrate their activities in and out of school, ‘read’ books and play games. For each session, three to five children were seated around a small table in a quiet room next to their classroom.

Children were L1 French speakers except for two children whose L1 was Portuguese (RIT) and Russian (ELI).

These recordings were initially undertaken in order to study the development of subject clitics, strong pronouns and noun phrases in spontaneous child French. This is the reason why pronouns bear specific codes on the MOR tier (see Specific.cut file, and details in

Palasis, Katerina, 2010, "Introducing New French Child Data: Thoughts on their Gathering and Coding", Corpus, Vol. 9 "La syntaxe de corpus", p. 33-51 pdf

The codes used on the %err line are described in this table.

Corpus 3 (18 children, 4;5-5;11, 25 hours of recordings) is on its way.

Acknowledgements

Corpus 2 work was supported by the French-German ANR-DFG grant awarded to the project ‘Dialectal, acquisitional, and diachronic data and investigations on subject pronouns in Gallo-Romance’ (DADDIPRO, 2012-2015, no. ANR 11 FRAL 007 01): https://anr.fr/Projet-ANR-11-FRAL-0007

My deepest thanks also go to Brian MacWhinney who made this contribution come true. Usage Restrictions Last names and locations are sometimes audible in the recordings. They are not transcribed and should never be.