CHILDES Italian Calambrone Corpus

Paola Cipriani
Department of Clinical and Experimental Medicine
INPE-Universitá di Pisa


Giuseppe Cappelli



Participants: 3 target + 6 control
Type of Study: longitudinal
Location: Italy
Media type: cannot reach people
DOI: doi:10.21415/T56C8R

Browsable transcripts

Download CHAT transcripts

Citation information

Publications using these data should cite:

Cipriani, P., Pfanner, P., Chilosi, A., Cittadoni, L., Ciuti, A., Maccari, A., Pantano, N., Pfanner, L., Poli, P., Sarno, S., Bottari, P., Cappelli, G., Colombo, C., & Veneziano, E. (1989). Protocolli diagnostici e terapeutici nello sviluppo e nella patologia del linguaggio (1/84 Italian Ministry of Health): Stella Maris Foundation

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by the above reference.

Project Description

This corpus includes data on both normal and disordered language development. However, the data from the three children with language disorder were never contributed to CHILDES. The normal data come from six participants (2 boys and 4 girls) whose speech samples were collected at home. Each child was recorded bimonthly and every session lasted from 30 to 45 minutes. The data from children with language disorders were collected at the Stella Maris Institute and include longitudinal as well as cross-sectional data on clinical syndromes (developmental dysphasia, genetic and chromosomal disorders). The longitudinal participants are three dysphasic children, observed from the age of three. Their linguistic production was limited to holophrases and few word associations. Their speech was videotaped during monthly 30-minute sessions.

All transcripts were derived from videotaped interactions recorded with a video camera (Hitachi VM 200E or Nordmende V150). Each session was also audiotaped with a Sony™ TCM-6 recorder with Sony™ ECM-150T personal microphone. Transcripts were filed on floppy disks of IBM personal computer by one researcher, and the level of mutual evaluation agreement was checked by two independent transcribers.

The six normal participants were: Rafaello, a first-born boy from a family of high SES: followed from 1;7.08 to 3;3.00 (39 videotapings); Rosa, a second-born girl from a middle-low SES, followed from 1;3.00 to 3;3.23 (43 videotapings); Martina, the only daughter from a family of middle SES, followed from 1;7.00 to 3;0.00 (20 videotapings); Guglielmo, a second-born boy from a family of middle-high SES, followed from 2;1.00 to 2;11.00 (13 videotapings); Viola, a second-born girl from a family of middle SES, followed from 1;10.00 to 3;0.14 (23 videotapings); and Diana, a first-born girl from a family of middle SES, followed from 1;6.07 to 3;0.19 (26 videotapings).

Table 1: Longitudinal SLI Participants
Marco6;2 – 9;413296305878027
Sara4;11 – 6;512257327358216
Davide5;8 – 6;1147391914490

Table 2: Cross-Sectional SLI Participants

The language samples from children and interacting adults were transcribed in CHAT format with a minimum context contained in dependent tiers (%act; %gpx; %exp); the main lines of the children contain the real speech produced, with some coding for special forms of lexicon, punctuation, and pauses. At a second stage some new lines were added in order to code errors, omissions, and presyntactic devices. The focus of our first analysis was on lexical and morphological acquisition by normal and language-impaired children, looking in depth for transitional phenomena and stages of global language development.


This database is the result of research conducted from 1985 to 1990 in the laboratory of “Fisiopatologia del linguaggio in etá evolutiva” in which many people have taken part: Piero Bottari, Anna Maria Chilosi, Lorena Cittadoni, Alessandro Ciuti, Anna Maccari, Natalia Pantano, Lucia Pfanner, Paola Poli, Stefania Sarno, Luca Surian, and Paola Cipriani as co-ordinator. Pietro Pfanner is the Scientific Director of the Institute. Giuseppe Cappelli of the Institute for Computational Linguistics (directed by Antonio Zampolli) was responsible for the computational aspects of the project. Data collection was supported by the grant 6 500.4/ICS/62.1/1135 (13/08/85) assigned to the Stella Maris Scientific Institute by the Italian Ministry of Health.