CHILDES Croatian Kovacevic Corpus

Melita Kovacevic
Department of Language & Speech Pathology
University of Zagreb

Participants: 3
Type of Study: naturalistic
Location: Croatia
Media type: audio
DOI: doi:10.21415/T5FS5X

Browsable transcripts

Download transcripts

Link to media folder

Citation information

Project Description

The data collection started in 1993 when the first project on the acquisition of Croatian was initiated. From that point up to now the project has been renewed and the new grant has been rewarded in 2002 for the three year period. The project is headed by Prof. Melita Kovacevic, Department of Speech & Language Pathology, University of Zagreb (

The Project follows language development of the total of ten children. Among them there are two pairs of twins, where one pair is late talkers. While some corpora are already completely coded, some are still in the process of recording. Parents of all children included in our study expressed interest and wish their children language development to be followed. Most of the parents are in some way connected to language fields, e. g. language pathology, language teaching, and linguistics.

Antonija's data are transcripts of the audio-recordings collected during 1994 - 1996 in Zagreb. Antonija was recorded in her home in spontaneous interactions with her parents and grandparents. The period of about one month, between 1;7 and 1;9, is missing due to family reasons. Each recording session lasted about 45 minutes and usually there were three sessions a month (see the table). The recordings and the transcriptions were made by Antonija's mother, Drazenka Blazi. Additional transcription and encoding was done by Maja Andjel, Antigona Katicic and Marijan Palmovic. All the transcribers are linguists and trained for the transcriptions. M. Palmovic and M. Andjel are research assistants (University of Zagreb) and A. Katicic is a Ph.D. student (University of Vienna).

Marina's data are transcripts from audio-recordings collected during 1994 and 1995. They cover the period from Marina's age of 1;10 to 2;11 with three months missing (2;4, 2;10 and 2;11) due to the escalation of war in Croatia at the time. The child has been recorded for a longer period of time, but all the files are not in the stage to be released. Since some of the recordings have been done outside, in a yard, the quality of sound is not satisfactory and as such requires additional work. The present files were partially transcribed by Blazenka Brozovic, research fellow, Petra Makanec, student, and Maja Andjel. Final checking of the transcriptions and addition of the error codes has been done by Marijan Palmovic in the Laboratory for Psycholinguistic Research, Dept. for Speech & Language Pathology, University of Zagreb (

Vjeran’s data are transcripts of audio-recordings collected from 1995 to 1997. All the recordings were done in spontaneous interactions, mostly in his home in Zagreb, either with his parents or with person who was recording. Some of the recordings were primarily based on picture book readings, and this should be taken into an account when performing analysis. Some recordings took place outside home, at the playground or the neighboring store. The recordings covered the period from Vjeran’s age of 0;10 to 3;02. Each recording session lasted about 45 minutes, 3 times per month on average (see the table). The recordings were mainly done by Blazenka Brozovic, research fellow, and some have been done by his mother. Transcriptions were done by Blazenka Brozovic and Jelena Kuvac, research assistant. Final checking of the transcripts and some additional coding has been done by Marijan Palmovic and Gordana Hrzica in the Laboratory for Psycholinguistic Research, Dept. for Speech & Language Pathology, University of Zagreb ( All the transcribers have been trained to use CHILDES transcription and coding system.

The informed consents for the use of children data have been obtained. There are no pseudonyms, but only the first names are used.

Codes & transcription

For the purposes of the transcription a new code was added to the depfile - @x:syl. During the sessions it was quite common that Marina's mother elicited a word by uttering the first syllable of the word (and other syllables if necessary). Marina often responded in just adding the next syllable or the rest of the word. Regarding Antonija's files, rather than producing a single letter, Antonija was often producing a whole syllable. The new code was needed since these Antonijas' utterances neither were babbling nor incomplete words.

Warning: As Croatian keyboard has been used, there might be some problems with Croatian characters, e.g. š, đ, ž, č and ć. The files were checked with the CHECK program and have passed it successfully. However, some inconsistencies in encoding were inevitable due to the fact that a few different transcribers worked on the transcripts of each child over a long period of time. For all questions please write to: Marijan Palmovic ( or Gordana Hrzica (

Biographical data


Antonija's parents are middle class urban dwellers: her mother is an university teacher (speech & language pathologist) and her father is an engineer. Both parents are born and raised in Zagreb. They speak Zagreb Stokavian dialect. Croatian is marked with a number of dialects. The differences among these dialect groups could be more substantial than the differences that divide standard languages within the Slavic language family. In Zagreb (Croatian capital) there are two major dialects, so called Zagreb Stokavian - to be closer to the standard Croatian - and Zagreb Kajkavian - to be closer to Kajkavian dialect. The dialects differ on all language components levels: lexicon, phonology, morphology and syntax. Being an university teacher and a speech pathologist, Antonija's mother pays much attention to the way she speaks to her child often repeating what the child has just said. However, in Antonija's interactions with her grandparents strong elements of Zagreb Kajkavian dialect can be observed - the most noticeable being the question word kaj (eng. what). It should be noted that Antonija lives in a three generation household in Zagreb suburb, being all the time with both of the grandparents as well.

Kajkavian influence can also be heard in prosody as a lack of the two rising accents, but this cannot be seen in the transcripts since such information was neglected not being a focus of the study. At the time of the recordings Antonija was the only child.


Marina's parents are middle class urban dwellers as well: her mother is an assistant professor at the Department for Speech & Language Pathology, University of Zagreb and her father is an actor in a Croatian National Theatre in Zagreb. Her mother was born and raised in Zagreb, while her father came from Dalmatian coast (Zadar) to study in Zagreb. They both speak Zagreb Stokavian dialect with some elements of Zagreb Kajkavian although her father comes from Cakavian dialect region (another distinct dialect of Croatian). However, being a trained actor, his everyday speech is closed to standard.

At the time of the recordings Marina had a sibling - infant girl, Vita. She has been only mentioned during the recording sessions.


Vjeran’s parents also belong to middle class: his mother is English teacher working in a private college and his father is an electro engineer. His mother also received a degree in phonetics, which made her quite sensitive for different language/speech issues. Both parents were born and raised in Zagreb. They speak Zagreb Stokavian dialect (please see the above information on the dialect). Vjeran’s babysitter, with whom he was spending a few hours daily, was a woman who came to Zagreb from another Croatian region, Lika. The speech of that region is Stokavian dialect which has been preserved in the speech of the babysitter. This dialect is the base of Croatian standard.

Vjeran does not have any siblings and was mostly surrounded by adults at the time when the recordings had been done.

Usage Restrictions

When using Antonija and Marina corpora, please cite the title of the project and the name of head of the project: Acquisition of Croatian in Crosslinguistic Perspective, Kovacevic, M. (2003).

By contributing our data to the CHILDES system we do not impose any particular restrictions on the use of the data. However, we would appreciate if researchers send the copies of articles that make use of the data or send us a reference.


This project has been financed by the Croatian Ministry of Science and Technology (Project No. 013002).