CHILDES Hebrew Berman Longitudinal Corpus

Ruth Berman
Department of Linguistics
Tel Aviv University
rberman@post.tau.ac.il
website

Participants:	4
Type of Study:	naturalistic, longitudinal
Location:	Israel
Media type:	audio
DOI:	doi:10.21415/T5X61W

Citation information

Below are the publications, research studies, and talks using the database.

Armon-Lotem, S. (1997). The Minimalist Child: Parameters and Functional Heads in the Acquisition of Hebrew. Unpublished doctoral dissertation, Tel Aviv University.

Armon-Lotem, S. & Berman, R. A. (2003). The emergence of grammar: Early verbs and beyond. Journal of Child Language, 4, 845-878.

Berman, R. A. (1990). Acquiring an (S)VO language: Subjectless sentences in children's Hebrew. Linguistics, 28, 1135-1166.

Berman, R. A. (1996). Form and function in developing linguistic and narrative abilities: The case of ‘and’. In Slobin, D., Gerhardt, J., Kyratzis, A. & Guo, J. (eds.), Social Interaction, Context, and Language: Essays in Honor of Susan Ervin-Tripp. Mahwah: Erlbaum, 243-68.

Berman, R. A. (1997). Theory and research in the acquisition of Hebrew as a first language In Shimron, Y. (ed.) Studies in the psychology of language. Jerusalem: Magnes, 37-69. [in Hebrew].

Berman, R. A. & Armon-Lotem, S. (1996). How grammatical are early verbs? In Martinot, C. (ed.), Annales Littéraires de l’Université de Franche-Comté: Actes du Colloque International sur l’Acquisition de la syntaxe, 17-60.

Berman, R. A. & Ravid, D. (2000). Research in acquisition of Israeli Hebrew and Palestinian Arabic. Hebrew Studies, 41, 83-98.

Berman, R. A. & Weissenborn, J. (1991). Acquisition of word order: A crosslinguistic study. Final Report. German-Israel Foundation for Research and Development (GIF).

Uziel-Karl, S. (1999). The early make-up of children’s verb lexicon. In Proceedings of the 30th Child Language Research Forum, 41-50.

Uziel-Karl, S. (2001). A Multidimensional Pespective on the Acquisiton of Verb Argument Structure. Unpublished doctoral dissertation, Tel Aviv University.

Uziel-Karl, S. (2001). Acquisition of verb argument structure: Canonical mapping or verb-by-verb? In Proceedings of the 26th Boston University Conference on Language Development, 701-717.

Uziel-Karl, S (2001). Early morphological development: Evidence from the acquisition of verb morphology in child Hebrew. Proceedings of the Conference on Early Lexicon Acquisition, Lyon, France.

Uziel-Karl, S. (in press-a). Acquisition of verb argument structure in a developmental perspective. Studies in Theoretical Psycholinguistics, Dordrecht: Kluwer.

Uziel-Karl, S. (in press-b) A developmental approach to the acquisition of verb argument structure. Hebrew Linguistics [in Hebrew].

Uziel-Karl, S. & Berman, R. A. (2000). Where’s ellipsis: Whether and why there are missing arguments in Hebrew child Language. Linguistics, 38, 457-482.

Uziel-Karl, S. & Budwig, N. (2002). The development of non-agent subjects in Hebrew child language. In Proceedings of the 27th annual Boston University Conference on Language Development (pp. 798-808).

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This Hebrew longitudinal data-base is contributed by the Tel Aviv University Laboratory headed by Ruth A. Berman, holder of the chair “Language across the Life Span”. Funding for data-collection and transcription of these materials was provided by grants to Ruth Berman, Tel Aviv University, and Jürgen Weissenborn, Max-Planck Institute for Psycholinguistics, Nijmegen, from the German-Israel Binational Science Foundation (GIF) – 1988 to 1991 – and from the Deutsche Forschungsgemeinschaft (DFG) – 1988 to 1990 – for the crosslinguistic study of early language acquisition in French, German, and Hebrew. Additional assistance with funding and equipment was provided by Brian MacWhinney as director of the CHILDES Laboratory at Carnegie Mellon University, and by Wolfgang Klein, director of the Language Acquisition section at the Max-Planck Institute for Psycholinguistics.

Sharon Armon-Lotem supervised data collection by graduate student research assistants of the Department of Linguistics and School of Education, Tel Aviv University. Sigal Uziel-Karl and Bracha Nir-Sagiv standardized the files, following the latest version of the CHILDES transcription system (MacWhinney 2000).

The data-base consists of naturalistic longitudinal data collected on a weekly basis from four Hebrew-speaking children, three girls (Hagar, Smadar, and Lior) and one boy (Leor). All four children are native speakers of Hebrew raised in monolingual, highly educated Hebrew-speaking homes, with both parents professionals, in urban communities of central Israel. Smadar was the youngest of three girls, Hagar and Leor were only children at the time of recording, and Lior had a baby brother.

Each child was audio-recorded at his or her home for a total of around one hour per week, typically two or three times a week in different situations (mealtime, bath time, playing on their own or with siblings or parents and grandparents). Recordings were done over a period of one to three years (see Table 1 below). The contact person and main recorder for three of the children was the mother, and in one case (Leor’s) the aunt – all four native speakers of Hebrew that had majored in linguistics at the university. The research assistants kept close touch with the family contact person, and caretaker-recorders were encouraged to maintain a natural and spontaneous atmosphere throughout recording situations, but they were also instructed to repeat or extend what the child had said in cases where an utterance might be unclear or unintelligible to transcribers. Those doing the recording were also instructed to specify the exact situation in which recording took place at the outset and in the course of each session. Information about the situation in specific sessions is provided in each file under the @Situation heading. As former students of the department and/or research assistants in associated projects, all caretaker-recorders and parents agreed to giving over the materials for further use by the Berman lab.

This data-base has several features that make it well-suited to child language research. The interactions are natural since they were recorded in the homes, a setting familiar to the children, in the presence of a primary caregiver and / or other members of the family. The data were collected over several sessions each week and so allowed a variety of contexts for the children to express themselves. Rich contextual information was provided by the caregivers, and the latter were regularly available to the transcriber for consulting and clarifications. Finally, both the transcribers and the researchers involved in the project knew the children and their parents, and were familiar with the children’s linguistic development beyond the data provided by the recorded sessions.

Table 1 gives details of the complete data-base recorded and transcribed for the four children.

Table 1 - Size and Range of Database from Four Hebrew-speaking Children

ID Subject Sex Age Range #Files #Child Utts Utts Range Mean Utts Per File
LONGIT Hagar F 1;7 - 3;3 135 15238 31 - 288 113
LONGIT Lior F 1;5 - 3;1 141 21920 missing 149
LONGIT Smadar F 1;4 - 2;4 34 7144 44 - 374 210
LONGIT Leor M 1;9 - 3;0 82 16470 62- 423 198

The transcripts were all transcribed in the CHAT format (CHILDES) with adaptations to Hebrew. A special system of broad phonetic transcription of Hebrew devised by Ruth Berman in earlier studies (and used in an earlier, cross-sectional Hebrew data-base of 100 children between ages one to five years, entered on the CHILDES archives in the late 1980s) was improved and extended for use with the audio-recordings in the longitudinal study. The transcription applied to the longitudinal database made it possible to represent in a consistently standardized way the children’s target forms; that is, how they would be pronounced in the standard Hebrew speech of these children’s caretakers. This procedure was adopted in order to facilitate lexical searches across the same forms for the same words. This also meant, however, that the resulting transcriptions are suited to analysis at the levels of morphology (inflectional and derivational) as well as syntax and the lexicon, but are not adequate for details of phonological development.

The children’s target forms are typical of “standard” Hebrew usage of well-educated Israelis for whom Hebrew is a first and major language (Berman 1987, Ravid 1995, Berman & Ravid 1999). In order to reflect the genuine usage of such speakers (and the primary input to the children in this research), the transcription deliberately departs from both the historical or underlying forms represented by conventional Hebrew orthography and from the normative pronunciation stipulated by the Hebrew language establishment (Hebrew Language Academy, school grammars, official broadcasting and media, etc).

Children’s actual pronunciation of certain forms and pronunciation errors were marked as such on the main tier. For example, when a child used a form like nanu for gamarnu ‘finished-1pl-pt = alldone’, this was represented on the main tier (text-line) as nanu [: gamarnu] [*]. More general comments regarding child and adult pronunciation were included under the heading @Comment. This procedure was adopted (1) to allow for lexical searches, since Hebrew orthography represents vowels and phonological processes such as spirantization and voicing assimilation only very partially; (2) to facilitate analysis of data based on situational context or on caretaker reaction prior to coding (for example, whether a form such as pes ‘climb’ should be taken to mean letapes ‘to climb’ or metapes ‘climb-ms-sg-pr’ (Cf. Armon-Lotem & Berman 2003, Uziel-Karl 2001); (3) to make the contents of the transcripts more readable and so more accessible to outside investigators and students. As a result, as noted, the materials are well suited to analysis at the morphological, lexical, and syntactic levels, but do not allow for detailed phonological analysis. Further, the efforts to ensure rich contextual information by means of cues provided by the adults who did the recording make the material available to semantic and pragmatic analyses as well.

Transcription conventions

Letter	Symbol	Example	Gloss	Letter	Symbol	Example	Gloss
Aleph	∅	∅aba∅	Daddy	Kaf	k x	kdur rax	ball soft
Bet	b v	bayit tov	house good	Lamed	l	layla	night
Gimmel	g	gag	roof	Nun	n	ner	candle
Daled	d	dag	fish	Samech	s	sefer	book
Heh	h	har	mountain	Ayin	' ∅	na’al ∅ale	shoe leaf
Vav	v u	vered sus	rose horse	Pe	p f	pil sof	elephant end
Zayin	z	ze	it, this	Tsade	c	mocec	pacifier
Chet	x	xam	hot	Qof	k	kar	cold
Tet	t	taim	tasty	Resh	r	rosh	head
Yod	y	yad	hand	Shin	sh	shana	year
				Sin	s	se’ar	hair
				Tav	t	calaxat	plate

Vowels Example Gloss
a aba Daddy
e sefer book
i sipur story
o or light
u sus horse

Usage restrictions

Note: Copies of publications using this data-base should be sent by e-mail to rberman@post.tau.ac.il and/or by air mail to Dr. Ruth Berman, Department of Linguistics, Tel Aviv University, Ramat Aviv, Israel 69978.

For more information regarding the data-base, contact Bracha Nir-Sagiv brachan@post.tau.ac.il or Sigal Uziel-Karl at sigal@alum.mit.edu