Velka Popova Laboratory of Applied Linguistics University of Shumen v.popova@shu.bg website |
Dmitar Popov Laboratory of Applied Linguistics University of Shumen labling@shu.bg website |
Participants: | 5, 50, 71 |
Type of Study: | naturalistic, narrative |
Location: | Bulgaria |
Media type: | audio |
DOI: | doi:10.21415/PHWH-J834 |
The main focus of the LabLing research program is the creation of a Bulgarian children's language corpus as part of the CHILDES database. The LabLing is part of the consortium of the Bulgarian national research infrastructure for resources and technologies for linguistic, cultural and historical heritage, integrated within CLARIN EU and DARIAH EDU (CLaDA-BG – https://clada-bg.eu/en). The data in particular will be of great importance for the formation and creation of a national interdisciplinary electronic infrastructure in the process of integration and development of electronic resources in Bulgarian. Therefore, the construction of the LabLing CORPUS is a priority task of the consortium CLaDA-BG. The Cyrillic letters Я, Ю, Ъ, Ч, Щ, Ш, Ж, Ц, Й are assigned the following Latin correspondences: Я – ja , Ю – ju , Ъ – y , Ч – ch , Ш – sh , Щ – sht, Ж – zh , Ц – c , Й – j, X - x.
The children were born and live in the northeastern part of Bulgaria (Shuman and Varna). They were recorded in common situations (games, when dressing, eating, going to sleep, going through children’s pictorial books, free playing with mother, free playing with father, free playing with other children, reading a book and others) in the process of their daily interaction surrounded by their relatives. All individuals who were signed in the database in their role as participants in dialogues are monolingual native speakers of Bulgarian. The adults in the surroundings have a sufficient level of proper education (either secondary or higher university education). The audio-recordings of two of the children (ALE and TEF) were made by the researchers team of LabLing and those of of BOG, SIM, and ELI – by their mothers. The digitization and transcription of the material is done by the participants in the research team.
The narrative corpus consists of two segments. The first uses the fox and cat stories and the second uses the birds and dogs stories.
The fox-cat collection contains 91 transcripts of children`s narratives extracted from 50 monolingual children (native speakers of Bulgarian). They were recorded using a recorder in several kindergartens in Shumen and Varna (north-eastern Bulgaria), in only a few separate cases - at home or in the street. The children are grouped into 3 age groups: