CHILDES Japanese Ogawa Corpus


Yoshiki Ogawa
Graduate School of Information Sciences
Tohoku University

website

Susanne Miyata
Department of Medical Sciences
Aichi Shukotoku University

website

Participants: 1
Type of Study: naturalistic
Location: Japan
Media type: diary
DOI: doi:10.21415/T5H314

Browsable transcripts

Download transcripts

Citation information

Miyata, S. (2012b). CHILDES nihongoban: Nihongoyoo CHILDES manyuaru 2012. [Japanese CHILDES: The 2012 CHILDES manual for Japanese]. http://www2.aasa.ac.jp/people/smiyata/CHILDESmanual/chapter01.html

Ogawa, Yoshiki (2016). Ogawa Corpus. Pittsburgh, PA: TalkBank. doi:10.21415/T5H314

Sugisaki, Koji and Yoshiki Ogawa (2022) “Nihongo-niokeru Uhou Shuuenbu-no Kakutoku: Sizen-hatuwa Bunseki-nimotoduku Yobiteki Kenkyuu [Acquisiton of Right Peripheries in Japanese: A Preliminary Research based on an analysis of Natural Utterances], Coopasu-kara Wakaru Gengo Henka / Henni to Gengo Riron 3 [Facts of Language Changes and Variations seen from Corpora and Linguistic Theories], ed. by Yoshiki Ogawa and Hidetoshi Nakayama, Kaitakusha, Tokyo.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The Ogawa Corpus contains diary data collected by the generative linguist Yoshiki Ogawa. Ogawa has taught at Graduate School of Information Sciences, Tohoku University, since 2004. He has published a great number of books and articles mainly on syntax, morphology, lexical semantics, and diachronic change in morphosyntax. The release of this set of data is part of the many research activities driven by the Research Unit on Language Change and Language Variation, led by Yoshiki Ogawa.

He observed his first-born daughter Ayumi from birth (04-JUL-2011) until 6;2, as she was growing up in Miyagi. The data is based on handwritten records collected virtually daily (2250 days over 6 years and 2 months), although her first utterance is at 0;9 and there are few utterances in her first two years. The data contains 22,671 utterances by Ayumi, without including almost any utterances by other speakers (mainly, her mother and father). A comment is provided for some utterances, establishing the context and interpreting the child's utterance.

He has also been observing his second-born daughter Mari from birth (2015, October 6th) until the age of 4-year-and-2-month and is planning to observe her until the age of 7-year-old, as she is growing up in Miyagi. The data is based on handwritten records collected virtually daily (1524 days over 4 years and 2 months, as of December 6th, 2019), although her first utterance is in her 4 months old and there are few utterances in her first two years. The data contains 12995 utterances by Mari (until her 4 years and 2 months), without including almost any utterances by other speakers (mainly, her mother and father). A comment is provided for some utterances, establishing the context and interpreting the child's utterance.

An electronic version of this data was adjusted to CHAT format and provided with morpheme coding (JMOR07) by Susanne Miyata (Aichi Shukutoku University).

Acknowledgements

Susanne Miyata reformatted this corpus into accord with current (2020) versions of CHAT.

Usage Restrictions

If you use this data or parts of it, please send one printed copy of your article/publication to Yoshiki Ogawa. More data which includes handwritten utterances of Mari from 4 years and 3 months on are available upon request as an Excel file (not yet CHILDES formatted). You can contact Yoshiki Ogawa via email.