CHILDES Japanese MiiPro Corpus

Susanne Miyata
Department of Medical Sciences
Aichi Shukotoku University


Hiro Yuki Nisisawa
College of Human Science
Tokiwa University


Participants: 4
Type of Study: naturalistic
Location: Japan
Media type: audio
DOI: doi:10.21415/T55C72

Browsable transcripts

Download transcripts

Link to media folder

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The "MiiPro" study observes 4 acquainted first-born children of the same age, living close to each other in a neighborhood in the Tokyo area. Each child was observed separately at his/her home on a weekly basis from 1;2 to 3;0, and monthly or bi-monthly from 3;0 to 5;0. Each mother-child session lasted approx. 70 minutes. The present release covers the time span from 1;2 to 5;0 for Nanami, and 3;0 to 5;0 for Asato, Tomito, and Arika. For Arika, the ArikaF folder includes conversations with the father and ArikaM includes conversations with the mother.

The beginning and the end of the sessions include adult conversation (mother - investigator) occurring during the set up of the equipment (video camera and MD recorder) The maternal utterances directed to INV are marked by [+ bch]. The mother was given the instruction to play with the child, as she would normally do, animating her to talk. but to stay within the range of the video camera, and to avoid eating and drinking as well as noisy toys, TV, TV games and similar during the sessions. The investigator left during the observation. The most productive 60 minutes of each session are marked as "gem". Personal names are replaced by pseudonyms, last names replace by “N”. All utterances are transcribed on the base of the MD recordings in Japanese and Latin script (Miyata, 2012b), sound-linked with Sonic Mode, and provided with mor tiers (JMOR04.1 or later; please check the information in the Comment header tier; Miyata & Naka, 2014). MLU and DSSJ were computed with CLAN following the procedure explained in Miyata (2012a). The children's first names were replaced by pseudonyms.

Nanami (Nisisawa & Miyata, 2009)

Nisisawa, Hiro Yuki
Miyata, Susanne

The girl Nanami (NJD; 1;1.29 – 5;0.17) was observed monthly (total 65h 21min). The corpus includes 58 sessions with a total of 27,416 utterances of NJD and 58,324 utterances of MOT. Nanami's younger sister Juri (JUI; born when Nanami was 2;2) and her baby brother Kanta (KAN; born when Nanami was 3;6) were mostly present and sometimes took part in the conversation. Their transcribed utterances are not in JCHAT format and have not been checked for reliability.

Nanami's data
Age of NJDFile NameLengthNJD # UttMLUm*Participants**
2;11.28NJD1999061373'00"7333.356INV, FAT, JUI
3;0.14NJD1999062967'00"3473.468INV, JUI
3;1.16NJD1999073199'00"5434.168INV, JUI, BOY, LDY
3;2.20NJD1999090475'00"5793.309INV, JUI
3;3.19NJD1999100471'50"6913.503INV, FAT, JUI
3;4.15NJD1999103074'20"5864.296INV, JUI
3;5.23NJD1999112843'00"2643.525INV, JUI, CIE
3;6.9NJD1999122465'30"2843.845INV, JUI
3;9.23NJD2000040775'00"7564.057INV, JUI, KAN, APR
3;10.15NJD2000043074'55" 7433.895INV, JUI, MIM, APR
3;11.12NJD2000052773'30"4734.342IN2, JUI
4;0.9NJD2000062475'00"6274.291INV, IN2, JUI, KAN, APR, FUY
4;1.31NJD2000081475'00"4683.659INV, FAT, JUI
4;3.08NJD2000092374'50"6324.415IN2, JUI, APR, FUY
4;4.7NJD2000102274'40"4724.047INV, JUI
4;5.04NJD2000111973'00"7524.274INV, JUI, KAN
4;6.10NJD2000122574'50"7164.771INV, JUI
4;7.12NJD2001012776'15"6964.247INV, JUI, KAN
4;8.10NJD2001022570'15"4633.976IN2, JUI, APR
4;9.16NJD2001033181'00"7004.44INV, JUI, APR
4;11.07NJD2001052212'20"684.797INV, JUI, KAN, CHI
5;0.17NJD2001070274'15" 4284.844INV, JUI
Total57 Sessions27,416
*MLUm computed with mlu +d1 +t%mor +b- +b# +b+ -t* +t*NJD -sonoma* -sco:* -sn:let* @
**Participants: Participants other than NJD and MOT

Arika (Nisisawa & Miyata, 2010)

Nisisawa, Hiro Yuki
Miyata, Susanne

The girl Arika (APR; 2;11.28 – 5;0.17) was observed monthly. Arika was born 13-MAY-1996. The ArikaM corpus includes 55 sessions with a total of 47,173 utterances of APR and 40,412 utterances of MOT (total 67h 45min). Arika's younger sister Fuyumi (FUY; born when Arika was 3;2) was often present and started to take part in the conversation in the later sessions. Fuyumi’s transcribed utterances are not in JCHAT format and have not been checked for reliability. A second corpus of conversations of Arika with her father (ArikaF corpus) is under work.

Arika's data
Age of APRFile NameLength# APR uttMLUmParticipants
3;00.02 APRM990515 74'50" 11673.334 
3;00.21 APRM990603 64'30" 6952.76 
3;01.04 APRM990617 72'40" 8673.31 
3;01.16 APRM990629 75'00" 9213.624 
3;01.24 APRM990707 74'30" 8063.847 
3;02.01 APRM990714 74'30" 10033.142 
3;02.09 APRM990722 75'00" 6323.313 
3;02.22 APRM990804 75'00" 9664.406 
3;03.11 APRM990824 75'00" 6203.708FUY, BAA
3;03.25 APRM990907a 75'00" 9204.221BAA
3;03.25 APRM990907b 75'00" 6955.071FUY, BAA
3;03.27 APRM990909 68'15" 5844.497BAA
3;04.02 APRM990915 68'15" 5204.052 
3;04.21 APRM991004 75'00" 8054.316FUY
3;05.10 APRM991023 74'45" 5063.822FUY
3;05.21 APRM991103 75'00" 8664.246 
3;06.15 APRM991128 74'50" 8443.94FUY
3;07.05 APRM991218 75'00" 8543.871FUY
3;07.17 APRM991230 75'00" 9084.511FUY
3;07.28 APRM000110 75'00" 8574.317 
3;08.03 APRM000116 75'00" 8694.452 
3;08.10 APRM000123 75'00" 7404.481FUY
3;08.18 APRM000131 75'00" 11673.625FUY
3;09.00 APRM000213 75'00" 8844.515FUY
3;09.07 APRM000220 75'00" 9424.063 
3;09.16 APRM000301 74'20" 9373.82FUY
3;09.24 APRM000309 75'00" 8043.631FUY
3;10.03 APRM000316 75'00" 6593.882FUY
3;10.25 APRM000407 72'10" 6783.941FUY
3;11.04 APRM000417 66'15" 7844.094FUY
3;11.13 APRM000426 75'00" 9984.358FUY
4;00.02 APRM000515 75'00" 9213.781FUY
4;00.09 APRM000522 73'10" 6134.447FUY
4;00.16 APRM000529 71'00" 7073.98FUY
4;00.29 APRM000611 72'20" 7923.987 
4;01.06APRM000619 75'00" 7354.418FUY
4;01.20 APRM000703 70'35" 8524.143FUY
4;01.26 APRM000709 75'00" 8155.117 
4;02.03 APRM000716 72'20" 8834.507FUY
4;02.28 APRM000810 69'00" 6673.588FUY
4;03.07 APRM000820 70'30" 9364.524FUY
4;04.19 APRM009904 69'15" 10735.303FUY
4;04.19 APRM000911 72'40" 7835.95FUY
4;04.19 APRM001002 73'40" 7154.592 
4;05.17 APRM001030 61'00" 5284.032FUY
4;06.10 APRM001123 62'00" 3293.474FUY
4;06.27APRM001210 66'30" 5733.801 
4;08.01 APRM010114 68'10" 4543.958 
4;08.15 APRM010128 74'50" 5833.64FUY
4;08.29 APRM010211 74'20" 7293.604FUY
4;09.10 APRM010223 64'30" 8233.809 
4;09.26 APRM010311 79'30" 7634.377 
5;00.04 APRM010517 75'55" 8073.656FUY
5;00.25APRM010607 64'40" 5883.949FUY
5;01.19 APRM010702 75'00" 8973.706FUY
Total55 Sessions43,064
*MLUm computed with mlu +d1 +t%mor +b- +b# +b+ -t* +t*APR -sonoma* -sco:* -sn:let* @
**Participants: Participants other than APR and MOT

Asato (Miyata & Nisisawa, 2009)

Miyata, Susanne
Nisisawa, Hiro Yuki

The first-born boy Asato (ALS; 3;0.1 – 5;0.27) was observed weekly up to 3;0. Between 3;0 and 3;9 he was observed monthly, and after a 3-months break every second month (each session approx. 70 minutes; total time 21h 47m). The corpus (3;0 – 5;1) includes 18 sessions with a total of 11,183 utterances from ALS and 14,937 utterances from his mother.

Asato's data
Age File NameTime# ALS Utt#MLUuttMLUm
3;7.15ALS2000020172'00" 6733904.336
4;04.19ALS2000110661'00" 3572794.620
Total18 Sessions11,183

Tomito (Miyata & Nisisawa, 2010)

Miyata, Susanne
Nisisawa, Hiro Yuki

The boy Tomito (TOM; 2;11.27 – 5;1.23) was observed monthly for approx. 70 minutes (total: 23h 21min). The table below gives the specification of Tomito’s data. The corpus includes 19 sessions with a total of 11,065 utterances of TOM and 18,657 utterances of MOT. Tomito’s younger sister Honoka (ONO) was present and sometimes took part in the conversation. Note that only the utterances of Honoka that were related to TOM’s conversation with their mother, have been transcribed. Honoka’s transcribed utterances are not in JCHAT format and have not been checked for reliability.

Tomito's data
Age of TOMFile NameLength# TOM UttMLUm*Participants**
3;0.13TOM99061473'00"6014.454ONO, IN2
3;0.28TOM99062975'00"7354.061ONO, INV
3;2.3TOM99080472'50"6084.960ONO, INV, IN2
3;3.2TOM99090375'00"7474.150ONO, INV
3;4.3TOM99100474'50"5263.898ONO, INV
3;5.1TOM99110274'55"8843.668ONO, IN2
3;05.29TOM99113070'50"4173.259ONO, INV, APR
3;07.4TOM00010575'00"4803.969ONO, INV
3;08.01TOM00020275'00"4053.522ONO, INV
3;09.05TOM00030675'00"6543.713ONO, INV
3;10.07TOM00040775'00"6705.442ONO, INV
4;00.04TOM00060571'00"6344.763ONO, INV, APR
4;02.07TOM00080875'00"4263.654ONO, IN2, APR
4;04.01TOM00100271'15"5073.609ONO, INV
4;06.14TOM00121575'00"4203.586ONO, IN2, APR
4;09.06TOM01030775'00"5803.584ONO, INV
4;11.17TOM01051879'40"5833.757ONO, INV
5;01.23TOM01072463'00"6263.784ONO, INV, IN2
Total19 Sessions11,0653.784ONO, INV, IN2
*MLUm computed with mlu +d1 +t%mor +b- +b# +b+ -t* +t*TOM -sonoma* -sco:* -sn:let* @
**Participants: Participants other than TOM and MOT