CHILDES Thai CRSLP-MARCS Corpus


Sudaporn Luksaneeyanawin
Centre for Research in Speech and Language Processing (CRSLP)
Chulalongkorn University

website

Participants: 18
Type of Study: naturalistic
Location: Thailand
Media type: video
DOI: doi:10.21415/T5S59B

Browsable transcripts

Download transcripts

Link to media folder

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This corpus is part of the collaborative research project on tone development between the Centre for Research in Speech and Language Processing (CRSLP), Chulalongkorn University, Thailand led by Assistant Professor Dr. Sudaporn Luksaneeyanawin, and MARCS Auditory Laboratories, University of Western Sydney, Australia led by Professor Dr. Denis Burnham.

The data consist of video-linked transcriptions of 18 Thai adult-child dyads from the child age of 6 to 24 months, at three monthly intervals. Sessions at each age were of 20 minutes duration and for CHILDES these have been split into 10 minute files, a total of 242 files.

The data comprised the major part of a doctoral thesis by Sorabud Rungrojsuwan - ‘First Words: Communicative Development of 9- to 24-Month-Old Thai Children’. Data were collected by Sorabud Rungrojsuwan and Nirattisai Krajaikiat, postgraduate research assistants at CRSLP, during the period January 2000-January 2002, using a SONY Digital Handicam DCR-TRV320E video camera. The videotaped data were then computerized and converted into 242 video files (in .mpg format) using the Ulead Video Studio 4.0 SE Basic program. Using the CLAN program (CHAT mode), Sorabud transcribed the data in Thai script. Phonological representations of these Thai transcriptions were automatically added by the use of a Thai text-to-phonological representation program developed by the CRSLP, by Assistant Professor Dr. Sudaporn Luksaneeyanawin. Thai phonological representations used in this corpus are shown below.

Details

Details about the corpus are as follows.
Participants: - 18 Thai children recorded at 3-monthly intervals from 6 to 24 months of age
Data Collection: - the data were collected during the period 2000-2002 from 18 Thai families living in Bangkok or the outskirts of Bangkok
Videos: - 121 20-minute sessions resulting in 242 10-minute video files (.mpg format)
Transcriptions: - 121 sessions of 242 transcription files (.cha format)—including Thai transcription and phonological representation
File Duration: - 10 minutes
Filenames: - the following examples are filenames for when the first participant (L01) was 9 months (9m), for the first and second (1,2) 10-minute periods for video (.mpg) and transcription (.cha)

Overall data
CodesChild691215182124FilesBirth
L01SeaFXXXXXX12July 5
L02BossMXXXXXXX14 July 22
L03PlaaFXXXXXXX14Oct 1
L04ChomboonMXXXXXXX14Oct 2
L05PimFXXXXXXX14Oct 3
L06OilFXXXXXXX14Oct 5
L07EarthMXXXX8Oct 10
L08MonMXXXXXXX14Oct 19
L09FlukMXXXXXXX14Oct 22
L10MantaaFXXXXXXX14 Oct 24
L11PhiiMXXXXXXX14Oct 26
L12PaiMXXXXXXX14Oct 27
L13BaiwaanFXXXXXXX14 Oct 30
L14UiFXXXXXXX14Nov 6
L15PanMXXXXXXX14Nov 11
L16TunMXXXXXXX14Nov 14
L17MaiFXXXXXXX14Nov 16
L18OngMXXXXXX12Dec 1
* The “X” symbols show completed sessions (included in the corpus), while blanks in the cases of L01 and L18 represent missing data due to a lot of noise, overlapping in speech production. In the case of L07, data collection was not possible after the age of 15 months, because the child was moved to another province far from Bangkok.

Thai Phonological Representation

(adapted from Luksaneeyanawin 2000)

Consonants
Manner of
Articulation
Place of
Articulation
LabialsAlveolarPalatalVelarGlottal
Stops
> Voiceless aspirated
> Voiceless unaspirated
> Voiced
___
p*
ph
b
___
t*
th
d
___
c
ch
-< td>___
k*
kh
-
___
?*
-
-
Non-stops
>Nasals
>Fricatives
>Continuants
>>Lateral
>>Trill
>>Approximants
___
m*
f
-
-
-
w*
___
n*
s
-
l
r
j*
___
-
-
-
-
-
-
___
N*
-
-
-
-< br>-
___
-
h
-
-
-
-
* These consonants can occur in both syllable initial and syllable final positions.

Clusters
pr, pl; tr; kw, kr, kl
phr, phl; thr; khw, khr, khl

Vowel Monophthongs
Tongue
Height
Tongue
Advancement
FrontCentralBack
Highi, iiU, UUu, uu
Mide, eeq, qqo, oo
Lowx, xxa, aaO, OO

Diphthongs
Short Diphthongs:- /ia/ /Ua/ /ua/
Long Diphthongs:- /iia/ /UUa/ /uua/

Tones
mid level = 0
low level = 1
falling = 2
high level = 3
rising = 4

Acknowledgements

The project was financially supported by Australian Research Council and Chulalongkorn University under the university centre of excellence scheme.