About childes-db

The childes-db project aims to make CHILDES transcripts more accessible by reducing the amount of preprocessing (e.g., CLAN or specific preprocessing libraries) and by making the individual tokens, utterances, transcripts, and corpora available in a tidy, tabular format that is accessible across programming languages. We release new versions of this dataset periodically to facilitate reproducibility. We also provide an R package (childesr) and a Python package (childespy) which allow users to access this database without having to write complex SQL queries.

Citation policy

If you use childes-db to access CHILDES in your research, please note the database version you used (i.e., 2018.1) and cite:

The childes-db paper in Behavior Research Methods:
^*Sanchez, A., ^*Meylan, S.C., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2019). "childes-db: a flexible and reproducible interface to the Child Language Data Exchange System." Behavior Research Methods 51 (4), 1928–1941.
^* indicates co-first authorship.
CHILDES itself – both the database and the corpora you use – following the Talkbank policy.