The childes-db
project aims to make CHILDES transcripts more accessible by reducing the amount of preprocessing (e.g., CLAN or specific preprocessing libraries) and by making the individual tokens, utterances, transcripts, and corpora available in a tidy, tabular format that is accessible across programming languages. We release new versions of this dataset periodically to facilitate reproducibility. We also provide an R package (childesr) and a Python package (childespy) which allow users to access this database without having to write complex SQL queries.
If you use childes-db
to access CHILDES in your research, please note the database version you used (i.e., 2018.1
) and cite:
paper in Behavior Research Methods:
*Sanchez, A., *Meylan, S.C., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2019). "childes-db: a flexible and reproducible interface to the Child Language Data Exchange System." Behavior Research Methods 51 (4), 1928–1941.* indicates co-first authorship.
UC Berkeley (now at Amazon)
Stanford University
Duke University
Carnegie Mellon University
Stanford University