Get tokens

get_tokens(
  collection = NULL,
  language = NULL,
  corpus = NULL,
  target_child = NULL,
  role = NULL,
  role_exclude = NULL,
  age = NULL,
  sex = NULL,
  token,
  stem = NULL,
  part_of_speech = NULL,
  replace = TRUE,
  connection = NULL,
  db_version = "current",
  db_args = NULL
)

Arguments

collection

A character vector of one or more names of collections

language

A character vector of one or more languages

corpus

A character vector of one or more names of corpora

target_child

A character vector of one or more names of children

role

A character vector of one or more roles to include

role_exclude

A character vector of one or more roles to exclude

age

A numeric vector of an single age value or a min age value and max age value (inclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages.

sex

A character vector of values "male" and/or "female"

token

A character vector of one or more token patterns (`%` matches any number of wildcard characters, `_` matches exactly one wildcard character)

stem

A character vector of one or more stems

part_of_speech

A character vector of one or more parts of speech

replace

A boolean indicating whether to replace "gloss" with "replacement" (i.e. phonologically assimilated form), when available (defaults to TRUE)

connection

A connection to the CHILDES database

db_version

String of the name of database version to use

db_args

List with host, user, and password defined

Value

A `tbl` of Token data, filtered down by supplied arguments. If `connection` is supplied, the result remains a remote query, otherwise it is retrieved into a local tibble.

Examples

# \donttest{ get_tokens(token = "dog")
#> Using current database version: '2020.1'.
#> Getting data from 7085 children in 375 corpora...
#> # A tibble: 26,032 x 28 #> id gloss language token_order prefix part_of_speech stem actual_phonology #> <int> <chr> <chr> <int> <chr> <chr> <chr> <chr> #> 1 7 dog eng 7 "" n dog "" #> 2 92 dog eng 8 "" n dog "" #> 3 219 dog eng 3 "" n dog "" #> 4 289 dog eng 6 "" n dog "" #> 5 352 dog eng 7 "" n dog "" #> 6 381 dog eng 5 "" n dog "" #> 7 510 dog eng 8 "" n dog "" #> 8 587 dog eng 3 "" n dog "" #> 9 687 dog eng 11 "" n dog "" #> 10 716 dog eng 6 "" n dog "" #> # … with 26,022 more rows, and 20 more variables: model_phonology <chr>, #> # suffix <chr>, num_morphemes <int>, english <chr>, clitic <chr>, #> # utterance_type <chr>, corpus_name <chr>, speaker_code <chr>, #> # speaker_name <chr>, speaker_role <chr>, target_child_name <chr>, #> # target_child_age <dbl>, target_child_sex <chr>, collection_name <chr>, #> # collection_id <int>, corpus_id <int>, speaker_id <int>, #> # target_child_id <int>, transcript_id <int>, utterance_id <int>
# }