Get the utterances surrounding a token(s)

get_contexts(
  collection = NULL,
  language = NULL,
  corpus = NULL,
  role = NULL,
  role_exclude = NULL,
  age = NULL,
  sex = NULL,
  target_child = NULL,
  token,
  window = c(0, 0),
  remove_duplicates = TRUE,
  connection = NULL,
  db_version = "current",
  db_args = NULL
)

Arguments

collection

A character vector of one or more names of collections

language

A character vector of one or more languages

corpus

A character vector of one or more names of corpora

role

A character vector of one or more roles to include

role_exclude

A character vector of one or more roles to exclude

age

A numeric vector of an single age value or a min age value (inclusive) and max age value (exclusive) in months. For a single age value, participants are returned for which that age is within their age range; for two ages, participants are returned for whose age overlaps with the interval between those two ages.

sex

A character vector of values "male" and/or "female"

target_child

A character vector of one or more names of children

token

A character vector of one or more token patterns (`%` matches any number of wildcard characters, `_` matches exactly one wildcard character)

window

A length 2 numeric vector of how many utterances before and after each utterance containing the target token to retrieve

remove_duplicates

A boolean indicating whether to remove duplicate utterances from the results

connection

A connection to the CHILDES database

db_version

String of the name of database version to use

db_args

List with host, user, and password defined

Value

A 'tbl' of Utterance data, filtered down by supplied arguments.

Examples

# \donttest{ get_contexts(target_child = "Shem", token = "dog")
#> Using current database version: '2018.1'.
#> Getting data from 1 child in 1 corpus ...
#> Warning: `filter_()` is deprecated as of dplyr 0.7.0. #> Please use `filter()` instead. #> See vignette('programming') for more help #> This warning is displayed once every 8 hours. #> Call `lifecycle::last_warnings()` to see where this warning was generated.
#> # A tibble: 199 x 25 #> utterance_id speaker_id utterance_order transcript_id corpus_id gloss #> <int> <int> <int> <int> <int> <chr> #> 1 776315 2454 13 2765 29 dog #> 2 776323 2455 14 2765 29 that… #> 3 776344 2455 16 2765 29 what… #> 4 776378 2455 20 2765 29 is t… #> 5 776404 2455 22 2765 29 when… #> 6 780058 2455 156 2770 29 this… #> 7 780102 2454 160 2770 29 fire… #> 8 780590 2455 185 2766 29 now … #> 9 780627 2454 187 2766 29 what… #> 10 780644 2455 188 2766 29 he w… #> # … with 189 more rows, and 19 more variables: num_tokens <int>, stem <chr>, #> # part_of_speech <chr>, speaker_code <chr>, speaker_name <chr>, #> # speaker_role <chr>, target_child_id <int>, target_child_age <dbl>, #> # target_child_name <chr>, target_child_sex <chr>, type <chr>, #> # media_end <dbl>, media_start <dbl>, media_unit <chr>, collection_id <int>, #> # collection_name <chr>, num_morphemes <int>, language <chr>, #> # corpus_name <chr>
# }