Initialize a `talkr` dataset

From a dataframe object, generate a talkr dataset. This dataset contains columns that are used throughout the talkr infrastructure to visualize conversations and language corpora. Initializing a talkr dataset is the first step in the talkr workflow.

Usage

init(
  data,
  source = "source",
  begin = "begin",
  end = "end",
  participant = "participant",
  utterance = "utterance",
  format_timestamps = "ms"
)

Arguments

data: A dataframe object
source: The column name identifying the conversation source (e.g. a filename; is used as unique conversation ID). If there are no different sources in the data, set this parameter to `NULL`.
begin: The column name with the begin time of the utterance (in milliseconds)
end: The column name with the end time of the utterance (in milliseconds)
participant: The column name with the participant who produced the utterance
utterance: The column name with the utterance itself
format_timestamps: The format of the timestamps in the begin and end columns. Default is "ms", which expects milliseconds. `%H:%M:%OS` will format eg. 00:00:00.010 to milliseconds (10). See `?strptime` for more format examples.

Value

A dataframe object with columns needed for the talkr workflow