Skip to contents

From a dataframe object, generate a talkr dataset. This dataset contains columns that are used throughout the talkr infrastructure to visualize conversations and language corpora. Initializing a talkr dataset is the first step in the talkr workflow.

Usage

init(
  data,
  source = "source",
  begin = "begin",
  end = "end",
  participant = "participant",
  utterance = "utterance",
  format_timestamps = "ms"
)

Arguments

data

A dataframe object

source

The column name identifying the conversation source (e.g. a filename; is used as unique conversation ID). If there are no different sources in the data, set this parameter to `NULL`.

begin

The column name with the begin time of the utterance (in milliseconds)

end

The column name with the end time of the utterance (in milliseconds)

participant

The column name with the participant who produced the utterance

utterance

The column name with the utterance itself

format_timestamps

The format of the timestamps in the begin and end columns. Default is "ms", which expects milliseconds. `%H:%M:%OS` will format eg. 00:00:00.010 to milliseconds (10). See `?strptime` for more format examples.

Value

A dataframe object with columns needed for the talkr workflow