Initialize a `talkr` dataset
From a dataframe object, generate a talkr dataset. This dataset contains columns that are used throughout the talkr infrastructure to visualize conversations and language corpora. Initializing a talkr dataset is the first step in the talkr workflow.
source = "source",
begin = "begin",
end = "end",
participant = "participant",
utterance = "utterance",
format_timestamps = "ms"
- data
A dataframe object
- source
The column name identifying the conversation source (e.g. a filename; is used as unique conversation ID). If there are no different sources in the data, set this parameter to `NULL`.
- begin
The column name with the begin time of the utterance (in milliseconds)
- end
The column name with the end time of the utterance (in milliseconds)
- participant
The column name with the participant who produced the utterance
- utterance
The column name with the utterance itself
- format_timestamps
The format of the timestamps in the begin and end columns. Default is "ms", which expects milliseconds. `%H:%M:%OS` will format eg. 00:00:00.010 to milliseconds (10). See `?strptime` for more format examples.