Simple version of tokenizer function. — tokenize • elbird

Simple version of tokenizer function.

Usage

tokenize(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tbl(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidytext(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidy(text, match_option = Match$ALL, stopwords = TRUE)

Arguments

text: target text.
match_option: Match: use Match. Default is Match$ALL
stopwords: stopwords option. Default is TRUE which is to use embaded stopwords dictionany. If FALSE, use not embaded stopwords dictionany. If char: path of dictionary txt file, use file. If Stopwords class, use it. If not valid value, work same as FALSE. Check analyze() how to use stopwords param.

Value

list type of result.

Examples

if (FALSE) {
  tokenize("Test text.")
  tokenize("Please use Korean.", Match$ALL_WITH_NORMALIZING)
}