Skip to contents

Simple version of tokenizer function.

Usage

tokenize(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tbl(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidytext(text, match_option = Match$ALL, stopwords = TRUE)

tokenize_tidy(text, match_option = Match$ALL, stopwords = TRUE)

Arguments

text

target text.

match_option

Match: use Match. Default is Match$ALL

stopwords

stopwords option. Default is TRUE which is to use embaded stopwords dictionany. If FALSE, use not embaded stopwords dictionany. If char: path of dictionary txt file, use file. If Stopwords class, use it. If not valid value, work same as FALSE. Check analyze() how to use stopwords param.

Value

list type of result.

Examples

if (FALSE) {
  tokenize("Test text.")
  tokenize("Please use Korean.", Match$ALL_WITH_NORMALIZING)
}