Kiwi class is provide method for korean mophological analyze result.
Methods
Method new()
Create a kiwi instance.
Usage
Kiwi$new(
num_workers = 0,
model_size = "base",
integrate_allomorph = TRUE,
load_default_dict = TRUE
)Arguments
num_workersint(optional): use multi-thread core number. default is 0 which means use all core.model_sizechar(optional): kiwi model select. default is "base". "small", "large" is available.integrate_allomorphbool(optional): default is TRUE.load_default_dictbool(optional): use defualt dictionary. default is TRUE.
Method analyze()
Analyze text to token and tag results.
Arguments
textchar(required): target text.top_nint(optional): number of result. Default is 3.match_optionmatch_option
Match: use Match. Default is Match$ALLstopwordsstopwords option. Default is FALSE which is use nothing. If
TRUE, use embaded stopwords dictionany. Ifchar: path of dictionary txt file, use file. IfStopwordsclass, use it. If not valid value, work same as FALSE.
Method tokenize()
Analyze text to token and pos result just top 1.
Arguments
textchar(required): target text.match_optionmatch_option
Match: use Match. Default is Match$ALLstopwordsstopwords option. Default is FALSE which is use nothing. If
TRUE, use embaded stopwords dictionany. Ifchar: path of dictionary txt file, use file. IfStopwordsclass, use it. If not valid value, work same as FALSE.formchar(optional): return form. default is "tibble". "list", "tidytext" is available.
Method split_into_sents()
Some text may not split sentence by sentence. split_into_sents works split sentences to sentence by sentence.
Arguments
textchar(required): target text.match_optionmatch_option
Match: use Match. Default is Match$ALLreturn_tokensbool(optional): add tokenized resault.
Method get_tidytext_func()
set function to tidytext unnest_tokens.
Arguments
match_optionmatch_option
Match: use Match. Default is Match$ALLstopwordsstopwords option. Default is TRUE which is to use embaded stopwords dictionary. If FALSE, use not embaded stopwords dictionary. If char: path of dictionary txt file, use file. If
Stopwordsclass, use it. If not valid value, work same as FALSE.
Examples
if (FALSE) {
kw <- Kiwi$new()
kw$analyze("test")
kw$tokenize("test")
}
## ------------------------------------------------
## Method `Kiwi$get_tidytext_func`
## ------------------------------------------------
if (FALSE) {
kw <- Kiwi$new()
tidytoken <- kw$get_tidytext_func()
tidytoken("test")
}
