Kiwi class is provide method for korean mophological analyze result.
Methods
Method new()
Create a kiwi instance.
Usage
Kiwi$new(
num_workers = 0,
model_size = "base",
integrate_allomorph = TRUE,
load_default_dict = TRUE
)
Arguments
num_workers
int(optional)
: use multi-thread core number. default is 0 which means use all core.model_size
char(optional)
: kiwi model select. default is "base". "small", "large" is available.integrate_allomorph
bool(optional)
: default is TRUE.load_default_dict
bool(optional)
: use defualt dictionary. default is TRUE.
Method analyze()
Analyze text to token and tag results.
Arguments
text
char(required)
: target text.top_n
int(optional)
: number of result. Default is 3.match_option
match_option
Match
: use Match. Default is Match$ALLstopwords
stopwords option. Default is FALSE which is use nothing. If
TRUE
, use embaded stopwords dictionany. Ifchar
: path of dictionary txt file, use file. IfStopwords
class, use it. If not valid value, work same as FALSE.
Method tokenize()
Analyze text to token and pos result just top 1.
Arguments
text
char(required)
: target text.match_option
match_option
Match
: use Match. Default is Match$ALLstopwords
stopwords option. Default is FALSE which is use nothing. If
TRUE
, use embaded stopwords dictionany. Ifchar
: path of dictionary txt file, use file. IfStopwords
class, use it. If not valid value, work same as FALSE.form
char(optional)
: return form. default is "tibble". "list", "tidytext" is available.
Method split_into_sents()
Some text may not split sentence by sentence. split_into_sents works split sentences to sentence by sentence.
Arguments
text
char(required)
: target text.match_option
match_option
Match
: use Match. Default is Match$ALLreturn_tokens
bool(optional)
: add tokenized resault.
Method get_tidytext_func()
set function to tidytext unnest_tokens.
Arguments
match_option
match_option
Match
: use Match. Default is Match$ALLstopwords
stopwords option. Default is TRUE which is to use embaded stopwords dictionary. If FALSE, use not embaded stopwords dictionary. If char: path of dictionary txt file, use file. If
Stopwords
class, use it. If not valid value, work same as FALSE.
Examples
if (FALSE) {
kw <- Kiwi$new()
kw$analyze("test")
kw$tokenize("test")
}
## ------------------------------------------------
## Method `Kiwi$get_tidytext_func`
## ------------------------------------------------
if (FALSE) {
kw <- Kiwi$new()
tidytoken <- kw$get_tidytext_func()
tidytoken("test")
}