Runs a dictionary on the text column of x to create 'tokens' variable consisting of dictionary matches to each word. Non-matching words from the dictionary generate no token.
jl_tokenize_categories(x, dictionary, ...)
x | a tibble |
---|---|
dictionary | a quanteda content analysis dictionary |
... | extra arguments to tokenizers::tokenizer_* |
a tibble