Generate counts for the most frequent n-grams in text.

Function returns a list with a viz and a view object. The viz object can be fed into ParseR's `viz_ngram` function to produce a network visualisation.

Usage

count_ngram(
 df,
 text_var = Message,
 n = 2,
 top_n = 50,
 min_freq = 10,
 distinct = FALSE,
 hashtags = FALSE,
 mentions = FALSE,
 clean_text  = FALSE,
 remove_stops = TRUE,
 tolower = TRUE, 
 ...
)

Arguments

df: A dataframe.
text_var: The variable containing the text.
n: The number of terms to include in the n-gram. E.g. 2 produces a bi-gram.
top_n: The number of n-grams to include.
min_freq: The minimum number of times an n-gram must be observed to be included.
distinct: If TRUE, will count # of unique posts for each n-gram.
hashtags: Should hashtags be included in the n-grams?
mentions: Should mentions be included in the n-grams?
clean_text: Should the text variable be cleaned?
remove_stops: Should stopwords be removed?
tolower: Should all tokens be lower cased in calls to unnest_tokens?
...: fed to the `ParseR::clean_text()` function

Value

A list containing a summary table and a tidygraph object suitable for a network visualisation.