Generate pairwise correlations for a vector of terms of interest.

calculate_corr(
 df,
 text_var,
 terms,
 min_freq = 10,
 corr_limits = c(-1, 1),
 n_corr = 75,
 hashtags = FALSE,
 mentions = FALSE
)

Arguments

df

A dataframe where each row is a separate post.

text_var

The variable containing the text which you want to explore.

terms

The terms of interest. You can use multi-word phrases.

min_freq

The minimum number of times a term must be observed to be considered.

corr_limits

Numerical lower and upper bounds for correlations.

n_corr

The number of correlations to include (begins with the most positive within the range specified in corr_limits).

hashtags

Should hashtags be included?

mentions

Should mentions be included?

Value

A list containing a summary table and a tidygraph object suitable for a network visualisation.

Examples

calculate_corr(
 df = sprinklr_export,
 text_var = Message,
 terms = c("foo", "bar", "I'm looking for"),
 min_freq = 10,
 corr_limits = c(-1, 1),
 n_corr = 75,
 hashtags = TRUE,
 mentions = FALSE
)
#> Using `to_lower = TRUE` with `token = 'tweets'` may not preserve URLs.
#> $viz
#> # A tbl_graph: 78 nodes and 77 edges
#> #
#> # An unrooted tree
#> #
#> # Node Data: 78 × 2 (active)
#>   word             term_freq
#>   <chr>                <int>
#> 1 bar                     16
#> 2 b                       23
#> 3 n                       27
#> 4 hookah                  13
#> 5 #downtownorlando        23
#> 6 doors                   25
#> # … with 72 more rows
#> #
#> # Edge Data: 77 × 3
#>    from    to correlation
#>   <int> <int>       <dbl>
#> 1     1     2       0.264
#> 2     1     3       0.216
#> 3     1     4       0.206
#> # … with 74 more rows
#> 
#> $view
#> # A tibble: 77 × 3
#>    from  to               correlation
#>    <chr> <chr>                  <dbl>
#>  1 bar   b                      0.264
#>  2 bar   n                      0.216
#>  3 bar   hookah                 0.206
#>  4 bar   #downtownorlando       0.206
#>  5 bar   doors                  0.202
#>  6 bar   o                      0.175
#>  7 bar   r                      0.174
#>  8 bar   #orlandofl             0.161
#>  9 bar   #orlandoflorida        0.161
#> 10 bar   #orlandocity           0.161
#> # … with 67 more rows
#>