ParseR uses Tyler Rinker’s sentimentr package to enable users to perform a dictionary based sentiment analysis.

We’ll play through an example on a sample of the data set included in the ParseR package to see what this looks like.

# Generate a sample
set.seed(1)
example <- ParseR::sprinklr_export %>%
  dplyr::sample_n(1000)

Scoring the posts

Each sentence in each post will be assigned a valence score that considers shifters such as:

  • Negators: “I do not like it.”
  • Amplifiers: “I really like it.”
  • De-amplifiers: “I hardly like it.”
  • Adversative conjunctions: “I like it but it’s not worth it.”

These valence scores measure the degree of sentiment.

A mean score for the whole post is then calculated from these sentence-level scores.

valence <- example %>%
  ParseR::score_valence(text_var = Message,
                        valence_limits = c(-Inf, Inf),
                        remove_terms = NULL,
                        highlight = FALSE)
knitr::kable(valence %>% head(), format = "html")
message_id UniversalMessageId SocialNetwork SenderUserId SenderScreenName SenderListedName SenderProfileImgUrl SenderProfileLink Sender Followers Count SenderInfluencerScore SenderAge SenderGender Title Message MessageType CreatedTime Language LanguageCode CountryCode MediaTypeList Permalink Domain Retweets Tweet Generator Favorites ReceiverId ReceiverScreenName AssignedBy AssignedTo Spam Status Intel Location Intel Product Star Rating Priority Review Source Experience Score - Message level Sentiment ClientQueues PartnerQueues ClientCustomProps PartnerCustomProps Custom Tags Action Time Geo Target Post Id Associated Cases Location Country State City Latitude Longitude Sender Email Message Type word_count sd ave_sentiment
1 TWITTER_4_1055807946544812032 TWITTER 276997556 JavFFlores Javier Flores https://pbs.twimg.com/profile_images/951832695134441473/8jHAOqT6_normal.jpg NA 4338 NA NA NA NA #AmplifyLatinx cafecito celebrating the 30 amplifiers recognized for Hispanic Heritage month including @HNBARegionI President @bensigel. Shout out to amazing organizers @eromanesq and @BettyFrancisco who are exceptional leaders and role models pic.twitter.com/YgBQdzJHex Twitter Mention 2018-10-26 13:06:45 English en US PHOTO https://www.twitter.com/JavFFlores/status/1055807946544812032 twitter.com 2 Twitter for iPhone 7 NA NA NA NA false NA NA NA NA NA NA NA POSITIVE NA NA NA NA NA null NA NA NA NA NA NA NA 42.35843 -71.05977 NA Twitter Mention 28 0.1163224 0.3247880
2 TWITTER_2_1051960505915662338 TWITTER 133122139 senatorduff Bob Duff, Senate Majority Leader, CT https://pbs.twimg.com/profile_images/824055611/Duff-color-med_normal.jpg NA 5557 NA NA NA NA Celebrating Hispanic Heritage Month with the wonderful seniors at the Norwalk Senior Center South. We had a great time singing and dancing. Thanks to Giovana Ramirez, Program Coordinator and everyone involved. — at Norwalk Senior Center South facebook.com/14677134542947… Twitter Update 2018-10-15 22:18:23 English en US LINK https://www.twitter.com/senatorduff/status/1051960505915662338 twitter.com NA Facebook NA NA NA NA NA false NA NA NA NA NA NA NA NEUTRAL NA NA NA NA NA null NA NA NA NA NA NA NA 41.11760 -73.40790 NA Twitter Update 38 0.1617715 0.2347437
3 INSTAGRAM_36_1894509043172895111_14900610 INSTAGRAM johnkocky johnkocky NA NA NA 0 NA NA NA NA

STATUS @tier_nightclub Music  by @djjayrmusic @djegolive Hosted by @jarriknows @mehkidakid Brought to you by @kheeporlando @hhm_ent For VIP contact @tier_girls 407.222.9732

#Tiernightclub #tiergirls #orlandofl #orlandoflorida #orlandocity #orlandonights #Saturday  #hospitality #industrynight #collegenight #college #downtownorlando #citywalk #florida #floridalife #orlandobound #ucf #ucfknights #orlandonightlife #moet #henny #belaire #ciroc #KHEEP #KHEEPUP #kheeporlando #KHEEPNightLife #KhuuHamilton #HHM
Instagram Post 2018-10-20 19:18:18 English en US PHOTO https://www.instagram.com/p/BpKpe_uFemH/ instagram.com NA NA 0 NA NA NA NA false NA NA NA NA NA NA NA NEUTRAL NA NA NA NA NA null NA NA NA NA NA Florida NA 28.54190 -81.37778 NA Instagram Post 14 NA 0.1069045
4 TWITTER_4_1051932019192672259 TWITTER 960612240796803072 CIRCLE_STAMP CIRCLE_STAMP https://pbs.twimg.com/profile_images/1146951363340820480/J-rTB4yV_normal.jpg NA 39 NA NA NA NA Should we switch from Hispanic Heritage Month to Latinx Heritage Month in order to include more people and make everyone feel their culture being celebrated? Via @DUClarion ow.ly/tMzI30m8roi https://t.co/fegoaKRRq7 Twitter Mention 2018-10-15 20:25:11 English en US PHOTO https://www.twitter.com/CIRCLE_STAMP/status/1051932019192672259 twitter.com NA Hootsuite Inc.  NA NA NA NA NA false NA NA NA NA NA NA NA NEUTRAL NA NA NA NA NA null NA NA NA NA NA NA NA 39.73915 -104.98470 NA Twitter Mention 31 0.3790092 0.2924879
5 TWITTER_2_1054831569901404162 TWITTER 783151915 NAVHOSPPCOLA NH Pensacola https://pbs.twimg.com/profile_images/2546474452/NHP_20Seal_png_normal.jpg NA 498 NA NA NA NA Naval Hospital Pensacola held several events to raise awareness for #HispanicHeritageMonth Heritage Month. youtu.be/3ohjVbqkmjk Twitter Update 2018-10-23 20:26:58 English en US LINK https://www.twitter.com/NAVHOSPPCOLA/status/1054831569901404162 twitter.com 1 Twitter for Android NA NA NA NA NA false NA NA NA NA NA NA NA NEUTRAL NA NA NA NA NA null NA NA NA NA NA NA NA 30.42131 -87.21691 NA Twitter Update 15 0.0000000 0.0000000
6 TWITTER_2_1051994518633865216 TWITTER 135608636 BottomGuy21 ℬø☂т☺м Ḡʊ¥ 21 ❤X❤ https://pbs.twimg.com/profile_images/953889956518244352/696wgg1S_normal.jpg NA 200 NA NA NA NA On the last day of Hispanic Heritage month: Please remember which people got paper towels thrown at them after a terrible hurricane, and whose families were torn apart at the border. Now go VOTE! Twitter Update 2018-10-16 00:33:33 English en US NA https://www.twitter.com/BottomGuy21/status/1051994518633865216 twitter.com NA Twitter Web Client NA NA NA NA NA false NA NA NA NA NA NA NA NEUTRAL NA NA NA NA NA null NA NA NA NA NA NA NA 39.76000 -98.50000 NA Twitter Update 34 0.1333501 -0.1029086

Summarising the sentiment

To get an overall sense of the sentiment of the posts in our data set we can generate empirical confidence intervals (using bootstrapping) for the following statistics:

  • Minimum
  • 1st Quartile
  • Median
  • Mean
  • Standard Deviation
  • 3rd Quartile
  • Maximum
valence_summary<- valence %>%
  ParseR::summarise_valence(valence_var = ave_sentiment,
                            bootstrap = 100,
                            coverage = 0.95)
Statistic 95%_CI_lower Mean Median 95%_CI_upper
Min -0.8000000 -0.7521435 -0.8000000 -0.5610354
Q1 0.0000000 0.0375933 0.0439828 0.0634186
Mean 0.1561444 0.1687568 0.1681488 0.1807331
Median 0.1669588 0.1797841 0.1790365 0.1917224
SD 0.1942822 0.2072197 0.2073099 0.2189205
Q3 0.2670260 0.2856566 0.2861195 0.3006224
Max 0.9015611 1.1447914 1.2159034 1.2159034

Visualising the sentiment

Often it is helpful to visualise the distribution of sentiment against a measure of post ‘importance’. For example, we could use follower count:

valence %>%
  janitor::clean_names() %>%
  ParseR::hexplot_valence(valence_var = ave_sentiment,
                          x_var = sender_followers_count,
                          log10_trans = TRUE,
                          theme = "viridis")

knitr::kable(valence_summary, format = "html")
Statistic 95%_CI_lower Mean Median 95%_CI_upper
Min -0.8000000 -0.7521435 -0.8000000 -0.5610354
Q1 0.0000000 0.0375933 0.0439828 0.0634186
Mean 0.1561444 0.1687568 0.1681488 0.1807331
Median 0.1669588 0.1797841 0.1790365 0.1917224
SD 0.1942822 0.2072197 0.2073099 0.2189205
Q3 0.2670260 0.2856566 0.2861195 0.3006224
Max 0.9015611 1.1447914 1.2159034 1.2159034