class: center, middle, inverse, title-slide # rtweet for Rfun ## Packages to gather and analyse tweets ### John Little ### 2017-03-27 --- exclude: true ## Setup for the GraphTweet slides --- class: center, bottom background-image: url( #[]( --- class: center, middle # Presentation Materials [Slides](// [Github](// --- class: middle, center ## Twitter Stream Gathering [rtweet]( [twitteR]( --- class: middle, center background-image: url(images/twitter-analysis.png) <!-- --> .right-column[ <h2>Twitter Analysis</h2> ] --- ## Outline - Using rtweet (a tidy way) - library(twitter) should also work fine - Analysis demonstrations - WordCloud - Word Freq - term document matrix - Sentiment Analysis - Network Graphs / Network Analysis - Time Series - Streaming / Scheduling / Sampling --- class: middle, center, softblue ## Gettting Started [Intro Vignette]( --- ## Authentication - API - - Keys and Access Tokens - Must be careful with the secret code - - But, my examples don't seem to require keys --- ## Gathering is Easy ```{} mm_tweets <- search_tweets("marchmadness", n=1000, lang = "en") ``` ```{} users_data(mm_tweets) ``` There are significant limitations in gathering historical data from the twitter API --- ## API Orchestration Many Tools can gather - Easy: - R with rtweet - Splunk --- class: softblue ## Analysis 1. Word Cloud 2. Sentiment Analysis 3. Network Graph Analysis 4. Time Series --- class: orange ## Word Cloud - WordCloud2 (HTML Widget) - Requires treatment and transformations - Term Document Matrix - Text Mining - lower case - strip whitespace - remove punctuation - remove numbers - remove stop words - term stemming --- class: bottom, center background-image: url(images/word-cloud.png) [Example 1](rtweet4rfun.nb.html): Search Tweets | Data Treatment | TDM | WorldCloud --- class: orange ## Sentiment Analysis Applied a simpler text treatment for this demonstration. See [Example 1](rtweet4rfun.nb.html) for a more complete treatment of data cleaning. ```{} iconv(TweetText, 'UTF-8', 'ASCII') -> UseableText ``` Get sentiment ```{} get_nrc_sentiment(UsableText) ``` --- class: bottom, center background-image: url(images/sentiment_vis.png) [Example 2](sentiment_analysis.nb.html): syuzhet::get_nrc_sentiment(), plot sentiment --- class: orange ## Network Graph - `library(graphTweets)` - Transforms the document into edges and nodes - Creates a Gephi Document: `graphTweets.graphml` - HTML Widget [DiagrammeR]( is worth investigation --- class: bottom background-image: url(images/network_graph.png) 1. [Example 3](network_graph.nb.html): `getEdges()` | `getNodes()` | plot 2. Launch Gephi > Open Graph File > graphTweets.graphml --- class: center, middle ## More graphTweets Examples For the next few slides see [my R Notebook](network_graph_more_examples.nb.html) for code details --- class: middle ### graphTweets -- Identify Edges <!-- --> ``` dukmbb <- search_tweets("dukembb", n=100, lang = "en") ``` --- ### Edges with NetworkD3
--- ### Nodes plot from [Coene's example 2]( <!-- --> --- ### Geocode tweeters with ggmap and leaflet
--- ### Notes on previous slide Although rtweet has `lookup_coords()` I did not find it to be successful for my test. It may work fine upon further review. I chose to use ggmap and leaflet. see [my R Notebook](network_graph_more_examples.nb.html) for code details --- class: orange ## Time Series --- class: softblue ## Other ### Scheduling - See the [time-series]( vignette - [taskscheduleR: schedule R scripts with Windows task manager]( ### Analysis & Issues - Machine Learning - Implications for Social Science ### Keeping up from a tool perspective - --- class: center, bottom background-image: url( #[]( --- ## Thank You For Attending .pull-left[ ### I am ... - John Little - - #### Schedule Me - []( ] .pull-right[ ### We are... - Data & Visualization Services - - The /Edge, Bostock (1st Floor) #### Walk-in Hours - [Schedule]( #### Our Workshops - [Current Workshops]( - [Past Workshops]( #### Contact Us - ] --- class: center, middle ## Shareable under CC BY-NC license Data, presentation, and handouts are shareable under [CC BY-NC license](