harvesting data from twitter workshop: hands-on experience
TRANSCRIPT
![Page 1: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/1.jpg)
Harvesting Data from Twitter: Hands on
Experience
Dr. Nora alTwairesh, Ms. Tarfa alBuhairi, Ms. Mawaheb alTuwaijri, and Ms. Afnan alMoammar
![Page 2: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/2.jpg)
Content
• Introduction about Twitter API• Some ready to use tools (no programming)• Comparison between R and Python• R• Python
![Page 3: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/3.jpg)
WHY
TWITTER?!
![Page 4: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/4.jpg)
Why Twitter
• Twitter has become a mass information hub that can be used to study the evolution of any issue matter: revolutionary machine• Research disciplines that study Twitter data spanned
the domains of computer science, information science, communications, business, economics, education, medicine, political science, and sociology.
![Page 5: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/5.jpg)
• Recent studies show that %60 of daily Arabic tweets are from Saudi Arabia.
Why Twitter
Hamdy Mubarak and Kareem Darwish. 2014. Using Twitter to collect a multi-dialectal corpus of Arabic. ANLP 2014:1.
![Page 6: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/6.jpg)
Twitter API
• Free access to the tweets posted in the last 7 days within a certain rate-limit. • Any tweets posted earlier than 7 days are considered historical
tweets and should be purchased through third party providers• The Twitter API provides three interfaces for tweet collection:
Streaming API, REST API and Search API
![Page 7: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/7.jpg)
Streaming API• The Streaming API provides real-time tweets in a live-poll fashion. • In a Streaming API, requested tweets will be constantly flowing as
they are posted on Twitter. It is delivered in three bandwidths: “spritzer” :1%, “gardenhose”: 10% and “firehose”: 100% of all tweets posted on Twitter. • A regular user wanting to collect tweets will be granted spritzer
access.
![Page 8: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/8.jpg)
REST API• The REST API was specifically designed for programmatic access
to read and write Twitter data. • Third party applications that interact with Twitter are provided with
a large set of methods in the REST API to develop these applications.• The access of the REST API is also rate-limited, the limit is 150
requests per hour.
![Page 9: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/9.jpg)
Search API• Similar to the REST API, the Search API is pull-based. It replicates
the search functionality provided on the Twitter website. However, tweets retrieved are restricted to the past 7 days.
• the Search API is not appropriate for high-throughput real-time data acquisition. As such Twitter Inc. discourages its use and plans to discontinue it in the future.
![Page 10: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/10.jpg)
Create a Twitter App• To access the Twitter API you need to create a twitter app: follow this simple tutorial to do so:https://iag.me/socialmedia/how-to-create-a-twitter-app-in-8-easy-steps/• you will use the OAUTH settings in both R and Python:• Consumer Key• Consumer Secret• OAuth Access Token• OAuth Access Token Secret
![Page 11: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/11.jpg)
Tools to Collect Tweets
• Nodexl: https://nodexl.codeplex.com/ • Tweet Archivist : https://www.tweetarchivist.com/ • Twitter Archiving Google Spreadsheet (TAGS): https
://tags.hawksey.info/
![Page 12: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/12.jpg)
![Page 13: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/13.jpg)
What is R?
•Roos & Robert.
16
![Page 14: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/14.jpg)
Why R?
Statistics
Machine Learning
Data Analysis
![Page 15: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/15.jpg)
Why R?
Statistics
Machine Learning
Data Analysis Also:
Programming Language
![Page 16: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/16.jpg)
R allows you to integrate with
![Page 17: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/17.jpg)
Code
Code
C++
Code
Jave
CodePython
CodeR
![Page 18: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/18.jpg)
Fastest-growing language
https://www.r-bloggers.com/r-is-the-fastest-growing-language-on-stackoverflow/
![Page 19: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/19.jpg)
fastest-growing language
![Page 20: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/20.jpg)
Examples
![Page 21: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/21.jpg)
Now ..
Open your laptop, please
![Page 22: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/22.jpg)
Steps to install R1: install R:
• https://cran.r-project.org/bin/windows/base/ ---- http://cran.r-project.org/bin/macosx/
2: install RStudio (after installing R)• https://www.rstudio.com/products/rstudio/download3/
3: Install these packages (see the user manual):• streamR/ ROAuth/ RJSONIO/ RTextTools/ e1071/ SparseM.
User manual: • http://www.devchakraborty.com/RunningRJafroc.pdf
R Packages list:• https://cran.r-project.org/web/packages/available_packages_by_date.html
Developing Packages with RStudio:• https://support.rstudio.com/hc/en-us/articles/200486488?version=0.99.903&mode=de
sktop
• https://cran.r-project.org/doc/manuals/R-exts.html
![Page 23: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/23.jpg)
Useful URLs
• https://www.r-bloggers.com • https://www.r-bloggers.com/how-to-learn-r-2/ • http://www.slideshare.net/ChiuYW/r-language-tutorial • https://
www.rwaq.org/courses/introduction-r-programming • https://
www.researchgate.net/publication/288485806_Hybrid_Sentiment_Analyser_for_Arabic_Tweets_using_R
![Page 24: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/24.jpg)
Python
• Two versions: 2.7 3.X• Twitter packages: twitter -- -tweepy• IDE :Anaconda: iPython notebook: Jupyter
![Page 25: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/25.jpg)
Installing Python• Install Anaconda from here• https://www.continuum.io/downloads
choose Python 2.7 version (only for this tutorial)• Install the twitter package: From the command line
(terminal) type: pip install twitter
![Page 26: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/26.jpg)
Comparison between R and Python
• https://www.datacamp.com/community/tutorials/r-or-python-for-data-analysis#gs.GuXGfAc• http://blog.udacity.com/2015/01/python-vs-r-learn-first.html• http://www.dataschool.io/python-or-r-for-data-science/
![Page 27: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/27.jpg)
Contact Us
ASA Research Group
Twitter: @ASA__IUEmail: [email protected]: http://asa.imamu.edu.sa/
IWAN Research Group
Twitter: @IWAN_RGEmail: [email protected] Website: http://iwan.ksu.edu.sa
![Page 28: Harvesting Data from Twitter Workshop: Hands-on Experience](https://reader036.vdocuments.site/reader036/viewer/2022070601/5884e9d41a28abf76f8b4773/html5/thumbnails/28.jpg)
Thank you,
See you later …
THE END ..