3vs crowdsourcing presentation *ihub_research
TRANSCRIPT
How Useful Is A Tweet? iHub Research’s 3Vs of Crowdsourcing
Framework
Angela Crandall Nanjira Sambuli Chris Orwa
This research was funded by Canada’s International Development Research Centre.
Twitter: Some Facts and Figures
• Launched in 2006 • Approximately 550 million active users
worldwide • About 200 million monthly active users • An average of 400 million tweets are sent
everyday globally • 60% of the monthly active users log on using
a mobile device at least once every month
Twitter and the Tweeps
• Twitter…can be more of a news media than even a social network (Kwak et al, 2010)
• Breaking news and coverage of real-time events are all shared under the 140-character limit
• Twitter users search for up-to-the-second information and updates on unfolding events
Twitter for Crowdsourcing. That Is…
Collecting information from the “crowd” • Allows for a wide reach of people in inexpensive ways • Large amounts of data can be obtained quickly, and often in real time • Not necessarily through tech, but nowadays most use tech such as online or via mobile phone • Crowdsourcing fosters citizen engagement with the information—to dispute, confirm, or acknowledge its existence.
Mapping Kenyan Election Events, Thanks to crowdsourcing!
What is there to (Twitter) crowdsourcing?
Viability: In what situation/events is crowdsourcing a viable venture likely to offer worthwhile results/outcomes? Validity: Does crowd-sourced information offer a true reflection of the reality on the ground? Verification: Is there a way in which we can verify that the information provided through crowdsourcing is indeed valid? If so, can the verification process be automated?
Crowdsourcing during an Election • What, if any, particular conditions should be in place
for crowdsourcing of information to be viable during an election period?
• Can crowd-sourced information be validated during
an election period? If so, what is the practical implementation of doing so?
• How do different crowdsourcing methods contribute
to the quality of information collected?
Why Elections? o Elections in Kenya have been noted to spark many
online conversations, especially with the continued uptake of social media;
o Citizens have an important role to play to contribute
information from the ground; o Existing election crowdsourcing initiatives (such as
Uchaguzi), but none use passive crowdsourcing; o Research exists around crowdsourcing during
disasters, but does not yet exist around elections.
Why Crowdsourcing, Kenyan Elections and #KoT
• #KoT have participated in crowdsourcing activities severally, under hashtags such as #CarPoolKE, #findfuel, #SomeoneTellCNN etc.
• Approximately 90,000 tweets generated during the first Kenyan Presidential Debates (as monitored using popular hashtags)
• Election-campaigning was also digital
(Online) Passive Crowdsourcing vs. Active Crowdsourcing
• Active – Open call made for participation (e.g. Ushahidi’s Crowdmap).
• Passive – Sifting through content already being generated (e.g. on Twitter/Facebook) to capture relevant information.
What we did
Cross-comparison of different media sources: o Traditional Media o Data mining from Twitter o Uchaguzi Crowdsourcing o Fieldwork
Research Findings
Passive Crowdsourcing
is Viable During the
Elections in Kenya
Twitter Breaks
News
An Example from the Westgate Incident
First tweet about the attack at 12:38PM
First tweet by media about the attack
First tweet by a government institution about the attack
Mining Of Twitter Data without Machine Learning is Not Feasible
Search method
Time taken
Number of Newsworthy Tweets
Search time for whole data set
Viable for real time analysis
Viable for post-data analysis
Linear search
90 hrs 100 270 days No No
Keyword search
4.5 hrs 400 27 days No In a very limited way
ML, supervised learning
Less than 6 mins, 1.5 hrs labeling
12,208 Less than 1 sec
Yes Yes
From the Westgate Incident… Mining tweets from the Westgate attack manually has been labour-intensive, limiting us to sufficiently analysing the first half hour (12:38 PM – 1:18 PM GMT+ 3) Further analysis into Twitter data from the incident will require machine learning techniques.
In Summary: o Kenyan social media content is rich with real-time
updates of happenings that might not be present in mainstream media reports.
o Mining of crowd-sourced data appears to be high value
when one is looking for timely, local information. o There are indeed considerations that are useful for
assessing and running an election-based crowdsourcing activity.
The 3Vs Crowdsourcing Framework
AVAILABLE FOR FREE DOWNLOAD HERE: http://www.ihub.co.ke/blog/2013/08/3vs-crowdsourcing-framework-for-elections-
launched/
Next Steps
• Testing the 3V’s Framework on other election-related crowdsourcing opportunities
• Move to real-time analysis of tweets • Provide tools for verifying crowdsourced
information. • Integrate research to media practices • Working with local media organizations to build a
useable tool for collecting real-time newsworthy incidents from the crowd