poliweb project (peps'14) geraldine castel cemra, université stendhal, france genoveva...
TRANSCRIPT
PoliWeb project (PEPS'14)
Geraldine CastelCEMRA, Université Stendhal, France
Genoveva Vargas-SolarCNRS, LIG-LAFMIA, France
Towards a cloud infrastructure for collecting, storing and analyzing data flows produced by politicians
Javier Espinosa LAFMIA, UMI CNRS [email protected]
Journée Big Humanities, 8 December 2014, Université Stendhal
Have you been tweeting about the elections?
+ 3
Doing politics in the era of big data 2008 was called the “social media election” with 1.8 million
tweets sent on election day.
Barack Obama’s appearance at the Democratic National Convention caused 4 million tweets total during his 39 minute speech (52,000 tweets / minute).
9/10 Senators and Representatives have their own Twitter accounts.
+ 4
2012 in France
Sources: blogs.salesforce.com, web.archive.org
UMP # of times crawled
How is this online political activity affecting the elections ?
5
Social networks and IT
in politics
Volume of data big data
Analytics
Juridical issues
+ 6
Objective
Implications Collect data from social networks, websites and politicians' blog Curate and store these data Define a continuous comparison process that can evolve during time
and as new information is integrated in the database
" Compare the impact of the use of communication tools in the campaign strategies of the Europeans elections in France and UK"
+ 7
Challenges
Time and ownership Data of interest is determined by the campaign period (EU elections)
which is short Access, exploitation and storage of data can be limited or partially limited
according to juridical laws in both countries
Data curation and storage Fill-in missing information and unbalanced content retrieved about entities
that must be compared Organization according to political and geographical organization
Provenance and pertinence
+ 8
Expected Results
Integrated historical and distributed database of documents, photos, text and social networks posts Data provenance, freshness Structure Respecting privacy and data ownership, owner anonymity
Analysis platform for querying the database with respect to different criteria: Geographic and temporal parameters Statistics Political organization and tendencies
Compare strategies and conditions of the elections in UK and France
+ 9
Roadmap
Doing politics in the era of Big Data
Analyzing political campaign strategies in Europe Data collection and curation Comparing for understanding strategies: UK vs. France
Conclusions and perspectives
+ 10
Data Collection and Curation
Political Parties
Pull data
Scrap Web
Pushed data
[Time interval]
Frequency
Candidates
Constituencies
Juridical issues
Geographic provenance
+ 11
Comparison and Analysis Requirements ( i ) Query criteria
Date, candidate, party, document type, key words (frequent words/term clouds)
Data provenance Party, webmaster, candidate, campaign staff
Generate an inventory geo-localized and grouped by parties and militants Content types: video, text, image, document Links to other content and tools: donations on line, other campaign actions,
Facebook pages and support committees, agenda
+ 12
Comparison and Analysis Requirements ( ii ) Compare content from sites, personal blogs and pages, parties sites
Common and different elements: content and structure (communication strategies)
Count and compare Facebook posts, comments, likes and shares
Propose visualization Comparison of tools, candidates, parties, countries For example:
Which candidate is the most visible within the same party, among parties?
Compare data stemming from different sources (e.g. preferences of tools, content type) of users and parties
+ 13
Roadmap
Doing politics in the era of Big Data
Analyzing political campaign strategies in Europe Data collection and curation Comparing for understanding strategies: UK vs. France
Conclusions and perspectives
+ 14
Current Work
Propose a juridical profile of content and tools directly and transitively used by candidates and parties Ownership, right of use, storage, dissemination Consider different legislations according to the country
Implement a data collection process guided by juridical profile, temporal issues related to the type of elections analyzed
Propose a data curation process, guided by QoS aspects: Juridical, temporal, provenance, reputation, geography and characteristics of
the official organization of the process
Technology is changing the way elections are run.In which extent and how?
We need to develop analysis tools in a multidisciplinary context to provide a comprehensive picture
? Geraldine CastelCEMRA, Université Stendhal, France
Genoveva Vargas-SolarCNRS, LIG-LAFMIA, FranceJavier Espinosa
LAFMIA, UMI 3175 CNRS [email protected]