comparing automated factual claim detection against...
Post on 27-Sep-2020
8 Views
Preview:
TRANSCRIPT
Comparing Automated Factual Claim Detection Against Judgments of Journalism OrganizationsNaeemul Hassan#, Mark Tremayne, Fatma Arslan, Chengkai Li
University of Texas at Arlington 2016 Computation + Journalism Symposium, Sept. 30th, 2016
# The author is now with University of Mississippi
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Fact-checking at Peak
2
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Fact-checking the Presidential Debates
3
http://www.politifact.com/
http://www.politifact.com/
©2015-2016 The University of Texas at Arlington. All Rights Reserved.4
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Closed-caption decoder
Live Coverage of Debates
5
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Presidential Debate
Transcripts (1960-2012)
Ground Truth
Finding Important Factual Claims: A Supervised Learning Task
Human Annotation Feature
Vectors
Feature Extraction
Learning Algorithm
2016 Presidential Debates
Important factual claims
6
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Training Dataset o Transcripts of all 30 debates (11 general elections) in history o 20k sentences by candidates: removed very short (< 5 words) sentences o Ground truth collection website
7
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Experiments 2016 U.S. Presidential Election Primary Debates o Transcripts of 21 debate (12 Republican, 9 Democratic) o 30737 sentences o Collected fact-checks by PolitiFact and CNN o Detected topics of the sentences 12 Most Important Topics (MIP) used by Gallup* For each topic, we manually created a set of keywords representing that
topic. E.g., keywords for topic “Abortion”: {abortion, pregnancy, planned parenthood}
8
* M. E. McCombs and D. L. Shaw. The agenda-setting function of mass media. Public opinion quarterly, 1972 * T. W. Smith. America’s most important problem-a trend analysis. Public opinion quarterly, 1980
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
(1) Sentences selected by Journalism organizations (CNN and PolitiFact) for checking had ClaimBuster scores significantly higher (were more check-worthy) than sentences not selected for checking.
9
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Distribution of topics over sentences scored low (≤ 0.3) by ClaimBuster
Distribution of topics over sentences scored high (≥ 0.5) by ClaimBuster
10
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
(2) ClaimBuster is similar to Journalism organizations (CNN and PolitiFact) in the distribution of topics among highly scored sentences (≥ 0.5) and fact-checked sentences.
11
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
new factual claims: solicit analyses from professionals; algorithmic fact-checking
o debates o interviews o speeches o political ads o social media o web o news
repository of fact-checks by professionals
matched existing fact-checks: delivered via browser extensions, mobile apps, and smart-TV apps
The Quest to Automate Fact-Checking
claimed detected and ranked by check-worthiness
12
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
One Day Robot Fact-checkers will Dominate
13
o Less subjective
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
One Day Robot Fact-checkers will Dominate
14
o Restless and more efficient
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
One Day Robot Fact-checkers will Dominate
15
o More intelligent
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Acknowledgment UTA Students o Naeemul Hassan (graduated) o Fatma Arslan o Siddhant Gawsane o Aaditya Kulkarni o Joseph Minumol (graduated) o Vikas Sable o Gensheng Zhang o Many more students who are
currently working on it
Collaborators o Bill Adair (Duke) o Pankaj Agarwal (Duke) o Peter Fray (UTS) o James Hamilton (Stanford) o Mark Tremayne (UTA) o Jun Yang (Duke) o Cong Yu (Google Research)
16
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Acknowledgment Funding sponsors
Disclaimer: This material is based upon work partially supported by the National Science Foundation Grants 1117369, 1408928 and 1565699 and a Knight Prototype Fund from the Knight Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.
17
©2015-2016 The University of Texas at Arlington. All Rights Reserved.
Thank You! Questions? o http://ranger.uta.edu/~cli cli@uta.edu o Twitter @ClaimBusterTM o Prototype http://idir.uta.edu/claimbuster o You are invited to contribute http://bit.ly/claimbusters
18
top related