thesis_presentation_v1 shorter split
TRANSCRIPT
Identifying Single and Stacked News Triangles
in Online News Articles
- an Analysis of 31 Danish Online News Articles Annotated by 68 Journalists
By Miklas [email protected]
Master Thesis Project, 15 ECTS, DA613A, Spring 2015
Supervisor: Daniel Spikol Examiner: Bengt Nilsson
Link to data: http://figshare.com/account/projects/4414 and http://plot.ly (see thesis for direct links)
Presentation Outline
• Introduction• Research Questions • Methodology (What was the set-up)• Results
• Identifying the presence of Stacked News Triangles
• Named Entities influence on the presence of series of Stacked News Triangles
• Named Entities variance per Category
• Conclussion and Summary
Presentation Outline
https://www.flickr.com/photos/boston_public_library/6801377949/
Intro: The Problem
https://www.flickr.com/photos/boston_public_library/6801377949/
Intro: The Problem
https://www.flickr.com/photos/greenwood100/7314522860/
Intro: The News Triangle
“News stories should flow logically from the first paragraph. […] One way of looking at it is through the News Triangle or inverted pyramid. Generations of journalists have been brought up on this.”- Sissons
Intro: Stacked New Triangles
News Triangle for print news
News Triangles for online news
Headline + Sub-headline Intro Body element 1…
RQ 1: To what extent do online news articles follow the idiom of many News
Triangles, instead of only one News Triangle, where information is distributed at the beginning of the text. I.e. do the keyword candidates appear less frequently the further we move away from the start of each element block?
RQ 2: Given that much news concerns something that happened to someone somewhere, what influence does Named Entity keywords have on the
presence of News Triangles?
RQ 3: Is there a distinct variance of Named Entity Type keywords (Persons,
Place or Organisations) within the categories Culture, Domestic, Economy,
and Sports?
RQ 1: To what extent do online news articles follow the idiom of many News Triangles, instead of only one News Triangle, where information is
distributed at the beginning of the text. I.e. do the keyword candidates appear less frequently the further we move away from the start of each element block?
RQ 2: Given that much news concerns something that happened to someone somewhere, what influence does Named Entity keywords have on the
presence of News Triangles?
RQ 3: Is there a distinct variance of Named Entity Type keywords (Persons, Place or Organisations) within the categories Culture, Domestic, Economy, and Sports?
RQ 1: To what extent do online news articles follow the idiom of many News Triangles, instead of only one News Triangle, where information is
distributed at the beginning of the text. I.e. do the keyword candidates appear less frequently the further we move away from the start of each element block?
RQ 2: Given that much news concerns something that happened to someone somewhere, what influence does Named Entity keywords have on the
presence of News Triangles?
RQ 3: Is there a distinct variance of Named Entity Type keywords (Persons, Place or Organisations) within the categories Culture, Domestic, Economy, and Sports?
Research Questions
Methodology - Collecting Annotations
68 journalist8 articles eachSet of 31 articles
Categories:Culture (7)Domestic (14)Economy (6)Sports (4)
Methodology - Processing AnnotationsKeyword Annotations per Category:
Culture (7 articles), Domestic (14 articles), Economy (6 articles), Sports (4 articles).Category and Annotation Type
Keywords per category
Avg. keywords per article
collectively Unique keywords
per categoryAvg. Unique
keywords per article collectively
Culture Keywords 993 141.86 365 52.14Domestic Keywords 2510 179.29 703 50.21Economy Keywords 755 125.83 268 44.67
Sports Keywords 691 172.75 176 44
Averages and Five Number Summary of Article AnnotationsCategories Average Min Max Median 3rd
median1st
Median Culture 15.57 13 19 16 16 16 Domestic 18.71 12 39 15.5 16 15 Economy 14.83 13 18 14.5 15 14 Sports 18.75 12 37 13 13 13
Methodology - Processing Annotations
Article 22 - Keywords - 3RD QUARTILEKEYWORD COUN
Tdansk svømmeunion 13svømning 9trygfonden 9drukneulykker 6svømmeundervisning 6svømme 4børn 4tobias marling 4rené højer 3druknestatistik 2statistik 2folkeskolen 2drukneulykke 2skoler 2yougov 2skole 2yougov-undersøgelse 2
Article 4 - Keywords - 3RD QUARTILEKEYWORD COUN
T KEYWORD COUNT
odsherred 25 forældre 9tutoring 21 birgitte henriksen 9nordskolen 17 niveaudeling 9cooperative learning 15 elever 8manu sareen 14 kommuner 8undervisningsmetoder 12 lektier 8folkeskole 11 makkerlæsning 7undervisningsministeriet 11 trelærerordning 6folkeskolen 11 matematik 6specialundervisning 10 nordvestsjælland 6undervisning 10 undervisningsdifferentie
ring 6ppr 10 lektiehjælp 6peter holm 9
Methodology - Processing Annotations
Article 22 - Keywords
KEYWORD COUNT
dansk svømmeunion 13
svømning 9
trygfonden 9
drukneulykker 6
svømmeundervisning 6
svømme 4
børn 4
tobias marling 4
rené højer 3
druknestatistik 2
statistik 2
folkeskolen 2
drukneulykke 2
skoler 2
yougov 2
skole 2
yougov-undersøgelse 2
Article 4 - Keyowrds
KEYWORD COUNT KEYWORD COUNT
odsherred 25 forældre 9
tutoring 21 birgitte henriksen 9
nordskolen 17 niveaudeling 9
cooperative learning 15 elever 8
manu sareen 14 kommuner 8
undervisningsmetoder 12 lektier 8
folkeskole 11 makkerlæsning 7
undervisningsministeriet 11 trelærerordning 6
folkeskolen 11 matematik 6
specialundervisning 10 nordvestsjælland 6
undervisning 10 undervisningsdifferentiering 6
ppr 10 lektiehjælp 6
peter holm 9
Results - Keyword Distribution - 1
News Triangles for online news
Headline + Sub-headline Intro Body element 1…
Results - Keyword Distribution - 2Category Article
ID Overall News
TriangleIntro Section 1 Section 2 Section 3 Section 4 Section 5
How Many Have Stacked News
Triangles?
CultureArticle 16 1 0
3 of 7Article 17 1 1 0 1Article 18 1 1Article 22 1 0Article 23 1 1Article 27 1 0 1 0 1 1Article 28 1 1 1
Domesti
c
Article 03 1 1 1
7 of 14
Article 04 1 1 1 1 1Article 05 1 1 1Article 07 1 0 1 1Article 08 1 1Article 13 1 1 1 1 1 1Article 19 1 1 1Article 20 1 1 1 0Article 21 1 1 0 0 1 0 0Article 24 1 0 1 1 1 0Article 25 1 0 1 0 1Article 26 1 1Article 29 1 1 1 0
Article 30 1 0Econom
y
Article 01 1 1 0 1 1
2 of 5Article 02 1 0 0 1 1Article 06 1 1 0 1 0Article 09 1 1 1Article 10 1 1 1 1 1
Article 11 1 1 1 0 1SportArticle 12 1 1
2 of 4Article 14 1 0Article 15 1 1 1 1Article 31 0 1 0 1 1
RQ 1: To what extent do online news articles follow the idiom of many News Triangles, instead of only one News Triangle, where information is distributed at the beginning of the text. I.e. do the keyword candidates appear less frequently the further we move away from the start of each element block?
Summary/DiscussionSummary/Discussion
RQ 2: Given that much news concerns something that happened to someone somewhere, what influence does Named Entity keywords have on the presence of News Triangles?RQ 3: Is there a distinct variance of Named Entity Type keywords (Persons, Place or Organisations) within the categories Culture, Domestic, Economy, and Sports?
Category How Many Articles Have Stacked News Triangles?
Culture 3 of 7 have Stacked News Triangles
Domestic 7 of 14 have Stacked News Triangles
Economy 2 of 5 have Stacked News Triangles
Sports 2 of 4 have Stacked News Triangles
FUTURE WORK• Looking much closer at what causes of the ascent or descent of
the linear fit. • If a smaller or larger set of keywords is better• A larger set of articles• Named Entities in taxonomies
Summary/DiscussionSummary/Discussion• Including more keywords?
• Removing Stop words?
• De- or increasing partitions of the text?