the data journalism handbook v0.1

14

Click here to load reader

Upload: nicola-hughes

Post on 27-Apr-2015

2.341 views

Category:

Documents


4 download

DESCRIPTION

A collaboration between all those interested in the future of news

TRANSCRIPT

Page 1: The Data Journalism Handbook v0.1

The Data Journalism Handbook

Version 0.1

Contributors

Contributors to this book include:

David Banisar, Article 19Caelainn Barr, EU Data JournalistMariano Blejman, Hacks/Hackers

Marianne Bouchart, Data Journalism BlogLiliana Bounegru, European Journalism Centre

Brian Boyer, Chicago TribuneJane Park, Creative Commons

Paul Bradshaw, City University LondonLucy Chambers, Open Knowledge Foundation

Helen Darbishire, Access Info EuropeSteve Doig, Cronkite School of Journalism

David Erwin, New York TimesLisa Evans, Guardian Datablog

Tom Fries, Bertelsmann StiftungDuncan Geere, Wired.co.uk

Rich Gordon, Northwestern UniversityJonathan Gray, Open Knowledge Foundation

Ted Han, DocumentCloudKate Hudson, Open Journalism

Francis Irving, ScraperWikiLizzie Jackson, Ravensbourne College

Nicolas Kayser-Bril, Data JournalistJohn Keefe, New York Public Radio

Friedrich Lindenberg, Open Knowledge FoundationLorenz Matzat, OpenDataCityAidan McGuire, ScraperWiki

Cynthia O'Murchu, Financial TimesAron Pilhofer, New York Times

Page 2: The Data Journalism Handbook v0.1

Anthony Reuben, BBCSimon Rogers, Guardian Datablog

Amanda Rossi, freelance journalistFabrizio Scrollini, London School of Economics

Adam Thomas, Source FabricSascha Venohr, Zeit OnlineJerry Vermanen, De Stentor

César Viana, César Viana, Estacio de Sa UniversityFarida Vis, University of Leicester

Lulu Pinney, Infographic design (Telling Information)

This work is licensed under a Creative Commons Attribution Sharealike license.

Tables of contents

The Data Journalism HandbookContributorsTables of contents0. Preface

0.1 The purpose of this book0.2 Add to this book0.3 Share this book

1. Introduction1.1 What is data journalism?1.2 Why is it important?1.3 How is it done?1.4 Examples, case studies and interviews

1.4.1 Data powered stories1.4.2 Data served with stories1.4.3 Data driven applications

1.5 Making the case for data journalism1.5.1 Measuring impact1.5.2 Sustainability and business models

2. Getting data2.1 Where does data live?

2.1.1 Open data2.1.2 Social data services2.1.3 Research data

2.2 Asking for data2.2.1 Freedom of Information laws2.2.2 Helpful public servants

2.3 Getting your own data2.3.1 Scraping data2.3.2 Crowdsourcing data

3. Understanding data3.1 Data literacy3.2 Working with data3.3 Tools for analysing data3.4 Annotating data

4. Delivering data

Page 3: The Data Journalism Handbook v0.1

4.1 From datasets to stories4.2 Publishing data4.3 Visualising data4.4 Data driven applications4.5 Engagement, outreach and community

5. Appendix5.1 Further resources

Notes: First draft deadline: Sunday, November 6th, 17.00 GMT (Please inform us if you finish your contribution earlier so we can start editing it) Project hashtag: #ddjbook Project URL:

0. Preface

0.1 The purpose of this book

Overview: Explain what this book does and doesn’t aim to do Authors: Jonathan Gray, Liliana Bounegru Length: 0.5-1 page

0.2 Add to this book

Overview: Explain how to contribute to future versions of this book Authors: Jonathan Gray Length: 0.5 page

0.3 Share this book

Overview: Encourage people to share this book Authors: Jonathan Gray Length: 0.5 page

1. Introduction

1.1 What is data journalism?

Page 4: The Data Journalism Handbook v0.1

Overview: Define and describe data journalism and how it is different from other

forms of journalism. ● Authors: Paul Bradshaw, Jonathan Gray, [Heather Brooke], [Simon Rogers],

[Nicolas Kayser-Bril], [Richard Gordon] Length: 1-2 pages (with quotes from different people)

UPDATE: input from Paul Bradshaw, Jonathan Gray STILL NEED: Snappy quotes from different people on what data journalism is, and what it isn’t. EDITOR: Liliana

1.2 Why is it important?

Overview: Put data journalism into context and explain why it matters and what potential it has.

Authors: Tom Fries, [Paul Bradshaw], [Jonathan Gray], [Heather Brooke], [Simon Rogers], [Nicolas Kayser-Bril], [Richard Gordon]

Length: 1 page (with quotes) UPDATE: input from Tom Fries and Nicholas Kayser-Bril STILL NEED: Snappy quotes from different people on why data journalism is important. EDITOR: Liliana

1.3 How is it done?

Overview: Explain different ways of doing data journalism (e.g. journalists who can code vs coders for hire, off the shelf tools vs. custom web applications, in house graphics departments vs hired data visualisation experts, etc). Give examples of how it is being done in different newsrooms.

Authors: Lucy Chambers, [Aron Pilhofer], [Simon Rogers], [Anthony Reuben], [Cynthia O'Murchu], [Sascha Venohr], [Caelainn Barr]

Length: 2-3 pages (with examples and quotes) UPDATE: input from Zeit Online, notes from the Guardian and Chicago Tribune STILL NEED: More case studies, quotes and examples. In particular get input from BBC, Chicago Tribune, FT, Guardian and NYT. And talk about how to find developers, designers and issue experts.

Page 5: The Data Journalism Handbook v0.1

EDITOR: Liliana

1.4 Examples, case studies and interviews

1.4.1 Data powered stories

● Overview: Give and describe successful examples of data powered stories you worked on. Describe how you produced these stories. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data powered stories is and how they could go about producing them.

○ What data did you use and how did you obtain it?○ What determined you to start this project?○ What did the project aim to achieve?○ How long did you work on the project?○ How many people worked on it?○ What was the cost of the project?○ What were the skills necessary for this project? (domain knowledge,

coding, research, visualisation, etc.)○ What is the role of datasets in these stories? (e.g.: give rise to new stories,

enrich stories, contextualize stories, help journalists explore topics in new ways, etc.)

○ What was your approach? (exploratory vs. hypothesis approach)○ What techniques and tools did you use?○ How did you present the data powered story?○ What is the potential of data powered stories?○ Why should journalists/newsrooms be interested in producing such

projects?○ What were the challenges in producing these stories?○ What tips and advice would you give to journalists who want to work on

similar projects?○ Please include relevant links, videos and images.

● Authors: Caelainn Barr, James Ball, Sascha Venohr, [Anthony Reuben], Cynthia O'Murchu, [Heather Brooke]

● Length: 1.5-3 pages per example UPDATE: Zeit Online STILL NEED: More case studies - e.g. from Amanda on Brazilian citizen journalists, from Chicago Tribune, data journalism on the radio, Guardian (Lisa or James). EDITOR: Lucy/Kat

1.4.2 Data served with stories

● Overview: Give and describe successful examples of data served with stories

Page 6: The Data Journalism Handbook v0.1

you worked on. Describe how you produced these projects. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data served with stories is and how they could go about producing them.

Page 7: The Data Journalism Handbook v0.1

○ What data did you use and how did you obtain it?○ What determined you to start this project?○ What did the project aim to achieve?○ How long did you work on the project?○ How many people worked on it?○ What was the cost of the project?○ What were the skills necessary for this project? (domain knowledge,

coding, research, visualisation, etc.)○ What is the role of datasets in these stories? (e.g.: provide additional

context or insight, etc.)○ What was your approach? (exploratory vs. hypothesis approach)○ What techniques and tools did you use?○ How did you present the story and the data served with it?○ What is the potential of such projects?○ Why should journalists/newsrooms be interested in producing such

projects?○ What were the challenges in producing these projects?○ What tips and advice would you give to journalists who want to work on

similar projects?○ Include relevant links, videos and images.

● Authors: Caelainn Barr, James Ball, Sascha Venohr, [Anthony Reuben], [Cynthia O'Murchu], [Heather Brooke]

Length: 1.5-3 pages per example UPDATE: needs doing! STILL NEED: Guardian, BBC, … Who else serves data with stories? EDITOR: Lucy/Kat

1.4.3 Data driven applications

● Overview: Give and describe successful examples of data driven applications you worked on. Describe how you produced these applications. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data driven applications is and how they could go about producing them.

○ What data did you use and how did you obtain it?○ What determined you to start this project?○ What did the project aim to achieve?○ How long did you work on the project?○ How many people worked on it?○ What was the cost of the project?○ What were the skills necessary for this project? (domain knowledge,

coding, research, visualisation, etc.)○ What was your approach?○ What techniques and tools did you use?○ How did you present the outcome?○ What is the potential of such projects?

Page 8: The Data Journalism Handbook v0.1

○ Why should journalists/newsrooms be interested in producing such projects?

○ What were the challenges in producing these projects?○ What tips and advice would you give to journalists who want to work on

similar projects?○ Include relevant links, videos and images.

Authors: Aron Pilhofer, Marcus Bösch Length: 1.5- 3 pages per example

UPDATE: needs doing! STILL NEED: Guardian, NYT, BBC, … EDITOR: Lucy/Kat

1.5 Making the case for data journalism

1.5.1 Measuring impact

Overview: Give overview of the potential of data journalism (e.g. engaging with new audiences, the future of journalism on the web) and how it could be measured. Include results of EJC survey on training needs for data journalism

● Authors: Liliana Bounegru, [Lorenz Matzat] Length: 1 page

1.5.2 Sustainability and business models

Overview: Discuss costs, sustainability and business models for data journalism. Provide successful and less successful examples and explain what lessons can be learned from them.

Authors: Lorenz Matzat Length: 1-2 pages

UPDATE: 1.5 still needs doing! STILL NEED: input from Guardian, Deutsche Welle, Zeit Online, NYT, etc. EDITOR: Liliana

2. Getting data

2.1 Where does data live?

2.1.1 Open data

Page 9: The Data Journalism Handbook v0.1

Overview: An overview of open data sources, what they contain, how to find

them, how to search them, examples of open data being used by journalists Authors: Jonathan Gray, brian boyer Length: 1-3 pages (with links and examples)

2.1.2 Social data services

Overview: An overview of community driven websites which aim to help you find the data you need - such as GetTheData.org and TheDataHub.org - and their function in enabling collaboration around datasets

Authors: Jonathan Gray Length: 0.5-1 page (with links and examples)

2.1.3 Research data

Overview: An overview of sites to find research data Authors: Length: 0.5-1 page (with links and examples)

UPDATE: Great input and notes from Brian Boyer/Chicago Tribune, Jane Park/Creative Commons, John Keefe/WNYC, Chrys Wu/HacksHackers. STILL NEED: Needs to be written up and expanded. EDITOR: Friedrich

2.2 Asking for data

2.2.1 Freedom of Information laws

Overview: An overview of FOI legislation, an example of making an FOI request, information on resource in this area, how to get help from FOI experts

Authors: Helen Darbishire (Access Info), Fabrizio Scrollini (London School of Economics)

Length: 1-3 pages (with links and examples)

2.2.2 Helpful public servants

Overview: How talking directly with public servants or engaging with official open data initaitves might help you to find the data you need

Authors: [Jonathan Gray] Length: 0.5-1 page (with links and examples)

Page 10: The Data Journalism Handbook v0.1

UPDATE: First draft almost done. STILL NEED: Editing and peer-review. EDITOR: Liliana/Friedrich

2.3 Getting your own data

2.3.1 Scraping data

Overview: Explaining basic idea of web scraping, why this can be necessary, examples of how this has been used by journalists and guide for absolute beginners on how it can be done based on an interesting case study

Authors: Francis Irving, Aidan McGuire, [Friedrich Lindenberg] Length: 2-3 pages (with links, examples, and a basic tutorial)

UPDATE: Input from Friedrich Lindenberg, Federica Cocco, Glenn McMahon and Francis Irving. STILL NEED: Needs to be written up and expanded. EDITOR: Friedrich

2.3.2 Crowdsourcing data

Overview: Explaining basic idea of crowdsourcing data, how various projects have used this, and how to do this (e.g. using Google Spreadsheets, forms, maps, Twitter hashtags, etc)

Authors: [Simon Rogers], [Lisa Evans] Length: 1-3 pages (with links and examples)

UPDATE: Input from Marianne Bouchart and others (not in the Google doc yet), Guardian (notes) STILL NEED: Nicolas-Kayser Bril (water data) and other examples EDITOR: Liliana/Friedrich

3. Understanding data

3.1 Data literacy

Overview: Explaining data literacy and its importance (including statistical/numerical literacy, use of mathematics, technical literacy, etc)

Authors: James Ball, Nicolas Kayser-Bril, Richard Gordon

Page 11: The Data Journalism Handbook v0.1

Length: 1-3 pages UPDATE: input from Lisa Evans, Richard Gordon, Lizzie Jackson, Amanda Rossi, JV Chamary, Fabrizio Scrollini STILL NEED: Input from Nicholas Kayser-Bril, and quotes from Lisa Evans, Amanda on verifying data, citizen journalism, etc EDITOR: Liliana

3.2 Working with data

Overview: What you need to work with datasets: background knowledge, technical ability, etc. (case study approach with lessons learned from each project presented)

Authors: James Ball, Steve Doig Length: 1-2 pages per case study

UPDATE: Input from Claire Miller and Steve Doig STILL NEED: Further input and ideas EDITOR: Liliana

3.3 Tools for analysing data

Overview: Overview of different types of tools for analysing and working with datasets, examples of how they can be used, examples of how they have been used by journalists.

Authors: [Nicola Hughes], [Lisa Evans], [Friedrich Lindenberg], [Nicolas Kayser-Bril]

Length: 1-2 pages per case study UPDATE: Needs doing! STILL NEED: Input from Friedrich. EDITOR: Friedrich.

3.4 Harnessing external expertise

Overview: How to enable people to annotate and comment on datasets

Page 12: The Data Journalism Handbook v0.1

● Authors: [Aron Pilhofer] Length: 1 page

UPDATE: Needs doing! STILL NEED: Input from Guardian, OWNI, NYT? EDITOR: Liliana

4. Delivering data

4.1 From datasets to stories

Overview: Explaining how to find stories in datasets (various approaches), including examples and case studies. Also looking at the broader role of data journalists in the newsroom, how they work with other journalists, etc.

● Authors: Caelainn Barr, [Cynthia O'Murchu], [Heather Brooke], [Lisa Evans], [Sascha Venohr]

Length: 0.5-1 page per approach/case study UPDATE: Some material STILL NEEDS: Expanding and editing EDITOR: Jonathan

4.2 Publishing data

Overview: Overview of ways to publish data including examples. Embedding data, raw data (formats), live data live data, updating data, APIs. Who is your data for. Also a section on knowing the law, ethics and privacy and open licensing.

Authors: Length: 1-2 pages

UPDATE: Needs doing! STILL NEED: EDITOR: Jonathan

4.3 Visualising data

Overview: How to visualise data - off the shelf tools and custom visualisations with step by step guides demonstrated on an example

● Authors: [Lulu Pinney], [Alastair Dant]

Page 13: The Data Journalism Handbook v0.1

Length: 1-2 pages per case study UPDATE: Good start! STILL NEED: Needs expanding and editing, and more examples. EDITOR: Jonathan

4.4 Data driven applications

Overview: Step by step guide, tips and tricks for how newsrooms can produce data driven applications

What are the resources (skills, costs, etc.) needed? What are the steps to take when you want to build a data driven

application? What useful lessons did you learn from your own experience? Why should newsrooms be interested in producing data driven

applications? What is the potential of such projects?

● Authors: Aron Pilhofer Length: 2-3 pages (including examples)

UPDATE: Needs doing! Aron? STILL NEED: Ideas on how to get started, design process, etc. EDITOR: Jonathan

4.5 Engagement, outreach and community

Overview: Knowing your audience (and pitching appropriately), dissemination and outreach, social media, building community, engaging with existing communities (designers, developers, etc).

Authors: Length: 1-2 pages

UPDATE: Duncan (Wired) working on it now. Needs more input. EDITOR: Jonathan

5. Appendix

5.1 Further resources

Overview: Lists of links, resources, examples and other bits and pieces that don’t fit in the handbook

Page 14: The Data Journalism Handbook v0.1

Authors: Everyone! Length: 5 pages

UPDATE: Needs doing! STILL NEED: Lots of ideas from everyone. EDITOR: Jonathan