a crash course in data for information graphics
DESCRIPTION
An overview of using government data for information graphics, from locating the data to visualizing it with Web 2.0 tools and desktop software.TRANSCRIPT
David HerzogMissouri School of Journalism and NICAR
Locating the data
Obtaining the data
Evaluating the data
Working with the data
Visualizing the data
“Database state of mind”
Data has to exist. Where? Online Offline
Government websites Data.gov U.S. Census Bureau FDIC Missouri Data Portal Missouri Accountability Portal
U.S. agency FOIA pages Drug Enforcement Administration
NGO sites Right-to-Know Network OpenMissouri.org NICAR database library ALA state agency databases wiki
Commercial services Socrata Infochimps Geocommons Foreclosure Radar Oil Price Information Service Search Systems Junar
Academic data catalogs ICPSR
Forms Forms.gov Web forms▪ Columbia parade permits
Records retention schedules
Reports State auditor U.S. Government Accountability Office U.S. Inspectors General
Google advanced search Look for data files Look for key words Look only on government sites
Data entry In the field At the office
Printouts/reports
Inspection forms
Download it
Write or request a scraper with ScraperWiki
Convert a PDF with CometDocs Zamzar
Just ask for it
Make an open-records request
U.S. Freedom of Information Act Passed in 1966 Amended in 1996 to include electronic
records
State open-records statutes Missouri Sunshine Law
Get the roadmap! Record layout File layout Data dictionary Code sheet
Metadata Data about the data
Look at it immediately when you get it It is what you asked for/expected? How many rows/records of data? Is the file format OK?
Does it look too good to be true?Beware of missing informationWho collected the information?How? What are their methods?Why?What is their agenda?Who supports them financially or
otherwise?
Notepad++ for PCsTextMate for Mac
Always keep original file
Never overwrite data columns
Tools Spreadsheets Database managers Google Refine Programming languages
Raw numbers, without context, rarely are interesting.
Ask: Compared to what?
Raw (amount) change New-Original
Percent change Change/Original
Per capita rates Per person Per x people
Percent of total Individual/Total
Ratio Apples/oranges
Averages Mean Median
Be curious!Cut out small slicesSpreadsheets for simple math and
comparisonsSpreadsheets for pivot tablesDatabase managers for more robust
analysisAlways ask: Is this correct?
Online software platforms
Desktop software
Contact David Herzog at
[email protected] Twitter: @davidherzog