best practices for publishing data

24
FIND AND UNDERSTAND DATA October, 2012 Hjalmar Gislason, founder & CEO - [email protected] Best Practices for Publishing Data

Upload: hjalmar-gislason

Post on 14-Jan-2015

2.881 views

Category:

Documents


3 download

DESCRIPTION

A presentation given by Hjalmar Gislason, founder and CEO of DataMarket (http://datamarket.com/) at Strata Conference in London, October 2012

TRANSCRIPT

Page 1: Best Practices for Publishing Data

F I N D A N D U N D E R S TA N D D ATA

October, 2012Hjalmar Gislason, founder & CEO - [email protected]

Best Practices for

Publishing Data

Page 2: Best Practices for Publishing Data

Founder and CEO

HjalmarGislason

Twitter: @datamarketSlides: http://blog.datamarket.com/

Page 4: Best Practices for Publishing Data
Page 5: Best Practices for Publishing Data

Heavy

Data Consumers

Providers of

Data Delivery Technology

Page 6: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Computers

• Structure

Humans

• Search• Visualization• Download

Page 7: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Computers

• Structure

Humans

• Search• Visualization• Download

Page 8: Best Practices for Publishing Data

1. Simple formats2. Indexes, unique IDs and meta-data3. FAQs and feedback channels

Publishing for Computers

Page 9: Best Practices for Publishing Data

"Don't anthropomorphize computers - they hate it."

- Unknown

Page 10: Best Practices for Publishing Data

Simple Formats

Page 11: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Simple Formats:Tim Berners-Lee’s Five Stars

Page 12: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Simple formats:You lost me at “Semantics”

Page 13: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Standards will emerge and there will be more and more of them

• RDF•OData vs. GData•DSPL

Page 15: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Indexes, unique IDs and meta-data

• Must: Unique ID, Title, Last updated• Should: Meta-data

• Why?• No need for scraping

• Less load on your end• Ensures full coverage• Ensures content removal and updates

Page 16: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Indexes, unique IDs and meta-data

• Hard to emphasize enough!

• Unique IDs for everything: Datsets, columns, entities, ...

• Why?• Continuity: A small change for a man = giant leap for a

computer

Page 17: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Indexes, unique IDs and meta-data

• Any relevant contextual information• URL(s), descriptions, methodology, next updated, authors,

keywords, units, license information, ...

Page 18: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

FAQs and feedback channels

#1 reason for not publishing data:

“There are errors in the data and I don'twant others to discover them”

Page 19: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

FAQs and feedback channels

#1 reason for not publishing data:

“There are errors in the data and I dowant others to discover them”

Page 21: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

FAQs and feedback channels

Page 22: Best Practices for Publishing Data

1. Simple formats2. Indexes, unique IDs and meta-data3. FAQs and feedback channels

Publishing for Computers

Page 23: Best Practices for Publishing Data

| BEST PRACTICES for PUBLISHING DATA | Hjalmar Gislason, [email protected] | October 2012

Computers

• Structure

Humans

• Search• Visualization• Download

Page 24: Best Practices for Publishing Data

F I N D A N D U N D E R S TA N D D ATA

Twitter: @datamarket · Facebook: DataMarket · E-mail: [email protected]

Hjalmar Gislason, founder & CEO