valgeo 2011

19
ValGEO 2011, October 18 – 19, Ispra An Integrated Quality Score for Volunteered Geographic Information on Forest Fires Laura Spinsanti & Frank Ostermann European Commission – Joint Research Centre Institute for Environment and Sustainability Spatial Data Infrastructures Unit

Upload: frank-ostermann

Post on 07-May-2015

456 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

An Integrated Quality Score for Volunteered Geographic Information on

Forest Fires

Laura Spinsanti & Frank Ostermann

European Commission – Joint Research CentreInstitute for Environment and Sustainability

Spatial Data Infrastructures Unit

Page 2: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Outline

The research project

VGI and crisis events

Elements of Credibility and Relevance

Integrated Quality Score

Next steps

Page 3: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

1. Deploy a system for using VGI in crisis decision support• Integration• Quality control• Communication of risk/uncertainty

2. Assess its utility

Research Objectives

Page 4: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

• Horizontal communication during response and recovery phases• VGI as in-situ sensor readings

• Advantages: • Fast!!! (can beat Earthquake waves…)• Familiar (to users, at least)• Localized

• Disadvantages: • Fast!!! (rumor mill…)• Uncertain sources, uncertain accuracy• Data Deluge

• Scalability and Sustainability

Use of VGI during crisis events

Page 5: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

No external curation:

Page 6: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Some external curation: Retrieve, validate, publish

Page 7: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Full external curation: retrieve, rate/score, and publish

???

Page 8: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

System Workflow: Overview

Retrieval: • Heterogeneous sources• Validation, formatting, storage

Parametrization:• Top-down knowledge driven • Bottom-up machine driven

Quality Assessment:• Single data items and clusters• Credibility and Relevance

Event Detection:•At a later stage

Integration and visualization:•EFFIS Fire News

Page 9: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Source

Credibility

Relevance

Context

Content

Location

Elements of quality assessment

Page 10: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Credibility

• Two main characteristics: • Trustworthiness (affecting source credibility) (Source, Context)• Expertise (affecting information credibility, or accuracy) (Location, Source)

• Two main heuristics (Metzger et al 2010)• Social confirmation (what do others do or say?) (Source, Context)• Expectancies (what do I already know?) (Location, Context)

Page 11: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Relevance

• Main criterion: Match with information needs of a user

• Aspects include• Topicality (Content)• Location/Origin (Location, Context)• Novelty (Content)

• Three perspectives (Saracevic 1975, Hjorland 2010): • User (What s/he thinks is relevant)• System (Results matching query)• Subject knowledge (goal-, task-oriented)

• Geographic relevance vs. geographic aspect of relevance (Raper 2007, de Sabbata 2010)

Page 12: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

TopicalityTopicalityGeocodingGeocoding Geo-Spatial ContextGeo-Spatial Context Integrated Quality ScoreIntegrated Quality ScoreVGIVGI

Integration

Page 13: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Geocoding VGI

Geocoders used:

• GISCO/LAU2 string matching

• European Media Monitor algorithms

• Yahoo! Placemaker

• Exploit triplets of Location information (Source, Content, Message)

  TWITTER FLICKR

August 2010 August 2011 August 2010 August 2011

Number of retrieved VGI 2,904,065 7,996,228 7,991 17,850

Percentage with toponym

35% 27% 53% 50%

 Percentage with geocode

1.1% 0.92% 20% 21%

Page 14: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Contextualising VGI

• Assessing quality always needs context

• In absence of knowledge about “true values”, we default to basic heuristics: the social confirmation (what do others do or say?) and expectancies (what do I already know?)

• What do others do or say: Looking for confirmation in other VGI

• What do I already know: Looking for confirmation by grounding

Page 15: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Scoring VGI

• Sum of weighted scores: QS(VGIj) = ∑Ni=1wisji

• with w being weight for criterion i, and s being the score for the VGI object j

• Topicality: keyword-based• Proximity: next concurrent reported hotspot• Land cover: Forest, no-Forest, Built-up• Population Density: Risk factor• Information clusters: Similar messages or lone signal?

Page 16: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Page 17: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Full external curation: retrieve, rate/score, and publish

Page 18: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Next steps

• Improve quality assessment using bottom-up machine-learning approaches (Weka and the annotated set of Tweets)

• Refine spatial analysis

• Export high-potential VGI to PostGIS-DB for spatio-temporal analysis

• Validate method on 2011 data

Page 19: VALGEO 2011

ValGEO 2011, October 18 – 19, Ispra

Thanks for your attention!

[email protected]@jrc.ec.europa.eu