towards rule-guided classification for volunteered geographic information

37
Ahmed Loai Ali*, Falko Schmid, Zoe Falomir, and Christian Freksa University of Bremen – Cognitive System research group (CoSy) Contact: * [email protected] Geospatial Week @ ISSDQ 2015

Upload: ahmed-ali

Post on 19-Feb-2017

229 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Ahmed Loai Ali*, Falko Schmid, Zoe Falomir, and Christian Freksa

University of Bremen – Cognitive System research group (CoSy)

Contact: * [email protected]

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

The advanced technologies & the power of crowd-

sourcing.

VGI:

one form of UGC, but UGGC

contents with implicit/explicit spatial references

evolved to play a vital role in GIScience.

Geospatial Week @ ISSDQ 2015

Implicit/Explicit VGI:

Micorbolgs, OpenStreetMaps, WikiMapia, etc..

Potential data source & support various applications:

Map provisions

Urban planning

Land use mapping

Crises management

Environmental monitoring

Support humanities

Geospatial Week @ ISSDQ 2015

Explicit–VGI:

Users act as observers, collect, share, and maintain information about

geographic features, generating digital maps.

They are mostly untrained and non-experts, however they are interested in

contributing geographic information.

Geospatial Week @ ISSDQ 2015

The needed quality for reliable

services.

The main wheel between data

production & consumption

VGI has heterogeneous quality

due to:

Volunteers participants

Various technologies

Loose contribution mechanism

Data

Quality

Data

Production

Data

Utilization

Geospatial Week @ ISSDQ 2015

Standard measures

Consistency

Completeness

Positional accuracy

Thematic accuracy

Temporal accuracy

Evolved measures:

Conceptual quality

Fitness of use

Geospatial Week @ ISSDQ 2015

extrinsic

intrinsic

Social

Geographic

Crowdsourcing

Geospatial Week @ ISSDQ 2015

Classification

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

A facet of thematic accuracy

Problematic classification results in:

Inaccurate/ Incomplete results

Rich data with limited use

Geospatial Week @ ISSDQ 2015

Rule-guided classification approach

To improve the classification quality

Through guiding:

Improve the contributors experiences

Direct the contributors towards consistent classification

Geospatial Week @ ISSDQ 2015

Considering OSM data set, large amount of data with

good quality

Identical entities should be classified similarly – at

least within a country level

The appropriate classification of an entity is strongly

depending on

Entity characteristics

Geographic context

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

Grass-related entities classification

Park, Garden, Meadow, Grass, Forest, Wood

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

Germany data set (December 2013)

10 most densest cities

3724 forest

3030 garden

7336 grass

4277 park

4445 meadow

1454 wood

Geospatial Week @ ISSDQ 2015

Topological investigation

Geospatial Week @ ISSDQ 2015

Predictive association rule extraction

Class(E, C) R(E, F) (support, confidence)

An entity E

R ϵ {‘contains’, ‘meet’, ’disjoin’, ’coveredBy’ or ‘overlap’}

F a set of features [f1,f2, …]

C ϵ {‘park’, ’garden’, ’grass’, … .}

Geospatial Week @ ISSDQ 2015

Topological investigation

Step 1: find frequent rule items

Step 2: generate rules

Step 3: filter rules

Step 4: develop a classifier

Geospatial Week @ ISSDQ 2015

9193 rules:

4100 describe the forest class

215 describe the garden class

745 describe the grass class

506 describe the meadow class

2938 describe the park class

689 describe the wood class

Classification plausibility and ambiguity

Geospatial Week @ ISSDQ 2015

Pruning:

remove redundant rules based on confidence

Filtering rules based on confidence:

confidence ≥ 50%

Ranking 1st

& 2nd

recommendations:

consider top 2 recommendations

Classification assumption:

maximum confidence per class

Geospatial Week @ ISSDQ 2015

Geospatial Week @ ISSDQ 2015

The entity matches with:

Park 92%

Meadow 46%

Grass 32%

Forest 13%

Garden 12%

Geospatial Week @ ISSDQ 2015

60%

50%

75%

55%

0%

25%

0%

10%

20%

30%

40%

50%

60%

70%

80%

all rules rules with conf >= 50&

Classification accuracies by various hypothesis

1st recommendations 1st & 2nd recommendations not matched

Geospatial Week @ ISSDQ 2015

38%

71% 72%

42%

80%

7%

67%

81%

87%

58%

94%

15%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

forest garden grass meadow park wood

Classification accuracies per class

1st recommendation 1st & 2nd recommendation

Geospatial Week @ ISSDQ 2015

81%

87%

94%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

forest garden grass meadow park wood

Classification accuracies per class

1st recommendation 1st & 2nd recommendation

Geospatial Week @ ISSDQ 2015

Grass

Park Park

Geospatial Week @ ISSDQ 2015

Park

Grass Grass

Geospatial Week @ ISSDQ 2015

Garden

Meadow Garden

Geospatial Week @ ISSDQ 2015

Now, it becomes applicable

We launched Grass&Green

A web tool that guide participants towards the most appropriate

classification

It is online under www.opensciencemap.org/quality

Simple interface

Definitions and descriptions are provided (textual & visual)

Multiple-classification plausibility

Contribute directly to OSM project

Geospatial Week @ ISSDQ 2015

@grass_and_green

Grass&Green

[email protected]

www.opensciencemap.org/quality

Geospatial Week @ ISSDQ 2015

0

5

10

15

20

25

30

35

40

0

50

100

150

200

250

300

350

9/1/2015

9/2/2015

9/3/2015

9/4/2015

9/5/2015

9/6/2015

9/7/2015

9/8/2015

9/9/2015

9/1

0/20

15

9/11/2015

9/12/2015

9/13/2015

9/14/2015

9/15/2015

9/16/2015

9/17/2015

9/18/2015

9/19/2015

9/20/2015

9/21/2015

9/22/2015

9/23/2015

9/24/2015

Contributions to Grass&Green tool in 24 days

active contribution

inactive contribution

users

@grass_and_green

Grass&Green

Geospatial Week @ ISSDQ 2015

23%

67%

10%

Agreement with Recommendations

Agree

Partial Agree

Not Agree

@grass_and_green

Grass&Green

Geospatial Week @ ISSDQ 2015

Data mining is a promising methodology to VGI quality

analysis

Recommendation systems is required

Towards features- and location- customized systems

Fuzziness of spatial concepts towards data evaluations

Geospatial Week @ ISSDQ 2015

Thank you !!!

?

[email protected]

@grass_and_green

Grass&Green