semdat: a web-based interactive, flexible translation service for classification systems and...

24
SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University of Auckland William R. Smart Sina Masoud-Ansari Brandon Whitehead Tawan Banchuen Mark Gahegan

Upload: steven-elliott

Post on 18-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

SemDat: A Web-Based Interactive, Flexible Translation Service for

Classification Systems and Taxonomies

Center for eResearch & School of EnvironmentUniversity of Auckland

William R. SmartSina Masoud-AnsariBrandon Whitehead

Tawan BanchuenMark Gahegan

Page 2: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Overview

• Problem and motivation

• A quick tour

• Ontology creation

• Web app architecture

• More snapshots/live demo (perhaps)

Page 3: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Kyoto Treaty | Kyoto Protocolcarbon credits

Motivation

Landcare’s desire to support interoperable data

Subset of PhD research

Page 4: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

background data schemas and…

• Land Cover Data Base (LCDB)

• EcoSat

• Land Use and Carbon Analysis System (LUCAS)

Page 5: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

backgroundLCDB

• Three iterations• LCDB1• LCDB2• LCDB1.1

(or, LCDB1 second edition)

• Primarily for reporting on changes to land cover(1 ha. min. mapping unit) Source: Ministry for the Environment, 2004

Page 6: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

backgroundEcoSat

• Maps ecosystem attributes from satellite

• Regional scale – min. mapping unit 15m

• World leader in methods for removing the effect of topography from satellite imagery

Page 7: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

backgroundLUCAS

• Team housed at MfE

• Tasked with developing methods to meet the requirements of the Kyoto Protocol

• Goal is to track and quantify changes in New Zealand land use from 1990 to 2008

Page 8: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

The specific problem we are solving

• We have legends with no spatial data• ... for which we want the full map

• For example, the Kyoto Protocol

• Worth a lot to have a classified map of NZ with the Kyoto Protocol classes as its legend

Page 9: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

are they compatible?

• Would an understanding of the semantic structure of each concept in each data store surface meaningful concept relationships?

• Would meaningful concept relationships be helpful to decision makers?

• Would meaningful concept relationships enhance our understanding of New Zealand’s carbon footprint?

Page 10: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

http://semdat.bestgrid.org

Page 11: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

http://semdat.bestgrid.org

Page 12: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

http://semdat.bestgrid.org

Page 13: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

https://wiki.auckland.ac.nz/display/knowcomp/SemDat+Users+Manual

Page 14: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

how?

• Workshop!

• Invite experts from each respective data source

• Share concept development process (pitfalls, concrete and fuzzy concepts, etc.)

Page 15: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University
Page 16: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University
Page 17: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

An example: LCDB1 and LCDB2(Land-cover database versions 1, 2(or 1b))

• LCDB1• PRIM_HORTICULTURAL 

• PLANTED_FOREST 

• PRIM_PASTORAL 

• SCRUB 

• URBAN 

• TUSSOCK 

• MINES_DUMPS 

• MANGROVE 

• COASTAL_SANDS 

• URBAN_OPEN_SPACE 

• COASTAL_WETLANDS 

• INDIGENOUS_FOREST 

• INLAND_WETLANDS 

• INLAND_WATER 

• BARE_GROUND 

•LCDB2•Matagouri •Mixed Exotic Shrubland •Orchard and Other Perennial Crops •Other Exotic Forest •Manuka and or Kanuka •Mangrove •Landslide •Low Producing Grassland •Major Shelterbelts •Pine Forest - Closed Canopy •Pine Forest - Open Canopy •Surface Mine •Tall Tussock Grassland •Transport Infrastructure •Urban Parkland/ Open Space •Sub Alpine Shrubland •Short-rotation Cropland •Permanent Snow and Ice •River •River and Lakeshore Gravel and Rock •Lake and Pond •Indigenous Forest •Built-up Area •Coastal Sand and Gravel •Deciduous Hardwoods •Depleted Tussock Grassland •Broadleaved Indigenous Hardwoods •Alpine Gravel and Rock •Vineyard •Afforestation (not imaged) •Alpine Grass-/Herbfield •Dump •Estuarine Open Water •Herbaceous Freshwater Vegetation •Herbaceous Saline Vegetation •High Producing Exotic Grassland •Grey Scrub •Gorse and Broom •Fernland •Flaxland •Forest Harvested •Afforestation (imaged, post LCDB 1) 

• These databases largely come from the same source• Yet, their legends render them incompatible

• For instance, we couldn’t easily compare some class between LCDB1 and LCDB2

• We need a mapping

Page 18: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Can we fix it? (yes we can)

• LCDB1• PRIM_HORTICULTURAL 

• PLANTED_FOREST 

• PRIM_PASTORAL 

• SCRUB 

• URBAN 

• TUSSOCK 

• MINES_DUMPS 

• MANGROVE 

• COASTAL_SANDS 

• URBAN_OPEN_SPACE 

• COASTAL_WETLANDS 

• INDIGENOUS_FOREST 

• INLAND_WETLANDS 

• INLAND_WATER 

• BARE_GROUND 

•LCDB2•Matagouri •Mixed Exotic Shrubland •Orchard and Other Perennial Crops •Other Exotic Forest •Manuka and or Kanuka •Mangrove •Landslide •Low Producing Grassland •Major Shelterbelts •Pine Forest - Closed Canopy •Pine Forest - Open Canopy •Surface Mine •Tall Tussock Grassland •Transport Infrastructure •Urban Parkland/ Open Space •Sub Alpine Shrubland •Short-rotation Cropland •Permanent Snow and Ice •River •River and Lakeshore Gravel and Rock •Lake and Pond •Indigenous Forest •Built-up Area •Coastal Sand and Gravel •Deciduous Hardwoods •Depleted Tussock Grassland •Broadleaved Indigenous Hardwoods •Alpine Gravel and Rock •Vineyard •Afforestation (not imaged) •Alpine Grass-/Herbfield •Dump •Estuarine Open Water •Herbaceous Freshwater Vegetation •Herbaceous Saline Vegetation •High Producing Exotic Grassland •Grey Scrub •Gorse and Broom •Fernland •Flaxland •Forest Harvested •Afforestation (imaged, post LCDB 1) 

• Build a mapping from one to other, or..• Build an ontology which contains and links them• The mapping will fall out of the ontology

naturally

Page 19: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Ontologies • An ontology is stored as a set of triples

• Subject predicate object• John hasColour Orange

• Some predicates are special• John subClassOf People• John sameAs John

• Our mapping could be an ontology directly• LCDB2:River subClassOf LCDB1:InlandWater

• There are also some very comprehensive ontologies available that relate many concepts together• eg Sweet

• By making our mapping via an ontology we leverage:• Previously identified relationships between general concepts• Inference engines and data stores to hold our mapping

Page 20: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

The system

LCDB 2

Spatial

Legend

LUCAS

Spatial

Legend

Hybrid Map

LCDB2 Spatial

Lucas LegendKyoto

Legend

Kyoto Legend(there is no map)

Map 2Map 1Ontology Alignment(Brodaric’s Engine, GIN)

Page 21: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

SNAPSHOTS/LIVE DEMO

Page 22: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Conclusions• Spatial data format is highly standardized

• Legends can be also• The SemDat site uses an ontology to relate a given

virtual legend and a spatial legend attached to a map.• Any legend well-connected to the ontology may be

rendered as the legend of any other map with a legend that is connected to the ontology

• The site allows multiple types of download• WMS• WFS• Shapefil

• Chinese province – next test case (supports Madarin)

• Ola – Workshop at GIScience?

Page 23: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Technology choices• Ontology storage/inference –

• Sesame• Good choice

• Map server – happy medium• Mapserver for WMS• Fast – mediation via SLD files

• Geoserver for WFS/Shapefile• Flexible – mediation via features• Issues with memory yet to be sorted out

• Map storage• Both postgis/postgresql and as shapefiles• Found postgis to be about four times slower for WMS

• Site• Custom Javascript• OpenLayers (Javascript) for WMS

• Server interface• PHP

Page 24: SemDat: A Web-Based Interactive, Flexible Translation Service for Classification Systems and Taxonomies Center for eResearch & School of Environment University

Questions

Tawan Banchuen, PhD

[email protected]

http://wiki.auckland.ac.nz (keyword: knowledge comp)

http://jira.auckland.ac.nz (knowledge computing project)

NZ eResearch Symposium http://www.eresearch.org.nz

Eclipse RAP http://www.eclipse.org/rap