semdat: a web-based interactive, flexible translation service for classification systems and...
TRANSCRIPT
SemDat: A Web-Based Interactive, Flexible Translation Service for
Classification Systems and Taxonomies
Center for eResearch & School of EnvironmentUniversity of Auckland
William R. SmartSina Masoud-AnsariBrandon Whitehead
Tawan BanchuenMark Gahegan
Overview
• Problem and motivation
• A quick tour
• Ontology creation
• Web app architecture
• More snapshots/live demo (perhaps)
Kyoto Treaty | Kyoto Protocolcarbon credits
Motivation
Landcare’s desire to support interoperable data
Subset of PhD research
background data schemas and…
• Land Cover Data Base (LCDB)
• EcoSat
• Land Use and Carbon Analysis System (LUCAS)
backgroundLCDB
• Three iterations• LCDB1• LCDB2• LCDB1.1
(or, LCDB1 second edition)
• Primarily for reporting on changes to land cover(1 ha. min. mapping unit) Source: Ministry for the Environment, 2004
backgroundEcoSat
• Maps ecosystem attributes from satellite
• Regional scale – min. mapping unit 15m
• World leader in methods for removing the effect of topography from satellite imagery
backgroundLUCAS
• Team housed at MfE
• Tasked with developing methods to meet the requirements of the Kyoto Protocol
• Goal is to track and quantify changes in New Zealand land use from 1990 to 2008
The specific problem we are solving
• We have legends with no spatial data• ... for which we want the full map
• For example, the Kyoto Protocol
• Worth a lot to have a classified map of NZ with the Kyoto Protocol classes as its legend
are they compatible?
• Would an understanding of the semantic structure of each concept in each data store surface meaningful concept relationships?
• Would meaningful concept relationships be helpful to decision makers?
• Would meaningful concept relationships enhance our understanding of New Zealand’s carbon footprint?
http://semdat.bestgrid.org
http://semdat.bestgrid.org
http://semdat.bestgrid.org
https://wiki.auckland.ac.nz/display/knowcomp/SemDat+Users+Manual
how?
• Workshop!
• Invite experts from each respective data source
• Share concept development process (pitfalls, concrete and fuzzy concepts, etc.)
An example: LCDB1 and LCDB2(Land-cover database versions 1, 2(or 1b))
• LCDB1• PRIM_HORTICULTURAL
• PLANTED_FOREST
• PRIM_PASTORAL
• SCRUB
• URBAN
• TUSSOCK
• MINES_DUMPS
• MANGROVE
• COASTAL_SANDS
• URBAN_OPEN_SPACE
• COASTAL_WETLANDS
• INDIGENOUS_FOREST
• INLAND_WETLANDS
• INLAND_WATER
• BARE_GROUND
•LCDB2•Matagouri •Mixed Exotic Shrubland •Orchard and Other Perennial Crops •Other Exotic Forest •Manuka and or Kanuka •Mangrove •Landslide •Low Producing Grassland •Major Shelterbelts •Pine Forest - Closed Canopy •Pine Forest - Open Canopy •Surface Mine •Tall Tussock Grassland •Transport Infrastructure •Urban Parkland/ Open Space •Sub Alpine Shrubland •Short-rotation Cropland •Permanent Snow and Ice •River •River and Lakeshore Gravel and Rock •Lake and Pond •Indigenous Forest •Built-up Area •Coastal Sand and Gravel •Deciduous Hardwoods •Depleted Tussock Grassland •Broadleaved Indigenous Hardwoods •Alpine Gravel and Rock •Vineyard •Afforestation (not imaged) •Alpine Grass-/Herbfield •Dump •Estuarine Open Water •Herbaceous Freshwater Vegetation •Herbaceous Saline Vegetation •High Producing Exotic Grassland •Grey Scrub •Gorse and Broom •Fernland •Flaxland •Forest Harvested •Afforestation (imaged, post LCDB 1)
• These databases largely come from the same source• Yet, their legends render them incompatible
• For instance, we couldn’t easily compare some class between LCDB1 and LCDB2
• We need a mapping
Can we fix it? (yes we can)
• LCDB1• PRIM_HORTICULTURAL
• PLANTED_FOREST
• PRIM_PASTORAL
• SCRUB
• URBAN
• TUSSOCK
• MINES_DUMPS
• MANGROVE
• COASTAL_SANDS
• URBAN_OPEN_SPACE
• COASTAL_WETLANDS
• INDIGENOUS_FOREST
• INLAND_WETLANDS
• INLAND_WATER
• BARE_GROUND
•LCDB2•Matagouri •Mixed Exotic Shrubland •Orchard and Other Perennial Crops •Other Exotic Forest •Manuka and or Kanuka •Mangrove •Landslide •Low Producing Grassland •Major Shelterbelts •Pine Forest - Closed Canopy •Pine Forest - Open Canopy •Surface Mine •Tall Tussock Grassland •Transport Infrastructure •Urban Parkland/ Open Space •Sub Alpine Shrubland •Short-rotation Cropland •Permanent Snow and Ice •River •River and Lakeshore Gravel and Rock •Lake and Pond •Indigenous Forest •Built-up Area •Coastal Sand and Gravel •Deciduous Hardwoods •Depleted Tussock Grassland •Broadleaved Indigenous Hardwoods •Alpine Gravel and Rock •Vineyard •Afforestation (not imaged) •Alpine Grass-/Herbfield •Dump •Estuarine Open Water •Herbaceous Freshwater Vegetation •Herbaceous Saline Vegetation •High Producing Exotic Grassland •Grey Scrub •Gorse and Broom •Fernland •Flaxland •Forest Harvested •Afforestation (imaged, post LCDB 1)
• Build a mapping from one to other, or..• Build an ontology which contains and links them• The mapping will fall out of the ontology
naturally
Ontologies • An ontology is stored as a set of triples
• Subject predicate object• John hasColour Orange
• Some predicates are special• John subClassOf People• John sameAs John
• Our mapping could be an ontology directly• LCDB2:River subClassOf LCDB1:InlandWater
• There are also some very comprehensive ontologies available that relate many concepts together• eg Sweet
• By making our mapping via an ontology we leverage:• Previously identified relationships between general concepts• Inference engines and data stores to hold our mapping
The system
LCDB 2
Spatial
Legend
LUCAS
Spatial
Legend
Hybrid Map
LCDB2 Spatial
Lucas LegendKyoto
Legend
Kyoto Legend(there is no map)
Map 2Map 1Ontology Alignment(Brodaric’s Engine, GIN)
SNAPSHOTS/LIVE DEMO
Conclusions• Spatial data format is highly standardized
• Legends can be also• The SemDat site uses an ontology to relate a given
virtual legend and a spatial legend attached to a map.• Any legend well-connected to the ontology may be
rendered as the legend of any other map with a legend that is connected to the ontology
• The site allows multiple types of download• WMS• WFS• Shapefil
• Chinese province – next test case (supports Madarin)
• Ola – Workshop at GIScience?
Technology choices• Ontology storage/inference –
• Sesame• Good choice
• Map server – happy medium• Mapserver for WMS• Fast – mediation via SLD files
• Geoserver for WFS/Shapefile• Flexible – mediation via features• Issues with memory yet to be sorted out
• Map storage• Both postgis/postgresql and as shapefiles• Found postgis to be about four times slower for WMS
• Site• Custom Javascript• OpenLayers (Javascript) for WMS
• Server interface• PHP
Questions
Tawan Banchuen, PhD
http://wiki.auckland.ac.nz (keyword: knowledge comp)
http://jira.auckland.ac.nz (knowledge computing project)
NZ eResearch Symposium http://www.eresearch.org.nz
Eclipse RAP http://www.eclipse.org/rap