A standard information transfer model scopes the ontologies required for observations
Simon Cox, Laurent Lefort TDWG, Fremantle, 2008-09-30
CSIRO Cox/TDWG 2008
Science relies on observations
• Provides evidence & validation• Involves sampling
• A domain-independent terminology and information-model• Supports data discovery and integration across discipline boundaries
• Scopes ontology developments
CSIRO Cox/TDWG 2008
What is “an Observation”
• Observation act involves a procedure applied at a specific time• The result of an observation is an estimate of some property• The observation domain is a feature of interest at some time
• After Fowler & Odell ca. 1997
CSIRO Cox/TDWG 2008
Examples
• The 7th banana weighed 270gm on the kitchen scales this morning
• The attitude of the foliation at outcrop 321 of the Leederville Formation was 63/085, measured using a Brunton on 2006-08-08
• Specimen H69 was identified on 1999-01-14 by Amy Bachrach as Eucalyptus Caesia
• The image of Camp Iota was obtained by Aster in 2003
• Sample WMC997t collected at Empire Dam on 1996-03-30 was found to have 5.6 g/T Au as measured by ICPMS at ABC Labs on 1996-05-31
• The X-Z Geobarometer determined that the ore-body was at depth 3.5 km at 1.75 Ga
• The GCM simulation run today using CMIP3 indicated that the pressure field in the atmosphere tomorrow will be as given in pf999_20081020_1
CSIRO Cox/TDWG 2008
In “pictures”
O7 :Observation
notessamplingTime="this morning"
270g :Measureweight :PropertyType
KitchenScales :Process7th Banana :
AnyFeature
featureOfInterestprocedure
observedProperty result
O-LF :Observation
notessamplingTime="2006-08-08"
63/085 :Recordfoliation orientation :
PropertyType
Brunton :ProcessLeederville
Formation :AnyFeature
featureOfInterestprocedure
observedProperty result
O_SAB :Observation
notessamplingTime="1999-01-14"
Eucalyptus Caesia :
AnyDefinition
species :PropertyType
Amy Bachrach :Process
H69 :Specimen
featureOfInterest procedure
observedProperty result
O_I345 :Observation
notessamplingTime="2003"
ASgh67c :ImageIR Radiance :PropertyType
Aster :ProcessCamp Iota :AnyFeature
featureOfInterestprocedure
observedProperty result
O_Assay689 :Observation
notessamplingTime="1996-03-30"resultTime="1996-04-16"
5.6 ppm :Measure
gold concentration :PropertyType
ICPMS :ProcessWMC997t :Specimen
Empire Dam :GeologicUnit
ABC Labs :AnyFeature
locationsampledFeature featureOfInterest
procedure
observedProperty result
O_Depth :Observation
notessamplingTime="-1.75Ga"
3.5km :Measuredepth :PropertyType
X-Z geobarometer :
Process
BF Ore Body :GeologicUnit
featureOfInterestprocedure
observedProperty result
Run999_20081020_1 :Observation
notessamplingTime="tomorrow"resultTime="today"
pf999_20081020_1 :CV_DiscreteGridPointCoverage
pressure :PropertyType
CMIP3 :ProcessGrid999 :
SamplingSolid
EarthAtmosphere
featureOfInterest
procedure
observedPropertyresult
sampledFeature
CSIRO Cox/TDWG 2008
Generic pattern for observation data
An Observation is an action whose result is an estimate of the value of some property of the feature-of-interest, obtained using a specified procedure
Where’s the “observation location”?
In the feature-of-interest - this reconciles remote, lab, and in-situ observations
OM_Observation
+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]
Any{n}
OF_PropertyType
OF_AnyFeature OM_Process
resultobservedProperty
1
propertyValueProvider
0..*
featureOfInterest1
generatedObservation
0..*
procedure1
Name:Package:Version:Author:
Overview: Observation«Application Schema» OM Observation Core1.0Simon Cox
Conformant with ISO 19100 CSL and meta-model
CSIRO Cox/TDWG 2008
OM_Observation
+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]
Any{n}
OF_PropertyType
OF_AnyFeature OM_Process
resultobservedProperty
1
propertyValueProvider
0..*
featureOfInterest1
generatedObservation
0..*
procedure1
Name:Package:Version:Author:
Overview: Observation«Application Schema» OM Observation Core1.0Simon CoxO&M vs. OBOE
An Observation is an action whose result is an estimate of the value of some property of the feature-of-interest, obtained using a specified procedure
NCEAS OBOE: An Observation is the Measurement of the Value of a Characteristic of some Entity in a particular Context
CSIRO Cox/TDWG 2008
TDWG examples
O2008_786s :Observation
E. Caesia :ScopedName
Taxon :PropertyType
FlipDibner :Process
OOc2008_786 :OrganismOccurence
O2008_786t :Observation
2008-10-18T18:17:00.00+08:00 :DateTime
Time :PropertyType
FlipDibnersWatch :Process
O2008_786x :Observation
-31.5 115,5 :DirectPosition
Location :PropertyType
FlipDibnersGPS :Process
observedProperty
featureOfInterest
procedure
result
procedure
result
observedProperty
observedProperty
result
procedurefeatureOfInterestfeatureOfInterest
Survey_2008_996 :Observation
notessamplingTime="2008-10-18"
26 :Integer
OrganismCount :PropertyType
FlipDibner :ProcessPlot_996 :
SamplingSurface
Ecosystem_abt417h
E. Caesia :ScopedName
taxon
observedPropertyresult
procedurefeatureOfInterest
sampledFeature
CSIRO Cox/TDWG 2008
Sampling strategies and relationshipsDomain feature type
Observation
SamplingFeature
+ samplingTime [0..1]+ parameter [0..*]
AnyFeature
relatedObservation
0..*
Intention
sampledFeature
SamplingPoint
Specimen
+ materialClass+ samplingMethod [0..1]+ samplingLocation [0..1]+ size [0..1]+ currentLocation [0..1]
SpatiallyExtensiveSamplingFeature
SamplingCurve
+ length [0..1]
SamplingSurface
+ area [0..1]
SamplingSolid
+ volume [0..1]
Station
Section
MapHorizonPlot
Mine
Traverse
Borehole
Traverse
0..*Complex
relatedSamplingFeature 0..*
CSIRO Cox/TDWG 2008
What’s this got to do with Ontologies?
• UML is a formal language• UML vs. OWL … similarly expressive
• Especially if UML profile and «stereotype» used
Ontologies for observations
Obrst 2006 - Ontology Spectrum: One View
weak semanticsweak semantics
strong semanticsstrong semantics
Is Disjoint Subclass of with transitivity property
Modal Logic
Logical Theory
Thesaurus Has Narrower Meaning Than
TaxonomyIs Sub-Classification of
Conceptual Model Is Subclass of
DB Schemas, XML Schema
UML
First Order Logic
RelationalModel, XML
ER
Extended ER
Description LogicDAML+OIL, OWL
RDF/SXTM
Syntactic Interoperability
Structural Interoperability
Semantic Interoperability
From le
ss to m
ore expre
ssive
Ontologies for observations
Obrst 2006- Ontology Spectrum: One View
weak semanticsweak semantics
strong semanticsstrong semantics
Is Disjoint Subclass of with transitivity property
Modal Logic
Logical Theory
Thesaurus Has Narrower Meaning Than
TaxonomyIs Sub-Classification of
Conceptual Model Is Subclass of
DB Schemas, XML Schema
UML
First Order Logic
RelationalModel, XML
ER
Extended ER
Description LogicDAML+OIL, OWL
RDF/SXTM
Syntactic Interoperability
Structural Interoperability
Semantic Interoperability
From le
ss to m
ore expre
ssive
Problem: Very GeneralSemantic Expressivity: Very High
Problem: Local Semantic Expressivity: Low
Problem: GeneralSemantic Expressivity: Medium
Problem: GeneralSemantic Expressivity: High
CSIRO Cox/TDWG 2008
What’s this got to do with Ontologies?
• UML is a formal language• UML vs. OWL … similarly expressive
• Especially if UML profile and «stereotype» used
• ISO 19103 profile models may be transformed into OWL without too much difficulty
• ISO 19150 will define a UML→OWL rule
• … but converting the O&M model just gives you an OWL representation of the schema
• Is this useful for reasoning?
• O&M model scopes the (discipline specific) ontologies required for observational data
• and describes the relationships between them
CSIRO Cox/TDWG 2008
Discipline or community profile
• feature of interest
• Types define a domain-model(e.g. Plot, Ecosystem, OrganismOccurence)
• observed property
• Belongs to the type of the feature-of-interest (e.g. organism count, taxon, time, location)
• procedure
• Standard procedures, suitable for the property-type
OM_Observation
+ samplingTime+ resultTime [0..1]+ procedureOperator [0..1]+ parameter [0..*]+ resultQuality [0..1]
Any{n}
OF_PropertyType
OF_AnyFeature OM_Process
resultobservedProperty
1
propertyValueProvider
0..*
featureOfInterest1
generatedObservation
0..*
procedure1
Name:Package:Version:Author:
Overview: Observation«Application Schema» OM Observation Core1.0Simon Cox
• result
• Standard scales suitable for the property-type (e.g. taxonomy)
CSIRO Cox/TDWG 2008
Ontology enabled profiles
• Step one: align ontology and O&M skeleton
• Step two: round trip transformation• Transform UML model into OWL (done)• Use OWL to develop vocabularies on top of O&M skeleton• Use extended UML-based MDA process to generate XML schemas
• Motivations• Better quality vocabularies• Greater consistency of the conceptual model
CSIRO Cox/TDWG 2008
Vocabularies dependencies in O&M
VA
LUE
HO
W
WH
O
WH
AT
WHEN
WHEREIN
WHATObservationSampling Feature
Observed property
Metadata
Procedure
Result
Time*
Gen
eri
c
typ
e
Geometrical types
Fe
atu
re-
ind
ep
en
de
nt
pa
ram
ete
rs
Units
Quantities
Taxa
Chemistry
Temporal types
Ob
se
rva
tio
n
Geo
me
try
Coord. Sys
Vertical Coord. Sys
Medium
FractionP
roce
du
res
Pro
ces
sin
g
ch
ain
ty
pe Processing &
interpolation
Validation & quality flag
Sensor (Instrument)
Station Platform
SiteWater
Feature
Result type
Sampled Feature
Institution and project
System and author
Fie
ld &
La
b
me
tho
ds
Security classif.
Transaction type
Gauge/weir layout/profile
Missing data
Ob
se
rva
tio
ns
co
des
Fe
atu
rec
od
es
Fe
atu
re-
de
pen
de
nt
pa
ram
ete
rs
Feature property
Par
a-
me
ters
??
??
Survey type
Process
Action
Event
Act
ion
co
des
Eve
nt
co
de
s
Multi-dependent concepts
Feature-dep. parameters
Feature-indep. parameters
Abstract concepts
Semi-abstract concepts
Semi-primitive concepts
Primitive concepts
O&M amd GFM stereotypes
Simple classes
Classes w/ ident. instances
Onto category to be defined
Time* : two O&M stereotypes (sampling time and result time)
Features types
??
Voc
abul
arie
sU
ML
defs
CSIRO Cox/TDWG 2008
Development and validation of “O&M”
• Developed in the context of • Geochemistry/Assay data• OGC Sensor Web Enablement – environmental and remote sensing
• Subsequently applied in• Water resources/water quality• Oceans & Atmospheres • Natural resources• Taxonomic data• Geology field data
CSIRO Cox/TDWG 2008
Scopes the ontologies for domain observations
• Feature types (feature of interest, sampling features)• Observed properties• Observation procedures, instruments, algorithms• Scales, taxonomies
CSIRO Cox/TDWG 2008
O&M Status
• OGC Standard 2007• ISO 19156 – upcoming
• Key aspect of GeoSciML• Basis for WaterML v2• Basis for Climate Science ML
CSIRO Cox/TDWG 2008
Motivation for developing a common model
• Cross-domain data discovery and fusion• Re-usable service interfaces
Thank you
Exploration & MiningSimon CoxResearch Scientist
Phone: +61 8 6436 8639Email: [email protected]: www.csiro.au/em
Contact UsPhone: 1300 363 400 or +61 3 9545 2176
Email: [email protected] Web: www.csiro.au
CSIRO Cox/TDWG 2008
Generations of “standards” & integration complexity
ASCII-based
DB-based
Registries
XML
Model-driven generation of XML schemas
Custom XSL transfo. & web services
Distributed systems with same db schema
UML & XML schemas
XML schemas Reuseable XML schema stack
Master Data Managt
OWL ontologies Semantic integration
EPA STORET
EPA WQX
GWML
WDTF
WFD schemas
eWater (EU)
SANDRE
SANDRE XML
Surface water & groundwater “standards”
Integration support
Stan
dard
use
rsSt
anda
rd d
evel
oper
s
ODM
WaterML (CUAHSI)
CSIRO Cox/TDWG 2008
Our Science is changing: scale
From small scale siloed
studies
To Integration on a global
scale
AtomMolecule
Mineral
Rock
Outcrop
Section
Mountain
Continent
Planet
Source: Office of Integrative Activities NSF
CSIRO Cox/TDWG 2008
Our Science is changing: interdisciplinary
Source: US Global Change Research Program
CSIRO Cox/TDWG 2008
Ontological value of the Observations & Measurements standard
• Two user-managed class hierarchies in GFM-based specs: • Feature and FeaturesCollection: a Feature-type is characterized by a specific set of
properties• Up to five user-managed class hierarchies in O&M-based specs
• Observation, SamplingFeature, PropertyType, Procedure and Result• An Observation is an Event whose result is an estimate of the value of some Property of the
Feature-of-interest, obtained using a specified Procedure
• Stronger ontological value for O&M• More branches and separation of concern: • Example: Difference between Feature and SamplingFeature
• Feature for the real world objects e.g. an aquifer• SamplingFeature to characterise how a measure is done e.g. along a borehole
CSIRO Cox/TDWG 2008
Normalised ontology skeleton for water observation vocabularies
Define the right branches at the top
Isolate unambiguous primitives (e.g. units)
Use modules/namespace/URIs to position source-specific definitions against common ones
CSIRO Cox/TDWG 2008
Observation data interface
• OGC “Sensor Observation Service”• http interface to sensor observations
• c.f. WFS, WCS, WMS
• Request parameters scoped by O&M model • featureOfInterest
• observedProperty
• Procedure
• Response is XML-encoded O&M
CSIRO Cox/TDWG 2008
Proximate vs ultimate feature-of-interest
Ultimate (“project”) thing of interest often not directly or fully accessible
1. Proximate feature of interest embodies a sample design• Rock-specimen samples an ore-body or geologic unit• Well samples an aquifer• Profile samples an ocean/atmosphere column• Cross-section samples a rock-unit
2. Sensed property is a proxy• e.g. want land-cover, but observe colour
Some sampling designs are common across disciplines
CSIRO Cox/TDWG 2008
Water quality of aquifers observed in wells
Ev ertsWell
Leederv illeFormation
CottesloeWedge
GnangaraMound
Interv a lEW2 :SamplingCurv e
WQ3/1 :Observ ation
WQ2/1 :Observ ation
WQ2/2 :Observ ation
WQ3/1 :Observ ation
EW/EW1 :SamplingFeatureRelation
notesrole="intervalHost"
EW/EW2 :SamplingFeatureRelation
notesrole="intervalHost"
Fooglemeter 2000 :Observ ationProcess
Farkleme ter XP :Observ ationProcess
NWC200 7/WQ :Observ ationCollection
RobsWe ll
Interv a lEW1 :SamplingCurv e
WQ3/1/r :CV_DiscreteTime InstantCov erage
WaterQuality :PropertyType
sampledFeature
relatedSamp lingFeature
sampledFeature
featureOfInterest
relatedOb servation
target
target
proce dure
observedPropertyobservedProperty
proce dure
observedProperty
proce dure
observedProperty
featureOfInterestrelatedOb servation
result
member
relatedOb servation
relatedOb servation
sampledFeature
relatedOb servation
sampledFeature
featureOfInterest
relatedOb servation
sampledFeature
relatedSamp lingFeature
featureOfInterestrelatedOb servationmember
member
memberproce dure
Nam e:Package:Version:Author:
Well&IntervalsExamples1.0Simon Cox
CSIRO Cox/TDWG 2008
Water quality measured along a ferry track
FerryTrack 20071113 :SamplingCurv e
notesshape=Curve987:GM_LineString
WQ200711113-1 :Observ ation
WQ/C-9 (Water Quality) :PropertyType
Albemarle-Pamlico Sound :AnyFeature
Fooglemeter 2000 :Observ ationProcess
WQ20071113/r1 :Rec ord
WQ200711113-2 :Observ ation
WQ20071113/r2 :Rec ord
WQ200711113-3 :Observ ation
WQ20071113/r3 :Rec ord
Station1 :Sa mplingPoint
notesposition=Point1:GM_Point
Station2 :Sa mplingPoint
notesposition=Point2:GM_Point
Station3 :Sa mplingPoint
notesposition=Point3:GM_Point
FT/S1 :SamplingFeatureRelation
notesrole="host track"
FT/S2 :SamplingFeatureRelation
notesrole="host track"
FT/S3 :SamplingFeatureRelation
notesrole="host track"
These stationsmust l ie on this track
relatedOb servation
target
sampledFeature
target
sampledFeature
featureOfInterest
relatedSamp lingFeature
target
observedProperty
featureOfInterest
proce dure
result
result
relatedOb servation
sampledFeature
relatedSamp lingFeature
sampledFeature
featureOfInterest
proce dure
observedProperty
result
proce dure
observedProperty
relatedSamp lingFeature
relatedOb servation
Nam e:Package:Version:Author:
FerrySam pling-PExamples1.0Simon Cox
FerryTrack 20071113 :SamplingCurv e
notesshape=Curve987:GM_LineString
WQ/C-9 (Water Quality) :PropertyType
Albemarle -Pamlico Sound :AnyFeature
Fooglemeter 2000 :Observ ationProcess
WQ200711113 :Observ ation
WQ20071113/r :CV_DiscretePointCov erage
sampledFeature
relatedOb servationproce dure
featureOfInterest
result
observedProperty
Nam e:Package:Version:Author:
FerrySam pling-CExamples1.0Simon Cox
CSIRO Cox/TDWG 2008
Patterns?
• Much of the interest concerns • relations between sampling features,
• associations with the domain (sampled) features
• i.e. sampling regimes are core
CSIRO Cox/TDWG 2008
OGC Sensor Web Enablement
• OGC Web Services testbeds• OWS-1 2001 – OWS-5 2007
• Core elements of OGC SWE suite• SensorML – provider-centric information viewpoint
• O&M – consumer-centric information viewpoint
• SOS, SAS – http interface to observations
• SPS – tasking interface
• sweCommon – data-types & encodings, including coverage encoding
• TML – low-level sensor streams
CSIRO Cox/TDWG 2008
SOS
getObservation
getResult
describeSensor
getFeatureOfInterest
Accessing data using the “Observation” viewpoint
WFS/Obs
getFeature, type=Observation
WCS
getCoverage
getCoverage(result)
Sensor Register
getRecordById
WFSgetFeature
e.g. SOS::getResult == “convenience” interface for WCS
CSIRO Cox/TDWG 2008
WFS/SFS
Accessing data using the “Sampling Feature Service” viewpoint
WFSgetFeature
WCSgetCoverage
getCoverage(property value)
SOSgetObservation
Commondata
source
getFeature(sampling Feature)
getFeature(coverage property value)
getFeature(relatedObservation)
getCoverage(result)
SensorRegister
getRecordById (procedure)
getFeature(featureOfInterest)
getObservation(relatedObs)
getResult(property value)
CSIRO Cox/TDWG 2008
WFS
Accessing data using the “Domain Feature” viewpoint
WCSgetCoverage(property value)
getFeatureSOS
getResult(property value)
The “George Percivall preferred™” viewpoint #1– observations are property-value-providers for features
??
CSIRO Cox/TDWG 2008
WCS
Accessing data using the “just the data” viewpoint
WFSgetFeature/geometry(domain exent)
getCoverageSOS
getResult (lots of ‘em)(range values)
The “George Percivall preferred™” viewpoint #2 – observations are range-value-providers for coverages
CSIRO Cox/TDWG 2008
• need information transfer standards for• Geochemistry • Geochronology• Geophysics• Geodesy• Seismology• Hydrogeology• Marine• Ecology • Biogeology
• But need to coordinate these standards (including ontologies) to avoid uncontrolled growth of YAML (Yet Another Markup Language)
http://www.datastrategyjournal.com/index.php?option=com_content&task=view&id=18&Itemid=1
Application to other science disciplines
CSIRO Cox/TDWG 2008
Procedure vs. observedProperty
• observedProperty supports discovery by observation users• “show me all the observations of temperature and wind-speed”
• procedure provides strict definition• “how was that value obtained?”
• …or provider-centric discovery• “show me all the data collected by instrument X”
CSIRO Cox/TDWG 2008
Some properties vary within a feature
• colour of a Scene or Swath varies with position• shape of a Glacier varies with time• flow at a Station varies with time• rock density varies along a Borehole
• Variable values may be described as a Function on some axis of the feature
• Corresponding Observation/result is a Function • If domain is spatio-temporal, also known as coverage or map
CSIRO Cox/TDWG 2008
Variable property coverage valued result
«FeatureType»PointCov erageObserv ation
«FeatureType»Observ ation
«FeatureType»DiscreteCov erageObserv ation
CV_DiscreteElementCov erage
«FeatureType»ElementCov erageObserv ation
CV_DiscreteTimeInstantCov erageCV_DiscretePointCov erage{n}
CV_Coverage
CV_DiscreteCov erage{n}
«FeatureType»TimeSeriesObserv ation
result
result result result