geoscience knowledge representation using the sweet ontologies

29
Geoscience Knowledge Representation Using the SWEET Ontologies Rob Raskin Jet Propulsion Laboratory

Upload: mimis

Post on 19-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Geoscience Knowledge Representation Using the SWEET Ontologies. Rob Raskin Jet Propulsion Laboratory. Transforming Data into Knowledge. Data InformationKnowledge. Basic Elements Bytes NumbersModelsFacts - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Geoscience Knowledge Representation Using the SWEET Ontologies

Geoscience Knowledge Representation Using the SWEET Ontologies

Rob RaskinJet Propulsion Laboratory

Page 2: Geoscience Knowledge Representation Using the SWEET Ontologies

Transforming Data into Knowledge

Data Information Knowledge

Basic Elements Bytes Numbers Models FactsServices Ingest Archive Visualize Infer Understand PredictStorage File Database HDF-EOS GIS MIS Ontology MindInteroperability Syntactic OPeNDAP WMS/WCS SemanticVolume/Density High/Low Low/HighStatistics Checksum Moments Descriptive InferentialAnalysis Fourier Wavelet EOF SSAMethodology Exploratory-analysis Model-based-mining

Syntax Semantics

Page 3: Geoscience Knowledge Representation Using the SWEET Ontologies

What is Knowledge? Facts, relations, meanings, contexts

Organized information Core ingredient in “common sense” Common understanding In a form to apply reasoning/inference Dynamic

Expandable

Page 4: Geoscience Knowledge Representation Using the SWEET Ontologies

Let’s eat, Grandma.Let’s eat Grandma.

Time flies like an arrow.Fruit flies like a pie.

Semantic Understanding is Semantic Understanding is Difficult!Difficult!

Variable t: temperatureVariable t: time

Sea surface temperature: measured 3 m above surface Sea surface temperature: measured at surface

LA Times headline

Data quality= 5

“Mission accomplished. Major combat operations in Iraq have ended”

Page 5: Geoscience Knowledge Representation Using the SWEET Ontologies

Database vs Knowledge Base Database

Entities and Relations Closed world

All facts included

Knowledge base Classes and Properties Collection of facts Captures corporate memory Open world

Facts not stated may be either true or untrue

Page 6: Geoscience Knowledge Representation Using the SWEET Ontologies

PO.DAAC Knowledge Bases

PeopleDocuments

Data Products

MetadataTools/

Services

Roles/Tasks

ScienceConcepts

Missions

Applications

Instruments

Web Pages

DataProcessing

Organiza-tions

Announce-ments

Inquiries Computers

(Docushare)

Public access

Page 7: Geoscience Knowledge Representation Using the SWEET Ontologies

Relations People have roles Instruments measure science

parameters Inquiries relate to data products etc.

Page 8: Geoscience Knowledge Representation Using the SWEET Ontologies

Example of Knowledge-Assisted Service Yellow Page Lookup:

cars vs automobiles Hotels vs motels vs resorts

Page 9: Geoscience Knowledge Representation Using the SWEET Ontologies

Semantic-based Service Example: Google Type into Google: “gymnasiums in Seattle”

Generates map of Seattle with dots locating gyms Google understands that

Seattle is a place Gymnasiums is a place-based service

Google understands semantics so that the search results also could include

locations near Seattle Similar services (e.g., health club)

Page 10: Geoscience Knowledge Representation Using the SWEET Ontologies

Assertion of Facts as TriplesSubject-Verb-Object representation

Flood subClassOf WeatherPhenomena HDF subClassOf FileFormat Pressure subClassOf PhysicalProperty

Ocean hasSubstance Water AIRS measures Temperature

Page 11: Geoscience Knowledge Representation Using the SWEET Ontologies

Applications Software tools can find “meaning” in resources for

Discovery Fusion Lineage …

Requirements Data products associated with objects in “science concept space”

Richer descriptions than DIFs Data services associated with objects in “service concept space”

Richer descriptions than SERFs Search/fusion tools that exploit ontologies

Page 12: Geoscience Knowledge Representation Using the SWEET Ontologies

Semantic Web Vision Web page creators place XML tags around

technical terms on web pages XML tags point to knowledge base where

term is “defined” Search tools use this information to provide

value-added services Common search engines (Google) use these

capabilities only minimally, at present

Page 13: Geoscience Knowledge Representation Using the SWEET Ontologies

Ontologies

Current preferred method to store “facts” General definition: “all that is known” Computer science definition: Machine-readable definition

of terms and how they relate to one another As with a dictionary, terms are defined in terms of other terms

Provide shared understanding of concepts Support knowledge reuse Support machine-to-machine communications with

deeper semantics than controlled vocabulary

Page 14: Geoscience Knowledge Representation Using the SWEET Ontologies

XML-based Ontology Languages

XML satisfies desired properties for language syntax Readable by both humans and machines

However, there are too many possible ways that XML tags can be named and used

No standardization of XML tag meanings as in HTML (<b> </b> pair => renders in bold)

Additional standardized semantics needed to exploit shared understanding of concepts

Page 15: Geoscience Knowledge Representation Using the SWEET Ontologies

RDF and OWL

W3C has adopted languages that specialize XML Resource Description Formulation (RDF)

Ontology Web Language (OWL) Languages predefine specific tags

RDF: Class, subclass, property, subproperty, … RDF and OWL form a nested collection of languages, each roughly

a specialization of the preceding language with further shared understanding

XML RDF RDFS OWL Lite OWL DL OWL Full

Page 16: Geoscience Knowledge Representation Using the SWEET Ontologies

Semantic Web for Earth and Environmental Terminology (SWEET)

SWEET is a concept space Enables scalable classification of Earth system science

concepts Currently being expanded to Space science

Anybody can import, expand, and specialize the work of others

No need to regenerate a physics, chemistry, or math ontology Concept space is translatable into other

languages/cultures using “sameAs” notions

Page 17: Geoscience Knowledge Representation Using the SWEET Ontologies

Non-LivingSubstances

LivingSubstances

PhysicalProcesses

Earth Realm

PhysicalProperties

Time

NaturalPhenomena

Human Activities

Integrative Ontologies

Space

Data

SWEET Ontologies and Their Interrelationships

Faceted Ontologies

Units

Numerics

Page 18: Geoscience Knowledge Representation Using the SWEET Ontologies

SWEET as an Upper Level Earth Science Ontology

Math Physics Chemistry

Space

TimeProperty

EarthRealmProcess, Phenomena

SubstanceData

StratosphericChemistry

Biogeochemistry Specialized domains

import

import

SWEET

Page 19: Geoscience Knowledge Representation Using the SWEET Ontologies

Why an Upper-Level Ontology for Earth System Science?

Many common concepts used across Earth Science disciplines (such as properties of the Earth)

Provides common definitions for terms used in multiple disciplines or communities

Provides common language in support of community and multidisciplinary activities

Provides common “properties” (relations) for tool developers Reduced burden (and barrier to entry) on creators of

specialized domain ontologies Only need to create ontologies for incremental knowledge

Page 20: Geoscience Knowledge Representation Using the SWEET Ontologies

How SWEET was Initially Populated

Initial sources GCMD

Over 10,000 datasets Over 1000 keywords Data providers submit far more than the 1000 terms for “free-text” search

CF Over 500 keywords Very long term names

surface_downwelling_photon_spherical_irradiance_in_sea_water

Decomposed into facets

Page 21: Geoscience Knowledge Representation Using the SWEET Ontologies

Spatial Ontology

Concepts of 0-D, 1-D, 2-D, and 3-D objects Default coordinate system: lat/lon/up Polygons used to store spatial extents

Spatial attributes added (population, area, etc.) Scientific applications include: geology to represent

3-D structure

Page 22: Geoscience Knowledge Representation Using the SWEET Ontologies

Numerical Ontologies

Numerics Extents: interval, point, 0, positiveIntegers, … Relations: lessThan, greaterThan, …

SpatialEntities Extents: country, Antarctica, equator, inlet, … Relations: above, northOf, …

TemporalEntities Extents: duration, century, season, … Relations: after, before, …

Page 23: Geoscience Knowledge Representation Using the SWEET Ontologies

Numerical Ontologies (cont.) Numeric concepts defined in OWL only through

standard XML XSD spec Intervals defined as restrictions on real line

Numerical relations defined in SWEET lessThan, max, …

Cartesian product (multidimensional spaces) added in SWEET

Numeric ontologies used to define spatial and temporal concepts

Page 24: Geoscience Knowledge Representation Using the SWEET Ontologies

Conceptual Ontologies

Phenomena ElNino, Volcano, Thunderstorm, Deforestation) Each has associated, spatial/temporal extent, EarthRealms,

PhysicalProperties etc. Specific instances included

e.g., 1997-98 ElNino

Human Activities Fisheries, IndustrialProcessing, Economics, Public Good

State History or state of planet or component

Page 25: Geoscience Knowledge Representation Using the SWEET Ontologies

SWEET Users

ESML- Earth Science Markup Language ESIP - Earth Science Information Partner Federation GEON- Geosciences Network GENESIS- Global Environmental & Earth Science

Information System IRI- International Research Institute (Columbia) LEAD- Linked Environments for Atmospheric

Discovery MMI- Marine Metadata Initiative NOESIS PEaCE- Pacific Ecoinformatics and Computational Ecology SESDI- Semantically Enabled Science Data Integration VSTO- Virtual Solar-Terrestrial Observatory

Page 26: Geoscience Knowledge Representation Using the SWEET Ontologies

Collaboration Web Site

Discussion tools Blog, wiki, moderated discussion board

Version Control/ Configuration Management Trace dependencies on external ontologies Tools to search for existing concepts in registered

ontologies Ontology Validation Procedure

W3C note is formal submission method Registry/discovery of ontologies Support workflows/services for ontology development

Page 27: Geoscience Knowledge Representation Using the SWEET Ontologies

Community Issues

Content Maintain alignment given expansion of classes and

properties Standards and Conventions

Agreement on standards for use of OWL Fuzzy representation conventions

Review Board Who will oversee and maintain for perpetuity (or at least

through the next funding cycle) ESIP Federation? ESSI?

Global Support Provide tools to visualize and appreciate the big picture

Page 28: Geoscience Knowledge Representation Using the SWEET Ontologies

Update/Matching Issues

No removal of terms except for spelling or factual errors

Subscription service to notify affected ontologies when changes made

Must avoid contradictions Additions can create redundancy if sameAs not used Humans must oversee “matching”

CF has established moderator to carry out analogous additions

OWL “import” imports entire file Associate community with ontology terms

Community tagging

Page 29: Geoscience Knowledge Representation Using the SWEET Ontologies

Best Practices Keep ontologies small, modular

Be careful that “Owl:Import” imports everything

Use higher level ontologies where possible Identify hierarchy of concept spaces

Model schemas Try to keep dependencies unidirectional