semantic web: introduction & overview

Post on 07-May-2015

10.201 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

A lecture/conversation focusing on the first 12 years of Semantic Web - delivered on February 21, 2012. See http://j.mp/SWIntro for more details. More detailed course material is at http://knoesis.org/courses/web3/

TRANSCRIPT

1

Semantic Web: intro & overview

A conversation with students – Feb 21, 2012

Amit Sheth http://knoesis.org/amit

Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, OH, USA

What are two of the most important software success

stories of 2012?

Apple’s SiriIBM’s Watson

What are common technologies?

Just stepping back a bit

Semantic technologies in the mainstream

• Microsoft purchased Powerset in 2008 • Apple purchased Siri [Apr 2010]

– “Once Again The Back Story Is About Semantic Web”

• Google buys Metaweb [June 2010]...” Google Snaps Up Metaweb in Semantic Web Play” – Now see: “Google Knowledge Graph Could Change Search Forever”

• Facebook OpenGraph, Twitter annotation …”another example of semantic web going mainstream” “Google, Twitter and Facebook build the semantic web”

5

• RDFa adoption ….Search engines (esp Bing) are about to introduce domain models and (all) use of background knowledge/structured databases with large entity bases

• Bing, Yahoo! and Google announced schema.org

A bit of history• Semantics with metadata and ontologies for heterogeneous

documents and multiple repositories of data including the Web was discussed in 1990s (semantic information brokering, faceted search, InfoHarness, SIMS, Ariadne, OBSERVER, SHOE, MREF, InfoQuilt, …). Also DAML and OIL.

• Tim Berners-Lee used “Semantic Web” in his 1999 book• I had founded a company Taalee in 1999, gave a keynote on

Semantic Web & commercialization in 2000 and filed for a patent in 2000 (awarded 2001).

• Well known TBL, Hendler, Lassila paper in Scientific American took AI-ish approach (agents,…) to Semantic Web

• First 5 years saw too much of AI/DL, but more practical/applied work has dominated recently

Different foci• TBL – focus on data: Data Web (“In a way, the Semantic

Web is a bit like having all the databases out there as one big database.”)

• Others focus on reasoning and intelligent processing

123of

Semantic Web

1

• Ontology: Agreement with a common vocabulary/nomenclature, conceptual models and domain Knowledge

• Schema + Knowledge base • Agreement is what enables interoperability• Formal description - Machine processability is

what leads to automation

2

• Semantic Annotation (Metadata Extraction): Associating meaning with data, or labeling data so it is more meaningful to the system and people.

• Can be manual, semi-automatic (automatic with human verification), automatic.

From Syntax to Semantics

Shallow semantics

Deep semantics

Expr

essi

vene

ss,

Rea

soni

ng

Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics

3

• Reasoning/Computation: semantics enabled search, integration, answering complex queries, connections and analyses (paths, sub graphs), pattern finding, mining, hypothesis validation, discovery, visualization

Semantic Web Stack

• Web of Linked Data• Introduced by Berners Lee

et. al as next step for Web of Documents

• Allow “machine understanding” of data,

• Create “common” models of domains using formal language - ontologies

Layer cake image source: http://www.w3.org; see W3C SW publications

Semantic Web Layer Cake

Characteristics of Semantic Web

15

SelfDescribing

Machine &HumanReadable

Issued bya TrustedAuthority

Easy toUnderstand

ConvertibleCan beSecured

The Semantic Web:XML, RDF & Ontology

Adapted from William Ruh (CISCO)

• Resource Description Framework – Recommended by W3C for metadata modeling [RDF]

• A standard common modeling framework – usable by humans and machine understandable

Resource Description Framework

IBM

Armonk, New York, United States

Zurich, Switzerland

Location

CompanyHeadquarters located in

Research lab located in

RDF/OWL slides From: Semantic Web in Health Informatics (thanks: Satya)

• RDF Tripleo Subject: The resource that the triple is abouto Predicate: The property of the subject that is described by the tripleo Object: The value of the property

• Web Addressable Resource: Uniform Resource Locator (URL), Uniform Resource Identifier (URI), Internationalized Resource Identifier (IRI)

• Qualified Namespace: http://www.w3.org/2001/XMLSchema# as xsd:o xsd: string instead of

http://www.w3.org/2001/XMLSchema#string

RDF: Triple Structure, IRI, Namespace

IBM Armonk, New York, United States

Headquarters located in

• Two types of property values in a tripleo Web resourceo Typed literal

RDF Representation

IBM Armonk, New York, United States

Headquarters located in

IBMHas total employees “430,000”

^^xsd:integer

• The graph model of RDF: node-arc-node is the primary representation model

• Secondary notations: Triple notationo companyExample:IBM companyExample:has-

Total-Employee “430,000”^^xsd:integer .

• RDF Schema: Vocabulary for describing groups of resources [RDFS]

RDF Schema

IBM Armonk, New York, United States

Headquarters located in

Oracle

Redwood Shores, California, United States

Headquarters located in

Company

Geographical Location

Headquarters located in

• Property domain (rdfs:domain) and range (rdfs:range)

RDF Schema

Headquarters located in

Company

Domain Range

Geographical Location

• Class Hierarchy/Taxonomy: rdfs:subClassOf

rdfs:subClassOf

Computer Technology Company

SubClass (Parent) Class

Company

Banking CompanyInsurance Company

Ontology: A Working Definition• Ontologies are shared conceptualizations of a

domain represented in a formal language*• Ontologies:

o Common representation model - facilitate interoperability, integration across different projects, and enforce consistent use of terminology

o Closely reflect domain-specific details (domain semantics) essential to answer end user

o Support reasoning to discover implicit knowledge* Paraphrased from Gruber, 1993

Expressiveness Range: Knowledge Representation

and Ontologies

Catalog/ID

GeneralLogical

constraints

Terms/glossary

Thesauri“narrower

term”relation

Formalis-a

Frames(properties)

Informalis-a

Formalinstance

Value Restriction

Disjointness, Inverse,part of…

Ontology Dimensions After McGuinness and Finin

SimpleTaxonomies

ExpressiveOntologies

WordnetCYCRDF DAML

OODB Schema RDFS

IEEE SUOOWLUMLS

GO

KEGG TAMBIS

EcoCyc

BioPAX

GlycOSWETO

Pharma

• A language for modeling ontologies [OWL]

• OWL2 is declarative• An OWL2 ontology (schema) consists of:

o Entities: Company, Persono Axioms: Company employs Persono Expressions: A Person Employed by a Company =

CompanyEmployee• Reasoning: Draw a conclusion given certain

constraints are satisfiedo RDF(S) Entailmento OWL2 Entailment

OWL2 Web Ontology Language

• Class Disjointness: Instance of class A cannot be instance of class B

• Complex Classes: Combining multiple classes with set theory operators:o Union: Parent = ObjectUnionOf (:Mother :Father)o Logical negation: UnemployedPerson =

ObjectIntersectionOf (:EmployedPerson)o Intersection: Mother = ObjectIntersectionOf

(:Parent :Woman)

OWL2 Constructs

• Property restrictions: defined over property• Existential Quantification:

o Parent = ObjectSomeValuesFrom (:hasChild :Person)o To capture incomplete knowledge

• Universal Quantification:o US President = objectAllValuesFrom (:hasBirthPlace

United States)• Cardinality Restriction

OWL2 Constructs

SPARQL: Querying Semantic Web Data

• A SPARQL query pattern composed of triples• Triples correspond to RDF triple structure, but

have variable at:o Subject: ?company ex:hasHeadquaterLocation ex:NewYork.o Predicate: ex:IBM ?whatislocatedin ex:NewYork.o Object: ex:IBM ex:hasHeadquaterLocation ?

location.• Result of SPARQL query is list of values – values

can replace variable in query pattern

SPARQL: Query Patterns

• An example query patternPREFIX ex:<http://www.eecs600.case.edu/>SELECT ?company ?location WHERE{?company ex:hasHeadquaterLocation ?location.}• Query Result

company location

IBM NewYork

Oracle RedwoodCity

MicorosoftCorporation Bellevue

MultipleMatches

SPARQL: Query Forms

• SELECT: Returns the values bound to the variables• CONSTRUCT: Returns an RDF graph• DESCRIBE: Returns a description (RDF graph) of a

resource (e.g. IBM)o The contents of RDF graph is determined by SPARQL

query processor• ASK: Returns a Boolean

o Trueo False

a little bit about ontologies

Open Biomedical Ontologies

http://bioportal.bioontology.org/ , http://obo.sourceforge.net/

Many Ontologies Available Today

From simple ontologies

Drug Ontology Hierarchy (showing is-a relationships)

owl:thing

prescription_drug

_ brand_na

me

brandname_unde

clared

brandname_comp

osite

prescription_drug

monograph_ix_cla

ss

cpnum_ group

prescription_drug

_ property

indication_

property

formulary_

property

non_drug_

reactant

interaction_proper

ty

property

formulary

brandname_indivi

dual

interaction_with_prescriptio

n_drug

interaction

indication

generic_ individua

l

prescription_drug_ generic

generic_ composit

e

interaction_ with_non_ drug_react

ant

interaction_with_monograph_ix_class

to complex ontologies

N-Glycosylation metabolic pathway

GNT-Iattaches GlcNAc at position 2

UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=>

UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2

GNT-Vattaches GlcNAc at position 6

UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021

N-acetyl-glucosaminyl_transferase_VN-glycan_beta_GlcNAc_9N-glycan_alpha_man_4

A little bit about semantic metadata extractions and

annotations

WWW, EnterpriseRepositories

METADATA

EXTRACTORS

Digital Maps

NexisUPIAP

Feeds/Documents

Digital Audios

Data Stores

Digital Videos

Digital Images. . .

. . . . . .

Create/extract as much (semantics)metadata automatically as possible;

Use ontlogies to improve and enhanceextraction

Extraction for Metadata Creation

Automatic Semantic Metadata Extraction/Annotation

Semantics & Semantic Web in 1999-2002

Sample applications

• Early Semantic Search, use baby steps of today’s engines

• Enterprise applications – healthcare & life sciences, financial, security

• Driving the innovation with new types of data: sensor (Semantic Sensor Web), social (Semantic Social Web), semantic IoT/WoT

BLENDED BROWSING & QUERYING INTERFACE

ATTRIBUTE & KEYWORDQUERYING

uniform view of worldwide distributed assets of similar type

SEMANTIC BROWSING

Targeted e-shopping/e-commerce

assets access

Taalee Semantic/Faceted Search & Browsing (1999-2001)

Search for company ‘Commerce One’

Links to news on companies that compete against Commerce One

Links to news on companies Commerce One competes against

(To view news on Ariba, click on the link for Ariba)

Crucial news on Commerce One’s competitors (Ariba) can

be accessed easily and automatically

Semantic Search/Browsing/Directory (2001-….)

System recognizes ENTITY & CATEGORY

Relevant portionof the Directory is automatically presented.

Semantic Search/Browsing/Directory (2001-….)

Users can exploreSemantically related

Information.

Semantic Search/Browsing/Directory (2001-….)

Focused relevantcontent

organizedby topic

(semantic categorization)

Automatic ContentAggregationfrom multiple

content providers and feeds

Related relevant content not

explicitly asked for (semantic

associations)

Competitive research inferred

automatically

Automatic 3rd party content

integration

Equity Research Dashboard with Blended Semantic Querying and Browsing

Semagix Freedom for building ontology-driven information system

Extracting Semantic Metadata from Semistructured and Structured Sources (1999 – 2002)

Managing Semantic Content on the Web

Ontology

Semantic Query Server

1. Ontology Model Creation (Description) 2. Knowledge Agent Creation

3. Automatic aggregation of Knowledge4. Querying the Ontology

Ontology Creation and Maintenance Steps

© Semagix, Inc.

472004 SEMAGIX

Watch list Organization

Company

Hamas

WorldCom

FBI Watchlist

Ahmed Yaseer

appears on Watchlistmember of organization

works for Company

Ahmed Yaseer:• Appears on Watchlist

‘FBI’

• Works for Company ‘WorldCom’

• Member of a banned organization’

Semantic Associations - Connecting the Dots

Global Investment Bank

Fraud Prevention application used in financial services – Related KYC application is deployed at Majority of Global Banks

User will be able to navigate the ontology using a number of different interfaces

World Wide Web content

Public Records

BLOGS,RSS

Un-structure text, Semi-structured Data

Watch ListsLaw

Enforcement Regulators

Semi-structured Government Data

Scores the entity based on the content and entity relationships

EstablishingNew Account

Fast forward to 2005-2006

Semantic Web + Clinical Practice Informatics = Active Semantic Electronic Medical Record (ASEMR)

Operationally deployed in January 2006, in use (as of 2012)

ASEMR: SW application in useIn daily use at Athens Heart Center

– 28 person staff• Interventional Cardiologists• Electrophysiology Cardiologists

– Deployed since January 2006– 40-60 patients seen daily– 3000+ active patients– Serves a population of 250,000 people

Information Overload in Clinical Practice

• New drugs added to market– Adds interactions with current drugs– Changes possible procedures to treat an illness

• Insurance Coverage's Change– Insurance may pay for drug X but not drug Y even

though drug X and Y are equivalent– Patient may need a certain diagnosis before some

expensive test are run• Physicians need a system to keep track of ever

changing landscape

Active Semantic Document (ASD)A document (typically in XML) with the following features:

• Semantic annotations– Linking entities found in a document to ontology– Linking terms to a specialized lexicon [TR]

• Actionable information– Rules over semantic annotations– Violated rules can modify the appearance of the document (Show an alert)

Active Semantic Patient Record

• An application of ASD• Three Ontologies

– PracticeInformation about practice such as patient/physician data

– DrugInformation about drugs, interaction, formularies, etc.

– ICD/CPTDescribes the relationships between CPT and ICD codes

• Medical Records in XML created from database

Active Semantic Electronic Medical Record App

In Use Today at Athens Heart Center For Clinical Decision Support since January 2006

Amit P. Sheth, S. Agrawal,Jonathan Lathem, Nicole Oldham, H. Wingate, P. Yadav, and K. Gallagher, Active Semantic Electronic Medical Record, Proc. of the 5th International Semantic Web Conference, 2006

Demo of ASEMR and other applications

http://knoesis.org/showcasehttp://archive.knoesis.org/library/demos/

Benefits of ASEMR

• Error prevention (drug interactions, allergy)– Patient care– insurance

• Decision Support (formulary, billing)– Patient satisfaction– Reimbursement

• Efficiency/time– Real-time chart completion– “semantic” and automated linking with billing

Using large data sets for Structured Data on the web:

Linked Open Data – samples from 2005 to 2010

Linked Open DataPublish Open Data Sets in RDFBy 2010, 203 data data sets25 billion Triples

Image: http://richard.cyganiak.de/2007/10/lod/

You publish the raw data…

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

… and others can use it

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

Using the LOD to build Web site: BBC

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

Using the LOD to build Web site: BBC

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

GoodRelations Ontology - RDFa

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

GoodRelations Ontology - RDFa

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

GoodRelations Ontology - RDFa

Ivan Herman, "Semantic Web Adoption and Application”, http://www.w3.org/People/Ivan/CorePresentations/Applications/

Fast forward to 2010-2011

Schema.org

Shared Vocabulary

Amazing things can happen

Will give some on-line examples

Twitris: Semantic Social Web Mash-upSelect topic

Select date

Topic tree

Spatial Marker

N-gram summaries

Wikipedia articles

Reference news

Related tweets

Images & Videos

Tweet trafficSentiment Analysis

More: TWITRIS

Web (and associated computing) is evolving

Web of pages - text, manually created links - extensive navigation

2007

1997Web of databases - dynamically generated pages - web query interfaces

Web of resources - data, service, data, mashups - 4 billion mobile computing

Web of people, Sensor Web - social networks, user-created casual content - 40 billion sensors, 500M+ FB users, 1B tweets/wk

Web as an oracle / assistant / partner - “ask the Web”: using semantics to leverage text + data + services - Powerset

Sem

antic

Tec

hnol

ogy

Use

d

Computing for Human Experience

Keywords

Patterns

Objects

Situations,Events

Enhanced Experience,Tech assimilated in life

Structured text (Scientific

publications / white papers)

Experimental Results Clinical Trial Data

Public domain knowledge (PubMed)

Metadata Extraction/Semantic Annotations

Ontologies/Domain Models/

Knowledge

Meta data / Semantic Annotations

Semantic Search/Browsing/Personalization/Analysis, Knowledge Discovery,Visualization,Situational Awareness

Big data

Search and browsing

Patterns / Inference / Reasoning

2D-3D & Immersive Visualization, Human Computer Interfaces

Impacting bottom line

Knowledge discovery

Migraine

Stress

Patient

affects

isaMagnesium

Calcium Channel Blockers

inhibit

SEMANTICS, MEANING PROCESSING

71

Semantics as core enabler, enhancer @ Kno.e.sis

Ohio Center of Excellence in Knowledge-enabled

Computing

one of the two largest academic

groups in Semantic Web;

multidisciplinary

Take Home Message (Cont.)

Semantics play a key role in refering "meaning" behind the data. Requires progress from keywords -> entities -> relationships -> events, from raw data to human-centric abstractions.

Take Home Message (Cont.)

Wide variety of semantic models and KBs (vocabularies, social dictionaries, community created semi-

structured knowledge, domain-specific datasets, ontologies) empower semantic solutions. This can lead to Semantic Scalability – scalability that is meaningful to human activities and decision making.

Interested in more?Kno.e.sis Wiki for the following and more:• Computing for Human Experience• Continuous Semantics to Analyze Real-Time Data• Semantic Modeling for Cloud Computing• Citizen Sensing, Social Signals, and Enriching Human Experience• Semantics-Empowered Social Computing• Semantic Sensor Web • Traveling the Semantic Web through Space, Theme and Time • Relationship Web: Blazing Semantic Trails between Web Resources • SA-REST: Semantically Interoperable and Easier-to-Use Services and Mashups• Semantically Annotating a Web Service

Tutorials: Semantic Web:Technologies and Applications for the Real-World (WWW2007)Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications (WWW2011)

Partial Funding: NSF (Semantic Discovery: IIS: 071441, Spatio Temporal Thematic: IIS-0842129), AFRL and DAGSI (Semantic Sensor Web), Microsoft Research (Semantic Search) and IBM Research (Analysis of Social Media Content),and HP Researh (Knowledge Extraction from Community-Generated Content).

76

http://knoesis.org

Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA

Vision Paper: Computing for Human Experience:http://wiki.knoesis.org/index.php/Computing_For_Human_Experience

Future: Computing for Human Experience

top related