columbia university department of computer science coms – e6125 web-enhanced information...

28
Department of Computer Scie nce COMS – E6125 Web-enHanced Information Manage ment Presentation A Study to the Semantic Web and Semantic Web Based Applications Student Name: Niu, Cheng Student ID: cn2198 Advisor: Prof. Kaiser

Upload: anne-wilkey

Post on 14-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Columbia UniversityDepartment of Computer ScienceCOMS – E6125 Web-enHanced Information ManagementPresentation

A Study to the Semantic Web and Semantic Web Based

Applications

Student Name: Niu, ChengStudent ID: cn2198

Advisor: Prof. Kaiser

Page 2: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Semantic Web in one word

As the next generation of World Wide Web, the significance of Semantic Web is that information on the web will be machine-understandable, so that machine reasoning on web page information is possible.

Page 3: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

"Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauternes."

Page 4: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layered architecture of Semantic Web

Page 5: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Resource Description Framework – Core of Semantic Web RDF Triples:

“Subject – Predicate – Object” “Resource – Property – Property value”

An example:“http://www.example.org/index.html has a creator

who is John Smith.”

http://www.example.org/index.html

John Smithcreator

Page 6: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Resource Description Framework – Core of Semantic Web

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:ex="http://example.org/ex" xml:lang="en-US">

<rdf:Description rdf:about="http://www.example.org/index.html">

<ex:creator>John Smith</ex:creator>

</rdf:Description>

</rdf:RDF>

Page 7: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

RDF Schema & Ontology

Analogies among RDBMS, XML, and Semantic Web

RDBMS XML

XML SchemaRDBMS Schema RDF Schema

RDF

Relational Database scope

XML scope Semantic Web scope

Ontology (OWL)

Page 8: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

D2R Mapping Model – Transformation from Relational DB to RDF

Creator: Christian Bizer

Free University of Berlin (Germany) . 2003. D2R mapping process

Page 9: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Prototype ideas

Prototype backgroundScenario: “MusicRec.com”

Traditional approaches– Relational Database– eXtensible Markup Language

Why Semantic Web?

Page 10: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Basic designing strategy

Data source2

TerminalEnd user Web service provider

“MusicRec.com”

Semantic Web ContentResource Description Framework

Data source 3

Heterogeneous data sources within World Wide Web

Data source1

Web

Dat

a In

tegr

atio

n

All information on the Internet

tran

sfor

m

tran

sfor

m

tran

sfor

m

Page 11: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

“MusicRec” – System requirement

I may also want to know about

…... The publisher

How to buy

The Tracks

The Singer

Knowledge base of Music

Semantic Web

I wanna some music…...

User Requirements

AgeMusic RegionMusic Genre

MusicRec.comUser input

Analysis &

Evaluations

List of CDs as Recommendation

Feedback

Automatic further search for possible

additional Info

Singer of the CD

Additional Information

Tracks of the CD

Publisher of the CD

How to buy this CD

Page 12: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Three-Layer architecture of this

Semantic Web Prototype

Layer 3: Semantic Web Application Layer (MusicRec.com)

Layer 2: Semantic Web Content Layer

Layer 1: Heterogeneous Data Integration Layer (RDBMS bottom layer)

Input.jsp Output.jsp

Controller ClassesEntity Model Classes

(Java Beans)Java APIs

(Jena)

Index.rdf

A.rdf B.rdf C.rdf

AA.rdf BB.rdfAB.rdf

AAA.rdf ABA.rdf BAA.rdf

BA.rdf

BBA.rdf CAA.rdf

CB.rdfCA.rdf

CBA.rdf

UR

I

URI

URIURI URIURI URIURI

UR

I

UR

I

UR

I

UR

I

UR

I

UR

I

SQL

Ser

ver

2000

DB

2

Qra

cle

Website A Website CWebsite B

UR

I

URI

UR

I

D2R Mapping Techniques

Root URI

Website A Website B Website C

Three-layer logical architecture of this project

Ontology(Music.owl)

Page 13: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Design Flow

Ontology Design(Music.owl)

Semantic Web Content

Semantic Web Application(using MVC structure)

Data Integration

Generating RDF documents

Organizing the structure of RDF

doc

Finish

View Tier design(Jsp pages)

Controller Tier design

(Servlets)

Model Tier design

(Java Beans)

Finish

Relational Database design

D2R transformation format design

Finish

FINISH

BEGIN

Complete design flow of this project

Page 14: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer2: Ontology Design

-has_CDName : string-has_CDRegion : string-has_CDCoverPic : string-has_CDSaleAmount : int-has_CDRank : int-has_CDPublishDate : Date-has_CDGenre : string-has_Singer (LINK) : Singer-has_Track (LINK) : Track-has_Publisher (LINK) : Publisher-has_Purchase (LINK) : Purchase

CD

-has_TrackGenre : string-has_TrackRank : int-has_TrackName : string-has_TrackPlayTime : string-has_TrackSerialNo : int-has_TrackDownloadCount : int-belong2CD (LINK) : CD

Track-has_SingerNationality : string-has_SingerName : string-has_SingerLaterNews : string-has_SingerGender : string-has_SingerConcert : string-has_SingerBirthDate : Date-has_SingerCDAmount : int-has_LatestCD (LINK) : CD

Singer

-has_Purchase_SellerName : string-has_Purchase_Homepage : string-has_PurchaseCDName : string-has_PurchasePrice : int-has_PurchaseDiscountInfo : string-has_PurchaseShipmentInfo : string

Purchase

-has_PublisherName : string-has_Publisher_Homepage : string-has_Publisher_NewlyPublishedCD : string-has_Publisher_ToBePublishedCD : string

Publisher

-has_CD (LINK) : CD-has_SiteHomePage : string-has_SiteName : string-has_SiteDescription : string

MusicSite

-has_MusicSite (LINK) : MusicSite

MusicSiteIndex

1..*

1..*

11..*

1..* 1..*

1 1

Page 15: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer2: Generating RDF documents

Page 16: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer2: Organizing the structure of RDF documents

MusicSiteIndex_RDF.rdf

EasternMusicSite_RDF.rdf WesternMusicSite_RDF.rdf

EasternCD_RDF.rdf

EasternTrack_RDF.rdf

WesternCD_RDF.rdf

WesternTrack_RDF.rdfEasternSinger_RDF.rdf WesternSinger_RDF.rdf

Purchase_RDF.rdf

Publisher_EMI_RDF.rdf

Publisher_SONY_RDF.rdf

RDF Documents Structure

has_MusicSite has_MusicSite

has_CD

has_Track

belong2CD

has_Singer

has_LastestCD

has_CD

has_Track

belong2CD

has_Singer

has_LatestCD

has_Publisher has_Publisher

has_Publisherhas_Publisher

has_Purchase has_Purchase

Root URI: http://localhost:8080/semanticweb/MusicSiteIndex_RDF.rdf

Page 17: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer3 Design: Basic Structure Model-View-Controller

Input.jsp(View tier)

Output.jsp(View tier)

ServletsController tier

Entity Model Classes(Java Beans)Model tier

Java APIs(Jena)

Index.rdf Root URI

Semantic Web Application Layer

Page 18: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer3: Servlets (Controllers)

+Recommender()+setAdviseeUser()+getAdviseeUser()+setTop5CdList()+getTop5CdList()+setCdListIncludingUserScores()+getCdListIncludingUserScores()+generateCdListFromModel [静态]()+calculateWholeCdList [实例 返回ArrayList]()+calculateSingleCd [静态 返回double]()+sortCdList [静态]()+fillAllInfoForTop5Cd [实例]()+selectTop5CdList[实例]()

-rdfModel : Model [静态]-cdListFromModel : ArrayList [静态]-adviseeUser : User [实例]-top5CdList : ArrayList [实例]-cdListIncludingUserScores : ArrayList[实例]

Recommender

+ModelBuilder()+traverseAndBuild()

ModelBuilder

com.semanticweb.lightbeans

+CD()+setCdName()+getCdName() : string+setCdRegion()+getCdRegion() : string+setCdCoverPic()+getCdCoverPic() : string+setCdSaleAmount()+getCdSaleAmount() : int+setCdRank()+getCdRank() : int+setCdPublishDate()+getCdPublishDate() : Date+setCdGenre()+getCdGenre() : string+setCdSinger()+getCdSinger() : Singer+setCdPublisher()+getCdPublisher() : Publisher+setCdPurchase()+getCdPurchase() : Purchase+?setCdTracks()+?getCdTracks() : Track+setCdTotalScore()+getCdTotalScore() : double+setCdNumOfTracksInTop5()+getCdNumOfTracksInTop5() : int+setCdSingerTempSearchPath()+getCdSingerTempSearchPath() : string+setCdPublisherTempSearchPath()+getCdPublisherTempSearchPath() : string+setCdPurchaseTempSearchPath()+getCdPurchaseTempSearchPath() : string+setCdTrackTempSearchPath()+getCdTrackTempSearchPath()

-cdName : string-cdRegion : string-cdCoverPic : string-cdSaleAmount : int-cdRank : int-cdPublishDate : Date-cdGenre : string-cdSinger : Singer-cdPublisher : Publisher-cdPurchase : Purchase-cdTracks [ArrayList] : Track-cdTotalScore : double-cdNumOfTracksInTop5 : int-cdSingerTempSearchPath : string-cdPublisherTempSearchPath : string-cdPurchaseTempSearchPath : string-cdTrackTempSearchPathList [ArrayList]

CD

+Singer()+setSingerName()+getSingerName() : string+setSingerGender()+getSingerGender() : string+setSingerNationality()+getSingerNationality() : string+setSingerBirthDate()+getSingerBirthdate() : Date+setSingerLaterNews()+getSingerLaterNews() : string+setSingerConcert()+getSingerConcert() : string+setSingerCdAmount()+getSingerCdAmount() : int+setSingerLatestCd()+getSingerLatestCd() : CD

-singerName : string-singerGender : string-singerNationality : string-singerBirthDate : Date-singerLaterNews : string-singerConcert : string-singerCdAmount : int-singerLatestCd : CD

Singer+User()+setUserName()+getUserName() : string+setUserBirthDate()+getUserBirthDate()+?setUserCdGenrePrioSeq()+?getUserCdGenrePrioSeq()+?setUserCdRegionPrioSeq()+?getUserCdRegionPrioSeq()

-userName : string-userBirthDate : Date-userCdGenrePrioSeq [Array] : int-uesrCdRegionPrioSeq [Array] : int

User

+Track()+setTrackName()+getTrackName() : string+setTrackGenre()+getTrackGenre() : string+setTrackRank()+getTrackRank() : int+setTrackPlayTime()+getTrackPlayTime() : string+setTrackSerialNo()+getTrackSerialNo() : int+setTrackDownLoadCount()+getTrackDownLoadCount() : int+setTrackBelong2Cd()+getTrackBelong2Cd() : CD

-trackName : string-trackGenre : string-trackRank : int-trackPlayTime : string-trackSerialNo : int-trackDownLoadCount : int-trackBelong2Cd : CD

Track

+Purchase()+setPurchaseSellerName()+getPurchaseSellerName() : string+setPurchaseHomepage()+getPurchaseHomepage() : string+setPurchaseCdName()+getPurchaseCdName() : string+setPurchasePrice()+getPurchasePrice() : int+setPurchaseDiscountInfo()+getPurchaseDiscountInfo() : string+setPurchaseShipmentInfo()+getPurchaseShipmentInfo() : string

-purchaseSellerName : string-purchaseHomepage : string-purchaseCdName : string-purchasePrice : int-purchaseDiscountInfo : string-purchaseShipmentInfo : string

Purchase

+Publisher()+setPublisherName()+getPublisherName() : string+setPublisherHomepage()+getPublisherHomepage() : string+setPublisherNewlyPublishedCd()+getPublisherNewlyPublishedCd() : string+setPublisherToBePublishedCd()+getPublisherToBePublishedCd() : string

-publisherName : string-publisherHomepage : string-publisherNewlyPublishedCd : string-publisherToBePublishedCd : string

Publisher

com.semanticweb.controller.heavybeans

+init()+doGet()

ControllerServlet

com.semanticweb.controller.servlet

Page 19: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer3: Servlets (Controllers)

Sequence Diagram

search.jsp cd.jsp ControllerServlet Recommender

1.Initialize servlet, init()

ModelBuilder

4.traverseAndBuild(RootURI)

2.Recommender()

3.ModelBuilder()

6.rdfModel 5.search Semantic Web content

7.generateCdListFromModel(rdfModel)

List of CD objs8.User input (Music preference), doGet()

User

9.User(user input)

User object

10.setAdviseeUser(user obj)

11.calculateWholeCdList(cd list)

12.calculateSingleCd(cd)

cd list with total scores

13.sortCdList(cd list with total scores)

sorted cd list

14.selectTop5CdList(sorted cd list)

top5CdList

top 5 cd with all info

top5 cd with all info

top 5 cd with all info

15.fillAllInfoForTop5Cd(top5 cd)

View Tier Controller Tier Model Tier

Page 20: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Layer1 Design: D2R Map overview

How D2R works

D2R Processor

D2R Mapping File

Relational Database

D2R Mapping Model

RDF DocumentsRDF DocumentsRDF Documents

Page 21: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Advantages and effectiveness: Free adoptions to new resources

Original RDF structure Index.rdf

A.rdf B.rdf C.rdf

AA.rdf BB.rdfAB.rdf

AAA.rdf ABA.rdf BAA.rdf

BA.rdf

BBA.rdf CAA.rdf

CB.rdfCA.rdf

CBA.rdf

UR

I

URI

URIURI URIURI URIURI

UR

I

UR

I

UR

I

UR

I

UR

I

UR

I

Website A Website CWebsite B

UR

IURI

UR

I

Root URI

Page 22: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Free adoptions to new resources Modified RDF structure Index.rdf

A.rdf B.rdf C.rdf

AA.rdf BB.rdfAB.rdf

AAA.rdf ABA.rdf BAA.rdf

BA.rdf

BBA.rdf CAA.rdf

CB.rdfCA.rdf

CBA.rdf

URI

URI

URIURI URIURI URIURI

UR

I

UR

I

UR

I

UR

I

UR

I

UR

I

Website A Website CWebsite B

UR

I

UR

I

URI

Root URI

C1Type1.rdf

Classification1.rdf Classification2.rdf

URI

C1Type2.rdf C2Type1.rdf C2Type2.rdf

URIURI

URI URI URI URI

Page 23: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Task allocations on the web Allocations of the tasks to maintain RDF on the web

MusicSiteIndex_RDF.rdf

EasternMusicSite_RDF.rdf WesternMusicSite_RDF.rdf

EasternCD_RDF.rdf

EasternTrack_RDF.rdf

WesternCD_RDF.rdf

WesternTrack_RDF.rdfEasternSinger_RDF.rdf WesternSinger_RDF.rdf

Purchase_RDF.rdf

Publisher_EMI_RDF.rdf

Publisher_SONY_RDF.rdf

RDF Documents Structure

has_MusicSite has_MusicSite

has_CD

has_Track

belong2CD

has_Singer

has_LastestCD

has_CD

has_Track

belong2CD

has_Singer

has_LatestCD

has_Publisher has_Publisher

has_Publisherhas_Publisher

has_Purchase has_Purchase

Root URI: http://localhost:8080/semanticweb/MusicSiteIndex_RDF.rdf

Page 24: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Task allocations on the web Allocations of the tasks to maintain RDF on the web

MusicSiteIndex_RDF.rdf

EasternMusicSite_RDF.rdf WesternMusicSite_RDF.rdf

EasternCD_RDF.rdf

EasternTrack_RDF.rdf

WesternCD_RDF.rdf

WesternTrack_RDF.rdfEasternSinger_RDF.rdf WesternSinger_RDF.rdf

Purchase_RDF.rdf

Publisher_EMI_RDF.rdf

Publisher_SONY_RDF.rdf

RDF distribution on

the Internet

has_MusicSite has_MusicSite

has_Publisher has_Publisher

has_Publisherhas_Publisher

has_Purchase has_Purchase

MusicRec.com

EasternMusic.com WesternMusic.com

EMI.com

SONY.com

eBay.com

Page 25: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Knowledge Understanding and Inference

Imagine the following “Wine” scenario: “Someone is planning a dinner party and at least one of the

guests is wine knowledgeable. The host would like to serve wine that is well matched to the courses on the menu. The host would also like to appear knowledgeable about the wines served at the event. The host would also like to have appropriate wines and wine accessories at the dinner. The host may have decided to serve a special tomato based pasta sauce with fresh pasta as the main course.”

“In order to serve wines appropriate to the meal, the host needs

information concerning wine and food pairings. In order to appear knowledgeable about wines, the host would benefit from having access to wine information relevant to the event. In order to have appropriate wine accessories, the host would need to have information about what accessories are relevant to the situation.”

Page 26: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Knowledge Understanding and Inference

Triples in the Wine scenario:

Page 27: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Knowledge Understanding and Inference Reasoning line of Wine scenario:

Page 28: Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic

Thank you so much!

"Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauternes."