Behshid BehkamalBehshid Behkamal
Ferdowsi University of MashhadFerdowsi University of MashhadWeb Technology LabWeb Technology Lab
22
OutlineOutline
1. Linked Data: What and Why1. Linked Data: What and Why
2. Linking Open Data Project2. Linking Open Data Project
3. An Experience of Publishing Linked Data3. An Experience of Publishing Linked Data
4. Challenges of Linked Data4. Challenges of Linked Data
33
The problem: Doing PhD in a good university
44
The problem: Planning a trip
55
The Web of Documents The Web of Documents
The Web of Data The Web of Data
66
Linked Data is …
A way of publishing data on the web to create typed links between data from different sources
WikipediaWikipedia:
A term used to describe a recommended best practice for exposing, sharing, and connecting pieces of datadata, informationinformation, and knowledgeknowledge on the Semantic Web using URIs and RDF
77
Linked Data employs …
URI URI to identify resource
HTTPHTTP protocol to retrieve resource
RDFRDF data model to represent resource
88
URI: Uniform Resource Identifier
As names for any entity existing in the world
A simple and extensible means for identifying a resource
For documents and other entities that can be located on the Web
99
HTTP
A simple mechanism for retrieving Resources that can be serialized as a stream of bytes
Picture of an animal
Descriptions of entities that cannot themselves be sent across networkDescription of a rabbits
1010
RDF ModelIt is a machine understandable metadata
A model is a collection of statements
Statement := (subject, predicate, object)Statement := (subject, predicate, object)
Subject Object
Predicate
StatementStatement
URI
URI
URI / Literal
1111
Linking Statements
The subject of one statement can be the object of another
Behkamal ComputerEngineering
Student OF
Ferdowsi university of
Mashhadwww.um.ac.ir
Department OF
Has HomePage
Such collections of statements form LOD cloud
1212
LOD Cloud : May 2007
1313
Linked Open Data (LOD) Cloud Each circle represents a dataset
Size of the circle corresponds to the number of triples Arrows represent the links between datasets Thickness of arrows indicates number of links
Datasets are published based on Linked Data PrinciplesLinked Data Principles
Datasets are interlinked with at least oneat least one other dataset in the cloud
Some datasets act as hubE.g. DBpedia, Geonames, …
1414
LOD Cloud : February 2008
14
1515
LOD Cloud : July 2009
15
1616
LOD Cloud : September 2010
1717
LOD cloud : 2007 - 2010
Datasets: Datasets: only 11 more than 200200Triples:Triples: around 22 Billion over 2525 Billion
Links:Links: over 22 Million around 395395 Million
1818
1919
Process of publishing FUM-LD
1. Selecting Target Data
2. Assigning URIs
3. Publishing Data
4. Interlinking Data Resources
2020
1. Selecting Target Data
- An important issue of publishing linked data is deciding which data is to be selected
- After studying the FUM data base, five important entities are selected:
FacultyFacultyDepartmentDepartmentProfessorProfessorPaperPaperCourseCourse
2121
2. Assigning URLs - We used a simplea simple schemaschema for naming URIs
- http://wtlab.um.ac.ir/LinkedOpenData/TYPETYPE/ID
Faculty: Faculty: http://wtlab.um.ac.ir/LinkedOpenData/faculties/engineeringDepartment: Department: http://wtlab.um.ac.ir/LinkedOpenData/departments/3105 Professor: Professor: http://wtlab.um.ac.ir/LinkedOpenData/profs/kahani Paper: Paper: http://wtlab.um.ac.ir/LinkedOpenData/papers/13 Course: Course: http://wtlab.um.ac.ir/LinkedOpenData/courses/23152183
2222
3. Publishing Data
Methods for publishingMethods for publishing Using existing toolsUsing existing tools Developing an applicationDeveloping an application
Modes of publishingModes of publishing Online (On the fly)Online (On the fly) OfflineOffline
2323
Providing links to other resources inside and outside the FUM-LD is a main design target of our web
presentation
4.1. Interlinking to FUM-LD Resources
4.2. Linking to External Datasets DBPedia Geonames YAGO DBLP …
4. Linking Data Resources
2424
4.1. Interlinking to FUM-LD Resources
Departments Courses
Profs
PapersFaculties
2525
4.2. Linking to Other Resources
Datasets Description Count
DBpedia Links to Resources 4570
DBpedia owl:sameAs links 1311
DBLP owl:sameAs links 475
ACM owl:sameAs links 38
DBpedia skos:subject links 3708
OpenCyc skos:subject links 449
GeoNames Links to resources 936
2626
2727
سيستم کيفيت ارزيابي برای شده ارائه مدلهايالکترونيکي تجارت هاي
Classification of ChallengesClassification of Challenges
Data Challenges
Multi Lingual challenges
Linking challenges
Data and Link Maintenance
2828
1. Linking Challenges Choosing appropriate ontologies and predicatesSelecting based on popularity
Not effective for all cases No automatic approach
Creating appropriate links between dataFinding appropriate links between resources of different
datasets
‘Hasan Amiri’ , ‘S.H. Amiri’ and ‘Amiri H.’
2929
Lack of required data in the original dataset
No information for publications of some professors
Incorrect or incomplete data abstract or keywords, or list of coauthors of the
papers
Inconsistency of data value Different types of formats (date of a conference) Different language
2. Data Challenges lack of data or presence of low-quality data
3030
3. Multi Lingual challenges
Linking a Non-English dataset to the related English datasets of LOD cloud.
Multi-lingualMulti-lingual Some users choose their mother tongue language while others use
English for entering their data Because of
Convenience of users Users’ field of activity
Multi SpellingMulti Spelling Some Persian name like “سعید” is entered as “saeed” , “saeid” and
“saied”.
3131
4. Data and Link Maintenance
An important issue in maintaining the quality of data published as linked data is:
Updating data as well as links between the data itemsUpdating data as well as links between the data items
Two main situations that requires updating the datasetTwo main situations that requires updating the dataset
The original dataset is changed Exp. A new professor joins Computer Engineering
department
A related external dataset is changed Exp. A new resource describing ‘Computer Engineering
Department of Ferdowsi University o Mashhad’ is added to DBpedia
3232
Summery
Linked Data and LOD cloud are introduces
An experience of publishing linked data is presented
Some problems and challenges are discussed
Behshid Behshid BehkamalBehkamal