personalised access to linked data
TRANSCRIPT
Personalised Access to Linked Data
Milan Dojchinovski and Tomas Vitvar
Web Intelligence Research Group Czech Technical University in Prague
The 19th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2014) November 24-28, 2014, Linköping, Sweden
Milan Dojchinovski [email protected] - @m1ci - http://dojchinovski.mk
Except where otherwise noted, the content of this presentation is licensed underCreative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Czech Technical University in Prague
Web Intelligence Research GroupWeb Intelligence Research Group
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
2
• Introduction • Personalised Resource Recommendations • Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Introduction
3
• Find relevant information in LOD is not easy - SPARQL, manual dereferencing URIs, …
• … or ask other people for recommendations and get personalised recommendations of resources
• Linked Data based recommenders can help [1] M. Schmachtenberg et al, Adoption of linked data best practices in different topical domains, ISWC 2014.
LOD cloud stats [1]: • 294 in Sep 2011 • 1,091 datasets in Apr 2014
• 271% growth
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Related Work
4
• dbRec (Passant, 2010): semantic distance measure - function of direct and indirect links
• Content-based LD recommender (Di Noia et. al, 2012) - movies domain, max resource distance: 2
• Lookup Explore Discovery (Mirizzi et al., 2010) - user input required - recommendations related to the entities occurring in the query
• Discovery Hub (Marie et al., 2013) - based on the spreading activation - utilizes small portion of information DBpedia
• Aemoo (Musetti et al., 2012) - Encyclopedic Knowledge Patterns over DBpedia
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Introduction
5
• Method for personalised Linked Data recommendations - apply collaborative filtering technique to Linked Data - recommendations from users with similar resource interests
• Two novel metrics: - resource similarity and resource relevance
• Considered aspects: - Resource Commonalities
- how much information two resources share
- Resource Informativeness - how informative the resources are
- Resource Connectivity - how well are resources connected
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
6
• Introduction • Personalised Resource Recommendations
- Resource Similarity - Resource Relevance
• Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Resource Recommendation In a Nutshell
7
• Input: RDF graph (including user profiles) • Step 1: evaluate user similarities
- e.g. similarity between resources representing users - instances of foaf:Person class
• Step 2: recommend resource from similar users - compute relevance for each resource candidate - incorporate the resource (user) similarities
dc:creator
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:used
API
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwani
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
8
• Introduction • Personalised Resource Recommendations
- Resource Similarity - Resource Relevance
• Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Resource Similarity Computation
9
• Assumption 1: the more information two resource share, the more similar they are
• 6 resources in the shared context graph
dc:creator
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:used
API
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwani
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Resource Similarity Computation (cont.)
10
• Assumption 2: less probable shared resources carry more similarity information than the more common
• Evaluated by computing the node degree value - Microsoft-Bing-API (deg. 40) more than Twitter-API (deg. 799)
Information Content (IC)
Resource IC
dc:creator
ls:tag
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:usedA
PI
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search#Microsoft-Bing-
API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwani
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Resource Similarity Computation (cont.)
11
• Assumption 3: better connected shared resources carry more similarity information
• The number of simple paths between the resources - 2 simple paths between #Alfredo and #Twitter-API
dc:creator
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:used
API
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwani
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
12
• Introduction • Personalised Resource Recommendations
- Resource Similarity - Resource Relevance
• Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Resource Relevance Computation
13
• Recommending resources of type Web APIs for an user
• Recommendations from similar users - connectivity between the similar user and the resource candidate
- number of simple paths - informativeness of each resource in these paths
dc:creator
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:used
API
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwanisimilar users
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
14
• Introduction • Personalised Resource Recommendations
- Resource Similarity - Resource Relevance
• Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Experiments Setup
15
• Linked Web APIs dataset - RDF representation of ProgrammableWeb.com - largest service and mashup repository
• Evaluated accuracy and usefulness of recommendations • Accuracy:
- precision/recall, AUC, NDCG, MAP, MRR
• Usefulness: - serendipity: how surprising the recommendations are - diversity: how diverse the recommendations are
• Evaluated methods: - User-KNN, Item-KNN, Most popular, Random - LD with RIC, LD without RIC
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
• Taking into account resource informativeness makes sense • Item-KNN and User-KNN do not work well
- … at least in the Web services domain
Accuracy Evaluation
16
0.0 0.2 0.4 0.6 0.8 1.0
0.00
0.05
0.10
0.15
0.20
Recall
Precision
Linked Data based with RIC Linked Data based without RIC User-KNNItem-KNNMost popularRandom
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
• Serendipity score = user resource avg. distance • Diversity score = avg. dissimilarity between all resource
pairs
Serendipity and Diversity Evaluation
17
@top-N Random Most Popular User-KNN Item-KNN LD without
RICLD with
RIC@top-5 2.97752 2.66810 2.59197 2.68006 3.18881 3.03271
@top-10 2.98455 2.67465 2.65514 2.70402 3.54821 3.26700@top-15 2.98364 2.65816 2.68101 2.71267 3.73117 3.36509@top-20 2.98455 2.65184 2.69780 2.70968 3.84142 3.42444@top-5 0.65339 0.58347 0.62092 0.63349 0.83417 0.81949
@top-10 0.65317 0.61354 0.62411 0.64392 0.86044 0.82912@top-15 0.65370 0.60374 0.63159 0.64558 0.87511 0.82884@top-20 0.65347 0.60719 0.63276 0.64287 0.88435 0.83114
sere
ndip
itydi
vers
ity
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Trade-off: Serendipity, Diversity and Accuracy
18
• higher serendipity leads to lower precision and higher recall
• optimal results @top 5-10
0.000.020.040.060.080.100.120.140.160.180.200.220.240.260.280.30
Precision
0.818
0.820
0.822
0.824
0.826
0.828
0.830
0.832
0.834
Diversity
0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.81 0.82Recall
@5 @10 @15 @20
Precision/RecallDiversity
0.000.020.040.060.080.100.120.140.160.180.200.220.240.260.280.30
Precision
3.003.053.103.153.203.253.303.353.403.453.50
Serendipity
0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.81 0.82Recall
@5 @10 @15 @20
Precision/RecallSerendipity
• higher diversity leads to lower precision and higher recall
• optimal results @top 5-10
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Outline
19
• Introduction and Motivation • Personalised Resource Recommendations
- Resource Similarity - Resource Relevance
• Experiments and Results • Conclusion and Future Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Conclusion
20
• Method for personalised access to Linked Data - recommendations based on the collaborative filtering
technique
• Considered aspects: - resources’ commonalities - resources’ informativeness - resources’ connectiviteness
• Validated on a dataset from the Web services domain - Linked Web APIs dataset
• Future work: - consider other multi-domain datasets - automatic determination of optimal resource contexts distances - publish the Linked Web APIs dataset to the LOD cloud
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Feedback
21
Thank you!Questions, comments, ideas?
Milan Dojchinovski [email protected]
@m1ci http://dojchinovski.mk
dc:creator
dc:creator
dc:creator
dc:crea
tor
ls:used
API
ls:categoryls:usedAPI
ls:tag
ls:used
API
ls:usedAPI
ls:usedAP
I
ls:tag
ls:tag
ls:tag
ls:used
API
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Hashtagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-API
#music
#social
#microblogginig
#411Sync-API
#MTV-Billboard-charts
#Mobile-Weather-Search
#mlachwani