a method for detecting behavior-based user profiles in collaborative ontology engineering
TRANSCRIPT
29-10-2014 pag. 1
A Method for Detecting Behavior-Based User Profiles in Collaborative Ontology
Engineering
Sven Van Laere, Ronald Buyl and Marc Nyssen
29-10-2014 @ ODBASE, OTM 2014
22-9-2015 pag. 2
Overview
• Motivation
• User profiling• … definition
• … in the research field
• Ontology engineering
• Method
• Use Case
• Conclusions and Future work
22-9-2015 pag. 3
Motivation
• Types of users are not known beforehand
• Ontology engineering is far from trivial
• Most methods and tools use a set of predefined roles
• Depend on the ontology project and interests of a user
• Assigning based on previous experiences, confidence andreliability in user
Roles and Responsibilities
vs
Users
22-9-2015 pag. 4
User profile
• Definition
• … is a model of a user’s interest and preferences which an agent can use to assist a user’s activity based on inferring observable information1,2
[1] D. Godoy and A. Amandi. User Profiling in Personal Information Agents: a Survey. (2005)
[2] I. Zukerman and D. Albrecht. Predictive Statistical Models for User Modeling. (2011)
22-9-2015 pag. 5
User profile
• In the research field• Fields
• News
• Internet browsing
• E-commerce
• Computer supported collaborative work (CSCW)
• …
• Approaches• Knowledge based user profiling
• Behaviour based user profiling
22-9-2015 pag. 6
User profile
• Behaviour based user profiling• Behavioural dimensions
• Focus dispersion
• Engagement
• Contribution
• Initiation
• Content Quality
• Popularity
“How to determine user role/profile basedon the type of input of a user?”
22-9-2015 pag. 7
Ontology Engineering
• GOSPL• Grounding Ontologies with Social Processes and
Natural Language
• Chosen for its explicit social interactions• Communities promoted to first class citizens
• Use of natural definitions (called ‘glosses’)
• Concepts are represented
• Formally => lexon
• Informally => gloss
22-9-2015 pag. 8
Ontology Engineering
• GOSPL
22-9-2015 pag. 9
Ontology Engineering
• Interactions in GOSPL tool
•Acting like forum
•Difference between forum and O.E.:• Closer
• Goal-oriented
• Deadline driven
22-9-2015 pag. 10
Method
22-9-2015 pag. 11
Method – Extraction Phase
• Apply D2RQ mappingof GOSPL ontology
Socialinteraction(sioc:Item)
Vote Sioc:Post Reply …
Glossinteractions
Glossinteractions
…
ADD gloss
UPDATEgloss
DELETE gloss
…
Socialinteraction(sioc:Item)
Vote Sioc:Post Reply …
Glossinteractions
Glossinteractions
…
ADD gloss
UPDATEgloss
DELETE gloss
…
Social interaction(sioc:Item)
Vote sioc:Post Reply …
Glossinteractions
Lexoninteractions
…
ADD gloss
UPDATEgloss
DELETE gloss
…ADD gloss
UPDATEgloss
DELETE gloss
22-9-2015 pag. 12
Method – Manipulation Phase
• Standardize dataset
• Principal Component Analysis (PCA)
• Transformation of variables (ortogonal)
• Reduce dimensionality
• Compose new matrix
22-9-2015 pag. 13
Method – Clustering Phase
• K-means clustering
• ANOVA
• Silhouette coefficients
• Take best result => different profiles
22-9-2015 pag. 14
Use Case
123456789
1011121314
1516171819202122232425262728
2930313233343536373839404142
91
3642
22303941
2152511
4161718
10262714
20131219
382128
6
37735
333231
8
34352423
4029
Work in teams
22-9-2015 pag. 15
Use Case
…
[A] gloss interactions
[B] lexon interactions
[C] constraint interactions
[D] supertype interactions
[E] gloss equivalence
interactions
[F] synonym interactions
[G] general request
interactions
[H] reply interactions
[I] closes of topics
[J] vote interactions
1st iteration
22-9-2015 pag. 16
Use Case
• Standardize data (z-score)
• PCA transformations• 95% of variance
• Iterative process
• Original: 42 users 10 dimensionsAfter PCA: 42 users 05 dimensions
22-9-2015 pag. 17
Use Case
• K-mean clustering
• Silhouette calculations
• ANOVA testing
• α = 0.95
22-9-2015 pag. 18
Use Case
22-9-2015 pag. 19
Use Case
91
3642
22303941
2152511
4161718
10262714
20131219
382128
6
37735
333231
8
34352423
4029
91
3642
22303941
2152511
4161718
10262714
20131219
382128
6
37735
333231
8
34352423
4029
Cluster 1
Cluster 4
Cluster 2
Cluster 3
Cluster 5
22-9-2015 pag. 20
Conclusions and Future Work
• Conclusions
• Demonstration of method for UP:• Semantic mapping (SIOC)
• Extract data
• Standardize data
• PCA to reduce dimensionality
• K-means clustering
• Silhouette coefficients and ANOVA testing
• 5 clusters based on behaviour
22-9-2015 pag. 21
Conclusions and Future Work
• Discussion & future work
• Sensitive to active and passive users
• Combine with classic behaviouraldimensions
• Validation cluster quality• Dunn index
• Davies-Bouldin index
• C-index
• Iterate process and re-evaluate
22-9-2015 pag. 22
References
• D. Godoy and A. Amandi. User Profiling in Personal Information Agents: a Survey. Knowledge Engineering Review, 20(4):329–361, 2005.
• C. Debruyne and R. Meersman. GOSPL: A method and tool for fact-oriented hybrid ontology engineering. In: T. Morzy, T. Härder, R. Wrembel (eds.) ADBIS 2012.LNCS, vol. 7503, pp. 153–166. Springer, Heidelberg (2012)
• M. Rowe, M. Fernandez, S. Angeletou, and H. Alani. Community Analysis through Semantic Rules and Role Composition Derivation. Web Semantics: Science, Services and Agents on the World Wide Web, 18(1):31–47, 2013.
• I. Zukerman and D. Albrecht. Predictive Statistical Models for User Modeling. User Modeling and User-Adapted Interaction, 11(1-2):5–18, 2001.
22-9-2015 pag. 23
THANK YOU!