Towards Exploratory Relationship Search: A Clustering-Based Approach
Yanan Zhang, Gong Cheng, Yuzhong QuNanjing University, China
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Relationship search
Searching graph-structured data
relatonship = path
Too many results!
Exploratory relationship search
• Exploring a set of relationships interactively and continuously
clustering(our solution: RelClus)
faceted categories(RelFinder)
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Challenges
• How to meaningfully label a cluster?• How to make sense of a cluster hierarchy?• How to measure similarity between clusters?
Agglomerative hierarchical clustering• Initially: relationships singleton clusters• Then: progressively merge the most similar pair
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Relationship pattern
• High-level abstraction of relationships– Vertices: entities or classes– Edges: properties (undirected)
How to meaningfully label a cluster?
• Using a leastest common relationship pattern– Vertices: leastest common classes (or entities)– Edges: leastest common properties
Person
label({R4, R5}) = P1
P1
R4
R5
How to make sense of a cluster hierarchy?
• subPatternOf ( )⊑– Vertices: s.t. subClassOf (or instance-type)– Edges: s.t. subPropertyOf
P3
P2
P1
P2 P⊑ 3, P1 P⊑ 3
How to measure similarity between clusters?
• sim(Ci,Cj) = how many commonalities they share
which are exactly captured by label(Ci C∪ j)– Measure: -log (probability of seeing label(Ci C∪ j))
i.e. the information content associated with label(Ci C∪ j)– Probability estimation: based on the data set
P3
P2
P1
A running exampleP3
P2
P1
R4
R5
R1
R2
R3
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Design• Data set: DBpedia• Systems
– RList: just a list of all results– RFacet: w/ faceted categories (similar to RelFinder)– RClus: w/ hierarchical clustering (our solution)
• Participants and tasks– 2 participants provide searh tasks
• 3 (well-defined) lookup tasks• 3 (open) exploratory search tasks
– 15 participants carry out tasks
• Metrics– Questionnaire– SUS– User feedback
Questionnaire results
Some inspiring user feedback
• Dislike deep hierarchies• Expect more concise visualization• Need more cognitive support
Performance testing
Outline
• Motivation• Challenges• Approach• Evaluation• Conclusion
Conclusion
• Goal: clustering-based exploratory relationship search• Approach: pattern-centric
• Future work– Combining faceted categories and hierarchical clustering– Going beyond them