histograph presentation insa de lyon
TRANSCRIPT
Lars Wieneke, CVCE, Luxembourg 0
histoGraphBuilding a Social graph from image archives
La science et les effets de réseau, Lyon 24.03.2014
www.cubrikproject.eu
About CUbRIK
European Community's Seventh Framework Program FP7-ICT
15 European partners
Multimedia search processing: Puttinghumans in the loop
Demos: History of Europeand Fashion
2Lars Wieneke, CVCE, Luxembourg
4Lars Wieneke, CVCE, Luxembourg
Goal: Reconstructing and exploring social ties through historical sources
5Lars Wieneke, CVCE, Luxembourg
Towards the social graph
4 pillars
1. Close connection to the requirements of
researchers in European Integration studies
2. Structured and referencable repository of
persons, events and places in time
3. Efficient indexation process that enables the
association of faces with identities
4. Toolchain for analysis and visualization
6Lars Wieneke, CVCE, Luxembourg
Towards the social graph
Sourcing researcher requirements
Selection of target user group
First draft of the app scenario
Feedback on technical scope
Exploratory interviews
(daily work practices)
Second draft of the app scenario
Focus group(user needs and app
scenarios)Feedback on
technical feasability
Lessons learned:issues and features
Specification
Implementation 1. demonstrator
Workshop: Review of app and features
Revised specification
Implementation 2. demonstrator
Evaluation and test
Stage 1
Stage 2
Stage 3
Stage 4
Stage 5
Users
Requirements
Technology
Towards the social graph
Indexation process
8
Raw content
High level features
(automatic annotations)
Conflict
(e.g., “Image contains
‘Romano Prodi’ ”
Confidence = low)?
Conflict store Conflict
manager
Conflict resolution
task store
Conflict resolution
task: conflict,
required skill, priority, ..
CUbRIK app
for Conflict
resolution
Game Q&ACrowdtask
Lars Wieneke, CVCE, Luxembourg
9
Towards the social graph
Indexation process II
Human in the loop added value: Verification of identities/places/events ambiguous and temporal only
possible by putting humans in the loop
Integration of multiple perspectives
CUbRIK as an open toolbox allows follow-up and extension through third parties
“Vertical” integration:GUI, components, crowdsourcing
integrated in a platform
Lars Wieneke, CVCE, Luxembourg
11
Challenges & Approach
Main challenges
Detection and identification of identities/places/events in time
Verification of identities/places/events in time
Analysis of relationships (e.g. co-occurrences)
Rights aware crawling and storage
Verification of provenance and license information
Truth and provenance
Approach
Crowd-sourced verification of detected faces (false positives/negatives)
Verification of identities through/places/events in time social networks of experts
Visual knowledge discovery/exploration
Integrated rights aware crawling and storage
Integrated license and provenance management
27/11/2013 Lars Wieneke, CVCE, Luxembourg
Towards the social graph
Bringing it all together
12Lars Wieneke, CVCE, Luxembourg
Image Indexation
Media
harvesting
and upload
Face
detection
Face
identifi cation
Clickworkers
Crowd Face
posit ion
validation
Copyright
aware
crawler
Provenance
checker
License
checker
Content
provider
tools
Metadata
Entity
ex tract ion
Identity reconciliation
Entity
verification &
annotation
Entitypedia
Integrat ion
CROWD
pre-
fi ltering
Text Indexation
Connection
to the CVCE
collection
Entity
anntation and
ex traction
Expert Crowd
Expert
CROWD
verification
Entitypedia
Integrat ion
CROWD Research Inquieries
Expert Crowd
CROWD
Research
Inquiry
Social Graph Network Analysis
Graph
Visualization
Analysis of
the social
graph
Graph Query (old: Query for Entities)
Graph
Visualization
Query for
entit ies
Context Expander
Expansion
through
documents
Expansion
through
videos
Expansion
through
images
Expansion
through
related
entit ies
Social Graph construction
Social
Graph
Creation
Content Analysis and Enrichment
QueryingFeedback acquisition
and processing Y3 component
Graph
Visualization
Query for
spatial
constraints
EXP through
SIMILAR
images
WP Event
detection
Pipelining the CUbRIK components:Human input from click-workers
Great choice for simple tasks:
Face detection: false positives, false negatives
Monetary motivation, via www.microtask.com
Poor performance on complex tasks:
Low resolution images
Different angles etc.
Actors recurring over time
14Lars Wieneke, CVCE, Luxembourg
Pipelining the CUbRIK components:Human input from experts
Capable of complex tasks:
In-depth knowledge of key actors
Context knowledge allows inferences
But: Different motivational models!
Public goods
Reputation
15Lars Wieneke, CVCE, Luxembourg
Usage for historians
No one truth in history but interpretation, context and discussion
Therefore need to represent ambivalence, contradictions and discussion
Close ties between data representation (Social graph) and their original context (primary sources)
16Lars Wieneke, CVCE, Luxembourg
Conclusion
Challenges
What is truth? Humanities vs. Computer Science
Gathering requirements for tools that haven‘t beendeveloped yet
Engaging crowds
Image copyrights
Scientific value?
Refinement of the application
Additional datasources
Improvement of the interface
Integration of the new components
17Lars Wieneke, CVCE, Luxembourg