europeana ga 2016: harnessing crowds, niches & professionals in the digital age
TRANSCRIPT
You are doing it
Harnessing Crowds, Niches & Professionals in the Digital Age
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Lora Aroyo
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Take home message
software is dead! long live data! know your data - know your crowds
adapt your core strategies
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
1998 2006
1 million dollar prize for best algorithm
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Netflix switches to streaming
2007 1998 2006
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Team BellKor wins Netflix Prize
2007 1998 2006 2009
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Team BellKor wins Netflix Prize
2007 1998 2006 2009
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
from books to data science
2006 1994 2003 2016
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
data is at the centre of every process
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
data is essential to evolve with your users
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Ceci n'est pas … la mona lisa
Louvre’s Mona Lisa is only #14
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
the battle of two worlds
9,3 million Louvre
visitors 2014 14 million website visitors
2,3 million social media
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
in the (very near) future
most visitors will be digital-born not bound by time or location
native to new forms of co-makership native to new media
Siebe Weide, Max Meijer and Marieke Krabshuis (2012). Agenda 2026: Study on the Future of the Dutch Museum Sector
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
in the (very near) future
digital & virtual positioning is critical traceability & recognisability is key
Siebe Weide, Max Meijer and Marieke Krabshuis (2012). Agenda 2026: Study on the Future of the Dutch Museum Sector
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
data at the center of ALL your processes
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
know your data
variety of meanings multitude of perspectives
abundance of sources endless applications
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
crowdsourcing: solution at scale to know your data
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
know your crowds
volunteers enthusiasts
visitors on-site visitors online paid crowds
in-house experts
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
understand who are the different crowds what can they do for your collection
know your crowds
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
expertise of Rijksmuseum professionals is in annotating their collection
with art-historical information, e.g. when they were created, by whom, etc.
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
detailed domain-specific information about depicted objects, e.g. which species the
animal or plant belongs to, is in most cases not available
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
use nichesourcing, i.e. niches of people with the right expertise, to add more specific information
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
use social media sources, e.g. Twitter, to find different types of experts in certain areas, e.g.
bird lovers or ornithologists
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
EngagetheNicheinCulturalHeritage
http://annotate.accurator.nl
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
EngagewithGamesinCulturalHeritage
training the general crowd to be a niche: game in which players can carry out an expert
annotation tasks with some assistance
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
combine the crowds in a workflow: assess the crowd’s annotation quality by allowing
another crowd to review/rate them
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
engage volunteer crowds through continuous gaming
@waisda http://waisda.nl
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
engage volunteer crowds through continuous gaming
http://spotvogel.vroegevogels.vara.nl
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
PaidCrowdsforVideoAnalysisCrowdTruth.org
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
PaidCrowdsforTextAnalysisCrowdTruth.org
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
PaidCrowdsforImageAnalysis
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
CrowdTruth.org
88% of the tags useful for specific genres Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Guus Schreiber, Lora Aroyo (2011). On the role of user-generated metadata in audio visual collections. International conference on Knowledge capture K-CAP '11, Pages 145-152
measure & assess monitor progress
6 months 2 years 340,551 tags 36,981 tags 137.421 matches 602 items 1.782 items 555 registered players 2,017 users (taggers) thousands of anonymous players 12,279 visits (3+ min online) 44,362 pageviews
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
user vocabulary 8% in professional vocabulary 23% in Dutch lexicon 89% found on Google
locations (7%)
engeland
persons (31%) objects (57%)
measure & assess evaluate content, compare crowds
TagsdescribemainlyshortsegmentsTagsareo9ennotveryspecificTagsnotdescribeprogrammesasawhole
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Challenge 1: Crowdsourcing initiatives are typically undertaken in isolation
Challenge 2: It is quite difficult to strictly control the time it takes to complete crowdsourcing initiatives (across different types)
Challenge 3: Crowdsourcing initiatives demand your continuous promotional effort
Challenge 4: It is challenging for CH Institutions to incorporate crowdsourcing results into their existing content infrastructure
Crowdsourcing Challenges
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
use any opportunity collect or verify data through crowdsourcing …
(S) with a global strategy
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… by user-driven augmentations of your collection
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
… in
NextGen Cultural Heritage
driven by data & crowds
http://lora-aroyo.org u http://slideshare.net/laroyo u @laroyo
Take home message
software is dead! long live data! know your data - know your crowds
adapt your core strategies