improving research efficiency: user and content fingerprinting
DESCRIPTION
Academic Publishing in Europe, 30 January 2013 Speaker: Kevin CohnTRANSCRIPT
![Page 1: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/1.jpg)
Kevin CohnChief Operating Officer
@Atypon
Improving Research Efficiency
Academic Publishing in Europe, Berlin30 January 2013
User and Content Fingerprinting
![Page 2: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/2.jpg)
![Page 3: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/3.jpg)
• Provider of Software as a Service content delivery for publishers
• Literatum platform used to deliver 15M journal articles and 70,000 eBooks
• 1.5 billion user sessions in 2012
About Atypon
3 Improving Research Efficiency
![Page 4: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/4.jpg)
• Research efficiency can be greatly improved if publishers tap into their huge volume of data to better connect users to content.
Thesis
4 Improving Research Efficiency
![Page 5: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/5.jpg)
![Page 6: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/6.jpg)
Users don’t want “advanced search...”
![Page 7: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/7.jpg)
![Page 8: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/8.jpg)
...but they do want relevant results.
![Page 9: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/9.jpg)
This is the APE I’m looking for.
![Page 10: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/10.jpg)
Data can drive this behavior.
![Page 11: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/11.jpg)
• Relevancy is the only order that matters
• > 50% of clicks are to the first result
• > 90% of clicks are on the first page
• Filters/facets aren’t used
Observations
9 Improving Research Efficiency
![Page 12: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/12.jpg)
• Give users what they want: a simple, Google-like search interface
• But use proprietary data to calculate relevancy for each individual user
Objectives
10 Improving Research Efficiency
![Page 13: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/13.jpg)
Automatic Topic Modeling11 Improving Research Efficiency
![Page 14: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/14.jpg)
• Based on a statistical model called latent Dirichlet allocation (LDA)
• Creates “topics:” collections of words that occur together with great frequency
Topic #1: {mammal, primate, hominoidea}
Topic #2: {academic, publishing, europe}
Automatic Topic Modeling
12 Improving Research Efficiency
![Page 15: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/15.jpg)
13 Improving Research Efficiency
![Page 16: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/16.jpg)
13 Improving Research Efficiency
![Page 17: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/17.jpg)
Topic #1
![Page 18: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/18.jpg)
Topic #2
![Page 19: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/19.jpg)
16 Improving Research Efficiency
![Page 20: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/20.jpg)
16 Improving Research Efficiency
![Page 21: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/21.jpg)
17 Improving Research Efficiency
![Page 22: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/22.jpg)
17 Improving Research Efficiency
![Page 23: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/23.jpg)
17 Improving Research Efficiency
![Page 24: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/24.jpg)
18 Improving Research Efficiency
![Page 25: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/25.jpg)
• My search for “APE” returns results about this conference, not primates
• The same is true for recommendations
• Better related articles (topics 1 and 2 are not related, despite sharing “APE”)
Applications
19 Improving Research Efficiency
![Page 26: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/26.jpg)
• Topics are self-updating = low-cost, low-maintenance
• Flat (not hierarchical) = avoids troublesome questions about classification
• Probabilistic (not binary) = better at expressing relevancy to topics
Not a Taxonomy/Ontology...
20 Improving Research Efficiency
![Page 27: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/27.jpg)
21 Improving Research Efficiency
![Page 28: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/28.jpg)
21 Improving Research Efficiency
![Page 29: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/29.jpg)
• Topics are “collections of words that occur together with great frequency”
• Knowing that “APE” is an acronym for “Academic Publishing in Europe”
• Knowing that “CC0” and “CC BY” are Creative Commons license types
...But Is Helped by Them
22 Improving Research Efficiency
![Page 30: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/30.jpg)
• We didn’t invent ATM (or LDA)
• Our implementation started as a collaboration with academic researchers...
• ...and will require considerable experimentation and testing to get right
Worth Mentioning
23 Improving Research Efficiency
![Page 31: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/31.jpg)
• Usage is not personally identifiable
• Usage is not shared with third parties
• Users can opt out of personalization
Privacy
24 Improving Research Efficiency
![Page 32: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/32.jpg)
• ATM uses proprietary data to calculate relevancy for each individual user
• Gives users what they want: a simple, Google-like search interface
• Improves research efficiency by freeing up searching time for reading
Summary
25 Improving Research Efficiency
![Page 33: Improving Research Efficiency: User and Content Fingerprinting](https://reader033.vdocuments.site/reader033/viewer/2022052619/5569ed07d8b42ac1468b4e88/html5/thumbnails/33.jpg)
Thank You
26 Improving Research Efficiency
Kevin CohnChief Operating Officer, Atypon