© 2008 endeca technologies, inc. all rights reserved. is search broken?! daniel tunkelang chief...
TRANSCRIPT
![Page 1: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/1.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.
Is Search Broken?!
Daniel TunkelangChief Scientist, Endeca
![Page 2: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/2.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.2
howdy!
• 1992:Bachelor’s + Master’s from MIT in CS + Math
• 1998:PhD from CMU in CS (ACO program)
• 1999:Co-founded Endeca!
• 2008:???
![Page 3: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/3.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.3
overview
• Who is Endeca?
• Is search broken?
• If it is, what can we do about it?
![Page 4: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/4.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.4
who / what is endeca?
• Software to help people explore, analyze, and understand complex information, guiding them to unexpected insights and better decisions.
• 500+ customers
• $108M revenue in 2007.
![Page 5: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/5.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.5
some of our customers
![Page 6: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/6.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.6
Is search broken?
![Page 7: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/7.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.7
Search has hit a wall.
![Page 8: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/8.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.8
search hits a wall in ecommerce
![Page 9: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/9.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.9
search hits a wall in knowledge management
Current Search: it outsourcing
![Page 10: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/10.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.10
search even hits a wall on the web
Results 1-10 out of about 344,000,000 for ir
![Page 11: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/11.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.11
But is search broken?
![Page 12: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/12.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.12
the accountants don’t think so
![Page 13: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/13.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.13
most users don’t think so
75
![Page 14: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/14.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.14
or do they?
78% wish search engines could read their minds.
What frustrates users most?– 25%: deluge of results– 24%: too many paid listings– 19%: inability to understand their keywords– 19%: disorganized / random results
The State of SearchAutobytel & Kelton Research, Oct ’07
![Page 15: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/15.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.15
web search vs. enterprise search
“Search on the internet is solved. I always find what I need.
But why not in the enterprise?
Seems like a solution waiting to happen.”
- a Fortune 500 CTO
![Page 16: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/16.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.16
Can theory help?
![Page 17: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/17.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.17
precision = fraction of retrieved documents that are relevant
recall = fraction of relevant documents that are retrieved
retrieveddocuments
relevantdocuments
![Page 18: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/18.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.18
the truth,
nothing but the truth
why improve precision?
![Page 19: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/19.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.19
the whole truth,
why improve recall?
![Page 20: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/20.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.20
the truth,the whole truth, nothing but the
truth
what we want…
![Page 21: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/21.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.21
recall
precision
but there is a trade-off…
![Page 22: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/22.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.22
Precision…to avoid annoying users with irrelevant results?
which should we favor?
Recall…to make sure we don’t throw away results the user wants / needs?
![Page 23: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/23.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.23
Enough stalling…what’s the answer?!
![Page 24: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/24.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.24
depends on what you want
vs.
![Page 25: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/25.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.25
you get what you pay for
• There are easy use cases…– 30% of queries are navigational.– 30% of queries lead to Wikipedia pages.– Users won’t pay, but advertisers will!
• …and hard use cases.– Queries where recall matters.– Exploratory search.– Enterprises will pay for insight.
![Page 26: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/26.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.26
Great, bring on the insight!
![Page 27: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/27.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.27
technology alone can’t provide insight
• The system can’t read your mind.
• Your spouse / best friend can’t read your mind.
• Sometimes you can’t read your own mind.
![Page 28: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/28.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.28
So should we just give up?
![Page 29: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/29.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.29
technology is a catalyst
• Computers are good at analysis.
• People are good at using what they know.
• How do we get the best of both worlds?
![Page 30: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/30.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.30
with apologies to luis von ahn
![Page 31: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/31.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.31
human-computer information retrieval
• Instead of guessing the user’s intent,optimize communication.
• De-emphasize the top ten documents;response is a set of documents.
• Think beyond single queries;support refinement and exploration.
![Page 32: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/32.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.32
recall
precision
hcir cheats the trade-off
![Page 33: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/33.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.33
But how do we implement HCIR?
![Page 34: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/34.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.34
endeca's approach: guided summarization
• Set retrieval that responds to queries with– an overview of the user's current context.– an organized set of options for incremental
exploration.
• Contextual summaries of document sets optimize system’s communication with user.
• Query refinement options optimize user’s communication with system.
![Page 35: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/35.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.35
guided summarization for ecommerce
Matching Categories include:
Appliances > Small Appliances > Irons & Steamers
Appliances > Small Appliances > Microwaves & Steamers
Bath > Sauna & Spas > Steamers
Kitchen > Bakeware & Cookware > Cookware >Open Stock Pots > Double Boilers & Steamers
Kitchen > Small Appliances > Steamers
![Page 36: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/36.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.36
guided summarization for KM
![Page 37: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/37.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.37
Guided summarization starts withfaceted search.
![Page 38: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/38.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.38
facets 101
![Page 39: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/39.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.39
But faceted search isn’t enough…
![Page 40: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/40.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.40
showing the right facets: microwaves
vs.
![Page 41: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/41.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.41
showing the right facets: ceiling fans
![Page 42: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/42.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.42
traditional topic taxonomy
![Page 43: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/43.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.43
dynamic topic facet
Subject
Electronic data processing (1002)
Distributed processing (937)
Parallel processing (619)
Computer networks (562)
Fault-tolerant-computing (365)Show more…
SubjectArtificial intelligence (227) High performance computing (244)Automatic theorem proving (9) History (11)Client/server computing (185) Information technology (145)Computer algorithms (110) Java (77)Computer architecture (162) Law and legislation (70)Computer networks (552) Logic, Symbolic and mathematical (16)Computer programs (139) Mathematics (70)Computer security (151) Mobile communication systems (54)Computer software (253) Operating systems (87)Computers (124) Parallel processing (619)Database management (277) Research (83)Distributed processing (937) Software engineering (197)Electronic data processing (1002) Supercomputers (139)Electronic digital computers (148) Web databases (54)Fault-tolerant computing (365) Wireless communication systems (97)
![Page 44: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/44.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.44
facets populated using entity extraction
apple production
![Page 45: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/45.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.45
cutting through facets to show the big picture
Search: storage
![Page 46: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/46.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.46
summarization: more than search and browse
![Page 47: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/47.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.47
guided summarization – a summary
Guided summarization enables a dialogbetween the user and the data,
enabling exploration and discovery.
![Page 48: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/48.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.48
The Moral
![Page 49: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/49.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.49
think outside the box
• Search works for many use cases.
• But not for some of the most valuable ones.
• Focus on human-computer information retrieval.
![Page 50: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/50.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.50
One More Thing
![Page 51: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/51.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.51
maybe we should treat search as a game
![Page 52: © 2008 Endeca Technologies, Inc. All rights reserved. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca](https://reader035.vdocuments.site/reader035/viewer/2022062511/551bf12a550346b9588b65e0/html5/thumbnails/52.jpg)
© 2008 Endeca Technologies, Inc. All rights reserved.52
thank you
Questions?