minneapolis solr meetup - may 28, 2014: ecommerce search with apache solr
DESCRIPTION
"eCommerce Search with Apache Solr", Grant IngersollTRANSCRIPT
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
eCommerce Search with Apache Solr
Grant IngersollCTO, LucidWorks
Twitter: @gsingers
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Tales from the trenches
• The case of the missing data
• The power of suggestion
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Topics
• Solr powered commerce– Companies– Features
• Relevance, relevance, relevance
• Demo
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Solr Powers Leading eCommerce and Consumer Sites
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Basic Features for eCommerce
• High quality OOTB relevance• Facets
– Range, Term/Category, Hierarchical, Pivot• Highlighting• Did you mean?• Boosting/Blocking/Landing Pages• Easy scale
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Advanced Features
• Spatial– Local– Route finding– Open Hours, etc.
• Function Queries– Inventory, Margin
• Stats Component– Missing data– Bounds, etc.
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Tips and Tricks
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014
Look Before You Leap
• Before undertaking any relevance tuning, you need to define what “better search” means to you
• Once determined, many ways to test/measure
• Once tested, many ways to fix
http://www.betternetworker.com/files/useruploads/16675/leap.jpg
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014
Understand your…
• Domain– Types of documents– Languages present– Document structures,
metadata and other features– Lexical resources: jargon,
synonyms, abbreviations...– Relationships between
documents
• Users– Sophistication/Expertise– Search and Discovery needs– Known Item vs. Keyword
• Tolerance for Pain– Managers– Business Interests– Release cycles– Obsession in finding the one
true relevance model (hint, it doesn’t exist)
– “explain() blindness”
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014
Known Item vs. Keyword
eCommerce search often has a split between known item and keyword search
You probably have more “wiggle” room for relevancy on keyword search
E.g. What should be the top result for a search on “women’s shoes”?
Known Item should have best matches at the top More in a moment
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014
Debugging
• Check the analysis (more in the next slide)• Check for data quality issues• Check your query constructs (slop, boosts, etc.)
• Try alternate query representations• (exact match)^100 OR (sloppy phrase match)^50 OR (OR query)
• Use Lucene’s explain() or Solr’s &debugQuery
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Signal Processing for Search and Discovery
• Signals power modern relevance– Clicks, conversions, sharing, history, signatures
• LucidWorks 5 makes it easy to capture and leverage signals– Recommendations, analytics, discovery
• Simplifies your data workflow• Simplify your operational footprint
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Solr Powered Signal Processing
• Use Case: eCommerce
• Data: – Product catalog (~1.2m items)– Click data (~3.9M clicks)
10010
10010
10010 10010
Confidential and Proprietary © Copyright 2014Confidential and Proprietary © Copyright 2013
Metadata
• http://www.lucidworks.com
• [email protected]• @gsingers
• Lucene/Solr Revolution – Washington DC, Nov 11-14– http://www.lucenerevolution.org