intro to apache solr
TRANSCRIPT
![Page 1: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/1.jpg)
Apache SolrIntroduction & Demo
![Page 2: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/2.jpg)
• What is Apache Solr?
• Start/stop Solr
• Indexing data to Solr
• Searching data
• Running a SolrCloud cluster
• Hacking Solr
Agenda
![Page 3: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/3.jpg)
• Lucene based search server + other features
• Access Lucene over HTTP:
• Java, Python, Ruby, .NET, PHP over XML/JSON and other formats
• Faceting (guided navigation), suggestions, highlighting etc.
• Replication and distributed search
• Lucene best practices
What is Apache Solr?
![Page 4: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/4.jpg)
• Extract:
• tar xvf solr-5.1.0.tgz (linux/mac)
• unzip solr-5.1.0.zip or click+extract (windows)
• Run:
• ./bin/solr start -e schemaless
• ./bin/solr start -e schemaless -p 8983
• ./bin/solr -help
• ./bin/solr start -help
• Stop:
• ./bin/solr stop
Running Solr
![Page 5: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/5.jpg)
• ./bin/post script
• Using curl directly
• Using the Admin UI
• SolrJ and other indexing clients
Indexing data
![Page 6: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/6.jpg)
Demo time
![Page 7: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/7.jpg)
Inverted index
![Page 8: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/8.jpg)
• +red +shoes = red AND shoes
• +shoes -red = shoes NOT red
• “android phone”
• “android phone” -samsung = “android phone” NOT samsung “android samsung”~4
• merced*
• createDate:[201301 TO 201401]
• author:shalin
• author:”shalin mangar”
• author:”shalin mangar” AND project:(lucene OR solr) title:samsung^5 category:phone
Lucene/Solr query syntax
![Page 9: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/9.jpg)
• DataImportHandler: Index databases, Email, RSS, XMLs etc.
• Rich document support: PDF, MS Office, Images etc.
• Faceting, stats, analytics
• Replication for high query volume
• Production systems with billions of documents
• Very extensible and customizable
• Embedded in commercial search products from Lucidworks, DataStax, Cloudera, Hortonworks, Pivotal, Amazon Cloudsearch, Riak etc.
Other features of Solr
![Page 10: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/10.jpg)
• Subset of optional features in Solr to enable and simplify horizontal scaling a search index using sharding and replication
• Goals: scalability, performance, high-availability, simplicity, and elasticity
What is SolrCloud?
![Page 11: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/11.jpg)
• ./bin/solr -e cloud
• Yeah, it’s that simple!
Running SolrCloud
![Page 12: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/12.jpg)
SolrCloud demo
![Page 13: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/13.jpg)
• http://wiki.apache.org/solr/HowToContribute
• Pre-requisites:
• git: git clone http://git-wip-us.apache.org/repos/asf/lucene-solr.git
• github: fork and clone apache/lucene-solr
• ant 1.8.x or above
• Eclipse or Intellij Idea (I recommend Idea)
• Put svn/git and ant in your $PATH or %PATH%
Hacking Solr
![Page 14: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/14.jpg)
• ant ivy-bootstrap (required only once)
• ant idea or ant eclipse (generated a complete project for you which you can open in your favourite IDE)
• Find an existing Jira issue or open a new one at http://issues.apache.org/jira/browse/SOLR
• Make changes, write tests, once finished:
• run ‘cd solr; ant server’ to build Solr and start via bin/solr scripts
• run ‘ant test’ (it can take a while), ensure all tests pass
• run ‘ant precommit’, (run from the checkout root) ensure it passes
• Generate a patch with ‘svn diff’ or ‘git diff’ and attach to Jira
Hacking Solr
![Page 15: Intro to Apache Solr](https://reader030.vdocuments.site/reader030/viewer/2022021420/587067401a28ab48378b5337/html5/thumbnails/15.jpg)
• http://lucene.apache.org/solr
• https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
• https://issues.apache.org/jira/browse/SOLR
• Ask me: solr-help.slack.com
• Ask other users: [email protected]
• Ask developers: [email protected] (use sparingly)
Resources