scalling solr in the cloud - oscon data 2011

61
www.nutshell.com Scaling Solr in the Cloud By @ ablyler & @ LindsaySnider FROM @ NutshellCRM

Upload: oscon-byrum

Post on 12-Jan-2015

2.142 views

Category:

Technology


1 download

DESCRIPTION

Solr, an open source enterprise search server, scales very well within an index (vertical scaling). It is when you have multiple indexes (horizontal scaling) that it starts to get hairy, which happens a lot when you are hosting a cloud based solution for multiple users. In this session we will discuss these issue as well as the techniques of how to overcome them in-depth. By: Andy Blyler & Lindsay Snider

TRANSCRIPT

Page 1: Scalling Solr in the Cloud - OSCON Data 2011

www.nutshell.com

Scaling Solr in the CloudBy @ablyler & @LindsaySnider

F R O M

@NutshellCRM

Page 2: Scalling Solr in the Cloud - OSCON Data 2011

Agenda

About the Speakers

About Nutshell

Nutshell and Solr

Solr Resource Usage

Scaling Methods

Questions and Answers

Page 3: Scalling Solr in the Cloud - OSCON Data 2011

About the Speakers

BitLeap / Barracuda NetworksDevelopers on the Backup Appliance / Cloud

Scaled databases and storage systems

Used SugarCRM & Salesforce.com

Page 4: Scalling Solr in the Cloud - OSCON Data 2011

About Nutshell

Web and Mobile CRM application

Heavy use of OpenSource technologies:Gentoo

Nginx / PHP / ZendFramework / jQuery

MySQL / Solr / Gearman

Jenkins / Redmine / Cacti / Nagios

Page 5: Scalling Solr in the Cloud - OSCON Data 2011

Nutshell and Solr

Heavy use of Solr for searching, table views, and de-duplication

Used for searching / display:Accounts, Competitors, Contacts, Leads, Products, Sources, Teams, and Users

Used for de-duplication:Accounts, Contacts

Page 6: Scalling Solr in the Cloud - OSCON Data 2011

Reads vs Writes

0

1750

3500

5250

7000

Jul 20, 2011 3:00 AM Jul 20, 2011 6:00 AM Jul 20, 2011 9:00 AM Jul 20, 2011 12:00 PM Jul 20, 2011 3:00 PM Jul 20, 2011 6:00 PM Jul 20, 2011 9:00 PM Jul 21, 2011 12:00 AM

Writes Reads Time

Average Read Query Time: 2.11ms

Page 8: Scalling Solr in the Cloud - OSCON Data 2011

Computer Resources

CPU Disk I/O

RAM Disk Storage

Page 9: Scalling Solr in the Cloud - OSCON Data 2011

CPU Disk I/O

RAM Disk Storage

Page 10: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Resources

CPU Disk I/O

RAM Disk Storage

Light

Heavy

Page 11: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Resources

CPU Disk I/O

RAM Disk Storage

Light

Heavy CPU

Page 12: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Resources

CPU Disk I/O

RAM Disk Storage

Light

Heavy CPU Disk I/O

Page 13: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Resources

CPU Disk I/O

RAM Disk Storage

Light

Heavy CPU Disk I/O

Disk Storage

Page 14: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Resources

CPU Disk I/O

RAM Disk Storage

Light

Heavy CPU Disk I/O

Disk Storage

RAM

Page 15: Scalling Solr in the Cloud - OSCON Data 2011

Auto Provisioning

Setup new Solr Jetty app

Create MySQL database

Populate MySQL / Solr with demo data

Send welcome email

Dark Ages Age of Enlightenment TodayModern Era

Page 16: Scalling Solr in the Cloud - OSCON Data 2011

Separate Jetty per Customer

Uses a ton of memory

Separate schema / Solr for each customer

Ran into upper limit morning before launch

Dark Ages Age of Enlightenment TodayModern Era

Page 17: Scalling Solr in the Cloud - OSCON Data 2011

Auto provisioning

Separate Jetty app for each customer

Dark Ages Age of Enlightenment TodayModern Era

Page 18: Scalling Solr in the Cloud - OSCON Data 2011

Auto provisioning

Separate Jetty app for each customer

Dark Ages Age of Enlightenment TodayModern Era

0

25

50

Page 19: Scalling Solr in the Cloud - OSCON Data 2011

Solr Core per Customer

Allows for management of Solr on a per customer basis: creating / stopping

Contained within a single Jetty app

Shared schema between all cores

Easily managed via simple HTTP API

Age of EnlightenmentDark Ages TodayModern Era

Page 20: Scalling Solr in the Cloud - OSCON Data 2011

Fallback to MySQL

Landing page of application

Allows for graceful handling when Solr is down

Abstracted within the application library

Age of EnlightenmentDark Ages TodayModern Era

Page 21: Scalling Solr in the Cloud - OSCON Data 2011

Sun JVM to IcedTea JVM

IcedTea JVM uses less memory than Sun JVM

Age of EnlightenmentDark Ages TodayModern Era

Page 22: Scalling Solr in the Cloud - OSCON Data 2011

Separate Solr core per customer

Fallback to MySQL for table data

Migrated from Sun JVM to IcedTea JVM

Age of EnlightenmentDark Ages TodayModern Era

Page 23: Scalling Solr in the Cloud - OSCON Data 2011

Separate Solr core per customer

Fallback to MySQL for table data

Migrated from Sun JVM to IcedTea JVM

0

150

300

Age of EnlightenmentDark Ages TodayModern Era

Page 24: Scalling Solr in the Cloud - OSCON Data 2011

Shared Schema Across Cores

Decreases initialization time for each core

Decreases memory usage

Age of EnlightenmentDark Ages TodayModern Era

Page 25: Scalling Solr in the Cloud - OSCON Data 2011

Solr Index Field Selection

More indexed fields = more used memory

Only index fields that are searched

Store other non-indexed fields for display

Age of EnlightenmentDark Ages TodayModern Era

Page 26: Scalling Solr in the Cloud - OSCON Data 2011

Splitting of Reader / Writer

Index building is CPU / disk intensive

Writer = Solr with caching disabled

Reader = Solr slave, that doesn’t build indexes

Age of EnlightenmentDark Ages TodayModern Era

Page 27: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Age of EnlightenmentDark Ages TodayModern Era

Page 28: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Range indexerRange indexerRange indexerRange indexerDocument indexers

Gearman workers

Age of EnlightenmentDark Ages TodayModern Era

Page 29: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Range indexerRange indexerRange indexerRange indexerDocument indexers

Gearman workers

Age of EnlightenmentDark Ages TodayModern Era

Page 30: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Range indexerRange indexerRange indexerRange indexerDocument indexers

Gearman workers

Age of EnlightenmentDark Ages TodayModern Era

Page 31: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Reindex ManagerRange indexerRange indexer

Range indexerRange indexerRange indexers

Age of EnlightenmentDark Ages TodayModern Era

Page 32: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Reindex ManagerRange indexer

Range indexerRange indexerRange indexerRange indexers

Age of EnlightenmentDark Ages TodayModern Era

Page 33: Scalling Solr in the Cloud - OSCON Data 2011

MySQL(master-master)

Web

Solr Slave

Solr Master

Reindex ManagerRange indexer

Range indexerRange indexerRange indexerRange indexers

Age of EnlightenmentDark Ages TodayModern Era

Page 34: Scalling Solr in the Cloud - OSCON Data 2011

Shared schema across cores

Solr index field selection

Splitting of reader / writer roles

Intelligent indexing / reindexing

Age of EnlightenmentDark Ages TodayModern Era

Page 35: Scalling Solr in the Cloud - OSCON Data 2011

Shared schema across cores

Solr index field selection

Splitting of reader / writer roles

Intelligent indexing / reindexing

Age of EnlightenmentDark Ages TodayModern Era

0

500

1000

Page 36: Scalling Solr in the Cloud - OSCON Data 2011

Intelligent Core Management

Least recently used Solr cores are spun down

Solr cores started on login

Reindexing is a database flag, and happens on the next login

Age of EnlightenmentDark Ages TodayModern Era

Page 37: Scalling Solr in the Cloud - OSCON Data 2011

Partitioning

Pairs of readers and writers

Partitioned based on account idLeft pad the account id w/ zeros to length of two

Reverse the account id

Take last two digits of the account id

Age of EnlightenmentDark Ages TodayModern Era

Page 38: Scalling Solr in the Cloud - OSCON Data 2011

Intelligent core spin down / up

Partitioning of customers to separate Solr servers

Age of EnlightenmentDark Ages TodayModern Era

Page 39: Scalling Solr in the Cloud - OSCON Data 2011

Intelligent core spin down / up

Partitioning of customers to separate Solr servers

Age of EnlightenmentDark Ages TodayModern Era

0

4000

8000

Page 40: Scalling Solr in the Cloud - OSCON Data 2011
Page 41: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

Page 42: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

Page 43: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

Page 44: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

Page 45: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

Page 46: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

Page 47: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

Page 48: Scalling Solr in the Cloud - OSCON Data 2011

1 2 3 4 5 6

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

Page 49: Scalling Solr in the Cloud - OSCON Data 2011

1 2 3 4 5 6

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

65

Page 50: Scalling Solr in the Cloud - OSCON Data 2011

1 2 3 4 5 6

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

65

01

02

03

04

65

66

96

97

98

99

Page 51: Scalling Solr in the Cloud - OSCON Data 2011

1 2 3 4 5 6

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

65

01

02

03

04

65

66

96

97

98

99

65

Page 52: Scalling Solr in the Cloud - OSCON Data 2011

1 2 3 4 5 6

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

123456

1 2 3 4 5 6

65

01

02

03

04

65

66

96

97

98

99

Page 53: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

Page 54: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

Page 55: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

Page 56: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

SOLR Server

8000 User

Page 57: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

SOLR Server

8000 User

Page 58: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

SOLR Server

8000 User

Page 59: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

SOLR Server

8000 User

Page 60: Scalling Solr in the Cloud - OSCON Data 2011

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

SOLR Server

8000 User

01

02

03

04

65

66

96

97

98

99

SOLR Server

8000 User

SOLR Server

8000 User

Page 61: Scalling Solr in the Cloud - OSCON Data 2011

7000 Users

600 Users

300 Users

50 Usersauto provisioning

separate Jetty app for each customer

separate Solr core per customer

fall back to MySQL for table data

migrated from Sun JVM to IcedTea JVM

shared schema across cores

solr index field selection

splitting reader / writer roles

intelligent core spin down / up

parititioning of customer to separate Solr servers

Scaling Solr in the Cloud

By @ablyler & @LindsaySniderF R O M

@NutshellCRM