queue based solr indexing with collection management: presented by devansh dhutia, gannett co

21
OCTOBER 13-16, 2016 AUSTIN, TX

Upload: lucidworks

Post on 07-Jan-2017

617 views

Category:

Technology


0 download

TRANSCRIPT

O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X

Queue Based Indexing & Collection Management Devansh Dhutia

Platform Architect

3

01

•  National & Local newspaper/media company •  92+ Markets in 33 states

4

03Current/Future Architecture

5

01Agenda

•  Solr @ Gannett •  Current State •  Collection Management •  Queuing Solution •  Future Work •  Questions

6

02

@

Site Search CMS Search

Analytics Personalization

40+ Applications 20M+

Integral pillar of Gannett’s Digital Platform

total documents

800,000+ per month

Growing rapidly

100,000+ requests per minute

Highly Available

~100ms average response time

Extremely Fast

8 nodes

256 gb memory per availability zone

7

03Current State

8

01Current State

•  Synchronous Operations •  Near Realtime •  Time Consuming schema changes •  Visible outage impact

9

01Collection Management

•  Create Collection •  Deploy Batch Indexer •  Index new Collection •  Update Alias to new Collection •  Run catch up •  Deploy Search/Index Apps

10

01Realtime Changes / Queries

11

01Prep Alternate Collection

12

01Deploy

13

01Outage Problems

•  Spinning Wheel •  Duplicate content •  Unable to find new content •  Frustrated editors •  Ux & other presentation layers

14

01Enter Queues

•  Asynchronous Write Operations •  Near Realtime •  Faster schema changes •  Auto scale indexing workers •  Low authoring outage impact •  Eventually consistent

Queue Based Indexing

16

01RabbitMQ

•  Clustered & Highly Available •  FIFO •  pub/sub model •  Consistent Hash / Multiple Queues

17

01RabbitMQ

18

01Components •  Realtime Queue •  Batch Queue •  Prep Queue •  Deadletter Queue •  Indexing Service •  Prep mode •  Batch Push Service

19

01Future Work

•  Continuous Delivery of schema •  Build payload in one zone only •  Automated Deadletter handling •  Earlier notification of potential failure

20

01

Thank you Interested in joining our team at Gannett?

http://www.gannett.com/careers

21

01

Questions?