a morning with mongodb barcelona: from oracle to mongodb

56
Pablo Enfedaque [email protected] 06.10.2012 A real use case at Telefónica PDI From Oracle to MongoDB

Upload: mongodb

Post on 02-Jul-2015

777 views

Category:

Documents


4 download

DESCRIPTION

http://www.10gen.com/events/MongoDB-Morning-Barcelona

TRANSCRIPT

Page 1: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Pablo Enfedaque [email protected] 06.10.2012

A real use case at Telefónica PDI

From Oracle to MongoDB

Page 2: A Morning with MongoDB Barcelona: From Oracle to MongoDB

01

02

03

04

Introduction • Telefónica PDI. Who? • Personalisation Server. Why? What?

The SQL version • Data model and architecture • Integrations, problems and improvements

The NoSQL version • Data model and architecture • Performance boost • The bad

Conclusions • Conclusions • Personal thoughts

Content

Page 3: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

01 Introduction

Page 4: A Morning with MongoDB Barcelona: From Oracle to MongoDB

4 Telefónica PDI

Telefónica PDI. Who?

• Telefónica §  Fifth largest telecommunications company in the world §  Operations in Europe (7 countries), the United States and Latin America

(15 countries)

• Telefónica Digital §  Web and mobile digital contents and services division

• Product Development and Innovation unit §  Formerly Telefónica R&D §  Product & service development, platforms development, research,

technology strategy, user experience and deployment & operation §  Around 70 different on going projects at all time.

01

Page 5: A Morning with MongoDB Barcelona: From Oracle to MongoDB

5 Telefónica PDI

Personalisation Server. What?

• User profiling system

• Machine learning

• Recommendations

• Customer’s profile storage

01

Page 6: A Morning with MongoDB Barcelona: From Oracle to MongoDB

6 Telefónica PDI

Opt-in and profile module. Why?

• Users data, profile and permissions, was scattered across different storages

01

• Gender • Film and music preferences IPTV service

• Permission to contact by SMS? • Gender

Mobile service

• Address • Music preferences

Music tickets service

• Address • Permission to contact by SMS?

Location based offers

So you want to know my address… AGAIN?!

Page 7: A Morning with MongoDB Barcelona: From Oracle to MongoDB

7 Telefónica PDI

Opt-in and profile module. Why?

• Users data, profile and permissions, was scattered across different storages

01

• Gender • Film and music preferences IPTV service

• Permission to contact by SMS? • Gender

Mobile service

• Address • Music preferences

Music tickets service

• Address • Permission to contact by SMS?

Location based offers

Page 8: A Morning with MongoDB Barcelona: From Oracle to MongoDB

8 Telefónica PDI

Opt-in and profile module. Why?

• Provide a module to become master customer’s data storage

01

•  Gender •  Film and music

preferences •  Permission to contact

by SMS? •  Address

IPTV service

Mobile service

Music tickets service

Location based offers

Page 9: A Morning with MongoDB Barcelona: From Oracle to MongoDB

9 Telefónica PDI

Opt-in and profile module. What?

• Features:

§  Flexible profile definition, classified in services

§  Profile sharing options between different services

§  Real time API

§  Supplementary offline batch interface

§  Authorization system

§  High availability

§  Inexpensive solution & hardware

01

Page 10: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

02 The SQL solution

Page 11: A Morning with MongoDB Barcelona: From Oracle to MongoDB

11 Telefónica PDI

Data model

• Services defined a set of attributes (their profile), with default value and data type

• Users were registered in services • Users defined values for some of the services attributes • Each attribute value had an update date to avoid overwriting newer changes through batch loads

Services, users and their profile

02

Page 12: A Morning with MongoDB Barcelona: From Oracle to MongoDB

12 Telefónica PDI

Data model

• Services could access attributes declared inside other services • There were sharing rights for read or read and write • The user had to be registered in both services

Services profile sharing matrix

02

Page 13: A Morning with MongoDB Barcelona: From Oracle to MongoDB

13 Telefónica PDI

Data model

• Everything that could be accessed in the PS was a resource • Roles defined access rights (read or read and write) of resources • Auth users had roles • Roles could include other roles

Authorization system

02

Page 14: A Morning with MongoDB Barcelona: From Oracle to MongoDB

14 Telefónica PDI

Data model

• Multiple IDS: §  Users profile could be accessed with different equivalent IDs depending

on the service §  Each user ID was defined by an ID type (phone number, email, portal ID,

hash…) and the ID value

Bonus features!

02

Page 15: A Morning with MongoDB Barcelona: From Oracle to MongoDB

15 Telefónica PDI

High level logical architecture

§  Everything running on Red Hat EL 5.4 64 bits

02

Page 16: A Morning with MongoDB Barcelona: From Oracle to MongoDB

16 Telefónica PDI

High level logical architecture

§  Everything running on Red Hat EL 5.4 64 bits

02

Page 17: A Morning with MongoDB Barcelona: From Oracle to MongoDB

17 Telefónica PDI

Integration

• PS replaces all customers profile and permissions DBs

• All systems access this data through PS real time API

•  In special cases, some PS-consumers could use the batch interface.

• The same way new services could be added quite easily

Planned integration

02

Page 18: A Morning with MongoDB Barcelona: From Oracle to MongoDB

18 Telefónica PDI

Integration

• Budget restrictions: adapt all services to use the API was too expensive

• Keep independent systems DBs and synchronize PS through batch

• Use DBs built-in massive extraction feature to generate daily batch files

• However… in most cases those DBs were not able to generate Delta (only changes) extractions §  Provide full daily snapshots!

Problems arise

02

Page 19: A Morning with MongoDB Barcelona: From Oracle to MongoDB

19 Telefónica PDI

First version performance

• 1.8M customers, 180 profile attributes, 6 services

• Sizes §  Tables + indexes size: 65Gb §  30% of the size were indexes

• Batch §  Full DWH customer’s profile import: > 24 hours §  Delta extractions: 4 - 6 hours §  Loads and extractions performance proportional to data size

• API: §  Response time with average traffic: 110ms

02

Ireland

Page 20: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

03 The SQL solution Second version

Page 21: A Morning with MongoDB Barcelona: From Oracle to MongoDB

21 Telefónica PDI

Second version

• New approach: batch processes access directly DB

03

High level logical architecture

Page 22: A Morning with MongoDB Barcelona: From Oracle to MongoDB

22 Telefónica PDI

Second version

• Batch processes had to

§  Validate authentication and authorization

§  Verify user, service and attribute existence

§  Check equivalent IDs

§  Validate sharing matrix rights

§  Validate values data type

§  Check the update date of the existing values

03

Batch processes

Page 23: A Morning with MongoDB Barcelona: From Oracle to MongoDB

23 Telefónica PDI

Second version 03

DB Batch processing

Our DBAs

Page 24: A Morning with MongoDB Barcelona: From Oracle to MongoDB

24 Telefónica PDI

Second version

• Preprocess incoming batch file in BE servers §  Validate format, services and attributes existence and values data types §  Generate intermediate file with structure like target DB table

• Load intermediate file (Oracle’s SQL*Loader) to a temporal table

• Switch DB to “deferred writing”, storing all incoming modifications

• Merge temporal table and final table, checking values update date

• Replace old users attributes values table with merge result

• Apply deferred writing operations

03

New DB-based batch loading process

Page 25: A Morning with MongoDB Barcelona: From Oracle to MongoDB

25 Telefónica PDI

Second version

• Generate a temporal DB table with format similar to final batch file. Two loops over users attributes values table required: §  Select format of the table; number and order of columns / attributes §  Fill the new table

• Loop the whole temporal table for final formatting (empty fields…)

• From batch side loop across the whole table (SELECT * FROM …) • Write each retrieved row as a line in the resulting file

03

New batch extraction process

Page 26: A Morning with MongoDB Barcelona: From Oracle to MongoDB

26 Telefónica PDI

Second version performance

• Batch time window: 3:30 hours §  Full DWH load §  Two Delta loads §  Three Delta extractions

• API: §  Ireland requirement: < 500ms

03

Ireland performance requirements

Page 27: A Morning with MongoDB Barcelona: From Oracle to MongoDB

27 Telefónica PDI

Second version performance

• 1.8M customers, 180 profile attributes, 6 services • Sizes §  Tables + indexes size: 65Gb §  30% of the size were indexes §  Temporal tables size increases almost exponentially: 15Gb and above §  Intermediate file size: from 700Mb to 7Gb • Batch §  Full DWH customer’s profile import: 2:30 hours §  Delta extractions: 1:00 hour §  Loads performance worsened quickly (almost exp): 6:00 hours §  Extractions performance proportional to data size §  Concurrent batch processes may halt the DB • API: §  Response time with average traffic: 80ms §  Response time while loading was unpredictable: >300ms

03

Ireland

Page 28: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

04 The SQL solution Third version

Page 29: A Morning with MongoDB Barcelona: From Oracle to MongoDB

29 Telefónica PDI

Third version 04

Speed up DB Batch processes

Our DBAs (again)

Page 30: A Morning with MongoDB Barcelona: From Oracle to MongoDB

30 Telefónica PDI

Third version

• Minor preprocessing of incoming batch file in BE servers §  Just validate format §  No intermediate file needed!

• Load validated file (Oracle’s SQL*Loader) to a temporal table • Loop the temporal table merging the values into final table, checking

values update date and data types §  Use several concurrent writing jobs

• Store results on real table, no need to replace!

• No “deferred writing”!

04

New (second) DB-based batch loading process

Page 31: A Morning with MongoDB Barcelona: From Oracle to MongoDB

31 Telefónica PDI

Third version

• Optimized loops to generate temporal output table. §  Use several concurrent writing jobs §  We achieved a speed-up of between 1.5 and 2

• Loop the whole temporal table for final formatting (empty fields…) • Download and write lines directly inside Oracle’s sqlplus

• No SELECT * FROM … query from Batch side!

04

Enhancements to extraction process

Page 32: A Morning with MongoDB Barcelona: From Oracle to MongoDB

32 Telefónica PDI

Our DBAs

F**K YEAH

Third version performance

• 1.8M customers, 180 profile attributes, 6 services • Sizes §  Tables + indexes size: 65Gb §  30% of the size were indexes §  Temporal tables: 15Gb

• Batch §  Full DWH customer’s profile import: 1:10 hours (vs. 2:30 hours) §  Three Delta extractions: 2:15 hours (vs. 3:00 hours) §  Loads and extractions performance proportional to data size §  Concurrent batch processes not so harmful

• API: §  Response time with average traffic: 110ms §  Response time while loading: 400ms

04

Ireland

Page 33: A Morning with MongoDB Barcelona: From Oracle to MongoDB

33 Telefónica PDI

Our DBAs

F**K YEAH

Third version performance

• 25M customers, 150 profile attributes, 15 services • Sizes §  Tables + indexes size: 700Gb §  40% of the size were indexes

• Batch §  Two Delta imports: < 2:00 hours §  Two Delta extractions: < 2:00 hours §  Loads and extractions performance proportional to data size

• API: §  Response time with average traffic: 90ms

04

United Kingdom

Page 34: A Morning with MongoDB Barcelona: From Oracle to MongoDB

34 Telefónica PDI

Our DBAs

F**K YEAH

Third version performance 04

Ireland 3rd version 2nd version

DB size 65Gb + 15Gb (temp) 65Gb + > 15Gb

Full DWH load 1:10 hours 2:30 hours

Three Delta exports 2:15 hours 3:00 hours

Batch stability Stable, linear Unstable, exponential

API response time 110ms 110ms

API while loading 400ms Unpredictable

United Kingdom 3rd version

DB size 700Gb

Two Delta loads < 2:00 hours

Three Delta exports < 2:00 hours

API response time 90ms

Page 35: A Morning with MongoDB Barcelona: From Oracle to MongoDB

35 Telefónica PDI

Third version performance

• 20 database tables • API: several queries with up to 35 joins and even some unions • Authorization: 5 joins to validate auth users access • Batch: §  Load: 1700 lines of PL/SQL §  Extraction: 1200 of PL/SQL

04

DB stats

Page 36: A Morning with MongoDB Barcelona: From Oracle to MongoDB

36 Telefónica PDI

Mission completed? 04

Page 37: A Morning with MongoDB Barcelona: From Oracle to MongoDB

37 Telefónica PDI

Third version performance

• 20M customers, 200 profile attributes, 10 services

• Mexico time window: 4:00 hours §  Full DWH load! §  Additional Delta feeds loads §  At least two Delta extractions

04

Mexico

Our DBAs

Page 38: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

05 The NoSQL solution

Page 39: A Morning with MongoDB Barcelona: From Oracle to MongoDB

39 Telefónica PDI

MongoDB Data Model Services and their profile + sharing matrix

05

{ _id : 7, service_name : "root", id_type : 1, default_values: false, owned_attribs : [ { attrib_id : 70005, attrib_nane : “marketing.consent", attrib_data_type : 1, attrib_def_value : "no", attrib_status : 1 }, ... ], shared_attribs : [ {attrib_id : 20144, sharing_mode : 0}, ... ] }

attrib_id = service_id * 10000 + num attribs + 1

attrib_id = service_id * 10000 + num attribs + 1

Page 40: A Morning with MongoDB Barcelona: From Oracle to MongoDB

40 Telefónica PDI

MongoDB Data Model Users and their profile + multiple IDs

05

{ _id : "011234" services_list : [ { service_id : 1, reg_date : {"$date" : 1318040693000} }, ... ], user_values : [ { attrib_id : 10140, attrib_value : "Open", update_date : {"$date" : 1317110161000} }, ... ] }

Equivalent ID document: { _id : “05abcd" ue : "011234" }

_id = “id type” + “user ID”

attrib_id = service_id * 10000 + num attribs + 1

_id = “id type” + “user ID”

Page 41: A Morning with MongoDB Barcelona: From Oracle to MongoDB

41 Telefónica PDI

MongoDB Data Model Authorization system

05

AUTH USERS COLLECTION: { _id: "admin" auth_pswd: ”XXX", auth_roles: ['PS_ADMIN_ROLE’, …], auth_uris: [ {uri_path: "/**", method: 'R'}, {uri_path: "/stats/**", method: 'RW'}, {uri_path: "/kpis/**", method: ’IMPORT'}, ... ] } RESOURCES COLLECTION:

{ _id: "admin.**", role_uri: "/**" }

ROLES COLLECTION: { _id: 'PS_ADMIN_ROLE', roles_resources: [ { resource_id: "admin.**”, method: 'R' }, { resource_id: "stats.**”, method: 'IMPORT' }, ... ] }

Replicate uris (from resources) and methods (from roles)

Page 42: A Morning with MongoDB Barcelona: From Oracle to MongoDB

42 Telefónica PDI

MongoDB Data Model

• Only 5 collections • API: typically 2 accesses (services and users collections) • Authorization: access only 1 collection to grant access • Batch: all processing done outside DB

05

DB stats

Page 43: A Morning with MongoDB Barcelona: From Oracle to MongoDB

43 Telefónica PDI

NoSQL version

§  Everything running on Red Hat EL 6.2 64 bits

05

High level logical architecture

Page 44: A Morning with MongoDB Barcelona: From Oracle to MongoDB

44 Telefónica PDI

NoSQL version performance

• 1.8M customers, 180 profile attributes, 6 services • Sizes §  Collections + indexes size: 20Gb (vs. 65Gb) §  < 5% of the size are indexes (vs. 30%)

• Batch §  Full DWH customer’s profile import: 0:12 hours (vs. 1:10 hours) §  Three Delta extractions: 0:40 hours (vs. 2:15 hours) §  Loads and extractions performance proportional to data size §  Concurrent batch processes without performance affection

• API: §  Response time with average traffic: < 10ms (vs. 110ms) §  Response time while loading: the same §  High load (600 TPS) response time while loading: 300ms

05

Ireland (at PDI lab)

Page 45: A Morning with MongoDB Barcelona: From Oracle to MongoDB

45 Telefónica PDI

NoSQL version performance

• 25M customers, 150 profile attributes, 15 services • Sizes §  Collections + indexes size: 210Gb (vs. 700Gb) §  < 5% of the size were indexes

• Batch §  Two Delta imports: < 0:40 hours (vs. 2:00 hours) §  Loads and extractions performance proportional to data size

05

United Kingdom (at PDI lab)

Page 46: A Morning with MongoDB Barcelona: From Oracle to MongoDB

46 Telefónica PDI

NoSQL version performance

• 20M customers, 200 profile attributes, 15 services • Sizes §  Collections + indexes size: 320Gb §  Indexes size: 1.2Gb

• Batch §  Initial Full import (20M, 40 attributes): 2:00 hours §  Small Full import (20M, 6 attributes): 0:40 hours

• API: §  Response time with average traffic: < 10ms (vs. 90ms) §  Response time while loading: the same §  High load (500 TPS) response time while loading: 270ms

05

Mexico

Page 47: A Morning with MongoDB Barcelona: From Oracle to MongoDB

47 Telefónica PDI

Our DBAs

NoSQL version performance 04

Ireland NoSQL version SQL version

DB size 20Gb 80Gb

Full DWH load 0:12 hours 1:10 hours

Three Delta exports 0:40 hours 2:15 hours

API while loading < 10ms 400ms

API 600TPS + loading 300ms Timeout / failure

United Kingdom NoSQL version SQL version

DB size 210Gb 700Gb

Two Delta loads < 0:40hours < 2:00 hours

Mexico NoSQL version

DB size 320Gb

Initial Full load (40 attr) 2:00 hours

Daily Full load (6 attr) 0:40 hours

API while loading < 10ms

API 500TPS + loading 270ms

Page 48: A Morning with MongoDB Barcelona: From Oracle to MongoDB

48 Telefónica PDI

Mission completed? 05

Page 49: A Morning with MongoDB Barcelona: From Oracle to MongoDB

49 Telefónica PDI

The bad

• Batch load process was too fast §  To keep secondary nodes synched we needed oplog of 16 or 24Gb §  We had to disable journaling for the first migrations

• Labels of documents fields take up disc space §  Reduced them to just 2 chars: “attribute_id” -> “ai”

• Respect the unwritten law of at least 70% of size in RAM

• Take care with compound indexes, order matters §  You can save one index… or you can have problems §  Put most important key (never nullable) the first one

• DBAs whining and complaining about NoSQL §  “If we had enough RAM for all data, Oracle would outperform MongoDB”

05

Page 50: A Morning with MongoDB Barcelona: From Oracle to MongoDB

50 Telefónica PDI

The ugly

• Second migration once the PS is already running §  Full import adding 30 new attributes values: 10:00 hours §  Full import adding 150 new attributes values: 40:00 hours

•  Increase considerably documents size (i.e. adding lots of new values to the users) makes MongoDB rearrange the documents, performing around 5 times slower §  That’s a problem when you are updating 10k documents per second

• Solutions? §  Avoid this situation at all cost. Run away! §  Normalize users values; move to a new individual collection §  Prealloc the size with a faux field

•  You could waste space! §  Load in new collection, merge and swap, like we did in Oracle

05

Page 51: A Morning with MongoDB Barcelona: From Oracle to MongoDB

Título del capítulo Máximo 3 líneas

06 Título del capítulo Máximo 3 líneas Conclusions

Page 52: A Morning with MongoDB Barcelona: From Oracle to MongoDB

52 Telefónica PDI

Conclusions & personal thoughts

• Awesome performance boost §  But not all use cases fit in a MongoDB / NoSQL solution!

• New technology, different limitations

• Fear of the unknown §  SSDs performance? §  Long term performance and stability?

• Python + MongoDB + pymongo = fast development §  I mean, really fast

• MongoDB Monitoring Service (MMS)

• 10gen people were very helpful

06

Page 53: A Morning with MongoDB Barcelona: From Oracle to MongoDB

53 Telefónica PDI

Questions? 06

Page 54: A Morning with MongoDB Barcelona: From Oracle to MongoDB
Page 55: A Morning with MongoDB Barcelona: From Oracle to MongoDB

55 Telefónica PDI

SQL Physical architecture

§  Scale horizontally adding more BE or DB servers or disks in the SAN §  Virtualized or physical servers depending on the deployment

0X

Page 56: A Morning with MongoDB Barcelona: From Oracle to MongoDB

56 Telefónica PDI

MongoDB Physical architecture

§  MongoDB arbiters running on BE servers §  Scale horizontally adding more BE servers or disks in the SAN §  Sharding may already be configured to scale adding more replica sets

0X