Azure + DSE Powers O365 Per-User Store
1 Introduction
2 What We Built
3 How We Built It Using Cassandra + Spark
4 Why We Built It On Azure + DSE
5 How Can You Build It
6 Wrap Up2© 2015. All Rights Reserved.
Introduction
3© 2015. All Rights Reserved.
Sean Usher
Office 365
Email: [email protected]
Twitter: @seanushermsft
Silvano Coriani
Azure
Email: [email protected]
Twitter: @scoriani
Introduction – Office 365
4© 2015. All Rights Reserved.
Collaboration
Document Authoring
Social Networking
Calendaring
File Storage
Business Intelligence
Etc…
Introduction – Azure
5© 2015. All Rights Reserved.
Azure is Microsoft’s cloud computing platform, a growing collection of
integrated services—analytics, computing, database, mobile,
networking, storage, and web—for moving faster, achieving more, and
saving money.
What We Built
6© 2015. All Rights Reserved.
A way to understand our users and organizations at a deeper level!
• Are users happy with the service they are receiving?
• Are users fully utilizing the services they are paying us for?
• Are users hitting issues that we can proactively help them with?
• How has a user’s experience been over their lifetime?
• Can we discover insights that we aren’t even aware of?
This requires ingesting and storing a lot of data. We need to be able
to perform fast, scalable analytics on that data, or we will discover
issues too late!
Questions:
What We Built (contd)
7© 2015. All Rights Reserved.
• Running in the cloud
• Highly scalable
• Initial ingestion of 50k events/sec, growing rapidly
• Millisecond latency for reads/writes
• High Availability
• Tunable Consistency
• Real-time and batch analytics
• Machine learning
• Storing aggregated data for 1+ years
One month to get it built…..
Platform Requirements:
Topology:
1 Physical Datacenter (Eventually will be geo-replicated)
2 Logical Datacenters (Cassandra, Analytics[Spark])
All machines are in a virtual network (VNet) and are assigned internal static IPs.
No inbound access to Cassandra machines from outside the VNet.
How We Built It Using Cassandra + Spark (contd)
8© 2015. All Rights Reserved.
Physical Datacenter
VNet
SLogical DC
C*Logical DC
Configuration (Azure G4 machines):
• Ubuntu
• 16 cores
• 224 GB RAM
• 3TB local SSD
Snitch: GossipingPropertyFileSnitch
Replication: All keyspaces use NetworkTopologyStrategy with replication factor=3 in each
DC.
How We Built It Using Cassandra + Spark (contd)
9© 2015. All Rights Reserved.
Node Type Node Count Heap Size GC Type
Cassandra 12 12 GB G1
Spark 12 20 GB G1
How We Built It Using Cassandra + Spark (contd)
© 2015. All Rights Reserved. 10
How We Built It Using Cassandra + Spark (contd)
11© 2015. All Rights Reserved.
Ingestion:
RE
ST
AP
I
O365
Queue (event hub)
Ingestion
Worker(Azure worker role
using DataStax
C# driver)
Cassandra
All data is ingested into the Cassandra DC
All read APIs read from the Cassandra DC
All ingested data is PII scrubbed
APIs are the only way to get data in or out of Cassandra.
How We Built It Using Cassandra + Spark (contd)
12© 2015. All Rights Reserved.
What Data Do We Ingest?
CREATE TABLE userdatasetraw (
userid text,
timepk timestamp,
device text,
createdtime timestamp,
errorcode text,
errordetail text,
omstid text,
useragent text,
PRIMARY KEY ((userid, timepk, device), createdtime)
) WITH CLUSTERING ORDER BY (createdtime DESC) AND
COMPACTION={'base_time_seconds':'50','class':'DateTieredCompactionStrategy','max_sstable_age_days':'0.25‘….
userid timepk device createdtime errorcode errordetail omstid useragent
102033ffa4a7
079e7c
14411520000
00
8000000A 14412151330
00
Failure InvalidOperati
on
15321c64d-
0A92-4f4e7-
bcc8-
2aeb89354ff2
6
null
How We Built It Using Cassandra + Spark (contd)
13© 2015. All Rights Reserved.
What Data Do We Ingest?
CREATE TABLE tenantsubscription (
omstidtext,
skuid text,
city text static,
comculture text static,
exchgtid text static,
exousercnt int static,
geoareacd text static,
isconcierge boolean static,
istrial boolean,
liccount int,
lyousercnt int static,
numtrailsubs int,
orgname text static,
sku text,
spousercnt int static,
statechangedt timestamp,
subautorenew boolean,
subcreated timestamp,
subexpiry timestamp,
substate text,
tdssynctime timestamp,
usedliccount int,
PRIMARY KEY ((omstid), skuid)
)
omstid skuid city exousrcnt liccnt subautorenew sku
129001cd1-
21dc-4706-
96cfe-
1f632522d3
65ae
6fe22a85e-
b296-42f0-
b187-
1b91e9394b
900
BANGKOK 1000 1500 false OFFICE 365
ENTERPRIS
E E3
How We Built It Using Cassandra + Spark (contd)
14© 2015. All Rights Reserved.
Aggregations
Spark Batch Jobs to roll up UserDatasetRaw into 1 hour and 24 hour aggregates.
Other jobs use the 1 hour and 24 hours aggregates and join them with the tenant
subscription and dataset tables to calculate insights:
Result:
1. Great feedback from customers and support agents!
2. Saved customers money!
3. Save customers from being locked out of their service!
4. Proactively fix user experience (detected customer misconfiguration)!
How We Built It Using Cassandra + Spark (contd)
15© 2015. All Rights Reserved.
What Problems Did We Run Into?
• Bad Data Modelling:
• Partitions getting too large (1-2 GB) which raised the “compacting large row”
warning and led to OOM errors (Cassandra 2.0).
• Not using DateTieredCompactionStrategy for time-series data.
• Bad Configuration:
• OS limits configured too low (ulimit, nofiles, etc…)
• Number of concurrent compactors and flush writers too low.
• Not using G1 garbage collection with large heaps.
• Not paying enough attention to blocked flush writers and dropped mutations.
• Allowing SSTable count to get too high, causing OOM errors.
Why We Built It On Azure + DSE
16© 2015. All Rights Reserved.
Why Azure?
• We didn’t want to manage bare metal and the overhead it brings.
• Easy to add capacity without ordering hardware and rack
space.
• We have used Azure for other services for ~5 years.
• We have built tools for deploying and managing services.
• Great track record with Azure support.
• We love to try out new things on the Azure platform!
Why We Built It On Azure + DSE (contd)
17© 2015. All Rights Reserved.
Why Cassandra?
The Good
• Low Latency ✓
• Linear Scale ✓
• Highly Available ✓
• Aggregations (Spark/Spark Streaming) ✓
• Machine Learning (Spark ML) ✓
• No Enforcement of Full Consistency ✓ ✓ ✓
The Not-So-Good
• No Hosted Option in Azure ✗
• Have to Install and Configure it Ourselves ✗
Why We Built It On Azure + DSE (contd)
18© 2015. All Rights Reserved.
Why DataStax Enterprise?
Training:
Cassandra can be complex and its success depends on various design decision.
Getting training from the experts is invaluable to ensuring our success.
Integration:
DataStax has built integration between Cassandra and Spark (as well as other
products) and provides a tested package that we can depend on. Ops Center UI.
Support:
Cassandra is new to us. There is nothing better than being able to send off a
message when something goes wrong, and getting experts to help solve the
problem.
How Can You Build It
19© 2015. All Rights Reserved.
- Marketplace
- Simplified set of deployment options
- Bring Your Own License
- Production Cluster or Dev Sandbox
- 4, 12, 36 or 90 nodes
- Pick your VM type and size
- Single VNET
- OpsCenter:
http://{cluster}.{region}.cloudapp.azure.com:8888
- Define your own deployment
1. Group cluster resources based on
common lifecycle1. E.g. separate infrastructure components from
compute nodes
2. Define compute and storage options for
nodes in the cluster1. Pick your VM type and size
2. Ephemeral vs persistent disks
(Standard/Premium)
3. Snapshots
3. Define networking options1. VNETs configuration
2. Cross-DC (VNET to VNET) connectivity
4. Performance considerations1. Compute
2. Storage
3. Networking
Azure Resource Manager principles
AZURE RESOURCE MANAGER API
• RBAC-based
• Template-driven
• Declarative and imperative
• Idempotent
• Multi-service
• Multi-region
• Extensible
Resource Group container for multiple resources
resources exist in one* resource group
resource groups can span regions
resource groups can span services
RESOURCE GROUP
Deployment tracks template execution
created within a resource group
allows nested deployments
• Template describes the topology (outside the box)
• Template extensions can initiate state configuration (inside the box)
• Multiple extensions available for Windows and Linux VMs
– DSC
– Chef
– Puppet
– Custom Scripts
– AppService + WebDeploy
– SQLDB + BACPAC
Inside the Box vs. Outside the Box
Common Use Cases for ARM Templates
• Enterprises and System Integrators– Delivering a capability or cloud capacity (building block templates, e.g. DSE)
– Delivering an end to end application (solution templates)
• Cloud Service Vendors (CSVs)– Support different multi-tenancy approaches
• Distinct deployments per customer– Within the CSV’s subscription
– “Bring Your Own Subscription” model that uses customer subscriptions
• Scale units within a central multi-tenant system
• Marketplace integration
• All deploy known configurations/skus/t-shirt sizes– Lots of variables makes free form less desirable
– T-shirt Sizes / SKUs are the common approach
Design and deploy a building block template
Go to http://github.com/azure/azure-quickstart-templates
to find 100s of quick start deployment templates for finished solutions.
DataStax is evolving ARM deployment templates in this
github repo to include DSE specific capabilities (e.g.
multi-region topology) for those who want to manage
their own deployment.
Deploying DataStax with the Azure CLI
Deploying DataStax with Azure Marketplace
Compute and storage options for nodes in the cluster
• Compute families for production clusters– D-Series, G-Series (Xeon® E5 v3)
• Local SSD disks
– DS-Series, GS-Series
• Premium Storage optimized, host caching for reads
• Storage options for nodes– Maintain data and logs on local ephemeral SSD disks
• ~100k IOPs and 1.5 GB/sec on G5
– Leverage Premium Storage Disks for persistent data and logs
• P10, P20, P30 (128GB to 1TB, up to 5000 IOPs and 200MB/sec)
• Striped volumes to balance storage size, throughput and costs
• Max 64TB, 80000 IOPs and 1GB/sec per node
– Use Standard Storage for backup snapshots
• Low cost, geo-replicated
Networking deployment options
• Supporting your replication topology (NetworkTopologyStrategy), including geo-
replication, for disaster recovery or workload segregation purposes
• Within a VNET, bandwidth is a function of VM type/size– Up to 20Gbps for G5
• Cross-region VNET gateways– Standard (100Mbps) or High Performance (200Mbps), No-Crypto option
– Latency impact proportional to distance
Contact Us
27© 2015. All Rights Reserved.
Sean Usher
Office 365
Email: [email protected]
Twitter: @seanushermsft
Silvano Coriani
Azure
Email: [email protected]
Twitter: @scoriani
Thank you