Download - A Year with Cinder and Ceph at TWC

A Year with Cinder and Ceph at TWC

Photo by Navinhttps://flic.kr/p/7vSNe7

By Craig DeLatte & Bryan Stillwell - May 20, 2015

What we will cover

• All views are from a systems admin perspective• Cinder

– Evaluating storage– Traditional vs. “grid” type storage

• Ceph• Adding more backends

Craig DeLatte & Bryan Stillwell

Our Criteria for Evaluating Storage


•Must have passed the base cinder matrix for release•Must have open API access to allow for monitoring and gathering statistics•Must support nova’s live-migration•Ideally support a rack anywhere methodology

What Led Us to Using Ceph

• Supports live-migration• Ability to use x86 architecture instead of vendor specific hardware


Our First Ceph Design

• What led to our design and where it went wrong– In our environment (and maybe yours) our customers only plan on

capacity, they assume unlimited performance


First OpenStack Deployment

• Live migration testing• Ceph to the rescue


Early Life with Ceph

• Out of family upgrades require a leap of faith• Be prepared to scare your co-workers• How your first production upgrade will feel


Initial Ceph Cluster

OSDs: 60Journal Ratio: 5:1Drive Size: 1TBRaw Capacity: 60TBUsable Capacity: 20TB


First Expansion


First Expansion

OSDs: 60Journal Ratio: 5:1Drive Size: 1TBRaw Capacity: 60TBUsable Capacity: 20TB

OSDs: 75Journal Ratio: 5:1Drive Size: 1.2TBRaw Capacity: 90TBUsable Capacity: 30TB


What Went Wrong

• Performance issues– Too high HDD:SSD ratio for journals– Not enough placement groups (PGs)

• VMs lost sight of storage (libvirt)• Legacy tunables• VMs lost site of storage again! (version mismatch)


Corrections Made

• Ordered more SSDs to reduce HDD:SSD ratio• Re-used mon IPs• Placement groups went from 512 PGs/pool to 4096 PG/s pool• Tunables switched to ‘firefly’• Need to make sure ‘ALL’ systems are upgraded to the new version

ceph osd set nobackfillceph osd set noscrubceph osd set nodeep-scrub

osd max backfills = 1osd recovery max active = 1osd recovery op priority = 1


Second Expansion


Second Expansion

OSDs: 75Journal Ratio: 5:1Drive Size: 1.2TBRaw Capacity: 90TBUsable Capacity: 30TB

OSDs: 189Journal Ratio: 3:1Drive Size: 1.2TBRaw Capacity: 226.8TBUsable Capacity: 75.6TB


What Went Wrong

• More performance problems during expansion• Unintentional upgrades (giant)


Corrections Made

• Decided we needed dedicated mon nodes• Added a couple more options to improve performance• Started work on replacing ceph-deploy with puppet-ceph

osd max backfills = 1osd recovery max active = 1osd recovery op priority = 1osd recover max single start = 1osd op threads = 12


Third Expansion


What Went Wrong

• Performance problems when adding OSDs• Started removing OSDs before the data was off them


Corrections Made

• Work started on replacing ceph-deploy with puppet-ceph• Added option to bring in new OSDs with a weight of 0

osd max backfills = 1osd recovery max active = 1osd recovery op priority = 1osd recover max single start = 1osd op threads = 12osd crush initial weight = 0


Fourth Expansion (most recent)


Dedicated Mon Nodes

YES! Finally!


Multiple Ceph Clusters

• 2 production• 2 staging• 2 lab• Virtual clusters for each member of the team


The Next Cinder Hurdle

• Going from single backend to multi-backend• Naming of backends needs to be planned for• Not all lab testing will reveal issues when going to production


Looking Foward

• New storage tiers (Performance-SSD, Capacity-HDD)• Emerging drive technologies• Newstore


Takeaways

• Don't start small if you're going big

• Order the right number and type of SSDs

• Determine the right number of PGs early

• Dedicated mon nodes (fsync)

• Be careful with mon nodes in OpenStack

• Ceph upgrades (don't forget the compute nodes)


What Made It Worth the Effort

• We are now not locked into vendor specific hardware• Scaling across racks, rows, and rooms• Nasty data migrations are a thing of the past• It allows us to future proof our data against EOL hardware support• We have a say!

– The Ceph working session is today at 11:50 in room 217


Questions or Comments

• Email: [email protected]

• irc: cdelatte

• Email: [email protected]

• irc: bstillwell


mailto:[email protected]

mailto:[email protected]

More TWC Talks

Wednesday, May 20th

Getting DNSaaS to Production with DesignateGrowing OpenStack at Time Warner CableChanging Culture at Time Warner CableNeutron in the Real World - TWC Implementation and Evolution

Thursday, May 21st

Real World Experiences with Upgrading OpenStack at Time Warner Cable

9:50a11:00a11:50a1:50p

2:20p


https://openstacksummitmay2015vancouver.sched.org/event/500aa9da65b4f849e621e3ef94181585?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/500aa9da65b4f849e621e3ef94181585?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/219b5e008fc26716812493e71a13945c?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/219b5e008fc26716812493e71a13945c?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/a175129ccafdeadd0939a3ceda0a26f1?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/a175129ccafdeadd0939a3ceda0a26f1?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/0d939fca1741de28716f42db6d8d7867?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/0d939fca1741de28716f42db6d8d7867?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no

https://openstacksummitmay2015vancouver.sched.org/event/a88ef9b6ea1b7d3a70e1b709f2ef62e0?iframe=yes&w=i:0;&sidebar=yes&bg=no%23?iframe=yes&w=i:0;&sidebar=yes&bg=no



Download - A Year with Cinder and Ceph at TWC

Top Related