case study: the university of alabama at birmingham presented by openstack , ceph and dell

23
Case Study: The University of Alabama at Birmingham OpenStack , Ceph, Dell Kamesh Pemmaraju, Dell John-Paul Robinson, UAB OpenStack Summit 2014 Atlanta, GA

Upload: dell-enterprise

Post on 23-Jan-2015

277 views

Category:

Technology


0 download

DESCRIPTION

The University of Alabama at Birmingham gives scientists and researchers a massive, on-demand, virtual storage cloud using OpenStack and Ceph for less than $0.41 per gigabyte. OpenStack, Ceph and Dell joined forces at OpenStack Summit 2014 to outline how the university IT staff deployed a private storage cloud infrastructure using the Dell OpenStack cloud solution with Dell servers, storage, networking and OpenStack, and Inktank Ceph. Learn more:

TRANSCRIPT

Page 1: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

Case Study: The University of Alabama at BirminghamOpenStack , Ceph, Dell

Kamesh Pemmaraju, Dell

John-Paul Robinson, UAB

OpenStack Summit 2014

Atlanta, GA

Page 2: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

2

Dell - Internal Use - Confidential

An overview

• Dell – UAB backgrounder

• What we were doing before

• How the implementation went

• What we’ve been doing since

• Where we’re headed

Page 3: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

3

Dell - Internal Use - Confidential

Dell – UAB background

• 900 researchers working on Cancer and Genomic Projects.• Their growing data sets challenged available resources

– Research data distributed across laptops, USB drives, local servers, HPC clusters

– Transferring datasets to HPC clusters took too much time and clogged shared networks

– Distributed data management reduced researcher productivity and put data at risk

• They therefore needed a centralized data repository for Researchers in order to insure compliances concerning retention of data.

• They also wanted scale-out cost-effective solution and hardware that could be re-purposed for compute & storage 

Page 4: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

4

Dell - Internal Use - Confidential

Dell – UAB background (contd..)

• Potential solutions investigated– Traditional SAN– Public cloud storage– Hadoop

UAB chose Dell/Inktank to architect a platform that would be very scalable and provide lost costs per GB and was the best of all worlds that provide compute and storage on the same hardware.

Page 5: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

5

Dell - Internal Use - Confidential

A little background…

• We didn’t get here overnight

• 2000s-era High Performance Computing

• ROCKS-based compute cluster

• The Grid and proto-clouds

• GridWay Meta-scheduler

• OpenNebula an early entrant that connected grids with this thing called the cloud

• Virtualization through-and-through

• DevOps is US

Page 6: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

6

Dell - Internal Use - Confidential

Challenges and Drivers

• Technology– Many hypervisors– Many clouds– We have the technology…can we rebuild it here?

• Applications– Researcher started shouting “Data”!

NextGen SequencingResearch Data RepositoriesHadoop

– Researcher kept on shouting “Compute”!

Page 7: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

7

Dell - Internal Use - Confidential

Data Intensive Scientific Computing

• We knew we needed storage and computing

• We knew we wanted to tie it together with an HPC commodity scale-out philosophy

• So August 2012 we bought 10 Dell 720xd servers– 16-core– 96GB RAM– 36TB Disk

• A 192-core, ~1TB RAM, 360TB expansion to our HPC fabric

• Now to integrate it…

Page 8: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

8

Dell - Internal Use - Confidential

December 2012

• Bob said:

Hearing good things about open stack and ceph at this week at dell world.Simon anderson, CEO of dream host , spoke highly of dell, open stack, and ceph today.He is also chair of company that supports He also spoke highly of dell crowbar deployment tool.

I

Page 9: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

9

Dell - Internal Use - Confidential

December 2012

• Bob said:

Hearing good things about open stack and ceph at this week at dell world.Simon anderson, CEO of dream host , spoke highly of dell, open stack, and ceph today.He is also chair of company that supports He also spoke highly of dell crowbar deployment tool.

• I said:

Good to hear. I've been thinking a lot about dell in this picture too. We have the building blocks in place. Might be a good way to speed the construction.

Page 10: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

10

Dell - Internal Use - Confidential

Lesson 1:

Recognize when a partnership will help you achieve your goals.

Page 11: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

11

Dell - Internal Use - Confidential

The 2013 Implementation

• The Timeline– In January we started our discussions with Dell and Inktank– By March we had committed to the fabric– A week in April and we had our own cloud in place

• The Experience– Vendors committed to their product– Direct engagement through open communities– Bright people who share your development ethic

Page 12: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

12

Dell - Internal Use - Confidential

Next Step…Build Adoption

• Defined a new storage product based on the commodity scale-out fabric

– Able to focus on strengths of Ceph to aggregate storage across servers

– Provision any sized image to provide Flexible Block Storage

• Promote cloud adoption within IT and across the research community

• Demonstrate utility with applications

Page 13: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

13

Dell - Internal Use - Confidential

Applications

• Crashplan Backup in the cloud– A couple hours to provision the VM resources– An easy half-day deploy with the vendor because we controlled

our resources a.k.a. firewall– Add storage containers on the fly as we grow…10TB in few clicks

• Gitlab hosting– Start a VM spec’d according to project site– Work with Omnibus install. Hey it uses Chef!

• Research Storage– 1TB storage containers for cluster users– Uses Ceph RBD images and NFS– The storage infrastructure part was easy – Scaled provisioning, 100+ user containers (100TB) created in

about 5 minutes.– Add storage servers as existing ones fill

Page 14: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

14

Dell - Internal Use - Confidential

Ceph Rebalances as Storage Grows :)

Page 15: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

15

Dell - Internal Use - Confidential

Lesson 2:Use it! That’s what it’s for!

The sooner you start using the cloud the sooner you start thinking like the cloud.

Page 16: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

16

Dell - Internal Use - Confidential

How PoC Decisions Age Over Time

• Pick the environment you want when you are in operation…you’ll be there before you know it

• Simple networking is good– But don’t go basic unless you are able to reinstall the fabric– Class B ranges to match the campus fabric– We chose a split admin range to coordinate with our HPC admin

range– We chose a collapsed admin/storage network due to a single

switch…probably would have been better to keep separate and allow growth

– It’s OK to add non-provisioned interfacing nodes…know your net

• Avoid painting yourself in corner– Don’t let the Paranoid Folk box-in your deployment– An inaccessible fabric is an unusable fabric

• Fixed IP range mismatch with “fake” reservations

Page 17: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

17

Dell - Internal Use - Confidential

Lesson 3:The fabric is flexible. Let it help you solve your problems

Page 18: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

18

Dell - Internal Use - Confidential

Problems will Arise

• The release version of the ixgbe driver in Ubuntu 12.04.1 kernel didn’t perform well with our 10Gbit cards

– Open source has an upstream– Use it as part of debug network– Upgrading the drivers was a simple fix

• Sometimes when you fix something you break something else

• There are still a lot of moving parts but each has a strong open source community

– Work methodically– You will learn as you go – Recognize the stack is integrated and respect tool

boundaries

Page 19: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

Dell - Internal Use - Confidential

Sometimes a Problem is just a Problem

• Code ex

Page 20: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

20

Dell - Internal Use - Confidential

Lesson 4:The code *is* the documentation

…and that’s a *good* thing

Page 21: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

21

Dell - Internal Use - Confidential

Where we are today

• OpenStack plus Ceph are here to stay for our Research Computing System

– They give us the flexibility we need for an ever expanding research applications portfolio

– Move our UAB Galaxy NextGen Sequencing platform to our Cloud

– Add Object Storage services– Put the cloud in the hands of researchers

• The big question…

Page 22: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

22

Dell - Internal Use - Confidential

…how far can we take it?

• The goal of process automation is scale

• Incompatible, non-repeatable, manual processes are a cost

• Success is in dual-use– Satisfy your needs and customer demand– Automating process implies documenting process…great for

compliance and repeatability– Recognize the latent talent in your staff today’s system

admins are tomorrows systems developers

• Traditional infrastructure models are ripe for replacement

Page 23: Case Study: The University of Alabama at Birmingham presented by OpenStack , Ceph and Dell

To learn more…

Please visit Dell.com/RedHat