ibm cloud first factory - openstack · ibm cloud first factory ... object storage openstack...

46
1 OpenStack Summit May 12-16, 2014 Atlanta, Georgia IBM Cloud First Factory Running a Large OpenStack Infrastructure at IBM Angel E. Tomala-Reyes Senior Software Engineer Esteban Arias Cloud IT Specialist Pablo Barquero Software Engineer and Architect

Upload: nguyenthuan

Post on 07-Apr-2018

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

1

OpenStack Summit May 12-16, 2014 Atlanta, Georgia

IBM Cloud First FactoryRunning a Large OpenStack Infrastructure at IBM

Angel E. Tomala-ReyesSenior Software Engineer

Esteban AriasCloud IT Specialist

Pablo BarqueroSoftware Engineer and Architect

Page 2: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

2 © 2014 IBM Corporation

Agenda

Background

The CloudFirst Factory

Building the CloudFirst Factory

Lessons learned

Upgrading the CloudFirst Factory

Summary

2

Page 3: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

3

Background

Page 4: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

4 © 2014 IBM Corporation

Background

Early 2013 IBM’s public cloud was SCE 2.x‒ Ecosystem never developed

‒ Missing APIs for core features i.e SCAAS

IBM announced move to OpenStack

‒ Strong ecosystem

‒ “Linux for the data center”

IBM GTS was looking to:

‒ Develop OpenStack skills and experience

‒ Provide an environment for IBM and partners to develop solutions ahead of the move to SCE 3.0 (powered by OpenStack)

4

Page 5: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

5 © 2014 IBM Corporation

Open source (Apache 2.0 licensed)

“Linux of the data center” eliminate vendor lock-in, maintain workload portability

Build a great engine, packagers make a great car (think Linux Kernel to RHEL/SUSE)

Design principles‒ scalability and elasticity are our main goals

‒ share nothing, distribute everything (must be asynchronous and horizontally scalable)

‒ any feature that limits our main goals must be optional

‒ accept eventual consistency and use it where appropriate

5

OpenStack is a cloud operating system thatcontrols large pools of compute, storage, and networking resources throughout a datacenter

“ Our goal is to produce the ubiquitous Open Source cloud computing platform

that will meet the needs of public and private cloud providers regardless of size,

by being simple to implement and massively scalable. “

Page 6: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

6 © 2014 IBM Corporation

SCE solutions moving a common OpenStack base

6

Simple 3 tier structure, with increased Client Value at each tier

Using open, common, standards based architecture providing choice,

flexibility, interoperability, portability

Clean upgrade paths with progression to fully integrated and factory

optimized PureApplication System

Significant customer benefits above and beyond base OpenStack

Page 7: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

7 © 2014 IBM Corporation

General OpenStack architecture*

Identity

Dashboard

Image

ComputeObject

Storage

Block

Storage

Network

Provides UI for Provides

UI for

Provides UI for Provides

UI for

Provides UI for

Provides Auth for

Provides Auth for

Provides Auth for

Provides Auth for

Provides Auth for

Provides Auth for

Provides volumes

for

Provide network

connectivityfor

Stores images in

Stores disk files in

http://www.solinea.com

7

Compute (Nova)

KVM based virtualization

Block Storage (Cinder)

iSCSI based

JBod

Network (Quantum)

Provision and manage

virtual resources

Dashboard (Horizon)

Self-service portal

Image (Glance)

Catalog and manage server

images

Identity (Keystone)

Unified authentication and

authorization

Object Storage (Swift)

petabytes of secure, reliable

object storage

OpenStack Components:

* OpenStack Folsom

Page 8: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

8

The CloudFirst Factory

Page 9: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

9 © 2014 IBM Corporation

CloudFirst Factory mission and strategy

CloudFirst Mission‒ Harvest the community (IBM, 3rd party and open source) for offerings, service components

and other assets that can be introduced into IBM SmartCloud to increase value to IBM and our clients.

CloudFirst Strategy‒ Engage “CloudFirst” projects (public cloud is the primary delivery) and foster learning,

experimentation and client engagement to move faster and with high quality through the development process

9

Page 10: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

10 © 2014 IBM Corporation

SmartCloud Enterprise requires changes that the CloudFirst Factory was positioned to deliver

The CloudFirst Factory absorbed the change we needed on SCE and aimed at reducing the risk to delivering in production by vetting out the operational and performance impacts. We made extensive changes across dimensions to ensure the CloudFirst Factory delivered the right change into SmartCloud Enterprise with a particular focus on supporting big data workloads.

Dimensions of change in the CloudFirst Factory:

Better performance hardware

Updated networking

OpenStack

The CloudFirst Factory enabled innovation that could not run on SCE

10

Page 11: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

11

Building the CloudFirst Factory

Page 12: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

12 © 2014 IBM Corporation

Getting started

Team‒ Experience with SCE deployment

‒ No OpenStack experience

‒ Minimal python

‒ Split between US and Costa Rica

Folsom was the GA version at the time

RHEL 6.3 based install

Nova-network‒ Needed to start simple, so avoided Quantum/Neutron

‒ More machines than open ports on switches, single network node

Start with single node‒ Move services as needed

‒ Add compute hosts as needed

Simple script based install

12

Page 13: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

13 © 2014 IBM Corporation

CloudFirst Factory: A compute node

13

-SB- IBM System x iDataPlex dx360 M3 server

-SB- Mellanox ConnectX-2 EN Dual-port SFP+ 10GbE PCI-E 2.0 Adapter

1x 1GE on-board adapter (management)

128 G RAM: 16 * 8GB (1x8GB, 2Rx4, 1.5V) PC3-10600 CL9 ECC DDR3 1333 MHz LP RDIMM

Intel Xeon Processor X5670 6C 2.93GHz 12MB Cache 1333MHz 95w

12 * IBM 3TB 7.2K 6Gbps NL SAS 3.5" HS HDD. Each could get us ~130 MB/s on sequential writes

IBM ServeRaid M1050 controller. Capped at ~600 MB/s total, but, crossflashed to LSI bios, scales to 1.8

GB/s

Rack R Node N

Eth2 10 GE – VM to VM network

Eth3 10 GE – storage network

Cinder mounts

Glance image transfers

Eth1 1 GE

Unused in majority of nodes

Cable drops to external or internal networks

Eth0 1 GE – management

Page 14: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

14 © 2014 IBM Corporation

The CloudFirst Factory was a tool to engage the community & redefine how IBM develops and delivers high value public cloud solutions

14

CloudFirst Factory

Integration

Creates a pre-

release

SmartCloud

Enterprise (n+1)

Partner

Engages a

community of

high value

cloud service

providers

Client

Provides a

showroom of

cloud solutions

delivered to

clients

SmartCloud Enterprise

IBM Community

• Research

• GBS

• SWG

• GTS

• STG

3rd Party Community

•open source

•business partners

Innovations go in Cloud solutions come out

Consists of 3 zones designed to make high value cloud

solutions available quickly

Community

Customers

Page 15: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

15 © 2014 IBM Corporation

The CloudFirst Factory provided purposed zones and processes to help grow an ecosystem on our public cloud

15

This is not a static

environment. The

movement of all code

and development

projects is towards

production.

Client zone

Customers

Integration zone

CloudFirst

FactoryConstant inflow of development code that is installed, operationally tested, stabilized

and rolled out across the CloudFirst Factory zones in phases

Deliver alpha cloud solutions to clients to apply course correction for

feature/functionality, business model, pricing, etc. before pursuing

broader availability

Partner zone

Partner

Explore pre-release SmartCloud features quickly, with automated on boarding for

right fit projects to explore technology that can be used in high value solutions

IBM production cloud

Page 16: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

16 © 2014 IBM Corporation

Projects in the CloudFirst Factory

16

Types of projects

Company distribution of projects

How the project is sold / distributed

SWG’s key 2013 committed projects in the

CloudFirst Factory (not a complete list):

‒ DBaaS

‒ Visualization –aaS

‒ BigInsights 2.1 (hadoop)

‒ Cloud OE (BlueMix)

Key 3rd party partners are being enabled to help us

progress the mission (not a complete list)

‒ Mellanox

‒ Tervela

‒ Fabrix (video grid) + Kaltura

Page 17: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

17 © 2014 IBM Corporation

SCE: The CloudFirst Factory IaaS is built on OpenStack (SCE 3.0)

17

CloudFirst Factory requirements & resources

CloudFirst & open source

‒ Public cloud is the primary delivery

‒ Industry recognized standards and APIs

‒ Reaching an ecosystem is critical

Alignment of priorities to influence revenue

‒ Current IaaS focus is SCE 3.0

Virtualized resources (1 node ~ 100s of VMs)

‒ KVM

Storage

‒ Tape

‒ Local storage

‒ Object Storage: SWIFT

‒ Block Storage: cinder

Common management & operational consistency

‒ Leveraging OpenStack to support the management of on

boarding partners

‒ CloudFirst is providing the operational insight into SCE 3.0

dev stream (devOps)

‒ CloudFirst goal is 100% automation

CloudFirst Factory enablement pattern

Deliver new IaaS (eg: OpenStack) to a partner community, focusing

on selective engagement of priority projects

Learn and adapt the IaaS according to the needs of the partners

Use the open source community to grow IaaS capability

OpenStack (Grizzly, Fulsom, EE)

Cloud Foundry

CloudOE

(Worklight)

CloudFirst Factory HW

Potential

offerings

CloudFirst

Factory delivers

Symphony

BigInsightCloudFirst

Factory On

Ramp zone

enables

Page 18: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

18 © 2014 IBM Corporation

Philosophy

Start simple

‒ Single control node, network, compute

‒ Single Cinder volume server

‒ Script based install

Once working, grow

‒ Use xcat to deploy nodes

‒ Move compute to dedicated host

‒ Disable compute on control node

‒ Validate

‒ Add additional compute nodes

18

Page 19: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

19 © 2014 IBM Corporation

Cloud First Factory: Racks and Zones

19

The equipment was split into three separate environment, each with a distinct

purpose

These were completely separate OpenStack installations

61 2 3 57

External Client

Zone

OpenVPN / xcatxcat

External

Partner ZoneIBM Internal Zone

OpenVPN / xcat

Physical

separation

Physical

separation

http://cloudfirst.demos.ibm.com

On-boarding process started here http://cloudalpha.demos.ibm.comhttp://cloudfirst.democentral.ibm.com

Flow of innovations

Start here Test here Quasi-production here

Page 20: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

20 © 2014 IBM Corporation

OpenStack on the IBM Intranet

20

This Zone was the first we set up and therefore, the simplest

Almost every component runs on bare metal, non-virtualized

We use xcat to set up OS (RHEL 6.4) on the nodes

Because this is IBM Internal network, security is not as important

a b

Control plane

Rack 2

28

151

14

Half rack bHalf rack a

External (blue)

drop

cinder 27

Swift 16

Swift 17

Swift 18

Swift proxy

Control plane

node

Rack 2 node 28

• mysql – the overall database

• qpidd (message processing)

• glance (images)

• nova network (gateway for VLANs, floating IPs)

• keystone (authentication / entitlement)

• horizon (user intefrace) – is currently running virtualized.

1 72

http://cloudfirst.democentral.ibm.com

Nova compute

only

control

Core switch 10 GE

Page 21: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

21 © 2014 IBM Corporation

OpenStack in the Internet Zones

21

http://cloudalpha.demos.ibm.com

Rack 5

a b

28

151

14

Half rack bHalf rack a

cinder

Mysql (VM)

External (Internet) drop

a b

Control plane

28

151

14

Half

rack b

Half

rack a

cinder 27

Swift 16

Rack 6

Qpidd, Horizon (VM)

Swift 15

Swift 14

The use of VMs is more widespread

Still, not optimal – e.g. Swift needs to be in both racks

Control node (rack 6 node 1) is the single point of failure

Control node is also the major network bottleneck!

Page 22: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

22 © 2014 IBM Corporation

Being on the Internet: networking

Each project is assigned a separate VLAN (nova networking)

These VLANs are pre-created in the switches manually and are defined in the racktopswitches as well as in the core switch

Each VLAN starts with a restrictive policy (ports closed)

Each VM gets assigned a private IP in that VLAN, which is not externally routable. Access to this VLAN from other private VLANs is controlled by the security policy

The floating IPs are externally routable; access also controlled by the security policy

Limit the number of external IPs per project

All external traffic routes through the network node

The OpenVPN / jumpbox is used to get access to the hypervisor managed network

22

Page 23: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

23 © 2014 IBM Corporation

Running OpenStack on the Internet

23

Note that all traffic between VLANs flows through the control node

All traffic between nodes on floating IPs flows through the control node

Page 24: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

24 © 2014 IBM Corporation

Being on the Internet - two portals

24

Make a list of all entry points and reduce their number

The internal Horizon portal is unmodified and for admin only

This portal is available on the internal network (behind OpenVPN)

The external, modified portal has no admin tab at all

The external portal has a custom skin

‒ Note the terms and conditions checkbox

The external portal has the custom Project Admin Tab

Image upload was disabled

VNC removed from the yellow zone

External portal runs over https (out of the box comes with http)

Horizon was scanned for vulnerabilities and largely passed

Page 25: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

25 © 2014 IBM Corporation

Being on the Internet - API

We implemented a secure reverse proxy for UI services (that support Horizon)

We moved the admin Horizon UI into a VM

‒ In the future, this could support load balancing and VM updates

Implemented a reverse proxy for web services

‒ By default, they run over a number of different ports and are not secure

‒ Just one port to worry about

‒ In the future: support for load balancing and VM updates

25

Page 26: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

26 © 2014 IBM Corporation

Being on the Internet - Images

IBM has internal standards for images that are exposed to the Internet (ITCS 104)

Userid / password policy; ssh keys preferred as the way to access

Lock down the ports (iptables exposes 22 only by default), SELinux

Semi- automated OS patch management

All floating / public IPs are periodically scanned

Disabled user-initiated image upload

26

Page 27: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

27 © 2014 IBM Corporation

The Custom On-boarding process

27

IBM Forms Experience Builder used to create a custom workflow

SSO with ibm.com (or guest) allows you to launch it

Once the form is filled out, email is generated to the admin. They go into the tool to approve requests, which

creates the ID for the ProjectAdmin and associates the role with their ID, allowing them to manage users in

the project

The same instance of on-boarding application for all three zones

Modification to keystone was made allowing ProjectAdmin to create users

Page 28: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

28 © 2014 IBM Corporation

User Authentication and Entitlement modifications

28

Out of the box, Horizon assumes that the admin will create all IDs.

‒ Not the way to go for a public cloud

At IBM, we have two auth API: external (ibm.com) and internal (w3.ibm.com)

We created keystone adapters for both systems

Both adapters require that the userid exists in the OpenStack Zone

We created a special role, Project Admin, who can manage users within their project

‒ Project Admins create userids in OpenStack, then the users can log in and authenticate against w3.ibm.com or

ibm.com, depending on the zone

We created a custom tab for Horizon to expose this functionality

Page 29: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

29

Lessons learned

Page 30: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

30 © 2014 IBM Corporation

Challenges

Floating IPs

‒ Needed to extend to an additional subnet

‒ Getting to other VMs by floating IP

Qpid

‒ Majority of the community uses Ubuntu/RabbitMQ stack; therefore RHEL/Qpid stack was not as well tested

Image and instance compliance

‒ General security rules

‒ Ongoing security compliance of the instances

‒ Auto disk re-size

30

Page 31: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

31 © 2014 IBM Corporation

Modularity tweaks

31

In general, moving OpenStack services into VMs is not a bad idea

Easier and cleaner to upgrade – no need to change baremetal OS

Easier to move VMs between hosts to back them up

Eventually, resiliency – multiple services running on multiple VMs..

But, is it fast enough? Is disk I/O fast enough?

But, what creates these VMs (devops aspects)?

OK: MySql

OK: MQ service

‒ But, found QPID performance issues, so moved back to bare metal.

As well we found a memory leak.

‒ Eventually migrated to rabbit on a VM

OK: Horizon

Maybe: Glance

Not OK: Storage components (Swift, Cinder)

Page 32: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

32 © 2014 IBM Corporation

Resiliency considerations

32

Ability to recover from the loss of a single HD

‒ OS: we have a pair of mirrored disks in RAID 1

‒ But, what’s so important? The OS can be reloaded easily. It’s the question of how

quickly..

‒ Compute nodes: image cache could be lost

‒ Compute nodes: VM disks; we are converting to RAID 10 for regular VMs

‒ Swift: a bunch of JBODs

‒ Glance: RAID 10

‒ Cinder: RAID 10

Ability to recover the loss of a single node

‒ A lot of work to do to get here!

HD

HD

HD

HD

HD

HD

HD

HD

HD

HD

HD

HD

A CloudFirst Factory Node

Page 33: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

33

Upgrading the CloudFirst Factory

Page 34: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

34 © 2014 IBM Corporation

Upgrading Folsom

Move to updated version and new features

Moved to Grizzly EE in order to align with SCE 3.0 direction

Opportunity to re-architect

‒ Split control node into a number of VMs

‒ Scale API services

‒ Scale Horizon

34

Page 35: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

35 © 2014 IBM Corporation

Migration from Folsom to Grizzly

35

This was a major migration of a heavily used environment – three racks

Concerns about uptime, data integrity

Created a test environment consisting on two nodes; rehearsed upgrade/downgrade there multiple

times until we got it right

Database schema changed, so it needed to be modified and data in it, preserved

There shutoff VMs in compute nodes which were in deleted state with incorrect hostname in

MySQL. This prevented nova compute from starting (widespread)

Some deleted/shutoff VMs were present in more than one node (incomplete previous attempt to

migrate them?)

Some VMs were active in different nodes from what MySQL thought

db_sync on cinder database failed. Updated table layout manually

File ownership of configuration files were critical on OpenStack (chown swift:swift /var/lib/swift),

errors related to package signatures.

Xvnc had issues starting when running along side with openstack-console-auth; as a workaround

we moved openstack-console-auth to a different node

Page 36: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

36 © 2014 IBM Corporation

Migration from Folsom to Grizzly - continued

36

This was a major migration of a heavily used environment – three racks

Concerns about uptime, data integrity

dbsync on openstack database failed when we moved mysql to a different host. Kept mysql on

the controller node

Any Cinder or Glance migration need it’s own maintenance window. We found that upgrading

nova volume was troublesome and we had better success by setting up cinder and Glance on

a different path and then migrating the data

We used IBM OpenStack Grizzly EE. It does not include Horizon or Swift; so we had to

complement it, luckily with compatible versions

Other than that, the upgrade was straightforward

The VMs remained accessible throughout, no reboots

Page 37: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

37 © 2014 IBM Corporation

Post-Grizzly migration notes

37

This was a major migration of a heavily used environment – three racks

Concerns about uptime, data integrity

Qpid memory leak: must set amqp_rpc_single_reply_queue=True !!

Migrated queue management from QPID to RabbitMQ

Increased the number of API workers

Scaled out Conductor services to 3 services nodes running 4 conductor processes each (12 all

together)

Migrated glance images to RAID 10 12 TB node

Full Nagios monitoring: plugins for OpenStack services and supporting services (MySQL,

RabbitMQ, TGT, etc)

TGT service unexpectedly stops finding targets and attaching new targets is no possible, this

caused Cinder volume to stop working unexpectedly

Page 38: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

38 © 2014 IBM Corporation

On-ramp blue deployment today

38

Page 39: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

39 © 2014 IBM Corporation

On-ramp Monitoring - Nagios

39

Each server in the grid has applied the standard checks and the custom checks are applied to the specific hosts such as Controller, Cinder and Horizon

Standard checks:

Check_ssh: Try to connect to an SSH server at specified server and port

Check_load: Tests the current system load average, with thresholds defined as warning if it reaches the 80% and critical 95%

Check_users: Test the amount of users logged into the local machine, with warning if it is more than 20 and critical if more than 50.

Check_ping: Test ping to the primary NIC of the machine

Check_local_disk: Define a service to check the disk space of the root partition on the local machine. Warning if < 20% free, critical if < 10% free space on partition.

Check_local_swap: Define a service to check the swap usage, critical if less than 10% of swap is free, warning if less than 20% is free.

Check_total_processes: Define a service to check the number of currently running procs, on the local machine. Warning if > 250 processes, critical if > 400 users.

Page 40: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

40 © 2014 IBM Corporation

On-ramp Monitoring – Nagios - continued

40

Each server in the grid has applied the standard checks and the custom checks are applied to the specific hosts such as Controller, Cinder and Horizon

Custom checks:

Check_rabbitmq: Define a service to check the connection to a rabbitmq server, provides a custom detailed error message in case of a failure.

check_Openstack_dashboard: Define a service to check the http connection using an IP and port in a specific host, will return custom error messages based on the output.

Check_mysql: Define a service to check the mysql connection using an specific user and password on a host.

check_compute: Define a service to check the existence of the nova-compute process in the host, returns an error if at least one process is not found.

Created custom Cinder monitoring plugin to ensure create, attach, detach and deletes were possible

Page 41: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

41 © 2014 IBM Corporation

Nagios Dashboard

41

Page 42: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

42

Summary

Page 43: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

43 © 2014 IBM Corporation

In Summary

OpenStack remains an emerging technology

Error handling not robust

Understanding the flow of calls and messages is needed

Large volume of message based rpc calls

Logging is not optimal (either too much or too little)

You need to be willing to look at code

Community of OpenStack on RHEL is not as robust

More seamless upgrade process is needed

Networking (nova-network) is challenging

‒ Multiple bridges

‒ IPTables configuration not straightforward

‒ Floating IP routing

43

Page 44: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

44 © 2014 IBM Corporation

Monday, May 12 – Room B314

12:05-12:45

Wednesday, May 14 - Room B312

9:00-9:40

9:50-10:30

11:00-11:40

11:50-12:30

OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ?

Angel Diaz, VP Open Technology and Cloud Labs

David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure

Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a

match made in Cloud heaven.

Todd Moore - Director, Open Technologies and Partnerships

Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator.

Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and

SmartCloud Orchestrator

IBM, SoftLayer and OpenStack - present and future

Michael Fork - Cloud Architect

IBM and OpenStack: Enabling Enterprise Cloud Solutions Now.

Tammy Van Hove -Distinguished Engineer, Software Defined Systems

IBM Sponsored Sessions

Page 45: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

45 © 2014 IBM Corporation

Monday, May 12

2:50 - 3:30

3:40 - 4:20

3:40 - 4:20

Tuesday, May 13

11:15 - 11:55

2:00 - 2:40

5:30 - 6:10

5:30 - 6:10

Wednesday, May14

9:50 - 10:30

2:40 - 3:20

Thursday, May 15

9:50 - 10:30

1:30 - 2:10

2:20 - 3:00

IBM Technical Sessions

Page 46: IBM Cloud First Factory - OpenStack · IBM Cloud First Factory ... object storage OpenStack Components: ... Because this is IBM Internal network, security is not as important a b

46

Be sure to stop by the IBM booth to see some demos and get your rockin’ OpenStack t-shirt while they last.

Don’t miss Monday evening’s booth crawl where you can enjoy Atlanta’s own SWEET WATER IPA!

Thank you !