ibm cloud first factory - openstack · ibm cloud first factory ... object storage openstack...
TRANSCRIPT
1
OpenStack Summit May 12-16, 2014 Atlanta, Georgia
IBM Cloud First FactoryRunning a Large OpenStack Infrastructure at IBM
Angel E. Tomala-ReyesSenior Software Engineer
Esteban AriasCloud IT Specialist
Pablo BarqueroSoftware Engineer and Architect
2 © 2014 IBM Corporation
Agenda
Background
The CloudFirst Factory
Building the CloudFirst Factory
Lessons learned
Upgrading the CloudFirst Factory
Summary
2
3
Background
4 © 2014 IBM Corporation
Background
Early 2013 IBM’s public cloud was SCE 2.x‒ Ecosystem never developed
‒ Missing APIs for core features i.e SCAAS
IBM announced move to OpenStack
‒ Strong ecosystem
‒ “Linux for the data center”
IBM GTS was looking to:
‒ Develop OpenStack skills and experience
‒ Provide an environment for IBM and partners to develop solutions ahead of the move to SCE 3.0 (powered by OpenStack)
4
5 © 2014 IBM Corporation
Open source (Apache 2.0 licensed)
“Linux of the data center” eliminate vendor lock-in, maintain workload portability
Build a great engine, packagers make a great car (think Linux Kernel to RHEL/SUSE)
Design principles‒ scalability and elasticity are our main goals
‒ share nothing, distribute everything (must be asynchronous and horizontally scalable)
‒ any feature that limits our main goals must be optional
‒ accept eventual consistency and use it where appropriate
5
OpenStack is a cloud operating system thatcontrols large pools of compute, storage, and networking resources throughout a datacenter
“ Our goal is to produce the ubiquitous Open Source cloud computing platform
that will meet the needs of public and private cloud providers regardless of size,
by being simple to implement and massively scalable. “
6 © 2014 IBM Corporation
SCE solutions moving a common OpenStack base
6
Simple 3 tier structure, with increased Client Value at each tier
Using open, common, standards based architecture providing choice,
flexibility, interoperability, portability
Clean upgrade paths with progression to fully integrated and factory
optimized PureApplication System
Significant customer benefits above and beyond base OpenStack
7 © 2014 IBM Corporation
General OpenStack architecture*
Identity
Dashboard
Image
ComputeObject
Storage
Block
Storage
Network
Provides UI for Provides
UI for
Provides UI for Provides
UI for
Provides UI for
Provides Auth for
Provides Auth for
Provides Auth for
Provides Auth for
Provides Auth for
Provides Auth for
Provides volumes
for
Provide network
connectivityfor
Stores images in
Stores disk files in
http://www.solinea.com
7
Compute (Nova)
KVM based virtualization
Block Storage (Cinder)
iSCSI based
JBod
Network (Quantum)
Provision and manage
virtual resources
Dashboard (Horizon)
Self-service portal
Image (Glance)
Catalog and manage server
images
Identity (Keystone)
Unified authentication and
authorization
Object Storage (Swift)
petabytes of secure, reliable
object storage
OpenStack Components:
* OpenStack Folsom
8
The CloudFirst Factory
9 © 2014 IBM Corporation
CloudFirst Factory mission and strategy
CloudFirst Mission‒ Harvest the community (IBM, 3rd party and open source) for offerings, service components
and other assets that can be introduced into IBM SmartCloud to increase value to IBM and our clients.
CloudFirst Strategy‒ Engage “CloudFirst” projects (public cloud is the primary delivery) and foster learning,
experimentation and client engagement to move faster and with high quality through the development process
9
10 © 2014 IBM Corporation
SmartCloud Enterprise requires changes that the CloudFirst Factory was positioned to deliver
The CloudFirst Factory absorbed the change we needed on SCE and aimed at reducing the risk to delivering in production by vetting out the operational and performance impacts. We made extensive changes across dimensions to ensure the CloudFirst Factory delivered the right change into SmartCloud Enterprise with a particular focus on supporting big data workloads.
Dimensions of change in the CloudFirst Factory:
Better performance hardware
Updated networking
OpenStack
The CloudFirst Factory enabled innovation that could not run on SCE
10
11
Building the CloudFirst Factory
12 © 2014 IBM Corporation
Getting started
Team‒ Experience with SCE deployment
‒ No OpenStack experience
‒ Minimal python
‒ Split between US and Costa Rica
Folsom was the GA version at the time
RHEL 6.3 based install
Nova-network‒ Needed to start simple, so avoided Quantum/Neutron
‒ More machines than open ports on switches, single network node
Start with single node‒ Move services as needed
‒ Add compute hosts as needed
Simple script based install
12
13 © 2014 IBM Corporation
CloudFirst Factory: A compute node
13
-SB- IBM System x iDataPlex dx360 M3 server
-SB- Mellanox ConnectX-2 EN Dual-port SFP+ 10GbE PCI-E 2.0 Adapter
1x 1GE on-board adapter (management)
128 G RAM: 16 * 8GB (1x8GB, 2Rx4, 1.5V) PC3-10600 CL9 ECC DDR3 1333 MHz LP RDIMM
Intel Xeon Processor X5670 6C 2.93GHz 12MB Cache 1333MHz 95w
12 * IBM 3TB 7.2K 6Gbps NL SAS 3.5" HS HDD. Each could get us ~130 MB/s on sequential writes
IBM ServeRaid M1050 controller. Capped at ~600 MB/s total, but, crossflashed to LSI bios, scales to 1.8
GB/s
Rack R Node N
Eth2 10 GE – VM to VM network
Eth3 10 GE – storage network
Cinder mounts
Glance image transfers
Eth1 1 GE
Unused in majority of nodes
Cable drops to external or internal networks
Eth0 1 GE – management
14 © 2014 IBM Corporation
The CloudFirst Factory was a tool to engage the community & redefine how IBM develops and delivers high value public cloud solutions
14
CloudFirst Factory
Integration
Creates a pre-
release
SmartCloud
Enterprise (n+1)
Partner
Engages a
community of
high value
cloud service
providers
Client
Provides a
showroom of
cloud solutions
delivered to
clients
SmartCloud Enterprise
IBM Community
• Research
• GBS
• SWG
• GTS
• STG
3rd Party Community
•open source
•business partners
Innovations go in Cloud solutions come out
Consists of 3 zones designed to make high value cloud
solutions available quickly
Community
Customers
15 © 2014 IBM Corporation
The CloudFirst Factory provided purposed zones and processes to help grow an ecosystem on our public cloud
15
This is not a static
environment. The
movement of all code
and development
projects is towards
production.
Client zone
Customers
Integration zone
CloudFirst
FactoryConstant inflow of development code that is installed, operationally tested, stabilized
and rolled out across the CloudFirst Factory zones in phases
Deliver alpha cloud solutions to clients to apply course correction for
feature/functionality, business model, pricing, etc. before pursuing
broader availability
Partner zone
Partner
Explore pre-release SmartCloud features quickly, with automated on boarding for
right fit projects to explore technology that can be used in high value solutions
IBM production cloud
16 © 2014 IBM Corporation
Projects in the CloudFirst Factory
16
Types of projects
Company distribution of projects
How the project is sold / distributed
SWG’s key 2013 committed projects in the
CloudFirst Factory (not a complete list):
‒ DBaaS
‒ Visualization –aaS
‒ BigInsights 2.1 (hadoop)
‒ Cloud OE (BlueMix)
Key 3rd party partners are being enabled to help us
progress the mission (not a complete list)
‒ Mellanox
‒ Tervela
‒ Fabrix (video grid) + Kaltura
17 © 2014 IBM Corporation
SCE: The CloudFirst Factory IaaS is built on OpenStack (SCE 3.0)
17
CloudFirst Factory requirements & resources
CloudFirst & open source
‒ Public cloud is the primary delivery
‒ Industry recognized standards and APIs
‒ Reaching an ecosystem is critical
Alignment of priorities to influence revenue
‒ Current IaaS focus is SCE 3.0
Virtualized resources (1 node ~ 100s of VMs)
‒ KVM
Storage
‒ Tape
‒ Local storage
‒ Object Storage: SWIFT
‒ Block Storage: cinder
Common management & operational consistency
‒ Leveraging OpenStack to support the management of on
boarding partners
‒ CloudFirst is providing the operational insight into SCE 3.0
dev stream (devOps)
‒ CloudFirst goal is 100% automation
CloudFirst Factory enablement pattern
Deliver new IaaS (eg: OpenStack) to a partner community, focusing
on selective engagement of priority projects
Learn and adapt the IaaS according to the needs of the partners
Use the open source community to grow IaaS capability
OpenStack (Grizzly, Fulsom, EE)
Cloud Foundry
CloudOE
(Worklight)
CloudFirst Factory HW
Potential
offerings
CloudFirst
Factory delivers
Symphony
BigInsightCloudFirst
Factory On
Ramp zone
enables
18 © 2014 IBM Corporation
Philosophy
Start simple
‒ Single control node, network, compute
‒ Single Cinder volume server
‒ Script based install
Once working, grow
‒ Use xcat to deploy nodes
‒ Move compute to dedicated host
‒ Disable compute on control node
‒ Validate
‒ Add additional compute nodes
18
19 © 2014 IBM Corporation
Cloud First Factory: Racks and Zones
19
The equipment was split into three separate environment, each with a distinct
purpose
These were completely separate OpenStack installations
61 2 3 57
External Client
Zone
OpenVPN / xcatxcat
External
Partner ZoneIBM Internal Zone
OpenVPN / xcat
Physical
separation
Physical
separation
http://cloudfirst.demos.ibm.com
On-boarding process started here http://cloudalpha.demos.ibm.comhttp://cloudfirst.democentral.ibm.com
Flow of innovations
Start here Test here Quasi-production here
20 © 2014 IBM Corporation
OpenStack on the IBM Intranet
20
This Zone was the first we set up and therefore, the simplest
Almost every component runs on bare metal, non-virtualized
We use xcat to set up OS (RHEL 6.4) on the nodes
Because this is IBM Internal network, security is not as important
a b
Control plane
Rack 2
28
151
14
Half rack bHalf rack a
External (blue)
drop
cinder 27
Swift 16
Swift 17
Swift 18
Swift proxy
Control plane
node
Rack 2 node 28
• mysql – the overall database
• qpidd (message processing)
• glance (images)
• nova network (gateway for VLANs, floating IPs)
• keystone (authentication / entitlement)
• horizon (user intefrace) – is currently running virtualized.
1 72
http://cloudfirst.democentral.ibm.com
Nova compute
only
control
Core switch 10 GE
21 © 2014 IBM Corporation
OpenStack in the Internet Zones
21
http://cloudalpha.demos.ibm.com
Rack 5
a b
28
151
14
Half rack bHalf rack a
cinder
Mysql (VM)
External (Internet) drop
a b
Control plane
28
151
14
Half
rack b
Half
rack a
cinder 27
Swift 16
Rack 6
Qpidd, Horizon (VM)
Swift 15
Swift 14
The use of VMs is more widespread
Still, not optimal – e.g. Swift needs to be in both racks
Control node (rack 6 node 1) is the single point of failure
Control node is also the major network bottleneck!
22 © 2014 IBM Corporation
Being on the Internet: networking
Each project is assigned a separate VLAN (nova networking)
These VLANs are pre-created in the switches manually and are defined in the racktopswitches as well as in the core switch
Each VLAN starts with a restrictive policy (ports closed)
Each VM gets assigned a private IP in that VLAN, which is not externally routable. Access to this VLAN from other private VLANs is controlled by the security policy
The floating IPs are externally routable; access also controlled by the security policy
Limit the number of external IPs per project
All external traffic routes through the network node
The OpenVPN / jumpbox is used to get access to the hypervisor managed network
22
23 © 2014 IBM Corporation
Running OpenStack on the Internet
23
Note that all traffic between VLANs flows through the control node
All traffic between nodes on floating IPs flows through the control node
24 © 2014 IBM Corporation
Being on the Internet - two portals
24
Make a list of all entry points and reduce their number
The internal Horizon portal is unmodified and for admin only
This portal is available on the internal network (behind OpenVPN)
The external, modified portal has no admin tab at all
The external portal has a custom skin
‒ Note the terms and conditions checkbox
The external portal has the custom Project Admin Tab
Image upload was disabled
VNC removed from the yellow zone
External portal runs over https (out of the box comes with http)
Horizon was scanned for vulnerabilities and largely passed
25 © 2014 IBM Corporation
Being on the Internet - API
We implemented a secure reverse proxy for UI services (that support Horizon)
We moved the admin Horizon UI into a VM
‒ In the future, this could support load balancing and VM updates
Implemented a reverse proxy for web services
‒ By default, they run over a number of different ports and are not secure
‒ Just one port to worry about
‒ In the future: support for load balancing and VM updates
25
26 © 2014 IBM Corporation
Being on the Internet - Images
IBM has internal standards for images that are exposed to the Internet (ITCS 104)
Userid / password policy; ssh keys preferred as the way to access
Lock down the ports (iptables exposes 22 only by default), SELinux
Semi- automated OS patch management
All floating / public IPs are periodically scanned
Disabled user-initiated image upload
26
27 © 2014 IBM Corporation
The Custom On-boarding process
27
IBM Forms Experience Builder used to create a custom workflow
SSO with ibm.com (or guest) allows you to launch it
Once the form is filled out, email is generated to the admin. They go into the tool to approve requests, which
creates the ID for the ProjectAdmin and associates the role with their ID, allowing them to manage users in
the project
The same instance of on-boarding application for all three zones
Modification to keystone was made allowing ProjectAdmin to create users
28 © 2014 IBM Corporation
User Authentication and Entitlement modifications
28
Out of the box, Horizon assumes that the admin will create all IDs.
‒ Not the way to go for a public cloud
At IBM, we have two auth API: external (ibm.com) and internal (w3.ibm.com)
We created keystone adapters for both systems
Both adapters require that the userid exists in the OpenStack Zone
We created a special role, Project Admin, who can manage users within their project
‒ Project Admins create userids in OpenStack, then the users can log in and authenticate against w3.ibm.com or
ibm.com, depending on the zone
We created a custom tab for Horizon to expose this functionality
29
Lessons learned
30 © 2014 IBM Corporation
Challenges
Floating IPs
‒ Needed to extend to an additional subnet
‒ Getting to other VMs by floating IP
Qpid
‒ Majority of the community uses Ubuntu/RabbitMQ stack; therefore RHEL/Qpid stack was not as well tested
Image and instance compliance
‒ General security rules
‒ Ongoing security compliance of the instances
‒ Auto disk re-size
30
31 © 2014 IBM Corporation
Modularity tweaks
31
In general, moving OpenStack services into VMs is not a bad idea
Easier and cleaner to upgrade – no need to change baremetal OS
Easier to move VMs between hosts to back them up
Eventually, resiliency – multiple services running on multiple VMs..
But, is it fast enough? Is disk I/O fast enough?
But, what creates these VMs (devops aspects)?
OK: MySql
OK: MQ service
‒ But, found QPID performance issues, so moved back to bare metal.
As well we found a memory leak.
‒ Eventually migrated to rabbit on a VM
OK: Horizon
Maybe: Glance
Not OK: Storage components (Swift, Cinder)
32 © 2014 IBM Corporation
Resiliency considerations
32
Ability to recover from the loss of a single HD
‒ OS: we have a pair of mirrored disks in RAID 1
‒ But, what’s so important? The OS can be reloaded easily. It’s the question of how
quickly..
‒ Compute nodes: image cache could be lost
‒ Compute nodes: VM disks; we are converting to RAID 10 for regular VMs
‒ Swift: a bunch of JBODs
‒ Glance: RAID 10
‒ Cinder: RAID 10
Ability to recover the loss of a single node
‒ A lot of work to do to get here!
HD
HD
HD
HD
HD
HD
HD
HD
HD
HD
HD
HD
A CloudFirst Factory Node
33
Upgrading the CloudFirst Factory
34 © 2014 IBM Corporation
Upgrading Folsom
Move to updated version and new features
Moved to Grizzly EE in order to align with SCE 3.0 direction
Opportunity to re-architect
‒ Split control node into a number of VMs
‒ Scale API services
‒ Scale Horizon
34
35 © 2014 IBM Corporation
Migration from Folsom to Grizzly
35
This was a major migration of a heavily used environment – three racks
Concerns about uptime, data integrity
Created a test environment consisting on two nodes; rehearsed upgrade/downgrade there multiple
times until we got it right
Database schema changed, so it needed to be modified and data in it, preserved
There shutoff VMs in compute nodes which were in deleted state with incorrect hostname in
MySQL. This prevented nova compute from starting (widespread)
Some deleted/shutoff VMs were present in more than one node (incomplete previous attempt to
migrate them?)
Some VMs were active in different nodes from what MySQL thought
db_sync on cinder database failed. Updated table layout manually
File ownership of configuration files were critical on OpenStack (chown swift:swift /var/lib/swift),
errors related to package signatures.
Xvnc had issues starting when running along side with openstack-console-auth; as a workaround
we moved openstack-console-auth to a different node
36 © 2014 IBM Corporation
Migration from Folsom to Grizzly - continued
36
This was a major migration of a heavily used environment – three racks
Concerns about uptime, data integrity
dbsync on openstack database failed when we moved mysql to a different host. Kept mysql on
the controller node
Any Cinder or Glance migration need it’s own maintenance window. We found that upgrading
nova volume was troublesome and we had better success by setting up cinder and Glance on
a different path and then migrating the data
We used IBM OpenStack Grizzly EE. It does not include Horizon or Swift; so we had to
complement it, luckily with compatible versions
Other than that, the upgrade was straightforward
The VMs remained accessible throughout, no reboots
37 © 2014 IBM Corporation
Post-Grizzly migration notes
37
This was a major migration of a heavily used environment – three racks
Concerns about uptime, data integrity
Qpid memory leak: must set amqp_rpc_single_reply_queue=True !!
Migrated queue management from QPID to RabbitMQ
Increased the number of API workers
Scaled out Conductor services to 3 services nodes running 4 conductor processes each (12 all
together)
Migrated glance images to RAID 10 12 TB node
Full Nagios monitoring: plugins for OpenStack services and supporting services (MySQL,
RabbitMQ, TGT, etc)
TGT service unexpectedly stops finding targets and attaching new targets is no possible, this
caused Cinder volume to stop working unexpectedly
38 © 2014 IBM Corporation
On-ramp blue deployment today
38
39 © 2014 IBM Corporation
On-ramp Monitoring - Nagios
39
Each server in the grid has applied the standard checks and the custom checks are applied to the specific hosts such as Controller, Cinder and Horizon
Standard checks:
Check_ssh: Try to connect to an SSH server at specified server and port
Check_load: Tests the current system load average, with thresholds defined as warning if it reaches the 80% and critical 95%
Check_users: Test the amount of users logged into the local machine, with warning if it is more than 20 and critical if more than 50.
Check_ping: Test ping to the primary NIC of the machine
Check_local_disk: Define a service to check the disk space of the root partition on the local machine. Warning if < 20% free, critical if < 10% free space on partition.
Check_local_swap: Define a service to check the swap usage, critical if less than 10% of swap is free, warning if less than 20% is free.
Check_total_processes: Define a service to check the number of currently running procs, on the local machine. Warning if > 250 processes, critical if > 400 users.
40 © 2014 IBM Corporation
On-ramp Monitoring – Nagios - continued
40
Each server in the grid has applied the standard checks and the custom checks are applied to the specific hosts such as Controller, Cinder and Horizon
Custom checks:
Check_rabbitmq: Define a service to check the connection to a rabbitmq server, provides a custom detailed error message in case of a failure.
check_Openstack_dashboard: Define a service to check the http connection using an IP and port in a specific host, will return custom error messages based on the output.
Check_mysql: Define a service to check the mysql connection using an specific user and password on a host.
check_compute: Define a service to check the existence of the nova-compute process in the host, returns an error if at least one process is not found.
Created custom Cinder monitoring plugin to ensure create, attach, detach and deletes were possible
41 © 2014 IBM Corporation
Nagios Dashboard
41
42
Summary
43 © 2014 IBM Corporation
In Summary
OpenStack remains an emerging technology
Error handling not robust
Understanding the flow of calls and messages is needed
Large volume of message based rpc calls
Logging is not optimal (either too much or too little)
You need to be willing to look at code
Community of OpenStack on RHEL is not as robust
More seamless upgrade process is needed
Networking (nova-network) is challenging
‒ Multiple bridges
‒ IPTables configuration not straightforward
‒ Floating IP routing
43
44 © 2014 IBM Corporation
Monday, May 12 – Room B314
12:05-12:45
Wednesday, May 14 - Room B312
9:00-9:40
9:50-10:30
11:00-11:40
11:50-12:30
OpenStack is Rockin’ the OpenCloud Movement! Who‘s Next to Join the Band ?
Angel Diaz, VP Open Technology and Cloud Labs
David Lindquist, IBM Fellow, VP, CTO Cloud & Smarter Infrastructure
Getting from enterprise ready to enterprise bliss - why OpenStack and IBM is a
match made in Cloud heaven.
Todd Moore - Director, Open Technologies and Partnerships
Taking OpenStack beyond Infrastructure with IBM SmartCloud Orchestrator.
Andrew Trossman - Distinguished Engineer, IBM Common Cloud Stack and
SmartCloud Orchestrator
IBM, SoftLayer and OpenStack - present and future
Michael Fork - Cloud Architect
IBM and OpenStack: Enabling Enterprise Cloud Solutions Now.
Tammy Van Hove -Distinguished Engineer, Software Defined Systems
IBM Sponsored Sessions
45 © 2014 IBM Corporation
Monday, May 12
2:50 - 3:30
3:40 - 4:20
3:40 - 4:20
Tuesday, May 13
11:15 - 11:55
2:00 - 2:40
5:30 - 6:10
5:30 - 6:10
Wednesday, May14
9:50 - 10:30
2:40 - 3:20
Thursday, May 15
9:50 - 10:30
1:30 - 2:10
2:20 - 3:00
IBM Technical Sessions
46
Be sure to stop by the IBM booth to see some demos and get your rockin’ OpenStack t-shirt while they last.
Don’t miss Monday evening’s booth crawl where you can enjoy Atlanta’s own SWEET WATER IPA!
Thank you !