Orchestrating an OpenStack DevOps Cloud for
R&D to Achieve Continuous Delivery
Ting Zou & Tanay Nagjee – November 7, 2013
Introductions
2
Tanay Nagjee
Solutions Engineer @ Electric Cloud
Former ElectricCommander Engineer
Ting Zou
Director, Cloud Computing R&D Data Center
Huawei USA
A privately-owned Global Company
An ICT Industry Leader
An Innovative Industry Contributor
This is Huawei
A global company providing information
and communications technology (ICT)
solutions.
Products and solutions have been
deployed in 140+ countries, serving 1/3 of
the world’s population.
A privately-owned company founded in
1987, Shenzhen
A Privately-owned Global Company
5
Sustainable Growth
By BG (2012) By Region (2012)
2012 Revenue amounted to $35.4 billion, a YoY increase of 8%; net profit reached $2.47 billion
2012 USD 4.8 billion R&D investment
1H 2013, Revenue amounted to USD $18.5B, a YoY increase of 10.8%; Expected net profit margin of
7-8% in 2013.
73%
22%
5%
Carrier Network 25.7 bn
Consumer 7.8 bn
Enterprise 1.9 bn
Americas 5.1 bn China
11.8 bn
Asia Pacific 6.0 bn
EMEA 12.5bn
15% 33%
17% 35%
Currency: USD Currency: USD
1.5Bn
2Bn
2.7Bn
3.8Bn
4.8Bn
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
2012 R&D investment: USD 4.8
billion, 13% of 2012 revenues
70,000 R&D employees
16 R&D centers worldwide
28 joint innovation centers
41,948 patents in China, 12,453 international
PCTs, 14,494 patents in outside of China.
US$300 million in patent royalties each year.
150+ standards organizations
30,000+ standards proposals
Accumulated R&D investments: USD 19
billion
Continuous Innovation Investment
What are we solving?
Large scale R&D environment with complicated tools requires multi-
thousands of CPU cores available on demand.
8
120
10
Current Build Time Expected Build Time
300
9
Actual Time Engineering Expectation
5 Million LOC Build
(unit: minute)
10 Million LOC code coverage testing report
(unit:minute)
What are we solving?
Environment/tools provisioning is very time-consuming; lab asset
utilization is low.
9
0
100
200
300
400
500
Current expected
480
20
environment provision
Current Provision time Expected Provision Time
10
2
Compile/Build environment Provision
(unit:minute)
Software Testing environment provision
(unit:hour)
What are we solving?
R&D data grows rapidly; requires the flexibility of dynamically
expanding PB storage capability.
10
0
500
1000
1500
2000
2500
2011年 2012年
1300
2500
500G
1000G
daily execution data generated by daily build
Millions LoC build static analysis/ build
coverage to generate PB data annually
Data Generated by the Tools in Auto Testing/CI
What are we solving?
Integration between development and operation; DevOps enabled
continuous delivery required.
11
Virtualization and cloud computing
converged infrastructure
Automatic provision and
configuration management technology
DevOps
Cloud
R & D requires multi-layer and multi-
platform infrastructure
Allow developers more control over the
production environment
Concise definition of R&D processes,
automate as much as possible
Bridging the gap between development
and operations
Globally distributed R&D teams
Virtualization/Cloud sounds like a good solution, but...
Servers CPUs Standard Enterprise Enterprise Plus
3000 6000 8 million US$ 20 million US$ 25 million US$
To virtualize 3k 2-CPUs servers, commercial SW license + 1 year support is a big investment:
Shall we leverage low-cost(free) open source cloud OS to reduce R&D TCO?
Source: http://www.qyjohn.net
Choose OpenStack as the Provider of IAAS
Controller Node
#3
Controller
Quantum(A)
Swift
Swift-Proxy
Private Switch (Network)
External Switch (Network)
Controller Node #2
Controller
Quantum(A)
Storage Node
Swift
Swift-Proxy
iLCM
(Lab Configuration Manager)
Cobbler Chef
Puppet etc
In-House Dev Module
Controller Node #1
Controller
Quantum(A)
vSwitch
8 Huawei RH2285(dual 6-cores CPUs, 96GB MEM, 8TB storage) servers used in PoC
Openstack is provisioned and managed by Huawei iLCM(intelligent Lab Configuration Manager) with backend
integration with cobbler/chef/puppet open source tools
Leverage the OpenStack open source community as much as we can (Devstack, Mirantis, RPC, Dell crossbar etc)
plus in house developed script modules, will contribute back to the community once it is ready.
Single Portal for Cloud Admin and Users
Cloud Admin Portal
for users with
different level of
privileges, integrated
with LDAP at the
backend
Cloud resource
usage monitoring
(Roadmap)
Cloud VM provision
and software
configuration
management
Cloud Network,
Storage and Security
configuration
management
(Roadmap)
Cloud is free now, what else can be free?
Design modeling
Code analysis
CI
Redmine
Func Testing
Requirement Analysys
Project Mgmt
SCM
Collaboration
Design Development Testing
gUnit,cppUnit
Automation Framework
Unit Testing Selenium
Pef Testing
IOMeter
Others
Hadoop
Provision
Chef
Travis
-CI
memcached
Github
Reviewboard
Trac
Puppet
RabbitMQ ActiveMQ
Hive、 HBase
CFEngine
Subversion Nagios
Monitor
CloudStack OpenStack Cloud Platform
SourceForg
e
Jenk
ins
PcLint valgrind
Jmeter
Mobitest gMock,JMock
CppCheck Cpplint
Page Speed
JIRA
GIT
Gerrit
Mercurial
OpenGrok
Bugzilla
Single Portal for R&D engineering
SaaS (instant
creation of an
environment with
complete portfolio of
common R&D tools)
Lab topology
creation with
compute, network,
storage etc
(Roadmap)
Project R&D data
statistics monitoring
Single dashboard
portal for
engineering to
access all the
needed
resource/tools in
R&D process
IaaS and PaaS enable DevOps-Engineering link
R&D tools PaaS integrated with engineering desktop
dashboard
IaaS integrated with engineering desktop dashboard
Electric Cloud’s Software Delivery System
Apps
Continuous Delivery Manager
Release Trains | Feature Boards | Pipelines | Gates | Dashboards
Workflow | Resource management I Tools integration | API | Security | Reporting
Automation & Acceleration Services
Software Delivery Platform
Pla
tfo
rm
Build
Automation
CI
Test
Automation
CT
Build
Acceleration
Test
Acceleration
Build Test Deploy
Deploy
AutomationC
D
Infrastructure
Provisioning
Before and After Electric Cloud
Issue Before Electric
Cloud After
Electric Cloud Business
Impact
Develop to Deploy 90 Days 10 Minutes 99.93%
Build to Release/Deploy 10+ errors/cycle ~0 errors/cycle 99+%
Audit application changes
(who, what, how, why, when) Days Minutes 90+%
Time to troubleshoot
problems 20 Days Minutes 99+%
Development Scenario
29
Reviewer Mike HUDOS
Developer Joe
Eclipse Modify code
launch preflight
Subversion Check out sources
overlay deltas
Redmine Mark issue as
“build & unit test”
Jenkins Launch build + test w/ preflight source
Redmine Mark issue as “code review”
Review Board Create review
request
Redmine Mark issue as
“resolved”
Review Board Review modified
code
Build + test
success?
Review success?
Eclipse Build + test failed; notify developer
Eclipse Code rejected;
notify developer
Eclipse Success
auto-commit code
Time
Savings
Test Scenario
40
HUDOS
Test Engineer Jill
Redmine Mark issue as
“verifying”
OpenStack Provision specified virtual machines
Testing success?
OpenStack Teardown virtual
machines
Redmine Mark issue as
“closed”
Commander Launch
automated tests
Notification Tests failed; VMs ready to inspect
Notification Tests passed; issue
closed
Commander Pick issue to verify
and tests cases
Time
Savings
武汉
深圳
ECTC
北京
ECTC
印度
ECTC
成都
ECTC
西安
ECTC
武汉
ECTC
上海
ECTC
苏州
ECTC
杭州
ECTC
What a DevOps Cloud brings to us
R&D Cloud Data Center facility to
provide the capability of acceleration
More than 2000 releases per year
More than 50000 compile & builds per day
More than 1million test cases run per day
More than 30million LoC, product is complicated
More than 480K code review/analysis per year
More than 170k system integration testing per year
Design Develop Product Validation Solution Validation
Architect Evaluation Code Analysis
Compile & Build
Hardware Emulation
Full Functional Testing Solution Testing
Hours-> Minutes
Days-> Hours
Weeks-> Days Hours-> Minutes Days-> Hours Months->Weeks
Full Regression Testing(System)
Days-> Hours
Regression Testing(software)
Days-> Minutes
Did we address the stated concerns?
• Large scale R&D environment with complicated tools requires
multi-thousands of CPU cores available on demand.
• Environment/tools provisioning is very time-consuming; lab asset
utilization is low.
• R&D data grows rapidly; requires the flexibility of dynamically
expanding PB storage capability.
• Integration between development and operation; DevOps enabled
continuous delivery required.
Achievements:
• Reduced cost of delivering software
• Increased resource utilization and productivity
• Shorter time to market with higher quality
48
What’s next?
49
武汉 上海 南京
北京
西安
成都
深圳
Hangzhou: 400 + Server
Size: 1600 + core, memory 10TB,
Storage 500TB
Shenzhen HQ: 2000+ Server
Size: 10k+ core, memory 50TB,
1-2PB storage
杭州
Beijing: 200 + server
Size: 800 core, memory 5TB,
Storage 200TB
R&D DATA CENTER
3000 + server, 15k + core, 10K +
virtual machine memory 100TB,
storage 2PB, reduce costs 2M US$