design and operation of openstack cloud on 100 physical servers - openstack summit 2014 paris...

62
Copyright©2014 NTT DOCOMO, INC. All rights reserved. Design and Operation of OpenStack Cloud on 100 Physical Servers NTT DOCOMO Inc. Ken Igarashi Virtualtech Japan Inc. Hiromichi Ito NEC Akihiro Motoki

Upload: virtualtech-japan-inc

Post on 17-Jul-2015

1.053 views

Category:

Technology


9 download

TRANSCRIPT

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.

Design and Operation of OpenStack Cloud on 100 Physical Servers

NTT DOCOMO Inc. Ken Igarashi

Virtualtech Japan Inc. Hiromichi Ito

NEC Akihiro Motoki

DOCOMO, INC All Rights Reserved

Ken Igarashi ○  Leading OpenStack Project at NTT DOCOMO ○ One of the first members of proposing

OpenStack Bare Metal Provisioning (currently called "Ironic") - bit.ly/1stuN2E

Hiromichi Ito ○ CTO of Virtualtech Japan Inc.

Akihiro Motoki ○ Senior Research Engineer, NEC ○ Core developer of Neutron and Horizon.

About Us

2

DOCOMO, INC All Rights Reserved

○  Information required Ø Hardware resources/performance

–  Management resources –  User resources

ü Nova, Cinder – depends of individual Ø Hardware/Software configuration

–  High Availability –  Network configuration (e.g. Neutron)

Ø Deployment tool –  JuJu/MaaS, Fuel, Helion, RDO etc

○ How we get it Ø Did simulation using 100 physical hosts

–  Total 3200vCPU, 12.8TB Memory –  Collaboration with: National Institute of Information and Communications

Technology, VirtualTech Japan Inc., NTT Advanced Technology Corporation, Japan Advanced Institute of Science and Technology, Tokyo University and Dell Japan Inc.

Design OpenStack Cloud

3

DOCOMO, INC All Rights Reserved

Test Environment

4

National Institute of Information and Communications Technology

Ishikawa prefecture

About 1400 servers in the single site

DOCOMO, INC All Rights Reserved

Research and Development New locater protocol development Home network protocol development Virtual node migration algorithms HEMS management protocol New tunnel protocols Inter-AS traceback

TCP behavior comparison Proxy server performance evaluation Evaluation of X-ray sharing Video conference protocol switching FW benchmarking

Protocol / Product Evaluation

Education Security operation competition

Cyberrange training Remote hands-on for Asian students

Competition of cloud computing ideas

Testbed federation algorithms Supporting software for control testbeds

Wireless link simulation on wired link IPv6 support on network testbeds

Simulation

Realistic and Flexible

experiments based on bare-wire

environment

StarBED – http://bit.ly/10gYttm

5

○ Open to any companies and organizations

DOCOMO, INC All Rights Reserved 6

100 Physical Servers on StarBED

Compute Node x 36

Leaf Switch (S4810)

Leaf Switch (S4810)

Leaf Switch (S4810)

Spine Switch (S6000)

Spine Switch (S6000)

Compute Node x 37

40Gb x 2

10Gb x 4

LB (BIG-IP 5200V) x 2

Leaf Switch (S4810) 10Gb x 4

Management Servers x 21

10Gb x 1

10Gb x 1 10Gb x 1

10Gb x 1

Compute Node x 6

40Gb x 2 40Gb x 2 40Gb x 2

10Gb x 1 10Gb x 1

10Gb x 1 10Gb x 1 10Gb x 1 10Gb x 1

○ OpenStack Icehouse

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.

Network Configuration

DOCOMO, INC All Rights Reserved

○ Multi-Chassis Link Aggregation (MLAG)

Network Redundancy

8

○ Endhost Equal Cost Multi Path(ECMP)

Switch Switch

eth1

bond0 z.z.z.z

eth2

MLAG with VRRP

ECMP

Bonding

Switch Switch

eth1 x.x.x.x

lo z.z.z.z

eth2 y.y.y.y

Routing Protocol

ECMP

Maturity Need expensive switch

Remove network complexity Maturity

DOCOMO, INC All Rights Reserved

○ Virtual network creation is essential to increase network security Ø ML2 with tunnel network configuration

–  Type Driver ü VXLAN ü GRE

–  We chose VXLAN ü VXLAN uses MAC Address-in-User Datagram Protocol (MAC-in-UDP)

encapsulation ü The load balancing algorithm works effectively by using UDP port

number hash ü Many network hardware support VXLAN

Ø Mechanism Drivers –  Open vSwitch (OVS) –  Linux Bridge

Neutron Configuration

9

DOCOMO, INC All Rights Reserved

○  Throughput between 1 VM and1 VM on different physical hosts (1 TCP connection) Ø No much difference between OVS and Linux Bridge Ø MLAG gets better performance than ECMP

Throughput for Different Network Configuration

10

3.4    

3.6    

3.8    

4.0    

4.2    

4.4    

4.6    

ovs_mlag   ovs_ecmp   bridge_mlag  

Throughp

ut  [G

bps]

DOCOMO, INC All Rights Reserved

○ MLAG with OVS seems the best configuration today Ø Performance, Potential, Stability

3.4    

3.6    

3.8    

4.0    

4.2    

4.4    

4.6    

ovs_mlag   ovs_ecmp   bridge_mlag  

Throughp

ut  [G

bps]

Throughput for Different Network Configuration

11

We increased VM’s MTU to 8950 to get the performance but the physical network bandwidth is 20Gbps

DOCOMO, INC All Rights Reserved

Throughput for Different Number of VMs

12

○ VM communicates to random VM on a different physical hosts (1 connection per VM)

○  It consumes only 50% of the total bandwidth though allocating all physical resource to VM

0.0    

2.0    

4.0    

6.0    

8.0    

10.0    

12.0    

0.0    

0.5    

1.0    

1.5    

2.0    

2.5    

3.0    

3.5    

100   200   300   400   477  

 Throu

ghpu

t  (PH

Y)  [G

bps]  

 Throu

ghpu

t  (VM

)  [Gbp

s]

Number  of  Servers

VM  (MTU  1500)   VM  (MTU  8950)   PHY  (MTU  1500)   PHY  (MTU  8950)  

* PHY :VM’s total throughput measured at a physical host

DOCOMO, INC All Rights Reserved

○ We could get 19Gbps (MTU 1500) between physical hosts ○ Enabling VXLAN

Ø We could get only 10Gps (MTU 8950) Ø VM’s CPU load during the communication

Ø  The throughput is highly reduced by turning on VXLAN –  CPU is overloaded by VTEP software processing

ü packet encapsulation and de-capsulation

Slow Throughput

13

Server Receiver

89.3 0.0 391:31.82 vhost-yyyyy 49.3 0.8 257:06.66 qemu-system-x86

98.4 0.0 462:41.90 vhost-xxxxx 42.9 0.9 294:34.67 qemu-system+

DOCOMO, INC All Rights Reserved

○ NIC with VXLAN offload would be able to reduce CPU load ○ Available Device Lists

Ø Mellanox ConnectX-3 Pro –  World 1st VXLAN offload NIC

Ø  Intel X710,XL710 –  Release at 2014 Sep.

Ø Emulex XE102 Ø Qlogic 8300 series

–  Support at October 21, 2013 software release Ø Qlogic NetXtreme II 57800 series

–  Broadcom is selling its NetXtreme II line of 10GbE controllers and adapters to QLogic.

NIC with VXLAN Offload Support

14

DOCOMO, INC All Rights Reserved

0.0    

5.0    

10.0    

15.0    

20.0    

25.0    

0.0    

0.5    

1.0    

1.5    

2.0    

2.5    

3.0    

3.5    

4.0    

10   20   30   38  

Throughp

ut  (P

HY)  [Gbp

s]

Thou

ghpu

t  (VM

)    [Gbp

s]

Number  of  Servers

VM  OFF  (MTU  1500)   VM  ON  (MTU  1500)   VM  OFF  (MTU  8950)   VM  ON  (MTU  8950)  PHY  OFF  (MTU  1500)   PHY  ON  (MTU  1500)   PHY  OFF  (MTU  8950)   PHY  ON  (MTU  8950)  

Throughput using VXLAN Offload NIC

15

○  Throughput between VMs on 4 different physical hosts (2 server, 2 receiver)

○  It can consume 98% of the total physical bandwidth Ø VXLAN Offload with MTU 8950

3.5 ~ 5.6 x

1.3 ~ 1.4 x

* PHY :VM’s total throughput measured at a physical host

DOCOMO, INC All Rights Reserved

CPU Load

16

0.0

50.0

100.0

150.0

200.0

250.0

1 2 4 6 8 10 12 14 16

CPU

[%]

Number of Servers

On Tx CPU/Gbps OFF Tx CPU/Gbps

27.1%

0.0

50.0

100.0

150.0

200.0

250.0

300.0

1 2 4 6 8 10 12 14 16

CPU

[%]

Number of Servers

On Rx CPU/Gbps OFF Rx CPU/Gbps

28.5%

Server Receiver

DOCOMO, INC All Rights Reserved

○ We could get 1.3~5.5 times throughput compared to NIC without offload capability

○ CPU load on a physical host was reduced 27~28% ○ MTU 8950 showed 1.5~1.6 times better throughput than MTU

1500 Ø We decided to set MTU 9000 on a physical host but we deliver MTU

1500 by DHCP server Ø  Let user extend MTU

VXLAN Offload NIC

17

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.

High Availability

DOCOMO, INC All Rights Reserved

○ You need 10-12 people Ø  4 group + α people are required

○  If we can delay fixing a problem later, we can only work on

weekday Ø High Availability is the key to achieve this

○ Our design Ø Double redundancies for hardware Ø  Triple redundancies for software ⇒ Against double failure

24/7 Support

19

DOCOMO, INC All Rights Reserved

○ Others ○  Load Balancer based

VMVM

VMVM

VMVM

MySQL (Galera)

High Availability

20

Arbitrator

DB1 DB2

DB3 DB4 VMVMNova

OpenStack APIs

Zabbix

LB LB

Load Balancing SSL Termination Health Check

Neutron Agents

PXE, DNS, DHCP

MAAS

RabbitMQ

DOCOMO, INC All Rights Reserved

○  4 Nodes + 1 Arbitrator MySQL HA

21

Arbitrator

DB1

LB LB

DB2 DB3 DB4

Read/Write to a single node

Quorum-based Voting

Health Check

•  Check TCP Port 3306 •  Cluster Status

ü show status like 'wsrep_ready=‘ON’

Priority 1 Priority 2 Priority 3 Priority 4

DOCOMO, INC All Rights Reserved

Galera-cluster State Transition

22

Open Primary

Joiner

Joined [3]

Synced [4]

Donor [2]

IST and SST

wsrep_ready=‘ON’

○ WSREP_STATUS = 2 and 4 can’t cover all the states

DOCOMO, INC All Rights Reserved

○ Node Recovery Ø Health check detects DB1’s failure

MySQL HA

23

DB1

LB LB

DB2 DB3 DB4

Priority 1 Priority 2 Priority 3 Priority 4

Health Check

•  Check TCP Port 3306 •  Cluster Status

ü show status like wsrep_ready=‘ON’

Arbitrator

DOCOMO, INC All Rights Reserved

○ Node Recovery Ø Designated DB is changed from DB1 to DB2

MySQL HA

24

DB1

LB LB

DB2 DB3 DB4

Priority 1 Priority 2 Priority 3 Priority 4

•  Cluster Status ü show status like wsrep_ready=‘YES’ -> ‘NO’

Arbitrator

DOCOMO, INC All Rights Reserved

Arbitrator

○ Node Recovery Ø DB1 is fixed from DB4 (lowest priority) using IST or SST

MySQL HA

25

DB1

LB LB

DB2 DB3 DB4

Priority 1 Priority 2 Priority 3 Priority 4

Synchronization •  IST: Incremental State Transfer •  SST: State Snapshot Transfer

DOCOMO, INC All Rights Reserved

MySQL HA

26

DB1

LB LB

DB2 DB3 DB4

Priority 1 Priority 2 Priority 3 Priority 4

○ Node Recovery Ø DB1’s priority is changed before joining the cluster

Priority 1 Priority 2 Priority 3 Priority 4

Arbitrator

DOCOMO, INC All Rights Reserved

○ Node Recovery Ø  The cluster is backed to normal state

MySQL HA

27

DB1

LB LB

DB2 DB3 DB4

Priority 4 Priority 1 Priority 2 Priority 3

Arbitrator

DOCOMO, INC All Rights Reserved

0.0    

50.0    

100.0    

150.0    

200.0    

250.0    

300.0    

350.0    

120TPS   240TPS   120TPS   240TPS  

Time  for  recovery    [s]

Backgroud  Traffic

JOINED-­‐>SYNCED  JOINER-­‐>JOINED  

Recovery Time

28

○  Time for IST

Performance Max 340 TPS

Performance Max 1356 TPS

DOCOMO, INC All Rights Reserved

Recovery Time

29

0.0    

200.0    

400.0    

600.0    

800.0    

1,000.0    

1,200.0    

1,400.0    

1,600.0    

1,800.0    

2,000.0    

120TPS   240TPS   120TPS   240TPS  

Time  for  recovery    [s]

Background  Traffic

JOINED-­‐>SYNCED  JOINER-­‐>JOINED  

○  Time for SST

Performance Max 340 TPS

Performance Max 1356 TPS

DOCOMO, INC All Rights Reserved

○  Loosing all database

Disaster Recovery

Restore from backup

Fix NW

DB1

DB2

DB3

DB4

Stand-by

30

SST

Run MySQL

Run MySQL Stand-by

Stand-by

DONOR

Run MySQL

SST

SST

Healthy State

DB 3GB

11 seconds 70 seconds 70 seconds 98.2 minutes 97.5minutes for 12hours

bin log recovery

Run MySQL DONOR

DOCOMO, INC All Rights Reserved

○ MAAS includes Ø DNS, DHCP, tftp

○ DNS Ø Master – Slave

○ DHCP (ISC DHCP) Ø Replication – (Delivering fixed IP address through DHCP)

○ MAAS and tftp Ø Back up by VM

MAAS-HA

31

MAAS

Storage

VM Image

Activate

DOCOMO, INC All Rights Reserved

○ We add multiple RabbitMQ address to configuration files Ø Easy configuration and application level health monitoring At lease 3 RabbitMQ (5 ideally) hosts are required against split-brain

○  Read/Write to single node using load balancer Ø Don’t need to care about split-brain – 3 RabbitMQ hosts Ø  Network level health monitoring

RabbitMQ-HA

32

VMVM

VMVM

VMVM

VMVMNova

LB

LB

VMVM

VMVM

VMVM

VMVMNova

fcluster_partition_handling =‘autoheal’

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.

Neutron HA

DOCOMO, INC All Rights Reserved

Network Setup

34

o  DHCP agent Ø  Support Active-Active. Assign a virtual network into multiple agents

ü dhcp_agents_per_network = 3 (should be <= 3) o  L3 agent

Ø  Support only Active-Standby Ø  If it fails, we need to migrate a router to another agent

o  Metadata agent Ø  Has no state ⇒ Just need to keep metadata-agent running in all nodes

NW node

Data Plane (VXLAN)

External Net

Neutron Server

Message Queue

NW node

NW node

L3-agt

dhcp-agt

Control Plane

dhcp-agt dhcp-agt

L3-agt L3-agt meta-agt meta-agt meta-agt

Compute Node

Compute Node

Compute Node

DOCOMO, INC All Rights Reserved

Monitoring Points

35

NW node

Data Plane (VXLAN)

External Net

GW router

Neutron Server

Message Queue

NW node

NW node

L3-agt

dhcp-agt

Compute Node

Compute Node

Compute Node

Control Plane

dhcp-agt dhcp-agt

L3-agt L3-agt

[2] PING from external net

[1] PING from Internal net

[4] PING from C-plane

[3] Agent state check via REST API

DOCOMO, INC All Rights Reserved

○ Data plane connectivity –  If it fails, users cannot communicate through routers.

Ø  [1] Internal network for VXLAN (ping) Ø  [2] External network (ping)

○ Network agent health check –  L3 agent, DHCP agent

Ø  [3] Agent alive state from neutron server (REST API agent-list) –  Each neutron agent reports its state via message queue.

Ø  [4] Control network connectivity (ping) –  If it fails, we are no longer able to control the node.

Health Checks against Failures

36

DOCOMO, INC All Rights Reserved

Recovery from Failures

37

NW node

Data Plane (VXLAN)

External Net GW

router

Neutron Server

Message Queue

NW node

NW node

N N N

R R R R R

N N

Compute Node

Compute Node

Compute Node

Control Plane

L3-agt

dhcp-agt dhcp-agt dhcp-agt

L3-agt L3-agt

(1) Disable agents on the host

DOCOMO, INC All Rights Reserved

Recovery from Failures

38

NW node

Data Plane (VXLAN)

External Net GW

router

Neutron Server

Message Queue

NW node

NW node

N N N

R R R R R

N N

Compute Node

Compute Node

Compute Node

Control Plane

L3-agt

dhcp-agt dhcp-agt dhcp-agt

L3-agt L3-agt

(2) Migrate network/router

DOCOMO, INC All Rights Reserved

Recovery from Failures

39

NW node

Data Plane (VXLAN)

External Net GW

router

Neutron Server

Message Queue

NW node

NW node

N N N N

R R R R R R

N N N

Compute Node

Compute Node

Compute Node

Control Plane

L3-agt

dhcp-agt dhcp-agt dhcp-agt

L3-agt L3-agt R

DOCOMO, INC All Rights Reserved

Recovery from Failures

40

NW node

Data Plane (VXLAN)

External Net GW

router

Neutron Server

Message Queue

NW node

NW node

N N N

R R R R R

N N

Compute Node

Compute Node

Compute Node

Control Plane

L3-agt

dhcp-agt dhcp-agt dhcp-agt

L3-agt L3-agt

(3) Shutdown NICs (or the node)

DOCOMO, INC All Rights Reserved

○ Dedicated network namespace on network node for external connectivity checking Ø Network node has reachability from external network node. Ø Use IP address on isolated namespace to avoid access to the node host

from public network.

Tips: Checking External network connectivity

41

Network Node

Bridge (external)

ethN

Router netns

Router netns

Netns for checking

IPAddr

GW router

PING check

No access to the host

DOCOMO, INC All Rights Reserved

○  Throughput from external node to a VM ○  Injected a control plane failure and migrated a router from

another L3-agent

Traffic During Router Migration

42

0

100

200

300

400

500

600

700

800

900

1000

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88

Elapsed Time [second]

Thro

ughp

ut [M

bps]

10 seconds

DOCOMO, INC All Rights Reserved

○ Migrated 88 routers from one L3-agent to other two L3-agents Router Migration Progress

43

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

100.0

0:00:00 0:00:17 0:00:35 0:00:52 0:01:09 0:01:26 0:01:44 0:02:01 0:02:18

Num

ber o

f Rou

ters

Pro

cess

ed

Elapsed Time [sec]

L3-agent processed

REST API processed

REST API requested

L3-agent processed (aggregated)

DOCOMO, INC All Rights Reserved

Possible Improvements

44

NW node

Data Plane (VXLAN)

External Net GW

router

Neutron Server

Message Queue

NW node

NW node

Compute Node

Compute Node

Compute Node

Control Plane L3-agt L3-agt L3-agt

R R R R

o  Integration with L3-Agent HA feature Ø  It improves data-plane availability much Ø Monitoring external network connectivity needs to be improved in L3-HA Ø Still requires router migration based on C-Plane monitoring

No monitoring for external network now

HA supported for internal network failure

C-Plane monitoring is still required

DOCOMO, INC All Rights Reserved

○  Integration with Juno Neutron features Ø Using L3-Agent HA feature (prev. page) Ø  Leveraging L3-agent auto rescheduling

–  Helps us reduce the number of REST API calls –  Juno Neutron support L3-agent rescheduling for routers on inactive agents –  “admin_state” is not considered for rescheduling ß Need to be improved

○ Possible contributions to Neutron upstream Ø DHCP agent auto rescheduling Ø  LBaaS agent scheduling

–  There is no way to reassigning LBaaS agent for HAProxy driver

Possible Improvements

45

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.

Management Resources

DOCOMO, INC All Rights Reserved

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MaaS, MongoDB

Management Resources

47

DOCOMO, INC All Rights Reserved

Management Resources

48

3 ○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MaaS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

49

3

3 + 2

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MaaS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

50

3

3 + 2

4 + 0.5

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MaaS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

51

3

3 + 2

4 + 0.5

3

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MaaS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

52

3

3 + 2

4 + 0.5

3

3

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MAAS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

53

3

3 + 2

4 + 0.5

3

3

xxTB

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MAAS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

54

3

3 + 2

4 + 0.5

3

3

xxTB

2

○  Controller Ø  API

○  Message Queue Ø  RabbitMQ

○  Database Ø  MySQL – OpenStack

○  Neutron Servers

○  Monitoring Ø  Zabbix Servers(+MySQL)

○  Storage Ø  Log, Backup

○  Deployment + etc Ø  MAAS, MongoDB

DOCOMO, INC All Rights Reserved

Management Resources

55

Controller RabbitMQ

MySQL

Neutron Zabbix Log, backup storage

etc

DOCOMO, INC All Rights Reserved

Management Resources

56

Controller RabbitMQ

MySQL

Neutron Zabbix Log, backup storage

etc

Nova Compute

DOCOMO, INC All Rights Reserved

Scalability Test

57

0.0%  

10.0%  

20.0%  

30.0%  

40.0%  

50.0%  

60.0%  

0.0    

5.0    

10.0    

15.0    

20.0    

25.0    

30.0    

35.0    

40.0    

45.0    

50.0    

0-­‐1000   1000-­‐2000   2000-­‐3000   3000-­‐4000   4000-­‐5000  

Error  R

ate  [%

]

Time 

[S]    

Number  of  VMs

Elapsed  Time  to  Boot  a  Instance  (Avg.)  Error  %  

○ We measured VM boot time for 0-5000 instances

DOCOMO, INC All Rights Reserved

Database Size - Zabbix

58

size duration

History 50 bytes 30 days

Trend 128 bytes 90 days

Event 130 bytes 90 days

Servers Switch (per port) Tempest Health Check (30 seconds)

Usage (180 seconds)

Health Check

Usage System Check

Item Number 69 557 1 24 500

Size (history) 15GB 40GB 687MB 5GB 108MB

Size (trend) 2GB 15GB 88MB 2GB 138MB

Size (event) * 1GB 1GB 1GB 1GB 1GB

Total Size 18GB 57GB 2GB 8GB 1GB

* Assume 1 event/second

86 GB

DOCOMO, INC All Rights Reserved

Database Size - OpenStack

59

Sep 14 2014 Sep 25 2014

OpenStack Related

Keystone* 1.4GB 1.4GB

Nova (28k -> 55k) 451MB 856MB

Neutron (7k -> 9k) 78MB 235MB

Glance 64MB 89MB

Heat 45MB 55MB

Cinder 39MB 43MB

Sub Total 2.1GB 2.7GB

MySQL Related

Transaction log 4.1GB 4.1GB

Ibdata1 268MB 268MB

Total Size 6.4GB 7.0 GB

* Did “keystone-manage token_flush” every 1 hour

DOCOMO, INC All Rights Reserved

○ We can change configuration easily (e.g. HA and Neutron) ○ We can use Ansible for deployment and operation

Deployment Tools

60

DOCOMO (Ansible based)

Mirantis Fuel HP Helion

Canonical Juju/MAAS

MySQL HA LB +

Percona

haproxy+corosync

+pacemaker +Galera

haproxy+keepalived

+Galera

haproxy+corosync

+pacemaker +Percona

RabbitMQ HA

Configfile based (pause_minority)

RabbitMQ Cluster(autoheal)

+ LB

Configfile based (pause_minority)

Configfile based (ignore)

LB HA Commercial

Products

haproxy (nameserver)

+corosync+pacemaker

haproxy+keepalived

haproxy+corosync

+pacemaker

Network Neutron

+ Own HA Neutron Neutron DVR Neutron

DOCOMO, INC All Rights Reserved

○ Default security group Ø  IP table entry is added/deleted to all VMs whenever you create/delete a VM ⇒ ovs-agent became busy when we created mv VMs

○ Number of Neutron workers Ø  neutron.conf

–  api_workers = ‘number of cores’ –  rpc_workers = ‘number of cores’

Ø metadata_agent.ini –  metadata_workers = ‘number of cores’

○ Number of File Descriptors Ø Default : 1024 Ø RabbitMQ: more than 5,000 connections Ø metadata-ns-proxy (L3-agent,dhcp-agent): request x 2

○ Retry VM Creation Time Ø  nova.conf

–  scheduler_max_attempts = 1 ⇒ No difference between 1 and 3

Tips Learned from Scalability Tests

61

Copyright©2014  NTT  DOCOMO,  INC.  All  rights  reserved.END