openstack on kubernetes · openstack on kubernetes: make openstack and kubernetes fail-safe...

46
OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn ([email protected]) Jaesuk Ahn ([email protected]) Open System Lab Network IT Convergence R&D Center SK Telecom Wil Reichert ([email protected]) Solinea

Upload: others

Post on 01-Sep-2019

40 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

OpenStack on Kubernetes Make OpenStack and Kubernetes Fail-Safe

Seungkyu Ahn (ahnskskcom)Jaesuk Ahn (jayahnskcom)

Open System LabNetwork IT Convergence RampD Center

SK Telecom

Wil Reichert (wilsolineacom)Solinea

What will happen

bull Introduction

bull KubernetesOpenStack

bull Demo Starts

bull CI

bull Demo Ends

Introducing Our Company SKT bull No 1 Mobile Service Provider in Korea

with 50 market share

bull We has been at the forefront of developing and commercializing new wireless technologies (recently 4G LTE 5band CA with max 700Mbps)

bull We are exploring more than network Especially around AI and Media

bull We actively participate open source project OCP TIP ONOS Ceph OpenStack etc

Solinea bull professional services partner that

accelerates enterprise cloud adoption

bull Technology agnostic always working in the best interest of our clients

bull Our clients are primarily Global Fortune 1000 organizations in multiple industry verticals

This is Totally Community Effort Wil CICD amp K8S

Sungkyu Ahn OpenStack amp K8S

Jaesuk Ahn OpenStack amp K8S

Robert Choi OpenStack amp Automation

Containers

Cloud NativeDan Kim

OpenStack amp K8S

Jawon ChooOpenStack amp Kolla Large Contributing

OpenStack Operator WG

Current (previous) Way SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 2: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

What will happen

bull Introduction

bull KubernetesOpenStack

bull Demo Starts

bull CI

bull Demo Ends

Introducing Our Company SKT bull No 1 Mobile Service Provider in Korea

with 50 market share

bull We has been at the forefront of developing and commercializing new wireless technologies (recently 4G LTE 5band CA with max 700Mbps)

bull We are exploring more than network Especially around AI and Media

bull We actively participate open source project OCP TIP ONOS Ceph OpenStack etc

Solinea bull professional services partner that

accelerates enterprise cloud adoption

bull Technology agnostic always working in the best interest of our clients

bull Our clients are primarily Global Fortune 1000 organizations in multiple industry verticals

This is Totally Community Effort Wil CICD amp K8S

Sungkyu Ahn OpenStack amp K8S

Jaesuk Ahn OpenStack amp K8S

Robert Choi OpenStack amp Automation

Containers

Cloud NativeDan Kim

OpenStack amp K8S

Jawon ChooOpenStack amp Kolla Large Contributing

OpenStack Operator WG

Current (previous) Way SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 3: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Introducing Our Company SKT bull No 1 Mobile Service Provider in Korea

with 50 market share

bull We has been at the forefront of developing and commercializing new wireless technologies (recently 4G LTE 5band CA with max 700Mbps)

bull We are exploring more than network Especially around AI and Media

bull We actively participate open source project OCP TIP ONOS Ceph OpenStack etc

Solinea bull professional services partner that

accelerates enterprise cloud adoption

bull Technology agnostic always working in the best interest of our clients

bull Our clients are primarily Global Fortune 1000 organizations in multiple industry verticals

This is Totally Community Effort Wil CICD amp K8S

Sungkyu Ahn OpenStack amp K8S

Jaesuk Ahn OpenStack amp K8S

Robert Choi OpenStack amp Automation

Containers

Cloud NativeDan Kim

OpenStack amp K8S

Jawon ChooOpenStack amp Kolla Large Contributing

OpenStack Operator WG

Current (previous) Way SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 4: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

This is Totally Community Effort Wil CICD amp K8S

Sungkyu Ahn OpenStack amp K8S

Jaesuk Ahn OpenStack amp K8S

Robert Choi OpenStack amp Automation

Containers

Cloud NativeDan Kim

OpenStack amp K8S

Jawon ChooOpenStack amp Kolla Large Contributing

OpenStack Operator WG

Current (previous) Way SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 5: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Current (previous) Way SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 6: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Previous Product Pain-PointsbullEven Update (patch) is challenging

bullUpgrade - gosh what I can say

bullDeployment issue - snow-flake env vs cattle

bullNot single huge scale OpenStack but many smallmedium OpenStacks

bullLack of flexible configuration management capability in ldquostandardized mannerrdquo

bullVery difficult to integrate with our own stuffs (Ceph SDN Controller Datacenter Operation Platform etc)

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 7: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

More to Do

SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

TEST

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 8: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Continuous Loop SPEC

DEV

TEST

UPSTREAM

Community Code

OpenStack Package

Configuration Management+

Requirements

Deployment Architecture-Network-Storage-Appliance Integration-Configuration Tuning

HardwareAppliancePurchase

Deployment

QA

OpenStack Production

Deployment OperationDevelopment

Triage

Monitoring

Upgrade

Tuning

CapacityMgmt

Scale-out

Analysis

Patch

Deployment Automation

Flexible Configuration

+

Trouble Sh

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 9: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Why OpenStack on Kubernetes

Better way to deliver OpenStack and manage its Lifecycle

bullReducing Overhead Dependency Management

bull ldquoEasy and Fastrdquo Multiple Deployment in ldquoStandardizedrdquo way

bullUpgradeUpdateRollback

bullEasy Scaling and Healing

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 10: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Key TechnologiesbullKubernetes (Control Plance Orchestration)

bullHelm (Application Lifecycle Management Automation)

bullCICD Pipeline leveraging Jenkins

bullOpenStack-helm (Managing OpenStack on Kubernetes)

bullKolla (Containerizing OpenStack)

bullONOSSONA (OpenStack Network Management)

bullCeph (Storage)

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 11: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Our Plan

bullProduction-Ready by the end of 2017

bullFirst Production within 2017 (IT Infrastructure)

bullExpanding to more deployment (Media NFV) in 2018 amp Putting more APPs on this ldquoStreamlinerdquo

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 12: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Overall Architecture

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 13: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Todayrsquos Demo System

gitHelm Repo Tiller

Helm Cli

Kubectl

JenkinsMaster

JenkinsSlave

KubernetesMaster

KubernetesNode

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 14: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

What is the HA targetKubernetes Master

bull Etcdbull API server (load balance)bull Scheduler (leader election)bull Controller manager (leader election)

OpenStack Controller (Keystone Glance Nova Cinder Neutron)bull API server (load balance)bull Scheduler (Nova Cinder)bull MariaDBbull RabbitMQbull Neutron network node (SONA)

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 15: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Kubernetes 3-Masters

apiserver

kube-master01 node01

scheduler

controllermanager

etcd

kubeproxy

kubelet

apiserver

kube-master02

scheduler

controllermanager

etcd

flanneld flanneldflanneld

kubelet kubelet

apiserver

kube-master03

scheduler

controllermanager

etcd

flanneld

kubelet

kubeproxy

kubelet

flanneld

node00

ceph

kubeproxy

kubeproxy

kubeproxy

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 16: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

KubeletKUBELET_OPTS=--kubeconfig=etckuberneteskubeletconf --require-kubeconfig=true --hostname-override=kube-master01 --logtostderr=false --log-dir=varlogkubernetes --pod-manifest-path=etckubernetesmanifests --allow-privileged=true --v=0 --register-schedulable=false --cluster-dns=1096010 --cluster-domain=clusterlocal

Kubelet

flanneld

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 17: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

etcd yaml --name kube-master01 --initial-advertise-peer-urls http19216830132380 --listen-peer-urls http19216830132380 --advertise-client-urls http19216830134001 --listen-client-urls http19216830132379 http1270012379http19216830134001 http1270014001 --data-dir varetcddata --initial-cluster-token 5d3903915c2cda30174970d784075f0a --initial-cluster kube-master01=http19216830132380 kube-master02=http19216830142380 kube-master03=http19216830152380 --initial-cluster-state new

Kubelet

flanneld

etcd

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 18: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

kube-apiserver yaml - usrlocalbinkube-apiserver --etcd-servers=http1270012379 --storage-backend=etcd3 --insecure-bind-address=127001 --insecure-port=8080 --secure-port=6443 --admission-control=NamespaceLifecycleLimitRangerServiceAccount PersistentVolumeLabelDefaultStorageClassResourceQuota --service-cluster-ip-range=10960016 --tls-cert-file=etckubernetespkiapiservercrt --tls-private-key-file=etckubernetespkiapiserver-keypem --token-auth-file=etckubernetespkikube-token --service-account-key-file=etckubernetespkiapiserver-keypem --allow-privileged --anonymous-auth=false

Kubelet

flanneld

etcd

kube-apiserver

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 19: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

kube-controller-manager yaml - kube-controller-manager --master=1270018080 --cluster-cidr=172160016 --cluster-name=kubernetes --allocate-node-cidrs=true --service-account-private-key-file=etckubernetespkiapiserver-keypem --root-ca-file=etckubernetespkicacrt --cluster-signing-cert-file=etckubernetespkicacrt --cluster-signing-key-file=etckubernetespkica-keypem --v=0 --leader-elect=true Kubelet

flanneld

controllermanager

etcd

kube-apiserver

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 20: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

kube-scheduler yaml - usrlocalbinkube-scheduler --master=1270018080 --v=0 --leader-elect=true

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 21: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

kube-proxy yaml securityContext privileged true command - binsh - -c - usrlocalbinkube-proxy --kubeconfig=runkubeconfig --cluster-cidr=10960016 --v=0

Kubelet

flanneld

controllermanager

scheduler

etcd

kube-apiserver

kube-proxy

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 22: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

MariaDB

RabbitMQ

Keystone

Glance

Cinder

NOVA

Neutron

NOVA

Neutron

VM

VM

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 23: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

OpenStack Process 1 (nova-api)

NOVA

Neutron

VM

VMOpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 24: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

OpenStack Controller amp Compute

kubernetes worker-nodes(label controller)

kubernetes worker-nodes (label compute)

NOVA

Neutron

VM

VMOpenStack Process 1 (nova-api)

OpenStack Process 2 (nova-api)

OpenStack Process 3 (nova-api)

OpenStack Process 3 (nova-api)

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 25: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Database clustering (3 node)jobsmariadb-seedpomariadb-0pomariadb-1pomariadb-2

joiner (3th mariadb)--wsrep_cluster_address=gcomm172165671721675517216815

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 26: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Neutron network (1 nic)Network

br-ex

br-data

eth0addif

veth1

addif

veth0

link add (veth)

add-port

br-int

1921683033

patch

phy-br-ex

int-br-exqr-xxxqg-xxx

br-tun

patch-int

patch-tun

patch

ovs internal iface

ovs patchqrouter-xxx

vxlan-xxx

ovs vxlan

linux virtual iface

tapxxx

qdhcp-xxx

namespacelocal ip remote ip vni flow

qbrxxxtabxxx

VM

qvbxxx

br-intpatch-intqvoxxx

br-tunvxlan-xxx patch-tun

eth0

br-exphy-br-ex

int-br-ex

br-dataveth1

veth0

1921683034

Network Node Compute Node

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 27: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

OpenStack-Helm Neutron charthellipnetwork interface external veth0 default br-data ml2 agent tunnel_types vxlan type_drivers - flat - vxlan ovs auto_bridge_add null bridge_mappings null neutron default l3_ha_network_type vxlan debug Truersquohellip

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 28: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Simplified Overlay Network Architecture

Neutron

Sets SwitchingRouting flow rules

Compute-01 Compute-02

nova-compute

br-int (OVS)

Gateway Node Group

nova-compute br-router (OVS)

QuaggaHypervisor Hypervisor

Controls external connectivity

BGP OSPF Multipath

br-int (OVS) br-int (OVS)

Nova OpenStackNetworkingProxies ARP DHCP

vRouter

Sets NAT flow rulesProvisions virtual machine

East-West Traffic North-South Traffic

OpenStack SONA(ONOS)

VXLAN tunnel

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 29: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Live Demo 1

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 30: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

CI Presentation - Wil

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 31: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

VS

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 32: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Moving parts - some numbersbull Git repositories 34 local 10 upstream

bull Deployment configurations 4

bull Supported Environments 4

bull Charts per deployment 12

bull Unique docker images in each deployment 23

bull Pods amp jobs in a single deployment 34

bull Pods amp jobs in a 3 node HA deployment 85

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 33: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Kinds of Workflows

bull Build Kolla Container

bull Build Helm Chart

bull Test

bull Deploy upgrade

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 34: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Kinds of Jobs

bull Commit PR builds - validate internal changes

bull Live system deploys - validate component upgrades

bull Nightly builds - validate merge CI upstream

bull Nightly redeploy - validate clean slate deploys

bull Upstream builds - identify incoming breaking changes

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 35: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

When yoursquove got twice as many virtualization technologies things fall over twice as fast

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 36: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Managing the Clutter

bull No upstream forks

bull Isolated CI build amp test environments

bull Incremental upgrades

bull Constant rebuilds

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 37: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Testing starts from the ground upDeployed Kubernetes

bull Kubernetes e2e

bull Custom

bull Heapster

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 38: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Testing starts from the ground upKolla Container Builds

bull Bats

bull Clair

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 39: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Testing starts from the ground upHelm charts

bull Helm test

bull Partial tempest runs

bull Built in (eg horizon)

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 40: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Testing starts from the ground upDeployed Openstack

bull Full tempest runs

bull Rally

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 41: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Deployment is only the Beginning

bull Manual validations

bull Launch custom application on OpenStack

bull Additional security tests on running containers

bull Performance baselines and thresholds

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 42: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Failure testing

bull Risk - Many pods are run mdashprivileged amp with host networking

bull Layered solutions present a lot more interesting failure cases

bull HA rules change

bull Openstack HA behaviors mapped onto Kubernetes HA behavior

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 43: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Upgrades Become Boring

bull All changes can be simulated multiple times prior to live rollout

bull Minor component patches become trivial

bull OpenStack upgrades become predictable

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 44: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Live Demo 2

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 45: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

Challenges (pitfalls)

bull Operational Burden

bull Limitations

bull OpenStack as Cloud Native Apps Feasible Needed Gaps and Improvement

bull Kubernetes Stability (fast moving projects)

bull Security

THE END

Page 46: OpenStack on Kubernetes · OpenStack on Kubernetes: Make OpenStack and Kubernetes Fail-Safe Seungkyu Ahn (ahnsk@sk.com) Jaesuk Ahn (jay.ahn@sk.com) Open System Lab Network IT Convergence

THE END