enforcing application sla with congress and monasca

23
Enforcing Application SLAs with Congress and Monasca Fabio Giannetti, Ken Owens April 28, 2016

Upload: fabio-giannetti

Post on 12-Apr-2017

444 views

Category:

Internet


0 download

TRANSCRIPT

Page 1: Enforcing Application SLA with Congress and Monasca

Enforcing Application SLAs with Congress and MonascaFabio Giannetti, Ken Owens

April 28, 2016

Page 2: Enforcing Application SLA with Congress and Monasca

2© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• Vision• Congress and Monasca implementing:

• OPS/NOC SLA Policies• App Intent SLA Policies

• Current State and Next Steps

Outline

Page 3: Enforcing Application SLA with Congress and Monasca

3© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Vision

Page 4: Enforcing Application SLA with Congress and Monasca

4© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• Application owners/developers do not care about the underlining infrastructure unless it is a problem.

• Microservices based architectures demands inherently granular application design.

• SLAs for applications must be holistic and independent of the underlining infrastructure

Vision

Host

Virtualization VirtualizationContainer Container

Container Container

Srvc Srvc Srvc Srvc Srvc Srvc Srvc

Application A Application B

Page 5: Enforcing Application SLA with Congress and Monasca

5© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Enable business/application owners to easily define the aspects that are relevant in running their applications with the budget constraints that are imposed by IT.

Vision

Page 6: Enforcing Application SLA with Congress and Monasca

6© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Monitoring is now holistic and has to consider various level of virtualization and harmonize data over the different layers.

Containers are short lived and moved around the available infrastructure.

Vision

Host

Virtualization VirtualizationContainer Container

Container Container

Page 7: Enforcing Application SLA with Congress and Monasca

7© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Application owners’ soft limits (alarms) are notified back and hard limits (actions) are performed whenever required.

Vision

Page 8: Enforcing Application SLA with Congress and Monasca

8© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

OPS/NOC SLA using Congress and Monasca

Page 9: Enforcing Application SLA with Congress and Monasca

9© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Underutilized Servers OPS/NOC Policy Example

error(vm, email) :-nova:server_owner(vm, owner),two_months_before_today(start, end),

ceilometer:statistics(vm, start, end, “cpu-util”, cpu),cpu < 5,keystone:email(owner, email)

two_months_before_today(start, end) :-date:today(end),date:minus(end, “2 months”, start)

If a VM has less than 5% CPU utilization for the last 2 months, then notify its owner via email

Page 10: Enforcing Application SLA with Congress and Monasca

10© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Current Solution

Ceilometer API

Congress APIPolicy Engine

Ceilometer Datasource

GET /v2/meters/cpu_util/statistics?resource_id=…

VM UUID (Resource ID) CPU

xxxxxxxx-0001-xxxx-xxxxxxxxxxx

xxxxxxxx-0002-xxxx-xxxxxxxxxxx

xxxxxxxx-0003-xxxx-xxxxxxxxxxx

xxxxxxxx-0004-xxxx-xxxxxxxxxxx

xxxxxxxx-0005-xxxx-xxxxxxxxxxx

Poll every <n>s403027055

Page 11: Enforcing Application SLA with Congress and Monasca

11© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Current Solution

Congress APIPolicy Engine

Ceilometer Datasource

VM UUID (Resource ID) CPU

xxxxxxxx-0001-xxxx

xxxxxxxx-0002-xxxx

xxxxxxxx-0003-xxxx

xxxxxxxx-0004-xxxx

xxxxxxxx-0005-xxxx

403027055

Nova API

Nova Datasource

Keystone Datasource

Keystone API

VM Owner

xxxxxxxx-0001-xxxx Ann

xxxxxxxx-0002-xxxx Fabio

xxxxxxxx-0003-xxxx Fabio

xxxxxxxx-0004-xxxx Ken

xxxxxxxx-0005-xxxx Ken

Owner Email

Ann [email protected]

Fabio [email protected]

Ken [email protected]

VM Email

xxxxxxxx-0003-xxxx

[email protected]

Page 12: Enforcing Application SLA with Congress and Monasca

12© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

From Policy to Alarmerror(vm, email) :-

nova:server_owner(vm, owner),two_months_before_today(start, end),

monasca_alarms:stats(vm, start, end, “cpu.user_perc”, cpu),cpu < 5,keystone:email(owner, email)

two_months_before_today(start, end) :-date:today(end),date:minus(end, “2 months”, start)

{ "name":"Average CPU percent is less than 5", "description":"The average CPU percent is lesser than 5", "expression":"(avg(cpu.user_perc{resource_id=vm}) < 5)", "match_by":[ "resource_id" ], "severity":”HIGH", "ok_actions":[ ”action_id_for_ok" ], "alarm_actions":[ ”action_id_for_alarm" ]}

Page 13: Enforcing Application SLA with Congress and Monasca

13© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Proposed Solution (receiving notif.)

MetricsDB

Monasca Agents

Monasca API

Notification Engine

Threshold Engine Persister

Kafka Cluster

Congress API

Policy Engine

Monasca Alarm Datasource

Webhook:…/v1/data-sources/monasca_alarm?execute&action=handle_alarm

Settings DB

monasca notification-create congress WEBHOOK http:…/v1/data-sources/monasca_alarm?execute&action=handle_alarm

handle_alarm(params)

VM UUID (Resource ID) CPU

xxxxxxxx-0003-xxxx 2

POST /v2.0/alarm-definitions

Page 14: Enforcing Application SLA with Congress and Monasca

14© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Proposed Solution (receiving notifications)

Congress API

Policy Engine

Monasca Alarm Datasource

VM UUID (Resource ID) CPU

xxxxxxxx-0003-xxxx 2

Nova API

Nova Datasource

Keystone Datasource

Keystone API

VM Owner

xxxxxxxx-0003-xxxx Fabio

Owner Email

Fabio [email protected]

VM Email

xxxxxxxx-0003-xxxx

[email protected]

Page 15: Enforcing Application SLA with Congress and Monasca

15© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Application Intent SLA using Congress and Monasca

Page 16: Enforcing Application SLA with Congress and Monasca

16© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

VM Evacuation for Biz Critical App if Host has potential health issues App Intent Policy Example

error(vm) :- nova:show(vm, hostID), monasca_alarm:host_issues(hostID)

If a Host has issues, for instance:

1. Unhealthy: cannot be pinged and or SSH into

2. Network errors and packet loss

3. Disk space below certain threshold

Page 17: Enforcing Application SLA with Congress and Monasca

17© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

App Intent Policy: Metrics Correlationerror(vm) :- nova:show(vm, hostID), monasca_alarm:host_issues(hostID)

Metric Name Dimensions Valuehost_alive_status observer_host=fqdn,

hostname=supplied hostname being checked,test_type=ping or ssh

0=online, 1=offline

disk.space_used_perc device, mount_point The percentage of disk space that is being used on a device

net.in_packets_dropped_sec device Number of inbound network packets dropped per second

net.out_packets_dropped_sec device Number of outbound network packets dropped per second

Page 18: Enforcing Application SLA with Congress and Monasca

18© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

App Intent Policy: Multi-Alarms #1{ "name":”Host is Unhealty", "description":"The host is considered unhealty", "expression":"(host_alive_status{host_id=hostID}) = 1)", "match_by":[ "host_id" ], ...}

{ "name":”Host disk getting full", "description":"The host disk is reaching capacity", "expression":"(disk.space_used_perc{host_id=hostID}) > 90)", "match_by":[ "host_id" ], ...}

Metric Name Valuehost_alive_status 0=online, 1=offline

disk.space_used_perc The percentage of disk space that is being used on a device

net.in_packets_dropped_sec Number of inbound network packets dropped per second

net.out_packets_dropped_sec

Number of outbound network packets dropped per second

Page 19: Enforcing Application SLA with Congress and Monasca

19© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

App Intent Policy: Multi-Alarms #2{ "name":”Host is Unhealty", "description":"The host is considered unhealty", "expression":"(net.in_packets_dropped_sec{host_id=hostID}) > 30)", "match_by":[ "host_id" ], ...}

{ "name":”Host disk getting full", "description":"The host disk is reaching capacity", "expression":"(net.out_packets_dropped_sec{host_id=hostID}) > 30)", "match_by":[ "host_id" ], ...}

Metric Name Valuehost_alive_status 0=online,

1=offline

disk.space_used_perc The percentage of disk space that is being used on a device

net.in_packets_dropped_sec Number of inbound network packets dropped per second

net.out_packets_dropped_sec Number of outbound network packets dropped per second

Page 20: Enforcing Application SLA with Congress and Monasca

20© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Current State and Future Work

Page 21: Enforcing Application SLA with Congress and Monasca

21© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Overall Architecture

Settings DB

MetricsDB

Monasca Agents

Monasca API

Keystone

Notification Engine

Threshold Engine Persister

Kafka Cluster

Congress APIPolicy Engine

Monasca Alarm Datasource

Metric Valuemetric1 val1

metricN valN

In Mem DB

webhookrpc

Page 22: Enforcing Application SLA with Congress and Monasca

22© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• Done:• Developed a Monasca Datasource to validate integration.• Designed the solution and found the main integration points

• To be Done:• Developed a Monasca Alarm Datasource leveraging the RPC

capabilties in Congress.• Create a Congress Notification Webhook for Monasca• Develop a policy to alarm conversion component to develop

policies prefixed with monasca-alarm.

Current Status and Next Steps

Page 23: Enforcing Application SLA with Congress and Monasca

OpenStack SummitAustin, Texas 2016

Thank You!