may, 17th 2017 alexander stock cloud …...all data is stored on the slaves configurations of the...

24
Monitoring CloudStack and Components May, 17th 2017 Alexander Stock Cloud Infrastructure Architect © 2017 itelligence classification: public | version: 1.1 05/17/2017

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Monitoring CloudStack and Components

May, 17th 2017

Alexander Stock

Cloud Infrastructure Architect

© 2

017 ite

llig

ence

cla

ssific

ation:

public |

vers

ion:

1.1

05/1

7/2

017

Page 2: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

About Me

2

Sysadmin @BIT.Group GmbH – member of itelligence group

Experience in Vmware, KVM, Nagios and Ansible

Working with CloudStack since 2015

GitHub:

https://github.com/AlexanderStock

Mail:

[email protected]

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

CloudStack Berlin & Dresden, Germany https://www.meetup.com/german-CloudStack-user-group

Ansible Dresden, Germany https://www.meetup.com/Ansible-Dresden

Page 3: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Overview BIT.Group GmbH – member of itelligence group

© 2

017 ite

llig

ence

cla

ssific

ation:

exte

rnal

3

350+ employees in Dresden, Bautzen, Hanover and Shanghai

SAP Consulting, Development and Support

SAP partner and service provider for SAP SE

IT Consulting

Development

Cloud IT Infrastructure Management

SAP BASIS

SAP Solution Manager Application Lifecycle Management

International

BIT Service Desk

SAP Service & Support

ITIL SAP HANA

Workshops

IT Service Management

SAP partner

5/1

7/2

017

Page 4: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Since June 2016 BIT.Group GmbH officially part of itelligence and NTT DATA Group

Know-how, flexibility and internationality as part of NTT DATA network

Together internationally leading full IT service provider with:

BIT.Group GmbH as part of itelligence / NTT DATA Group

© 2

017 ite

llig

ence

cla

ssific

ation:

exte

rnal

4

3.500+ active SAP customers

Locations in 40+ countries

$1,5 billion in SAP revenue worldwide

Over 9.000 SAP experts worldwide

5/1

7/2

017

Page 5: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Agenda

1. What do we use for monitoring?

2. MySQL

3. Tomcat

4. CloudStack API

5. Distributed Monitoring

5

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 6: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

What do we use for Monitoring?

6

Why do we monitor CloudStack?

Detecting performance issues

Detecting misconfigurations

Detecting resource bottlenecks

Get a long-term overview of our installations

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 7: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

What do we use for Monitoring?

We use Nagios with frontend called Check_MK

Check_MK :

Combines passive and active checks

Auto inventory of Client hosts

Manage host/services/reports

Live status: Module to access to the core data of Nagios

Can monitor Linux/Unix/Windows/Switches/Storage… Out of the Box

7

S: https://en.wikipedia.org/wiki/File:Cmk-dashboard.png

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 8: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Event-Konsole

Status GUI

BI WATO Mobile Custom

Applications

Multisite Web Platform

Syslog

SNMP Traps

Linux

Solaris VMS

Windows HP-UX

AIX

Switch

Sensor

Appliance Router PING DNS-

Server HTTP-Server

TCP-Port

NagVis

Event-Daemon

PNP- 4Nagios

RRDTool

CMK Notify

Monitoring Core (Nagios / Icinga)

Check_MK

Live status

Live check

Nag

ios-

Plu

gin

Nag

ios-

Plu

gin

TCP or SSH

TCP/IP

SNMP

In

line I

CM

P

What do we use for Monitoring?

Page 9: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

What do we use for Monitoring?

12.0

5.2

017

© 2

016 ite

llig

ence

Kla

ssifiz

ieru

ng:

inte

rn

9

Nagios core triggers active check (Check_MK script) Check_MK script polls data from client over TCP Check_MK script writes long-term data to RRD files Check_MK script distributes check results to passive checks

Check_MK

RRD

Host

1

2

3 4

Agent TCP

current state

Active check

Passive checks

Retrieve data

Page 10: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

MySQL

10

Check_MK Plugin for MySQL

Installation

Configuration Monitoring-Client

Configuration Monitoring-Server

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

wget https://<mycheckmkserver>/<site>/check_mk/agents/mk_mysql mv mk_mysql /usr/lib/check_mk_agent/plugin/

vi /etc/check_mk/mysql.cfg [client] user=monitor password=MyPassWord

cmk -I <mydbhost> cmk -r

Page 11: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

MySQL

11

Checks:

MySQL DB Size <database> MySQL Connections mysql MySQL DB Slave mysql MySQL InnoDB IO mysql MySQL Version mysql

Alternatives for pure Nagios:

Check mysql health

Active Check for MySQL Advanced features like “cache hit rates“

or “slow queries“

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 12: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Tomcat

12

Check_MK_Plugin for Tomcat using Jolokia (JMK Bridge):

Installation

Configuration Monitoring-Client

Configuration Monitoring-Server

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

cd /etc/check_mk/ Wget https://<mycheckmkserver>/<site>/check_mk/agents/cfg_examples/jolokia.cfg

cmk -I <mytomcathost> cmk -r

wget http://search.maven.org/remotecontent?filepath=org/jolokia/jolokia-war/1.3.5/jolokia-war-1.3.5.war mv jolokia-war-1.3.5.war /usr/share/cloudstack-management/webapps/jolokia.war service cloudstack-management restart wget https://<mycheckmkserver>/<site>/check_mk/agents/mk_jolokia mv mk_jolokia /usr/lib/check_mk_agent/plugin/

Page 13: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Tomcat

13

Metrics:

JVM <PORT> <url> Requests JVM <PORT> <url> Sessions JVM <PORT> GC PS_MarkSweep JVM <PORT> GC PS_Scavenge JVM <PORT> Memory JVM <PORT> ThreadPool http-8080 JVM <PORT> ThreadPool jk-20400 JVM <PORT> Threads JVM <PORT> Uptime

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 14: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

CloudStack API

14

Check Cloudstack.py:

Developed by BIT.Group to see what's going on inside CloudStack

Python script which can monitor different parts of CloudStack

Build as an active check which can also be used with plain Nagios

Thresholds can be defined in a JSON file (Global thresholds and instance thresholds)

Performance Data (long-term usage) will be produced by the Scripts

Two categories:

Availability checks

Resource checks

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 15: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

CloudStack API

15

Availabilty checks:

Hoststatus:

Status of Hosts per cluster Detects if Hosts are reachable and enabled Writes performance data

System VM:

Global status of all System VMs Writes performance data

Virtual router:

Global status of all virtual routers

Detects if VR is up or needs an update

Checks Redundant Routers

Writes performance data

Status for Cluster: kvm01 Host Result Status Enabled hv05 OK running yes hv03 OK running yes hv02 OK running yes hv04 OK running yes hv01 OK running yes

Name Status Running v-1405-VM OK yes s-1406-VM OK yes

Name Status Running Upgrade r-1289-VM OK yes no r-1385-VM OK yes no r-1272-VM Critical yes yes r-1173-VM OK yes no r-1381-VM OK yes no Status of redundant VPC Routers Name Status Status

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 16: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

CloudStack API

16

Resource checks:

Capacity:

• Status of all global capacity metrics • Thresholds can be set in JSON file • Writes performance data for each metric

Domains/Projects:

• Monitors usage metrics for all domains/projects • Checks if domains/projects have • reached their resource thresholds • Thresholds can be set in JSON file • Writes performance data for all metrics

Offerings:

• Monitors if offerings can be deployed on clusters • Thresholds can be defined in JSON file • Writes performance data for each offering

OK: CAPACITY_TYPE_CPU is in status ok. Value:37.2% OK: CAPACITY_TYPE_MEMORY is in status ok. Value:71.11% OK: CAPACITY_TYPE_STORAGE_ALLOCATED No Thresholds given.Value:26.99% OK: CAPACITY_TYPE_VIRTUAL_NETWORK_PUBLIC_IP No Thresholds given. Value:63.03% OK: CAPACITY_TYPE_PRIVATE_IP No Thresholds given. Value:3.92% OK: CAPACITY_TYPE_VLAN No Thresholds given. Value:92.96% OK: CAPACITY_TYPE_DIRECT_ATTACHED_PUBLIC_IP No Thresholds given. Value:2.01% OK: CAPACITY_TYPE_SECONDARY_STORAGE No Thresholds given. Value:45.01% OK: CAPACITY_TYPE_STORAGE No Thresholds given. Value:19.38% OK: CAPACITY_TYPE_LOCAL_STORAGE No Thresholds given. Value:0%

Results for Domain ROOT: Results for Domain DOM1: Warning: Domain DOM1 has reached threshold for cpu: 80 Results for Domain DOM2: Results for Domain DOM3: Results for Domain DOM4: Warning: Domain DOM4 has reached threshold for memory: 80

Results for Domain DOM5:

Statistics for Cluster: kvm01 ! Offering ! Count! !XL ! 21! !XXL ! 12! !XXXL ! 5! !XXXXL ! 0! !XXXXXL ! 0! --> Critical: Offering: XXXXL can not be deployed anymore --> Critical: Offering: XXXXXL can not be deployed anymore

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 17: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

CloudStack API

17

Execution:

Configfiles:

For domain and project checks: For capacity and offering checks:

{ "thresholds": { „DOM1": { "cpu": { "warn": "50", "critical": "90" } } }, "global":{ "cpu": { "warn": „60", "critical": "95" } } }

{ "thresholds": { "CAPACITY_TYPE_MEMORY": { "warn": "50", "critical": "80" }, "CAPACITY_TYPE_CPU": { "warn": "30", "critical": „70" } } }

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

./cloudstack-resources.py -m <MODE> -f <configfile> -d <optional DomainID> -p <optional ProjectID>

Page 18: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

CloudStack API

18

Outlook:

Checks to come:

Monitoring of usage of networks Monitoring optimal VM placement Resource forecasting Monitoring old snapshots

Download:

https://exchange.nagios.org/directory/Plugins/Cloud/Check_Cloudstack/details

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 19: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Distributed Monitoring

19

One Master Server which holds all configurations of the slaves

Status of objects will be queried on demand via Live status

All data is stored on the slaves

Configurations of the slaves will be done via API and HTTPS

Slaves provide UI functionality for the customers

Setup can be done over UI

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Core

State

System System System

RRDs

Core

State

System System System

RRDs

Livestatus

Core

State

System System System

RRDs

Master Site

Slave Site 2 Slave Site 1

Livestatus

Livestatus

Page 20: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Distributed Monitoring

20

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Netw

ork

Custo

mer A

(isola

ted)

Netw

ork

Custo

mer B

(isola

ted)

Configuration of hosts and setting over UI or API.

Automation with Chef, Ansible… Central overview of all systems Rules can maintained centraly

UI Access User

Replication of setting and Query of Livestatus

Check of Servers

Monitoring Network (isolated)

Page 21: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Summary

21

Detecting performance issues

Solved through MySQL and Tomcat checks

Detecting misconfigurations:

Solved through availability checks through the API

Detecting resource bottlenecks:

Solved through resource checks through the API

Get a long-term overview of our installations:

All checks producing RRD Files which can be used for analysis over a long period

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 22: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Other Platforms

22

Zabbix

Zenoss

https://github.com/ke4qqq/zabbix-cloudstack

https://www.zenoss.com/product/zenpacks/cloudstack

© 2

017 ite

llig

ence

cla

ssific

ation:

public

5/1

7/2

017

Page 23: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

Alexander Stock Cloud Infrastructure Architect [email protected] BIT.Group GmbH – member of itelligence group

We make the most of SAP® solutions!

5/1

7/2

017

© 2

017 ite

llig

ence

cla

ssific

ation:

public |

auth

or:

Ale

xander

Sto

ck |

vers

ion:

1.1

Contact

Questions?`

Page 24: May, 17th 2017 Alexander Stock Cloud …...All data is stored on the slaves Configurations of the slaves will be done via API and HTTPS Slaves provide UI functionality for the customers

5/1

2/2

017

© 2

017 ite

llig

ence

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of itelligence AG. The information contained herein may be changed without prior notice.

Some software products marketed by itelligence AG and its distributors contain proprietary software components of other software vendors. All product and service names mentioned and associated logos displayed are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.

The information in this document is proprietary to itelligence. This document is a preliminary version and not subject to your license agreement or any other agreement with itelligence. This document contains only intended strategies, developments and product functionalities and is not intended to be binding upon itelligence to any particular course of business, product strategy, and/or development. itelligence assumes no responsibility for errors or omissions in this document. itelligence does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.

itelligence shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. This limitation shall not apply in cases of intent or gross negligence.

The statutory liability for personal injury and defective products is not affected. itelligence has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third-party Web pages nor provide any warranty whatsoever relating to third-party Web pages.

Copyright itelligence AG - All rights reserved