esulabsolutions - york universityltb2020.eecs.yorku.ca/slides/albertoavritzer--ltb2020.pdf · the...

31
Integrating Automated Scalability Assessment into DevOps Alberto Avritzer et al. The Eighth International Workshop on Load Testing and Benchmarking of Software Systems (LTB 2020) eSulabSolutions

Upload: others

Post on 18-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Integrating Automated Scalability Assessment

into DevOps

Alberto Avritzer et al.

The Eighth International Workshop on Load Testing and Benchmarking of Software Systems (LTB 2020)

eSulabSolutions

Page 2: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 2

Our Team

eSulabSolutions, near Princeton University

Alberto Avritzer

André van Hoorn

Henning Schulz

Vincenzo Ferme

Barbara Russo

Andrea Janes

Daniel MenascheVilc Rufino

Page 3: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps

• Senior Researcher at Siemens and AT&T Bell Labs for about 24 years

• Published over 70 papers in journals, refereed conference proceedings, and book chapters in those areas: (http://dblp.uni-trier.de/pers/hd/a/Avritzer:Alberto) • Resilience Assessment and Evaluation of Computing Systems. Springer 2012

• Handbook of Software Aging and Rejuvenation, World Scientific, May 2020, World Scientific (https://worldscientific.com/worldscibooks/10.1142/11673)

• Siemens/MTA: The NY Subway PA/CIS Scalability Assessment

• Load Testing and Performance Analysis of AT&T Operations support systems:

• Monitoring for Software aging and Rejuvenation (1993)

• Performance testing using Markov chain (1995)

• Founder of eSulabSolutions (https://esulabsolutions.godaddysites.com/):

• Automated scalability assessment in DevOps and micro service architecture

3

eSulabSolutionsAbout Alberto Avritzer

Page 4: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Motivation: Recent Scalability Related Disasters

4

NYS Labor website crashesAmericans crash the Canadian immigration

Page 5: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOpsps

Motivation: Influence of Poor Performance on the Success of Businesses

5

Page 6: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 6

Examples of Operational Profile Representations

Observed load situations Time

Load

Lev

el

Empirical distribution of load situations Load level

Rel

. Fre

q.

Page 7: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps

»DevOps is ..

SECURERELIABLE

RESILIENTFAST

COST-

EFFICIENT ELASTIC

… a set of

intended toreduce the time between committing a change to a system and the change being placed into normal production, while ensuring .« HIGH QUALITY

Practices

Chaos Engineering

Live

Experi-

mentation

Automation

CD

Pipelines

Cultural

ChangeMicro-

services

Monitoring

Polyglot

Agile

Fast

Feedback

Docker

*-as-Code

7

Page 8: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer and van Hoorn: Operational Profile Data for Continuous Dependability Assessment in DevOpsA. van Hoorn (U Stuttgart) – Continuous Engineering of Dependable Distributed Software SystemsSECURERELIABLE

RESILIENTFAST

COST-

EFFICIENT ELASTIC

HIGH QUALITY

Chaos Engineering

Live

Experi-

mentation

Automation

CD

Pipelines

Cultural

ChangeMicro-

services

Monitoring

Polyglot

Agile

Fast

Feedback

Docker

*-as-CodeMy Questions of Interest

8

How can DevOps practices

support quality

assurance?

How to integrate „established“

quality assurance with DevOps

practices?

Page 9: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• PPTAM: Production and Performance Testing Based Application Monitoring

Avritzer et al.: Automated Scalability Assessment in DevOps

Page 10: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

10

Scalability Requirement Automated Measurement

Avritzer et al.: Automated Scalability Assessment in DevOps

Page 11: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps

… and Classic ProblemsThe Classic Load Testing Approach

There are no suitable load testsComplex analysis of performance

regressions

High manual effort for maintaining load tests

Load tests need much time to execute

Workload Specification Test Results Evaluation

Expert Load Driver Expert/Basic Rules

SUTLOAD

11

Page 12: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps

… How Problems Get WorseLoad Testing in Continuous Delivery Pipelines

void main(String[] args) { int foo; // do something bar(foo); System.out.println(“Hi“); }

Implementation Build Performance Testing

Functional Testing

void main(String[] args) { int foo; // do something bar(foo); System.out.println(“Hi“); }

void main(String[] args) { int foo; // do something bar(foo); System.out.println(“Hi“); }

main(Strin int foo; // do somet bar(foo);

vs.Pipeline automation

Fast & frequent releasesvs.

Complex load tests for every release impossible

vs.Complex analysis of performance

regressions

High manual effort for maintaining load tests

Load tests need much time to execute

Service-focus requires multiple testsvs.

There are no suitable load tests

12

Page 13: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• PPTAM: Production and Performance Testing Based Application Monitoring

Avritzer et al.: Automated Scalability Assessment in DevOps

Page 14: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• Challenge: assess performance of architectural deployment alternatives(e.g., number of replicas, CPU/memory allocation, technology stack) under fuzzy requirements

• Our approach • Use operational data to generate and weigh load tests • Measure baseline requirements • Metric allows quantitative comparison of deployment alternatives • Builds on previous work from telecommunication systems:

Avritzer, A., Weyuker, E.J.: The automatic generation of load test suites and the assessment of the resulting software. IEEE Trans. Softw. Eng. 21(9) (Sep 1995)

14

Quantitative Assessment of Deployment Alternatives

Alberto Avritzer, Vincenzo Ferme, Andrea Janes, Barbara Russo, Henning Schulz, and André van Hoorn: A Quantitative Approach for the Assessment of Microservice Architecture Deployment Alternatives by Automated Performance Testing.

In Proceedings of the 12th European Conference on Software Architecture (ECSA). LNCS, Springer, 2018 (Accepted)

Page 15: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 15

Overview of Approach

Observed load situations Time

Load

Lev

el

Production

Baseline testSampled load tests

Scalability criteria

Deployment conf.

Empirical distribution of load situations Load level

Rel

. Fre

q.

1

Empirical Distribution of Load situations Sampled load level

Aggr

. Rel

. Fre

q.

20.12 0.14 0.20 0.16 0.11

Test results

3

Domain Metric

0.734

Page 16: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 16

Experiments

Scal = avg + 3σ

Production

12 microservices

Custom Op. Mix

Sampled Load Tests

Empirical Distribution of Load situations

6 Load Levels

50,100,150,200,250,300 Concurrent user sessions

1,2

Deployment Config.

10 configurations

Replicas

CPURAM

3

Page 17: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 17

Experiment Results: Computation of Domain Metric (1/2)

Custom Op. Mix

Users Aggr. Rel. Freq.

50 0.10582

100 0.18519

150 0.22222

200 0.22222

250 0.20370

300 0.06085

Aggr. Rel. Freq. Contrib. to Domain Metric

Max: 0.20370

Actual: 0.13580

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

API Scalability Criteria

GET / PASS

GET /cart PASS

POST /item FAIL

Page 18: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 18

Experiment Results: Computation of Domain Metric (2/2)

Users Contribution

50 0.10582

100 0.18519

150 0.22222

200 0.07999

250 0.13580

300 0.04729

Contrib. to Domain Metric

Deployment Configuration: 1 GB RAM, 0.25 CPU, 1 Replica

Max: 1

Domain Metric4

0.77631Actual:

Page 19: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 19

Experiment Results: Single-Metric Comparison of Alternatives

Page 20: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Avritzer et al.: Automated Scalability Assessment in DevOps 20

Experiment Results: Visual Comparison of Alternatives

Con

trib.

to D

omai

n M

etric

Sampled Load Tests

Max Contrib.

Depl. Conf.

Page 21: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• Production and Performance Testing Based Application Monitoring

PPTAM

Avritzer et al.: Automated Scalability Assessment in DevOps

Page 22: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

PPTAM Mirai

Avritzer and Russo: Operational Profile Data for Continuous Dependability Assessment in DevOps

Page 23: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

23

• Mirai is a malware that has been used to turn networked devices running Linux into remotely controlled bots

• We use it to attack the system. It can perform different types of attack • by now, we have explored http, syn, ack

Mirai BotNet

Avritzer and Russo: Operational Profile Data for Continuous Dependability Assessment in DevOps

Page 24: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

24

PPTAM, No attack and attack plots with 50 users

Page 25: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• Apply the approach to a large telecom system

Identify Scalability Impacting Architecture Components

Arch Cockpi

tAPM

Mirai

Page 26: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Counter_name Measurement Load Service_name

Response Time_X Y L S

Response Time_X Y L S

Response Time_X Y L S

Response Time_X Y L S

What is the problem with this log?

Identify Scalability Impacting Architecture Components from Performance Testing Log

26Avritzer et al.: Automated Scalability Assessment in DevOps

Page 27: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Counter_Name Low Load_ RT Std _RT Baseline_RT

Response Time X 2.488602 3.513397 12.033352

Compute Scalability Baseline, Normalized Distance from Baseline, and Linear Regression Slope

27

Page 28: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Counter_Name Low Load_ RT Std _RT Baseline_RT

Response Time X 2.488602 3.513397 12.033352

Compute Scalability Baseline, Normalized Distance from Baseline, and Linear Regression Slope

28

Page 29: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Multivariate analysis used to prioritize re-factoring using Slope and Normalized Distance Profile

29

Page 30: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

Summary

• DevOps provides access to real-time data, enables production and testing integration

• Use cases • Performance, Scalability, Security • other illities: survivability, resilience

• Provide detailed guidance to architects and developers in real time: • Domain-metric based scalability

assessment dashboards • Prioritize component re-factoring

based on load testing results

Page 31: eSulabSolutions - York Universityltb2020.eecs.yorku.ca/slides/AlbertoAvritzer--LTB2020.pdf · The Eighth International Workshop on Load Testing and Benchmarking of Software Systems

• A. Avritzer, D. S. Menasché, V. Rufino, B. Russo, A. Janes, V. Ferme, A. van Hoorn, H. Schulz, PPTAM: Production and Performance Testing

Based Application Monitoring. ICPE Companion 2019: 39-40

• A. Avritzer, V. Ferme, A. Janes, B. Russo, H. Schulz, A. van Hoorn, A Quantitative Approach for the Assessment of Microservice Architecture

Deployment Alternatives by Automated Performance Testing. ECSA 2018: 159-174

• A. Avritzer, D. S. Menasché, V. Rufino, B. Russo, A. Janes, V. Ferme, A. van Hoorn, H. Schulz, Scalability Assessment of Microservice Architectural Configurations: A Domain-based Approach Leveraging Operational Profiles and Load Tests, Journal of Systems and Software, to appear 2020.

Publications