continuous large-scale testing of real-time communication systems

27
© 2016 Citrix Continuous Large-Scale Testing of Real-Time Communication Systems Robert Lübke, PhD Robin Lungwitz Agile Testing Days 2016

Upload: robert-luebke

Post on 15-Apr-2017

61 views

Category:

Software


2 download

TRANSCRIPT

Page 1: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Continuous Large-Scale Testing of

Real-Time Communication Systems

Robert Lübke, PhD

Robin Lungwitz

Agile Testing Days 2016

Page 2: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Who are we?

Robert Lübke, PhD

Senior Software Engineer 2

Quality Engineering Tech Lead

bluetoptesting.com

Robin Lungwitz

Software Engineer 2

Page 3: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

To deliver a more human, intelligent way to connect at work

Page 4: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Agenda I. RTC: special requirements

II. Large-Scale Testing: What? Why? How?

III. Our Large-Scale Testing Journey

IV. Learnings, Good Practices

V. Discussion

45'

Page 5: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix5

Possible Problems with online conferencing

Bad video quality:

• Artifacts in the video image

• Reduced resolution

Bad audio quality:

• Short interruption in speech

• High delay no interactivity

Disconnections:

• Partial or complete interruption of the audio/video

transfer

I cano — hea—

CHR — ything!

Page 6: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Real-Time Communication: Specific Issues & Requirements

Internet

Page 7: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Large-Scale Testing – Why do it? What is it?

Couple of things are not covered by unit / integration / system level tests:

• Need to verify functionality under day-to-day usage patterns

Load Testing: usually expected load over longer period of time to reflect real-life scenarios

• Check worst case situations beyond normal operation parameters

Stress / Overload Testing: exceed specified limitations - graceful degradation?

identify starting points to improve scaling / efficiency

• Find out concrete performance characterists on concrete hardware

Performance Testing: measure transaction timings to find out system limitations

Page 8: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Large-Scale Testing – How is it done?

load generators and telemetry interfaces

Ideally one should use the same tools / framework,

but the tests have a different focus or goal

Page 9: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2015 Citrix | Confidential

Our Large-Scale Testing Journey

Page 10: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Large-Scale Testing the Video Platform

• Requirements:– Large-Scale Testing: Check functionality and media quality in high-load scenarios

– Network Testing: implications of bandwidth limitations, delay, jitter, loss, reordering, …

Video Server Cluster

Video Conferencing

Engine

Test FixtureR

em

ote

Inte

rfa

ce

MonitoringTest Driver

Page 11: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

• cooperation project

• Emulation platform for creating and performing tests of distributed systems

• Network tests

• Large-scale tests

• Functional tests

Page 12: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Large-Scale Testing the RTC Gateway

Hub

Node

RTC Gateway

Monitoring

Platform

x1200 (30 machines à 40 chrome browsers)

Problems:stability of Se Grid, Chrome,

maintenance

Node

Node

… …Test Driver

Script

REST

Page 13: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Libraries

Current Large-Scale Testing Approach

Test Client Test Server

Message

Broker

Network

Degrader

Evaluation

Service

Input

Generator

EvaluationInput

Generation

Event-based Communication

Network

Degradation

Infrastructure

Provisioning

Cloud

Provider

Test Driver Cucumber Suite Unit Test Framework

Test

Client

Test

Server

Test Client Test ServerTest Server

Page 14: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Summary of our Large-Scale Testing Journey

own development

using existing

solutions

hybrid approach

– fits exactly our needs, but high maintenance efforts

– need for dedicated developers

– Selenium Grid (not designed for load testing with 1,000‘s of

browsers)

– JMeter (limited scope)

– custom implementation of test clients and fixtures

– standardized communication interfaces

– shared modules for recurring testing tasks (deployment,

evaluation, network emulation, quality analysis, monitoring)

Page 15: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

• How to automate it?– integrate test solution into CI system

– Plan when to execute automated load tests (costs! 30 c4.2xlarge instances * 24h ~$550)

– nightly / weekly / branches?

– AWS spot instances

• Automatic evaluation of statistics and

graphs can be challenging.– Find common performance metrics to

check during load testing (CPU, memory,

network, …)

– common load test reporting

Continuous Large-Scale Testing

Build #

Page 16: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Large-Scale Testing in the Test Pyramid

• As much large-scale testing

as possible should be done

on lower layers.

• But in the end, systems are

integrated with each other

forming a platform.

• Issues occured that could

have been caught by an E2E

load test of all (many)

systems in integration.Unit

Component

Integration

System

E2E

Page 17: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2015 Citrix | Confidential

Learnings, Good Practices

Page 18: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Learnings

• RTC has special requirements – RTC testing as well!

– Existing test tools & frameworks mostly don‘t scale (to our requirements).

– Existing load test tools often have a very limited scope.

– Developing proper solutions for large-scale testing can have the same efforts

as developing the product itself. Start developing test solutions early!

Page 19: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: Run existing tests on top of load tests

• Often load tests are executed like „fire & forget“

• Combine (automated) functional tests with load tests by running them in parallel

– just putting a lot of load onto the servers and checking statistics is often not

enough

– reuse existing (automated) tests

– evaluation of functionality & media quality

– network / failure mode / reconnect tests

Page 20: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: Reuse of upstream test solutions

• usually other teams are consuming RTC platform services, libraries, …

• running (load) tests from these teams can show integration bugs earlier

• especially beneficial if upstream test solutions are already automated

• challenge: build your current version of the service you want to test into the

existing test solution

• shared test cases among product layers / teams

Page 21: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: What to use to generate the load?

load

generator

robot

real

clients

(E2E testing)

Page 22: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: What should be mocked in a load test?

complete

environment

(no mock)

isolation (all

dependencies

mocked)

Page 23: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: Generic interfaces to control test clients

• previously often proprietary interfaces were used

• better: make it reusable from other test solutions as well

• examples: MQTT, REST

– client libraries for most common languages and

platforms

– keep the protocol simple!

– asynchronous message based communication

– efficiency

Test Client

(load generator)

GENERIC INTERFACE

Te

st S

olu

tio

n 1

Te

st S

olu

tio

n 2

Te

st S

olu

tio

n n

Server

under Test

Page 24: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

Good practice: Use production data pattern as load test input

• Various issues slipped through our testing

• Need for data to create representative load test cases– usage patterns

– client platform distributions (desktop / mobile / browser / telephone)

– network conditions

– codec distribution different load profiles

Page 25: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

That‘s it! Conclusion

• Real-Time Communications: special requirements

• Start developing proper tools and test solutions early.

• Modular approach: provide small reusable libraries / services

• Telemetry and monitoring are essential to capture real-world scenarios

representative load test cases

• Run large-scale tests continuously

• Pay attention to the costs

Page 26: Continuous Large-Scale Testing of Real-Time Communication Systems

© 2016 Citrix

I. RTC: special requirements

II. Large-Scale Testing: What? Why? How?

III. Our Large-Scale Testing Journey

IV. Learnings, Good Practices

V. Discussion

That‘s it!

45'