advancing testing program maturity in your organization

Intended for Knowledge Sharing only

Advancing Testing Program MaturityOptimizely User Group – San Francisco

Oct 2017


RAMKUMAR RAVICHANDRAN

Intended for Knowledge Sharing only2

Director, Analytics at Visa, Inc.

Data Driven Decision Making via

Insights from Analytics, A/B

Testing and Research

Manager, Analytics & AB Testing

Data Driven Decision Making via

Insights from Analytics, A/B

Testing and Research

ROGER CHANG MIMI LE

Senior Director

Product Launch Management


Quick recap of what it is


Just venting it out a little…

3

JUST CHANGE THE COLOR OF THE BUTTON, DAMN IT!


WAIT A MINUTE! WHAT WILL YOU DO WITH THAT MUCH MONEY, SENORE?


YOU BROKE IT, DUDE!!!


THE NUMBERS AREN’T IN LINE – YOU MUST HAVE SCREWED UP!


WHY WOULD I NEED YOU, WHEN I GOT ARTIFICIAL INTELLIGENCE?





A typical Testing Program in Utopia…

9

IF GOD DECIDE TO CREATE AN A/B TEST PROGRAM, WHAT WOULD IT LOOK LIKE…


Every major product change

has been iterated, quantified

& contextualized

A centralized but modular,

seamless & integrated Learn,

Listen and Test Framework

covering all domains

A Single-Source-Of-Truth

Testing Datamart within the

Organization’s Datalake for

year end Program

effectiveness studies

Unified Workflow & Project

Management with searchable

Knowledge repository &

centralized Admin capabilities

Programmatic Testing with

human intervention protocols




What is a Testing Maturity Curve?

11

TESTING PROGRAM MATURITY CURVE

SELL

SCALE

EXPAND

DEEPEN

TRANSFORM

Phases of Maturity

Va

lue

Ad

d

We are here

DEEPEN

• Content Delivery, Personalization,

Champion-Challenger Set up

• Platform: Cross Integration with

Analytics, ML, Session Replays &

Research at Application Layer

• Predictive Analytics on Test Impact:

Test-Mix Models (Scenario Planning

& Scoring)

• Unified Workflow & Project

Management

EXPAND

• Complexity and scope of the tests

• Multi (Variate, Pages, Experience,

Device, Domain)

• Enterprise Framework (Server Side

Integration, Datamart)

• Enterprise rollup of Operational and

Strategic Impact

• Searchable Knowledge bank &

Feedback loop into Design Stage

TRANSFORM

TESTING PROGRAM MATURITY- PHASES

PHASES KEY ACTION ITEMS SUCCESS CRITERIA

SELL

• Buy-ins across leadership &

stakeholders

• Scrappy quick win tests

• Allaying fears of Dev/QA/Security org

• Tangible KPI impact

• Sponsor Business Units and victory

use cases: Prod, UX, Mktg

• Approval for a cross functional team

and Testing environment set up

SCALE

• Agile Workflow set up

• Test Pipeline created & shared

• Testing Dashboard

• Readouts shared with stakeholders

• A successful rollout because of Test

& Learn Initiative (Use Case Driven-

>Numbers Driven->Experience

Driven)

• Testing formalized within Dev Cycle

• Algorithmic Test Management (Traffic

adjustments, winner ramps,

combinatorial tests)

• Test Modularity & Portability

• Testing as “Monetizable” Product

• Test & Learn made self serve via

Trainings for Citizen Experimenters

• Cross Pollination across BU – within

the DNA of the organization

TRANSFORM

TESTING PROGRAM MATURITY- EXPERIENCE & LEARNING

PHASES CHALLENGES RESOLUTION THAT WORKED FOR US

SELL• Executive Buy-ins

• Pushback from Security, Branding,

Integration, Development & QA teams

• Proof-Of-Concept (guard against

weak POC & Sponsor BU)

• Risk Ownership with executive air

cover/Shared limelight

• CMS/Bug Fixes – Good and bad!

SCALE

• Resourcing & Funding support:

Availability & size of shared team

• Sandbox availability/sync with

release cycles/broken tagging

• Production ramp

• Show progress even with persisting

challenges

• Successful Project delivery

• Dashboards & Communication

readouts

EXPAND

• Sample size, cookie issues and cross

domain traffic. Interaction problems.

• Consistency and integration issues of

tagging and logic between front and

back end and within backends itself

• Knowledge Management site and

dashboard

• Instrumentation request for the

Engineering team to link the various

cookies and identifiers

DEEPEN

• Huge investment and potential

tradeoffs with re-architecting

instrumentation

• Resourcing and Funding into platform

set-up & product builds

• Potentially Test-Mix Models on

manually scraped metadata (less

rigorous)

• Server Side product set up

• Dependent on successful transition

from previous phase

• Resourcing & funding

• Test & Learn made self serve via

Trainings for Citizen Experimenters.

Brown bags, whitepapers?




What has worked for us so far…

15

KNOWING WHAT WE ARE TESTING & HOW MUCH TO EXPECT…

Message

Prominence

Flow

Form

Clear and crisp Value Prop and Call to Action (CTA)

Trendy and easy to spot

Easily spotted and fitting with the Consumer’s mental model

Navigation and Pathing

Minimal and relevant elements only

Placement

Personalization Personalized with behavioral insights

Content Algorithmic delivery/Contextualized Content

Performance Platform Performance (Latency, Uptime, Errors)

Exp

ecte

d Im

pac

t


PLANNING IT END-TO-END

• Analytics team creates direct/proxy metrics to measure the performance

• Instrument metrics if needed

• Decision on the Research Methodology based on Analytical findings

AC

TIO

NS

• Defined the question to be answered and why, Design the changes, know the cost and finalize success criteria

• Quantify/Analyze the impact

• Size the potential impact on launching

Measure LaunchStrategy

PH

ASES

Analyze

Primary Metrics, e g.,

• Click Through Rate• NPS

Secondary Metrics

• Repeat Visits• Lifetime Value

Questions

• Target Customers• Where and What is

being checked?• Why is this even

being considered?• Target Metrics and

success criteria

Research Methods

• Attitudinal vs. Behavioral

• Qualitative vs. Quantitative

• Context for Product Use

Factors deciding Research Methods

• Speed of execution• Cost of execution• Reliability• Product

Development Stage

Factors deciding eventual rollout (in order of priority)

• Strategic need• Estimated impact

calculation from Analytics

• Findings from other sources (Data Analytics/Mining, Consumer Feedback

DET

AIL

S


KNOWING WHEN TO SET UP A/B TEST AND NOT…

Method DescriptionFactors

Speed Cost Inference Dev Stage

Prototyping

Usability

Studies

Focus Group

Surveys &

Feedback

Pre-Post

A/B Testing

Create & Test prototypes

internally (external, if

needed)

Standardized Lab

experiments – Panel/s of

employees/friends/family

In-depth interviews for

Feedback

Email/Pop-ups Surveys

Roll-out the changes and

then test for impact

Different experiences to

users and then measure delta

Quickest (HTML

Prototypes)

Quick (Panel,

Questions, Read)

Slow (+Detailed

interviews)

Slower

(+Response rate)

Slower (Dev+QA+

Launch+Release

cycle)

Slowest

(+Sampling+

Profiling+

Statistical

Inferencing)

Inexpensive

(Feedback

incentives)

Relatively

expensive

(+Lab)

Expensive

(+Incentive

+Time)

Expensive

(Infra to

send, track

& Read)

Costly

(+Tech

resources)

Very Costly

(+Tech

+Analytics

+Time)

Directional

+Consistency across

users

+additional context

on Why?

+strength of

numbers

+Possible Statistical

Significance but risk

of bad experience.

+Rigorous (Statistical

Significance). *Risk of

bad experience

reduced.

Ideation Stage

Ideation Stage

Ideation Stage

Ideation/Dev/

Post Launch

Post Launch

Pre Launch

(after Dev)


PAYING YOUR DUES –RIGHT WORKFLOW MANAGEMENT & XFUNCTIONAL OWNERSHIP

• A/B Test Analyst (Analytics): The driver of the testing program. Involved from start to finish up until the hand-off of a successful test to its respective product owner. A SME in the Optimizely tool, owner of test setup, deployment, and analysis.

• Product Partner: Talks to and brings in the right people for different steps of the process. Offers product’s perspective in terms of gatekeeping duties on test ideas. Well connected to different product owners and acts as the liaison towards the product team.

• QA Partner: Helps ensure that there are no bugs in the test setup, from a usability standpoint.

• Technology Partner: Offers consultation on feasibility for tests, assists in setup of advanced tests.

• Design Partner: Helps the team germinate ideas, as well as give the team visuals to work off of in a test.

IdeationPrioritization /

GroomingSetup QA Deployment Analysis Implementation

Analytics, Product, Design, Tech

Analytics, QA

Analytics, Product


…AND FOCUS ON FULL SPECTRUM OF OPERATIONAL METRICS

Operational Program KPIs

• # of Tests run per month

• % Successful tests

• % Learning Tests

• % Workaround/Bug fix Tests

• #Channels Tested on

• Time from ideation to deployment

• Time from test outcome to product implementation

• Program RoI

• Stakeholder NPS

• KPI Delta vs. Universal Control


…both raw and

YoY growth

forms

& ANALYTICS VALUE CHAIN: STRATEGY DRIVES EVERY INITIATIVE & ANALYTICS

MEASURES ITS EFFECTIVENESS!

Analytics provides insights into “actions”, Research context on “motivations” & Testing

helps verify the “tactics” in the field and everything has to be productized…

Strategy

Data Tagging

Data Platform

Reporting

Analytics

Research

Cognitive

IterativeLoop Key benefits

Focus on Big Wins

Reduced Wastage

Quick Fixes

Adaptability

Assured execution

Learning for future

initiatives


Optimization

…& TIMING IT CORRECTLY WITHIN THE ANALYTICS MATURITY RAMP

Testing makes sense after we know what the baseline actually looks like…


60%

20%

10%5% 5%

20%

30%

15%

10%5%

20%

25%

25%

25%

20%

25%

25%

20%

15%

25%

20%

20%

20%

20%

15%

YEAR 1 YEAR 2 YEAR 3 YEAR 4 YEAR 5

Primary source of insights for Decision Making along the Analytics Maturity Curve

Reporting Data Analytics User Research A/B Testing Advanced Analytics/Machine Learning Data Products Cognitive Analytics

ILLUSTRATIVE

MAKE OR BREAK DIMENSION: PROJECT TRACKER & PERFORMANCE DASHBOARD

Priority Test DescriptionRequestors/Key

Stakeholders

Type of

ChangeHypotheses

How did we

arrive at this

hypotheses

Where will

the Test

happen?

Target

Audience

1

Remove Ad

banner on Yahoo

home page

User Experience Prominence

Removing Ad

banners would

reduce

distraction and

focus users to

CTA

Product/Design

JudgementHome Page All Consumers

Standard Test

Plan Document

Ready

#Test Cells

#Days needed for the

Test to run tor

statistical significant

sample

Design

Ready?

Specific

Technical

Requirements?

Estimated Tech

Effort/Cost

(USD)

Overall Test Cost

(USD)

Yes 2 40 Yes

Test Details

Other details from the Test

ILLUSTRATIVE

MAKE OR BREAK DIMENSION: PROJECT TRACKER & PERFORMANCE DASHBOARD CONTD


ILLUSTRATIVE

Primary Metrics Secondary MetricsEstimated Benefit

(USD)Click Through Rate Net Promoter Score Repeat VisitsCustomer Lifetime

Value

x% y% z% a%

Expected Impact from the Test

Primary Metrics Secondary MetricsEstimated Benefit

(USD)Click Through Rate Net Promoter Score Repeat VisitsCustomer Lifetime

Value

x% y% z% a%

Actual Impact from the Test

& COMMUNICATION READOUT AT REGULAR CADENCE!

Objective

Understand if removing Ad banner on home page improves click through rate on articles and increases consumer

satisfaction

0%

20%

40%

60%

80%

100%

120%

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

Delt

a b

etw

ee

n T

es

t &

Co

ntr

ol

Te

st/

Co

ntr

ol V

alu

es

Test metrics - Click through Rate

Delta Test Control

Key Findings

1. Removing the banner increased CTR by '100%' and NPS by 20 points '. It translates to $40 M in Lifetime Value impact.

2. All the above lifts are statistically significant at 90% confidence level. These lifts were also consistent over two weeks

time window.

Sl.No.

1

2

3

5

Performance data Time window: Apr 1, 1980 to Apr 14, 1980

ILLUSTRATIVE

THINGS WE WATCH OUT FOR

• Engineering overheads – every time a new flow needs to be introduced or any major addition to the

experience, new development is required. It has to go through Standard engineering prioritization route unless a

SWAT team is dedicated to it.

• Tricky QA situations – QA team should be trained to handle A/B Testing scenarios and use cases; Integration

with automated QA tools. Security and FE load failure considerations apart from standard checks.

• Operational excellence requirements – Testing of the Tests in Sandbox, Staging and Live Site Testing areas.

End to End Dry runs mandatory being launching the tests.

• Analytical nuances – Experiment Design supreme need! External factors can easily invalidate A/B Testing.

Sample fragmentation with increasing #tests and complexity; Need for Universal Control; Impact should be

checked for significance over time.

• Data needs – Reliable instrumentation, Testing Tool JavaScript put in right place, with minimal overhead

performance impact, integration with Web Analytics tool, Data feed with ability to tie with other data sources

(for deep dives).

• Branding Guidelines – Don’t overwhelm and confuse users in quest for multiple and complex tests; Standardize

but customize experience across various channels and platforms; Soft launches should be as much avoided as

possible.

• Proactive internal communication, specifically to client facing teams.

• Strategic Decisions – Some changes have to go in irrespective of A/B Testing findings, the question would be

how to make it happen right? This is gradual ramp, progressive learning and iterative improvements – a collection

of A/B Tests and not one off big one.

…A/B Testing can never be a failure, by definition it is a learning on whether the change was well

received by the user or not that informs the next steps




Discussion items

27

QUESTIONS FOR THE AUDIENCE

Where are you in the Testing Maturity Curve?

What were your biggest bottlenecks and how did you solve them?

Were you successful in up-leveling the conversation in your organization?

How did you crack the Resourcing & Funding problem?

What are the things that worked best for you in your journey?

How did you protect Testing resources from being used up for CMS or Bug Fixes?

How did you manage the nuance between Learning and Business Objectives?

How did you convince the organization to use Testing as driver of accountability

but also not get dragged into for political issues?





The parting words…

29

KEY TAKEAWAYS


An advanced Experimentation program is hallmark of a “Data Driven

Decision Making Culture” of accountability & transparency

Benefit from Experimentation is best realized when it’s anchored to

Strategic goals and is driven with insights from Analytics and Research

Mature organization leverage Algorithmic Test Management framework to

achieve scalability and efficiency at Optimal Program RoI levels

Organizations with a disciplined Experimentation culture within the DNA

are poised to reap benefits of higher accountability, focus on business

performance and optimized Customer Experience Management

Testing Program is a high reward but high investment-high political risk

function and an executive leadership & support are imperative




Appendix

31

THANK YOU!

Would love to hear from you on any of the following forums…

https://twitter.com/decisions_2_0

http://www.slideshare.net/RamkumarRavichandran

https://www.youtube.com/channel/UCODSVC0WQws607clv0k8mQA/videos

http://www.odbms.org/2015/01/ramkumar-ravichandran-visa/

https://www.linkedin.com/pub/ramkumar-ravichandran/10/545/67a

RAMKUMAR RAVICHANDRAN

ROGER CHANG

https://www.linkedin.com/in/rogervchang/





https://twitter.com/decisions_2_0







advancing testing program maturity in your organization

Data & Analytics