ensuring data quality - a two-tier strategy

41
Ensuring Sample Quality A Two-Tiered Approach Lisa Wilding- Brown VP, Global Panel and Sampling Operations, uSamp Peter Gold CEO, VeraQuest

Upload: usamp

Post on 11-Jan-2015

1.149 views

Category:

Technology


10 download

DESCRIPTION

Peter Gold, CEO, VeraQuest & Lisa Wilding-Brown, VP Global Panel & Sampling Operations, uSamp offer tips for obtaining quality data. During this Webinar, they will share some of their experiences in the field and offer up five tips for optimizing your data quality. 1. Source Testing 2. Registration 3. Sample Frame Balance 4. Respondent Monitoring 5. Research Design and Execution

TRANSCRIPT

Page 1: Ensuring Data Quality - A Two-Tier Strategy

Ensuring Sample Quality A Two-Tiered Approach

Lisa Wilding-BrownVP, Global Panel and Sampling Operations, uSamp

Peter GoldCEO, VeraQuest

Page 2: Ensuring Data Quality - A Two-Tier Strategy

Today’s Topics

• Sample Quality Landscape

• The Quality Continuum

• 5 Tips for Optimizing Sample Quality

• Research on Research: A Case Study

Page 3: Ensuring Data Quality - A Two-Tier Strategy

State of the Industry

• Online market research has matured and stabilized

• Host of quality-focused consortiums and sample validation products available

• Data quality is an evolving and dynamic topic

• Our work is never done!

Page 4: Ensuring Data Quality - A Two-Tier Strategy

Vigilance Required

Page 5: Ensuring Data Quality - A Two-Tier Strategy

Spectrum Challenge

Page 6: Ensuring Data Quality - A Two-Tier Strategy

There is no silver bullet

Page 7: Ensuring Data Quality - A Two-Tier Strategy

Damage Control

Page 8: Ensuring Data Quality - A Two-Tier Strategy

How well do you know your sample?

• Demographic, behavioral and attitudinal data together provide a complete picture of source quality

• Analyzing benchmark data against known population characteristics helps to identify skews

Page 9: Ensuring Data Quality - A Two-Tier Strategy

Source Testing At Work

Source 9

Source 8

Source 7

Source 6

Source 5

Source 4

Source 3

Source 2

Source 1

+/- 5%

+/- 6-10%

+/- 11-20%

+/- 20%+

Yes

Yes

Yes

Yes

No

No

Maybe

Maybe

Maybe

Page 10: Ensuring Data Quality - A Two-Tier Strategy

What’s your first impression?

• Leveraging available tools helps verify respondent identity at the point of registration:

– Email and IP address verification

– Geo-IP look-up

– Digital fingerprinting

– Proxy server detection

Page 11: Ensuring Data Quality - A Two-Tier Strategy

How good is your balancing act?

• Using demographically balanced sample helps achieve more representative results

• Stratifying sample frames by activity levels, tenure and source helps to minimize bias

Page 12: Ensuring Data Quality - A Two-Tier Strategy

Are you getting consistent performance?

• Monitoring quality behaviors throughout lifetime, not just at registration point, helps to maintain consistency

• Utilizing Outlier / Black Swan algorithms helps to lessen the data impact of highly improbable characteristics or events

Page 14: Ensuring Data Quality - A Two-Tier Strategy

Caucasian CEO of a fortune 500 company…

Page 15: Ensuring Data Quality - A Two-Tier Strategy

…suffering from alopecia…

Page 16: Ensuring Data Quality - A Two-Tier Strategy

…and a super rare skin condition…

Page 17: Ensuring Data Quality - A Two-Tier Strategy

…currently living in New York…

Page 18: Ensuring Data Quality - A Two-Tier Strategy

…No! San Francisco…

Page 19: Ensuring Data Quality - A Two-Tier Strategy

…who drives a cherry red Lincoln…

Page 20: Ensuring Data Quality - A Two-Tier Strategy

…and owns a show dog…

Page 21: Ensuring Data Quality - A Two-Tier Strategy

…who won Best in Show at Westminster.

Page 22: Ensuring Data Quality - A Two-Tier Strategy

How can researchers help?

• Using thoughtful, disguised screening ensures the intended audience is reached

• Inclusion of Red-Herring questions weeds out over-zealous and inattentive respondents

• Maintaining an open feedback loop with sample suppliers helps manage potential quality offenders

Page 23: Ensuring Data Quality - A Two-Tier Strategy

All Hands On Deck!

Page 24: Ensuring Data Quality - A Two-Tier Strategy

Research on ResearchA case study in identifying fraudulent

or inattentive respondents

Page 25: Ensuring Data Quality - A Two-Tier Strategy

Artisan Bread Study

Objective

Assess brand awareness among national population for west coast artisan bread brand relative to other artisan brands in same region.

Brands

- Brand X (Client Brand) - Tribeca Oven - Maple Leaf - California Goldminer - Cuisine de France - Chabaso- Ace Bakery - Ecce Panis

Page 26: Ensuring Data Quality - A Two-Tier Strategy

Artisan Bread Study Percent Straight-Liners

97%

Straight-Liners

3%

Page 27: Ensuring Data Quality - A Two-Tier Strategy

Artisan Bread Study Percent Aware of 5+ Artisan Bread Brands

89%

Aware of 5+

Brands11%

Page 28: Ensuring Data Quality - A Two-Tier Strategy

Most Recent Case Study

Design:

All respondents were asked brand awareness for ten or twelve brands in three categories

• Shampoo • Juice • Chips

• Cell 1: yes/no grid - 10 fictitious brands and 0 real brands • Cell 2: pick list - 10 fictitious brands and 0 real brands• Cell 3: yes/no grid 10 fictitious brands and 2 real brands• Cell 4: pick list - 10 fictitious brands and 2 real brands

Page 29: Ensuring Data Quality - A Two-Tier Strategy

The Questions We Set Out to Answer

1. Will lists of fictitious brands help us to ID fraudulent responders?

2. Are pick lists preferable to yes/no grids?

3. Does it make sense to include at least two real brands in the brand list?

Page 30: Ensuring Data Quality - A Two-Tier Strategy

Number of Respondents Claiming Awareness of Fictitious Brands

0 1 2 3 4 5 6 7 8 9 10

767

104 61 47 27 25 15 13 7 3 32

3+ Brands = 15%

4+ Brands = 11%

Shampoo

Page 31: Ensuring Data Quality - A Two-Tier Strategy

0 1 2 3 4 5 6 7 8 9 10

610

182

98 73 43 36 19 6 11 2 21

3+ Brands = 19%

4+ Brands = 13%

Juice

Number of Respondents Claiming Awareness of Fictitious Brands

Page 32: Ensuring Data Quality - A Two-Tier Strategy

0 1 2 3 4 5 6 7 8 9 10

684

177

90 50 29 22 10 9 5 5 20

3+ Brands = 14%

4+ Brands = 9%

Chips

Number of Respondents Claiming Awareness of Fictitious Brands

Page 33: Ensuring Data Quality - A Two-Tier Strategy

Fictitious Brands Correlation, by Category

Shampoo & Juice Shampoo & Chips Juice & Chips

0.763 0.776 0.843

R2

Page 34: Ensuring Data Quality - A Two-Tier Strategy

Question 1: Will lists of fictitious brands help us to ID fraudulent responders? Answer: Yes. Or at least we think so.

Using Fictitious Brand Names to Identify Fraudulent Responders

Page 35: Ensuring Data Quality - A Two-Tier Strategy

Percent of Respondents Claiming Awareness of Fictitious Brands

Yes/No Grid Pick List Fictitious Brands Aware Shampoo Juice Chips Shampoo Juice Chips

10 4% 3% 3% 1% 1% 1%9+ 5% 3% 4% 1% 2% 1%8+ 6% 5% 4% 1% 1% 1%7+ 7% 5% 6% 2% 2% 1%6+ 8% 8% 7% 4% 2% 1%5+ 11% 13% 10% 6% 4% 3%4+ 14% 17% 13% 8% 8% 5%3+ 19% 23% 18% 12% 15% 9%2+ 23% 31% 25% 18% 25% 18%1+ 32% 48% 39% 28% 41% 37%0+ 100% 100% 100% 100% 100% 100%

Yes/No vs. Pick List

Page 36: Ensuring Data Quality - A Two-Tier Strategy

Percent of Respondents Aware of at Least One of the Real Brands

97%

Yes/No Grid Respondents Aware of at Least One Real Brand

97%

Pick List Respondents Aware of at Least One Real Brand

Yes/No vs. Pick List

Page 37: Ensuring Data Quality - A Two-Tier Strategy

Using Fictitious Brand Names to Identify Fraudulent Responders

Question 2:

Are pick lists preferable to yes/no grids for detecting fraudulent respondents?

Answer: Probably.

We believe yes/no grids may actually exacerbate fraudulent behavior.

Page 38: Ensuring Data Quality - A Two-Tier Strategy

Percent of Respondents Claiming Awareness of Fictitious Brands

10 Fictitious/No Real Brands 10 Fictitious/2 Real Brands Fictitious Brands Aware Shampoo Juice Chips Shampoo Juice Chips

10 4% 2% 2% 2% 2% 1%9+ 4% 3% 3% 2% 2% 1%8+ 5% 4% 4% 3% 2% 2%7+ 6% 4% 5% 4% 3% 2%6+ 8% 6% 6% 5% 5% 3%5+ 11% 9% 8% 6% 8% 5%4+ 14% 15% 11% 9% 10% 8%3+ 18% 22% 15% 13% 17% 12%2+ 25% 30% 23% 17% 26% 21%1+ 33% 43% 38% 28% 47% 38%0+ 100% 100% 100% 100% 100% 99%

10 Fictitious/No Real Brands vs. 10 Fictitious Brands/2 Real Brands

Page 39: Ensuring Data Quality - A Two-Tier Strategy

Using Fictitious Brand Names to Identify Fraudulent Responders

Question 3:

Does it make sense to include at least two real brands in the brand list?

Answer: Yes.

The absence of real brands from which to choose likely causes respondents to erroneously select a fictitious brand.

Page 40: Ensuring Data Quality - A Two-Tier Strategy

In Summary

Tips for Optimizing Sample Quality:

Source Testing

Registration Verification

Sample Balancing

Respondent Monitoring

Research Design Best Practices

Research on Research Case Study:

Red Herring questions are effective

Pick lists are preferable to yes/no grid designs

Inclusion of 2 or more real brands is optimal if using fictitious brand lists

Page 41: Ensuring Data Quality - A Two-Tier Strategy

Thank you!

Lisa Wilding-BrownVP, Global Panel and Sampling Operations, [email protected]

Peter GoldCEO, VeraQuest

[email protected]