ensuring data quality - a two-tier strategy
DESCRIPTION
Peter Gold, CEO, VeraQuest & Lisa Wilding-Brown, VP Global Panel & Sampling Operations, uSamp offer tips for obtaining quality data. During this Webinar, they will share some of their experiences in the field and offer up five tips for optimizing your data quality. 1. Source Testing 2. Registration 3. Sample Frame Balance 4. Respondent Monitoring 5. Research Design and ExecutionTRANSCRIPT
Ensuring Sample Quality A Two-Tiered Approach
Lisa Wilding-BrownVP, Global Panel and Sampling Operations, uSamp
Peter GoldCEO, VeraQuest
Today’s Topics
• Sample Quality Landscape
• The Quality Continuum
• 5 Tips for Optimizing Sample Quality
• Research on Research: A Case Study
State of the Industry
• Online market research has matured and stabilized
• Host of quality-focused consortiums and sample validation products available
• Data quality is an evolving and dynamic topic
• Our work is never done!
Vigilance Required
Spectrum Challenge
There is no silver bullet
Damage Control
How well do you know your sample?
• Demographic, behavioral and attitudinal data together provide a complete picture of source quality
• Analyzing benchmark data against known population characteristics helps to identify skews
Source Testing At Work
Source 9
Source 8
Source 7
Source 6
Source 5
Source 4
Source 3
Source 2
Source 1
+/- 5%
+/- 6-10%
+/- 11-20%
+/- 20%+
Yes
Yes
Yes
Yes
No
No
Maybe
Maybe
Maybe
What’s your first impression?
• Leveraging available tools helps verify respondent identity at the point of registration:
– Email and IP address verification
– Geo-IP look-up
– Digital fingerprinting
– Proxy server detection
How good is your balancing act?
• Using demographically balanced sample helps achieve more representative results
• Stratifying sample frames by activity levels, tenure and source helps to minimize bias
Are you getting consistent performance?
• Monitoring quality behaviors throughout lifetime, not just at registration point, helps to maintain consistency
• Utilizing Outlier / Black Swan algorithms helps to lessen the data impact of highly improbable characteristics or events
Caucasian CEO of a fortune 500 company…
…suffering from alopecia…
…and a super rare skin condition…
…currently living in New York…
…No! San Francisco…
…who drives a cherry red Lincoln…
…and owns a show dog…
…who won Best in Show at Westminster.
How can researchers help?
• Using thoughtful, disguised screening ensures the intended audience is reached
• Inclusion of Red-Herring questions weeds out over-zealous and inattentive respondents
• Maintaining an open feedback loop with sample suppliers helps manage potential quality offenders
All Hands On Deck!
Research on ResearchA case study in identifying fraudulent
or inattentive respondents
Artisan Bread Study
Objective
Assess brand awareness among national population for west coast artisan bread brand relative to other artisan brands in same region.
Brands
- Brand X (Client Brand) - Tribeca Oven - Maple Leaf - California Goldminer - Cuisine de France - Chabaso- Ace Bakery - Ecce Panis
Artisan Bread Study Percent Straight-Liners
97%
Straight-Liners
3%
Artisan Bread Study Percent Aware of 5+ Artisan Bread Brands
89%
Aware of 5+
Brands11%
Most Recent Case Study
Design:
All respondents were asked brand awareness for ten or twelve brands in three categories
• Shampoo • Juice • Chips
• Cell 1: yes/no grid - 10 fictitious brands and 0 real brands • Cell 2: pick list - 10 fictitious brands and 0 real brands• Cell 3: yes/no grid 10 fictitious brands and 2 real brands• Cell 4: pick list - 10 fictitious brands and 2 real brands
The Questions We Set Out to Answer
1. Will lists of fictitious brands help us to ID fraudulent responders?
2. Are pick lists preferable to yes/no grids?
3. Does it make sense to include at least two real brands in the brand list?
Number of Respondents Claiming Awareness of Fictitious Brands
0 1 2 3 4 5 6 7 8 9 10
767
104 61 47 27 25 15 13 7 3 32
3+ Brands = 15%
4+ Brands = 11%
Shampoo
0 1 2 3 4 5 6 7 8 9 10
610
182
98 73 43 36 19 6 11 2 21
3+ Brands = 19%
4+ Brands = 13%
Juice
Number of Respondents Claiming Awareness of Fictitious Brands
0 1 2 3 4 5 6 7 8 9 10
684
177
90 50 29 22 10 9 5 5 20
3+ Brands = 14%
4+ Brands = 9%
Chips
Number of Respondents Claiming Awareness of Fictitious Brands
Fictitious Brands Correlation, by Category
Shampoo & Juice Shampoo & Chips Juice & Chips
0.763 0.776 0.843
R2
Question 1: Will lists of fictitious brands help us to ID fraudulent responders? Answer: Yes. Or at least we think so.
Using Fictitious Brand Names to Identify Fraudulent Responders
Percent of Respondents Claiming Awareness of Fictitious Brands
Yes/No Grid Pick List Fictitious Brands Aware Shampoo Juice Chips Shampoo Juice Chips
10 4% 3% 3% 1% 1% 1%9+ 5% 3% 4% 1% 2% 1%8+ 6% 5% 4% 1% 1% 1%7+ 7% 5% 6% 2% 2% 1%6+ 8% 8% 7% 4% 2% 1%5+ 11% 13% 10% 6% 4% 3%4+ 14% 17% 13% 8% 8% 5%3+ 19% 23% 18% 12% 15% 9%2+ 23% 31% 25% 18% 25% 18%1+ 32% 48% 39% 28% 41% 37%0+ 100% 100% 100% 100% 100% 100%
Yes/No vs. Pick List
Percent of Respondents Aware of at Least One of the Real Brands
97%
Yes/No Grid Respondents Aware of at Least One Real Brand
97%
Pick List Respondents Aware of at Least One Real Brand
Yes/No vs. Pick List
Using Fictitious Brand Names to Identify Fraudulent Responders
Question 2:
Are pick lists preferable to yes/no grids for detecting fraudulent respondents?
Answer: Probably.
We believe yes/no grids may actually exacerbate fraudulent behavior.
Percent of Respondents Claiming Awareness of Fictitious Brands
10 Fictitious/No Real Brands 10 Fictitious/2 Real Brands Fictitious Brands Aware Shampoo Juice Chips Shampoo Juice Chips
10 4% 2% 2% 2% 2% 1%9+ 4% 3% 3% 2% 2% 1%8+ 5% 4% 4% 3% 2% 2%7+ 6% 4% 5% 4% 3% 2%6+ 8% 6% 6% 5% 5% 3%5+ 11% 9% 8% 6% 8% 5%4+ 14% 15% 11% 9% 10% 8%3+ 18% 22% 15% 13% 17% 12%2+ 25% 30% 23% 17% 26% 21%1+ 33% 43% 38% 28% 47% 38%0+ 100% 100% 100% 100% 100% 99%
10 Fictitious/No Real Brands vs. 10 Fictitious Brands/2 Real Brands
Using Fictitious Brand Names to Identify Fraudulent Responders
Question 3:
Does it make sense to include at least two real brands in the brand list?
Answer: Yes.
The absence of real brands from which to choose likely causes respondents to erroneously select a fictitious brand.
In Summary
Tips for Optimizing Sample Quality:
Source Testing
Registration Verification
Sample Balancing
Respondent Monitoring
Research Design Best Practices
Research on Research Case Study:
Red Herring questions are effective
Pick lists are preferable to yes/no grid designs
Inclusion of 2 or more real brands is optimal if using fictitious brand lists
Thank you!
Lisa Wilding-BrownVP, Global Panel and Sampling Operations, [email protected]
Peter GoldCEO, VeraQuest