![Page 1: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/1.jpg)
Quick Growth through ML ModelA/B Testing
Introduce eBay Experimentation Platform for the Paid Search Ads
- Sleven Liu, Martin Zhang, Yi Liu
![Page 2: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/2.jpg)
Agenda
• Why Growth hacking and A/B testing?
• Search Ads: The most important marketing channel
• Challenges and Solution for A/B testing
• Machine Learning Models Integration
Hadoop Summit 2
![Page 3: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/3.jpg)
Quick Growth in the eBay Paid Marketing through A/B Testing & ML Model
Hadoop Summit 3
50+Models/Year
5+Years
60+ Experiments/
Year
![Page 4: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/4.jpg)
Growth Hacking
Hadoop Summit 4
Data
A/B test
Marketing
“Growth hackers are a hybrid of marketer and coder,
one who…answers with A/B tests, landing pages,
viral factor, email deliverability, and Open Graph.
On top of this, they layer the discipline of direct
marketing, with its emphasis on quantitative
measurement, scenario modeling via spreadsheets,
and a lot of database queries.”
- 《Growth Hacker is the new VP Marketing》Andrew Chen
![Page 5: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/5.jpg)
A/B Testing
• Key Elements
–Statistical hypothesis
–Sampling
•Benefits
– Customer vs. expertise
– Early launch and adoption in the marketing
– Continue delivery and integration
– Based on the data and statistics
• Limitation
– Statistician Power
– Imbalancing
Hadoop Summit 5
![Page 6: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/6.jpg)
Growth Hacking Channels
• “Poor distribution, not product is the number one cause of failure” – Peter Thiel, 《Zero to One》
Hadoop Summit 6
UGC / SEO
Ads
Affiliate Net
Viral Marketing
![Page 7: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/7.jpg)
Google Text Ads
• Google Ads, CPC
• Content
–Headline
–Display URL
–Description
•SRP + Search Network
•Exact vs. Broad match
•Campaign Structure
Hadoop Summit 7
![Page 8: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/8.jpg)
Google Product Listing Ads / Shopping Campaign
•More info (price/picture) more qualified traffic
•Catch more eyeballs
•Product/Brand match
•Higher barrier, less competition
•Backend structure
Hadoop Summit 8
![Page 9: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/9.jpg)
Challenges of A/B testing in the Paid Search Ads
• No control on the user/visiting
• Accurate user targeting
• Skew data & Low coverage Sampling
• “Black Box” on third partner / ads platform
• Limitation of Testing objectsTest Setup
• External data loopTracking
Hadoop Summit 9
![Page 10: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/10.jpg)
A/B Testing Solution Example in the Text Ads
Sampling
• Based on the keywords
• Stratified sampling to resolve skewed data
Test Setup
• Campaign structure management
• Test object: bidding models
Tracking• Insides + outsides tracking
• Data loop for the model
Hadoop Summit 10
![Page 11: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/11.jpg)
Why Sampling is important for A/B testing?
Choose the right sample size
• Is a large sample always good to speed up A/B? Or put business in real risk?
Choose the right method
• Why not using random sampling anyway?
Un-represented sampling result might hurt business after rollout
• Is the model workable for all the Ads? Or only the sampled ads?
A trustable sampling result makes the A/B result trustable
• Is the difference from A/B test result really from the model? Or because of the sampling difference?
Hadoop Summit 11
![Page 12: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/12.jpg)
Sampling Challenge – Huge volume of data
• Billion level Ads
• New Ads sourcing – is the process scalable for
more ads added to marketing?
• Ads history tracking – how the process dealing
with the historical data?
Hadoop Summit 12
![Page 13: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/13.jpg)
Sampling challenge – Skew Data
& Low Coverage
Hadoop Summit 13
• Top click queries
• Long tail queries
• Low Conversion Rate – Impression -> Click ->
Transaction
• Deal with ads with no impression on partner
0 5000000 10000000 15000000 20000000 25000000
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Ad Count
Clic
k D
istr
ibution (
hot
-> c
old
)
ad count total_adADS IMPRESSION CLICK VALUED
CLICK
Ad
s C
ou
nt
![Page 14: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/14.jpg)
Sampling Solution - Method
Hadoop Summit 14
![Page 15: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/15.jpg)
Sampling Solution - Tech
• Hbase + HDFS
Active ads stored in Hbase
Ads history stored in HDFS
• Spark
Huge data pre-aggregation
Optimization of huge data join with ads history, user behavior…
Store data as Parquet to improve the spark job efficiency
Hadoop Summit 15
![Page 16: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/16.jpg)
Machine Learning Model Integration
Hadoop Summit 16
Where is the data?
What is a model?
How to manage the model lifecycle?
![Page 17: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/17.jpg)
Challenge for data
• Data extraction
• Data processing
• Data gathering
Hadoop Summit 17
• Original Solution
Regular ETL data pipeline to build factor for each model
Move gathered factors to model running env based on different scenario
• Bottleneck
Some effort are duplicated among different models
Factor is not reusable as it is built to meet special model’s requirement
More effort to maintain the factor as it could be from different sources and built for specified model
![Page 18: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/18.jpg)
New Solution - Factor System
Factor: the model input
Heterogeneous data sources
Syntax + Semantic layer
Calculate on the Hadoop
Factor life-cycle
Hadoop Summit 18
![Page 19: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/19.jpg)
What factor system provides
•Register Service
Factor code integration, deployment
External factor register
• Download Service
Online model input
Offline data exploring and model development
• Scheduling Service
Schedule the factor code in factor system due to different source data latency
• Dashboard
Factor status monitor, help understand the factor code running status
Factor meta definition, help data scientist better understand the factor to build the model
Hadoop Summit 19
![Page 20: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/20.jpg)
Capacity of Factor System
Hadoop Summit 20
• PB level source data volume
• 10+TB daily increment
• 1000+ permanent factors, historical data backup on HDFS
• Use Cases
Batch Models - serve all the machine learning models for Paid IM marketing
Adhoc – to support offline data exploring for data scientist and data developer
NRT/Real-time (Future) - build factor cache for NRT or real-time model use cases
![Page 21: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/21.jpg)
What model requires
Hadoop Summit 21
// Model Logic
Model result
Data Stream 1
Data Stream 2
• Model can access the wanted
data based on the logical
design
• Model can be executed in
expected env using right tech to
meet different use cases
• Model result can be delivered
for real business needs
![Page 22: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/22.jpg)
What is a model – Model Engine
Hadoop Summit 22
• Onboarding data from factor
system to model engine
• Execute models using different
tech solution to meet the real
scenarios
• Landing result to different
system to integrate with Ads
publisher
![Page 23: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/23.jpg)
What model engine can help more to data scientist
• Sampled data for model training
Data scientist can get pre-sampled represented ads to train/test the models
• Real production factors access
Avoid duplicated effort from data scientist when developing new models with existing factors
• Self Service
Integration, provide staging environment similar to real-production for model execution to avoid integration issue after
model deployment
Model deployment
Online debugging, all the model result/logs are kept in system to allow data scientist debugging during A/B testing
• Dashboard
Model status monitor
Hadoop Summit 23
![Page 24: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/24.jpg)
Model Lifecycle (Batch)
Hadoop Summit 24
![Page 25: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/25.jpg)
Model Lifecycle (NRT)
Hadoop Summit 25
![Page 26: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/26.jpg)
Anything Else for model?
• Is Model Result Reliable?
“SafeNet”
• Collect the historical behavior of
model
• Detect any significant difference
• Block the result sending to publisher
Hadoop Summit 26
•How to track?
Ads Monitor & Alert
• Expose online model result to Scientist/Analyst
• Dashboard
• Hourly & Daily report
• Alerts deliver to model owner & business owner
![Page 27: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/27.jpg)
Summary
• A/B Testing
Hbase, HDFS, MySQL, Oracle, Mongo
Java, Scala, SQL
• Machine learning model
HDFS, Kafka, Cassandra
Hive, Spark, Spark streaming
Java, Scala, R, Python
• Dashboard
InfluxDB
Grafana
Hadoop Summit 27
![Page 28: Quick Growth through ML Model A/B Testing · PDF fileIntroduce eBay Experimentation Platform for the Paid Search Ads ... Hadoop Summit 4 ... Scala, SQL •Machine learning model HDFS,](https://reader031.vdocuments.site/reader031/viewer/2022030401/5a7887857f8b9ab8768c5a81/html5/thumbnails/28.jpg)
Hadoop Summit 28