using machine learning to determine drivers of bounce and conversion (part 2)
TRANSCRIPT
![Page 1: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/1.jpg)
Using machine learning to determine drivers
of bounce and conversion(part 2)
Velocity 2016 New York
![Page 2: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/2.jpg)
Pat Meenan@patmeenan
Tammy Everts@tameverts
![Page 3: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/3.jpg)
What we did (and why we did it)
![Page 4: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/4.jpg)
![Page 5: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/5.jpg)
Get the codehttps://github.com/WPO-
Foundation/beacon-ml
![Page 6: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/6.jpg)
Deep learning
weights
![Page 7: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/7.jpg)
Random forestLots of random decision trees
![Page 8: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/8.jpg)
Vectorizing the data• Everything needs to be numeric• Strings converted to several inputs as
yes/no (1/0)• i.e. Device manufacturer• “Apple” would be a discrete input
• Watch out for input explosion (UA String)
![Page 9: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/9.jpg)
Balancing the data• 3% conversion rate• 97% accurate by always guessing
no• Subsample the data for 50/50 mix
![Page 10: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/10.jpg)
Smoothing the dataML works best on normally
distributed data
scaler = StandardScaler()x_train = scaler.fit_transform(x_train)x_val = scaler.transform(x_val)
![Page 11: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/11.jpg)
Validation data• Train on 80% of the data• Validate on 20% to prevent
overfitting–Training accuracy from validation set
![Page 12: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/12.jpg)
Input/output relationships
• SSL highly correlated with conversions• Long sessions highly correlated with
not bouncing• Remove correlated features from
training
![Page 13: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/13.jpg)
Training random forest
clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None)clf.fit(x_train, y_train)
![Page 14: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/14.jpg)
Feature importancesclf.feature_importances_
![Page 15: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/15.jpg)
Training deep learning
model = Sequential()model.add(...)model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"])model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)
![Page 16: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/16.jpg)
Understanding deep learning
![Page 17: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/17.jpg)
Brute force FTW• 93 input “features”• Train 93 models with 1 input–Measuring the prediction accuracy of each
• Train 92 models with 2 inputs– Top feature from first round–Measure combined prediction accuracy
• Lather, rinse, repeat…
![Page 18: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/18.jpg)
Visualizing the model• Take trained model (X inputs)
• Vary inputs–100ms to 20 seconds in 100ms intervals
• Apply the data smoothing from training set• model.predict_proba
![Page 19: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/19.jpg)
What we learned
![Page 20: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/20.jpg)
What’s in our beacon?
• Top-level – domain, timestamp, SSL• Session – start time, length (in pages), total load time• User agent – browser, OS, mobile ISP• Geo – country, city, organization, ISP, network speed• Bandwidth• Timers – base, custom, user-defined• Custom metrics• HTTP headers
https://docs.soasta.com/whatsinbeacon/
![Page 21: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/21.jpg)
Finding 1Maybe everything doesn’t matter
after all
![Page 22: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/22.jpg)
Bounce rate
![Page 23: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/23.jpg)
Finding 2DOM ready (aka DOM content
loaded) and average session load time were the best indicators of
bounce rate
![Page 24: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/24.jpg)
Up to 89.5% accuracy
![Page 25: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/25.jpg)
![Page 26: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/26.jpg)
Finding 3When it came to getting high
predictability, conversion data was tougher than bounce data
![Page 27: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/27.jpg)
81% prediction accuracy was as high as we got
![Page 28: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/28.jpg)
Finding 4Pages with more scripts were
more less likely to convert
![Page 29: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/29.jpg)
![Page 30: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/30.jpg)
Finding 5The number of DOM elements
matters…a lot
![Page 31: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/31.jpg)
![Page 32: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/32.jpg)
Finding 6Mobile-related measurements weren’t meaningful predictors of conversions
![Page 33: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/33.jpg)
![Page 34: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/34.jpg)
Finding 7Some conventional metrics
were not as important as we thought
![Page 35: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/35.jpg)
Feature Importance (bounce)
Start render 69 ~top 3
![Page 36: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/36.jpg)
Things to watch out for
(other than dangling prepositions)
![Page 37: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/37.jpg)
Yep, checkout pages are SLOW
![Page 38: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/38.jpg)
![Page 39: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/39.jpg)
Takeaways
![Page 40: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/40.jpg)
1. YMMV2. Do try this at home3. Gather your RUM data (lots of
it)4. Run the machine learning
against it5. If you get unexpected results,
keep digging
![Page 41: Using machine learning to determine drivers of bounce and conversion (part 2)](https://reader035.vdocuments.site/reader035/viewer/2022070603/58718dc21a28ab2c198b6d17/html5/thumbnails/41.jpg)
Thanks!@patmeenan@tameverts