how can behavioral targeting help online advertising? yan et. al. 2009 october 23, 2014 sam hewitt

17
How can Behavioral Targeting Help Online Advertising? Yan et. al. 2009 October 23, 2014 Sam Hewitt

Upload: jason-simmons

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

How can Behavioral Targeting Help

Online Advertising?Yan et. al. 2009

October 23, 2014Sam Hewitt

What is behavioral targeting?• Behavioral targeting (BT) is used by advertisers to

target ads to different segments of users based on the users’ web history.• Web history consists of search queries and page

visits.

The authors answer three questions related to BT• Can BT help online marketers?• Are users who clicked on the same ad different than

those who did not?

• How much can BT help?• They use the metric, click-through-rate (CTR)

• Which BT strategy is the best?• Short window vs. long window• Browsing history vs. query history

Modeling user history

• Browsing history using weighted TFIDF• Users are represented by a matrix

• g = number of users, l = total number of urls• Users can be thought of as the document, and the url

can be thought of as the term in the TFIDF model• Users removed if they clicked on more than 100 ads in one day• Ads removed if they have fewer than 30 clicks over seven days

• Each entry in the matrix is given the value:

𝑢𝑖𝑗=log [𝑐𝑜𝑢𝑛𝑡 (𝑐𝑙𝑖𝑐𝑘𝑠 𝑜𝑛𝑈𝑅𝐿 𝑗𝑏𝑦𝑈𝑠𝑒𝑟 𝑖 )+1]× log [𝑙

𝑐𝑜𝑢𝑛𝑡(𝑢𝑠𝑒𝑟𝑠 h𝑤 𝑜 h𝑎𝑣𝑒𝑐𝑙𝑖𝑐𝑘𝑒𝑑𝑜𝑛𝑈𝑅𝐿 𝑗)]

Modeling user history

• Query history uses bag of words to populate the TFIDF matrix• Authors used Porter Stemming• Stop words were removed as well as terms that only

appeared once• If a term only appeared once it could not be used to determine

similarity between users• Number of terms reduced from 765k to 294k

• Term frequency is discounted based on clicks that a query led to• If a user searched a term three time and clicked on an ad once

that was served because of that term, the term frequency will be two

Modeling user history

• Four BT strategies were assessed• Long Term behavior based on Page Views (LP)• Long Term behavior based on Query Terms (LQ)• Short Term behavior based on Page Views (SP)• Short Term behavior based on Query Terms (SQ)

• Long term = seven days• Short term = one day

Calculating similarity

• : the number of users clicked ad j

• Within ad similarity: users who clicked the same ad

Calculating similarity

• Between ad similarity: users who clicked different ads•

• Ratio of within and between ad similarity•

Results from similarity calculation

Sw Sb R

LP 0.1417 0.0252 28.9217

LQ 0.2239 0.0196 44.2908

ST 0.1532 0.0281 24.5086

SQ 0.2594 0.0161 91.1890

* Scores are average across all ads or query terms

• Users who clicked the same ad are up to 90 times more similar• The authors chose to segment groups by k-means and CLUTO (a

clustering software package)• For all ad pairs, 99.37% had higher within ad user similarity than

between ad similarity

How is this helpful

• Marketers want to increase CTR on ads AND keep impressions high• We have to deal with the precision vs. recall

tradeoff• In this case

• Precision of user segment = CTR of that segment• Recall of user segment = clicks of that segment/total clicks

• Authors use F-measure to calculate tradeoff

Improvement to CTR by segmentation

• CTR was improved by up to 670% off of the non segmented CTR

Overall performance by different methods

LP LQ SP SQPre 8.67% 8.60% 13.35% 17.08%Rec 10.20% 22.34% 7.63% 25.58%

F 0.08 0.1 0.08 0.16Pre 8.62% 8.56% 14.61% 19.13%Rec 10.01% 20.51% 7.86% 21.43%

F 0.08 0.1 0.07 0.15Pre 8.84% 9.23% 19.76% 20.53%Rec 9.48% 18.20% 4.83% 20.75%

F 0.08 0.1 0.06 0.16Pre 8.76% 9.14% 19.38% 22.80%Rec 8.44% 17.88% 4.52% 17.78%

F 0.08 0.1 0.06 0.14Pre 9.02% 9.63% 23.47% 23.49%Rec 8.93% 17.62% 4.06% 19.35%

F 0.08 0.1 0.06 0.16Pre 8.85% 9.51% 23.09% 27.00%Rec 7.82% 16.65% 4.00% 15.55%

F 0.07 0.1 0.06 0.15Pre 9.09% 9.93% 25.68% 25.81%Rec 8.54% 17.98% 3.92% 19.78%

F 0.074 0.1 0.06 0.17Pre 8.87% 9.84% 25.43% 31.02%Rec 7.24% 15.58% 3.78% 14.52%

F 0.07 0.1 0.06 0.15

K-means (160 segments)

CLUTO (160 segments)

K-means (20 segments)

CLUTO (20 segments)

K-means (40 segments)

CLUTO (40 segments)

K-means (80 segments)

CLUTO (80 segments)

Issues with paper

• Authors are from Microsoft, which makes money from ads• They do not reveal their source for this data (likely

Yahoo!/Bing)• Do not discuss the topic of over segmentation• Are using outdated methodologies which work, but

there is likely a better way (mentioned in future work)

Summary

• Behavioral targeting is useful to advertisers• It is better to use a short window to segment• It is better to use query terms to segment• Increasing the number of segments provides better

targeting

Ad Entropy

• As mentioned marketers want high CTR and impressions• If specific user segments contribute highly to

impressions then one could just send the ad to them• If an ad is clicked more uniformly across all

segments, then you need to send the ad to some users who might not be interested in the ad, hurting CPC

Ad Entropy

• Probability that a user is in a segment given that they clicked on ad j

• Entropy of an ad is defined as

• Worst case of entropy is a uniform distribution across all groups.• Desired outcome is that user segmentation

decreases entropy

Ad Entropy

• Results