computational advertising

Computational advertising

Kira Radinsky

Slides based on material from the paper “Bandits for Taxonomies: A Model-based Approach” by Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti, Vanja Josifovski, in SDM 200

The Content Match ProblemAds

Ads DB

Adv

ertis

ers

Ad Impression: Showing an add to a user


Ads DB

Adv

ertis

ers

Ad click: user click leads to revenue for ad server and content provider

(Click)


Ads DB

Adv

ertis

ers

The Content Match Problem: Match ads to pages to maximize clicks

(Click)


Ads DB

Adv

ertis

ers

Maximizing the number of clicks means:• For each webpage, find the ad with the best

Click-Through Rate (CTR)• But, without wasting too many impressions in learning this.

(Click)

Background: Bandits

𝑝1 𝑝2 𝑝3

Bandit “arms”

(Unknown payoff probabilities)

Pull arms sequentially so as to maximize the total expected reward• Estimate payoff probabilities • Bias the estimation process towards ‘better’ arms.

Background: Bandits Solutions

Try 1: Greedy solution:• Compute the sample mean of an arm ‘A’ by

dividing the total reward received from the arm by the number of times the arm has been pulled.

• At each time step – choose the arm with the highest sample mean.

Try 2: Naïve solution:• Pull each arm an equal number of timesEpsilon-greedy strategy:• The best bandit is selected for a propotion of of

the trials.• Another bandit is randomly selected (with

uniform probability) for a proportion of

Background: Bandits

pag

es ads

Web

page

1W

ebpa

ge2

Web

page

3

Bandit “arms”are ads

Background: BanditsW

ebpa

ges

AdsOne instance of the MAB problem

Unknown CTR

Content Match = A matrix• Each row is a bandit• Each cell has an

unknown CTR

Background: Bandits

Priority1

Bandit Policy:1. Assign Priority to each arm2. “Pull” arm with max priority

and observe reward3. Update priorities

Priority2 Priority3

Allocation

Estimation

Background: Bandits

Why not simply apply a bandit policy directly to the problem?• Converges too slowly with instances of MAB

and each bandit with arms per instance• Additional structure is available, we wish to

use it.

Multi-level PolicyAdsclasses

Webpagesclasses

Consider only two levels.

Multi-level PolicyApparel

Idea: CTRs in a block are homogeneous

Computers Travel Ad parent classes

Ad child classes

App

arel

Com

pute

rsTr

avel

Block

One MAB problem instance

Multi-level Policy

CTR in a block are homogeneous Used in allocation (picking ad for each

new page) Used in estimation (updating priorities

after each observation)

Multi-level Policy - Allocation

A C T

AC

T

? Page classifier

• Classify webpage page class, parent page class• Run bandit on ad parent classes pick one ad parent class• The two above steps results in a block


A C T

AC

T

? Page classifier

• Classify webpage page class, parent page class• Run bandit on ad parent classes pick one ad parent class• The two above steps results in a block• Run bandit among cells pick one ad class• (In general, continue from root to leaf final ad)


A C T

AC

T

? Page classifier

Bandits at higher levels:• Use aggregated information• Have fewer bandit arms Quickly figure out the best ad parent class

Multi-level Policy

CTR in a block are homogeneous Used in allocation (picking ad for each

new page) Used in estimation (updating priorities

after each observation)

Multi-level Policy - Estimation

CTR in a block are homogeneous Observations from one cell

also give information about others in the block.

How can we model this dependence?

A C T

AC

T


A C T

AC

T

Shrinkage Model

#clicks in cell

#impressions in cell

All cells in a block come from the same distribution


A C T

AC

T

• Intuitively, this leads to shrinkage of cell CTRs towards block CTRs

Estimated CTR

Beta prior (“block CTR”) Observed CTR

Experiments (S. Panday et al. 2007)

Root

20 nodes

221 nodes

~7000 nodes

Use this 2 levels

Depth 0

Depth 1

Depth 2

Depth 7

Taxonomy Structure


• Data collected over a 1 day period• Collected from only one server, under

some other ad-matching rules (not out bandit).

• ~229M impressions• CTR values have been linearly

transformed for purpose of confidentiality


Number of pulls

Clic

ks

Multi-level gives much higher #clicks!


Number of pulls

Mea

n-sq

uare

d E

rror

Multi-level gives much better MSE – it learnt more from its explorations.

Conclusions

• When having a CTR guided system, exploration is a key component.

• Short term penalty for the exploration needs to be limited (exploration budge)

• Most exploration mechanisms use a weighted combination of the predicted CTR rate (average) and the CTR uncertainty (variance)

• Exploration in a reduced dimensional space: class hirerchy

• Top down traversal of the hirerchy to determine the class of the ad to show

computational advertising

Documents

match ads

ad parent classthe

ad server

parent page classrun

bandits priority1bandit

multilevel policyapparelidea

ad classin general

admatching rules