online ad serving: theory and practice - sigmetrics

Online Ad Serving: Theory and Practice Aranyak Mehta Vahab Mirrokni June 7, 2011

Upload: others

Post on 12-Sep-2021




0 download


Page 1: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Ad Serving: Theory and Practice

Aranyak MehtaVahab Mirrokni

June 7, 2011

Page 2: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline of this talk

I Ad delivery for contract-based settingsI Ad ServingI Planning

I Ad serving in repeated auction settingsI General architecture.I Allocation for budget constrained advertisers.

I Other interactionsI Learning + allocationI Learning + auctionI Auction + contracts

Page 3: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Ad Delivery: Outline

I Basic InformationI Ad Serving.

I Targeting.I Online Allocation

I Ad Planning: Reservation

Page 4: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Online Advertising

I Pageviews (impressions) instead of queries.

I Display/Banner Ads, Video Ads, Mobile Ads.

I Cost-Per-Impression (CPM).

I Not Auction-based: offline negotiations + Online allocations.

Display/Banner Ads:

I Q1, 2010: One Trillion Display Ads in US. $2.7 billion.

I Top Advertiser: AT&T, Verizon, Scottrade.

I Ad Serving Systems e.g. Facebook, Google Doubleclick,AdMob.

Page 5: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Online Advertising

I Pageviews (impressions) instead of queries.

I Display/Banner Ads, Video Ads, Mobile Ads.

I Cost-Per-Impression (CPM).

I Not Auction-based: offline negotiations + Online allocations.

Display/Banner Ads:

I Q1, 2010: One Trillion Display Ads in US. $2.7 billion.

I Top Advertiser: AT&T, Verizon, Scottrade.

I Ad Serving Systems e.g. Facebook, Google Doubleclick,AdMob.

Page 6: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Online Advertising

I Pageviews (impressions) instead of queries.

I Display/Banner Ads, Video Ads, Mobile Ads.

I Cost-Per-Impression (CPM).

I Not Auction-based: offline negotiations + Online allocations.

Display/Banner Ads:

I Q1, 2010: One Trillion Display Ads in US. $2.7 billion.

I Top Advertiser: AT&T, Verizon, Scottrade.

I Ad Serving Systems e.g. Facebook, Google Doubleclick,AdMob.

Page 7: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Online Advertising

I Pageviews (impressions) instead of queries.

I Display/Banner Ads, Video Ads, Mobile Ads.

I Cost-Per-Impression (CPM).

I Not Auction-based: offline negotiations + Online allocations.

Display/Banner Ads:

I Q1, 2010: One Trillion Display Ads in US. $2.7 billion.

I Top Advertiser: AT&T, Verizon, Scottrade.

I Ad Serving Systems e.g. Facebook, Google Doubleclick,AdMob.

Page 8: Online Ad Serving: Theory and Practice - SIGMETRICS

Internet Advertising Revenues - 2010









Internet Ad Revenues - 2010

24% increase

Total $26.0 billion

Page 9: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

1. Planning: Contracts/Commitments with Advertisers.2. Ad Serving:

I Targeting: Predicting value of impressions.I Ad Allocation: Assigning Impressions to Ads Online.

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 10: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Strategic, Stochastic

Online, Stochastic

Offline, Online

1. Planning: Contracts/Commitments with Advertisers.2. Ad Serving:

I Targeting: Predicting value of impressions.I Ad Allocation: Assigning Impressions to Ads Online.

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 11: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

1. Planning: Contracts/Commitments with Advertisers.2. Ad Serving:

I Targeting: Predicting value of impressions.I Ad Allocation: Assigning Impressions to Ads Online.

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 12: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget

Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

1. Planning: Contracts/Commitments with Advertisers.2. Ad Serving:

I Targeting: Predicting value of impressions.I Ad Allocation: Assigning Impressions to Ads Online.

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 13: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

1. Planning: Contracts/Commitments with Advertisers.2. Ad Serving:

I Targeting: Predicting value of impressions.I Ad Allocation: Assigning Impressions to Ads Online.

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 14: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

I Objective Functions:I Efficiency: Users and Advertisers. Revenue of the Publisher.I Smoothness, Fairness, Delivery Penalty.

Page 15: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Ad Delivery: Outline

I Basic InformationI Ad Serving.

I Targeting.I Online Ad Allocation

I Ad Planning: Reservation

Page 16: Online Ad Serving: Theory and Practice - SIGMETRICS


Estimating Value of an impression.

I Behavioral TargetingI Interest-based Advertising.I Yan, Liu, Wang, Zhang, Jiang, Chen, 2009, How much can

Behavioral Targeting Help Online Advertising?

I Contextual TargetingI Information Retrieval (IR).I Broder, Fontoura, Josifovski, Riedel, A semantic approach to

contextual advertising

I Creative OptimizationI Experimentation

Page 17: Online Ad Serving: Theory and Practice - SIGMETRICS


Estimating Value of an impression.I Behavioral Targeting

I Interest-based Advertising.I Yan, Liu, Wang, Zhang, Jiang, Chen, 2009, How much can

Behavioral Targeting Help Online Advertising?

I Contextual TargetingI Information Retrieval (IR).I Broder, Fontoura, Josifovski, Riedel, A semantic approach to

contextual advertising

I Creative OptimizationI Experimentation

Page 18: Online Ad Serving: Theory and Practice - SIGMETRICS


Estimating Value of an impression.I Behavioral Targeting

I Interest-based Advertising.I Yan, Liu, Wang, Zhang, Jiang, Chen, 2009, How much can

Behavioral Targeting Help Online Advertising?

I Contextual TargetingI Information Retrieval (IR).I Broder, Fontoura, Josifovski, Riedel, A semantic approach to

contextual advertising

I Creative OptimizationI Experimentation

Page 19: Online Ad Serving: Theory and Practice - SIGMETRICS


Estimating Value of an impression.I Behavioral Targeting

I Interest-based Advertising.I Yan, Liu, Wang, Zhang, Jiang, Chen, 2009, How much can

Behavioral Targeting Help Online Advertising?

I Contextual TargetingI Information Retrieval (IR).I Broder, Fontoura, Josifovski, Riedel, A semantic approach to

contextual advertising

I Creative OptimizationI Experimentation

Page 20: Online Ad Serving: Theory and Practice - SIGMETRICS

Predicting value of Impressions for Display Ads

I Estimating Click-Through-Rate (CTR).I Budgeted Multi-armed Bandit

I Probability of Conversion.

I Long-term vs. Short-term value of display ads?I Archak, Mirrokni, Muthukrishnan, 2010 Graph-based Models.

I Computing Adfactors based on AdGraphsI Markov Models for Advertiser-specific User Behavior

Page 21: Online Ad Serving: Theory and Practice - SIGMETRICS

Predicting value of Impressions for Display Ads

I Estimating Click-Through-Rate (CTR).I Budgeted Multi-armed Bandit

I Probability of Conversion.I Long-term vs. Short-term value of display ads?

I Archak, Mirrokni, Muthukrishnan, 2010 Graph-based Models.I Computing Adfactors based on AdGraphsI Markov Models for Advertiser-specific User Behavior

Page 22: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Ad Delivery: Outline

I Basic Information

I Ad Planning: ReservationI Ad Serving.

I Targeting.I Online Ad Allocation

Page 23: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline: Online Allocation

I Online Stochastic Assignment ProblemsI Online (Stochastic) MatchingI Online Stochastic PackingI Online Generalized Assignment (with free disposal)I Experimental Results

I Online Learning and Allocation

Page 24: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Ad Allocation

I When page arrives, assign an eligible ad.I value of assigning page i to ad a: via

I Display Ads (DA) problem:I Maximize value of ads served: max

∑i,a viaxia

I Capacity of ad a:∑

i∈A(a) xia ≤ Ca

Page 25: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Ad Allocation

I When page arrives, assign an eligible ad.I value of assigning page i to ad a: via

I Display Ads (DA) problem:I Maximize value of ads served: max

∑i,a viaxia

I Capacity of ad a:∑

i∈A(a) xia ≤ Ca

Page 26: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Ad Allocation

I When page arrives, assign an eligible ad.I revenue from assigning page i to ad a: bia

I “AdWords” (AW) problem:I Maximize revenue of ads served: max

∑i,a biaxia

I Budget of ad a:∑

i∈A(a) biaxia ≤ Ba

Page 27: Online Ad Serving: Theory and Practice - SIGMETRICS

General Form of LP

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst-Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

[MSVV,BJN]:1− 1

e -aprx

Page 28: Online Ad Serving: Theory and Practice - SIGMETRICS

General Form of LP

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst-Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

[MSVV,BJN]:1− 1

e -aprx

Page 29: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution


[DH09]:1−ε-aprx,ifopt max via

Stochastic i.i.d model:

I i.i.d model with known distribution

I random order model (i.i.d model with unknown distribution)

Page 30: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution


[DH09]:1−ε-aprx,ifopt max via

Stochastic i.i.d model:

I i.i.d model with known distribution

I random order model (i.i.d model with unknown distribution)

Page 31: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution


[DH09]:1−ε-aprx,ifopt max via

Stochastic i.i.d model:

I i.i.d model with known distribution

I random order model (i.i.d model with unknown distribution)

Page 32: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09]:0.67-aprxi.i.d with knowndistribution


[DH09]:1−ε-aprx,ifopt max via

Page 33: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution

[FHKMS10,AWY]:1−ε-aprx,if opt max viaand Ca max sia

[DH09]:1−ε-aprx,ifopt max via

random order = i.i.d. model with unknown distribution

Page 34: Online Ad Serving: Theory and Practice - SIGMETRICS

Stochastic DA: Dual Algorithm

max∑i ,a


xia ≤ 1 (∀ i)∑i

xia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


zi ≥ via − βa (∀i , a)

βa, zi ≥ 0 (∀i , a)

Feldman, Henzinger, Korula, M., Stein 2010Thm[FHKMS10,AWY]: W.h.p, this algorithm is a (1−O(ε))-aprx,as long as each item has low value (via ≤ εopt

m log n ), and large

capacity (Ca ≥ m log nε3


Fact: If optimum β∗a are known, this alg. finds optI Proof: Comp. slackness. Given β∗a , compute x∗ as follows:

x∗ia = 1 if a = argmax(via − β∗a).

Page 35: Online Ad Serving: Theory and Practice - SIGMETRICS

Stochastic DA: Dual Algorithm

max∑i ,a


xia ≤ 1 (∀ i)∑i

xia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


zi ≥ via − βa (∀i , a)

βa, zi ≥ 0 (∀i , a)

Algorithm:I Observe the first ε fraction sample of impressions.I Learn a dual variable for each ad βa, by solving the dual

program on the sample.I Assign each impression i to ad a that maximizes via − βa.

Feldman, Henzinger, Korula, M., Stein 2010Thm[FHKMS10,AWY]: W.h.p, this algorithm is a (1−O(ε))-aprx,as long as each item has low value (via ≤ εopt

m log n ), and large

capacity (Ca ≥ m log nε3


Fact: If optimum β∗a are known, this alg. finds optI Proof: Comp. slackness. Given β∗a , compute x∗ as follows:

x∗ia = 1 if a = argmax(via − β∗a).

Page 36: Online Ad Serving: Theory and Practice - SIGMETRICS

Stochastic DA: Dual AlgorithmFeldman, Henzinger, Korula, M., Stein 2010Thm[FHKMS10,AWY]: W.h.p, this algorithm is a (1−O(ε))-aprx,as long as each item has low value (via ≤ εopt

m log n ), and large

capacity (Ca ≥ m log nε3


Fact: If optimum β∗a are known, this alg. finds optI Proof: Comp. slackness. Given β∗a , compute x∗ as follows:

x∗ia = 1 if a = argmax(via − β∗a).

Page 37: Online Ad Serving: Theory and Practice - SIGMETRICS

Stochastic DA: Dual AlgorithmFeldman, Henzinger, Korula, M., Stein 2010Thm[FHKMS10,AWY]: W.h.p, this algorithm is a (1−O(ε))-aprx,as long as each item has low value (via ≤ εopt

m log n ), and large

capacity (Ca ≥ m log nε3


Fact: If optimum β∗a are known, this alg. finds optI Proof: Comp. slackness. Given β∗a , compute x∗ as follows:

x∗ia = 1 if a = argmax(via − β∗a).

Page 38: Online Ad Serving: Theory and Practice - SIGMETRICS

Stochastic DA: Dual AlgorithmFeldman, Henzinger, Korula, M., Stein 2010Thm[FHKMS10,AWY]: W.h.p, this algorithm is a (1−O(ε))-aprx,as long as each item has low value (via ≤ εopt

m log n ), and large

capacity (Ca ≥ m log nε3


Fact: If optimum β∗a are known, this alg. finds optI Proof: Comp. slackness. Given β∗a , compute x∗ as follows:

x∗ia = 1 if a = argmax(via − β∗a).

Lemma: In the random order model, W.h.p., the sample β′a areclose to β∗a .

I Extending DH09.

Page 39: Online Ad Serving: Theory and Practice - SIGMETRICS

General Stochastic Packing LPs

I m fixed resources with capacity Ca

I Items i arrive online with options Oi , values vio , rsrc. use sioa.I Choose o ∈ Oi , using up capacity sioa in all a.

Thm[FHKMS10,AWY]: W.h.p, the PD algorithm is a(1−O(ε))-aprx, as long as items have low value (vio ≤ εopt

log n ) and

small size (sioa ≤ ε3Calog n ).

Other Results and Extensions (random order model):

I Agrawal, Wang, Ye: Updating dual variables by periodicsolution of the dual program: Ca ≥ m log n

ε2or sioa ≤ ε2Ca


I Vee, Vassilvitskii , Shanmugasundaram 2010: extension toconvex objective functions: Using KKT conditions.

Page 40: Online Ad Serving: Theory and Practice - SIGMETRICS

General Stochastic Packing LPs

I m fixed resources with capacity Ca

I Items i arrive online with options Oi , values vio , rsrc. use sioa.I Choose o ∈ Oi , using up capacity sioa in all a.

Thm[FHKMS10,AWY]: W.h.p, the PD algorithm is a(1−O(ε))-aprx, as long as items have low value (vio ≤ εopt

log n ) and

small size (sioa ≤ ε3Calog n ).

Other Results and Extensions (random order model):

I Agrawal, Wang, Ye: Updating dual variables by periodicsolution of the dual program: Ca ≥ m log n

ε2or sioa ≤ ε2Ca


I Vee, Vassilvitskii , Shanmugasundaram 2010: extension toconvex objective functions: Using KKT conditions.

Page 41: Online Ad Serving: Theory and Practice - SIGMETRICS

General Stochastic Packing LPs

I m fixed resources with capacity Ca

I Items i arrive online with options Oi , values vio , rsrc. use sioa.I Choose o ∈ Oi , using up capacity sioa in all a.

Thm[FHKMS10,AWY]: W.h.p, the PD algorithm is a(1−O(ε))-aprx, as long as items have low value (vio ≤ εopt

log n ) and

small size (sioa ≤ ε3Calog n ).

Other Results and Extensions (random order model):

I Agrawal, Wang, Ye: Updating dual variables by periodicsolution of the dual program: Ca ≥ m log n

ε2or sioa ≤ ε2Ca


I Vee, Vassilvitskii , Shanmugasundaram 2010: extension toconvex objective functions: Using KKT conditions.

Page 42: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

?[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution

[FHKMS10,AWY]:1−ε-aprx,if opt max viaand Ca max sia

[DH09]:1−ε-aprx,ifopt max via

Page 43: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

Free Disposal[FKMMP09]:1− 1

e -aprx:if Ca max sia

[MSVV,BJN]:1− 1

e -aprx


[FMMM09,MOS11]:0.702-aprxi.i.d with knowndistribution

[FHKMS10,AWY]:1−ε-aprx,if opt max viaand Ca max sia

[DH09]:1−ε-aprx,ifopt max via

Page 44: Online Ad Serving: Theory and Practice - SIGMETRICS

DA: Free Disposal Model

0.07Ad 1: C1 = 1

0.07Ad 1: C1 = 1


I Advertisers may not complain about extra impressions, but nobonus points for extra impressions, either.

Page 45: Online Ad Serving: Theory and Practice - SIGMETRICS

DA: Free Disposal Model

0.07Ad 1: C1 = 1

0.07Ad 1: C1 = 1


I Advertisers may not complain about extra impressions, but nobonus points for extra impressions, either.

Page 46: Online Ad Serving: Theory and Practice - SIGMETRICS

DA: Free Disposal Model

0.07Ad 1: C1 = 1


0.07Ad 1: C1 = 1


I Advertisers may not complain about extra impressions, but nobonus points for extra impressions, either.

Page 47: Online Ad Serving: Theory and Practice - SIGMETRICS

DA: Free Disposal Model

0.07Ad 1: C1 = 1


I Advertisers may not complain about extra impressions, but nobonus points for extra impressions, either.

Page 48: Online Ad Serving: Theory and Practice - SIGMETRICS

DA: Free Disposal Model

0.07Ad 1: C1 = 1


I Advertisers may not complain about extra impressions, but nobonus points for extra impressions, either.

I Value of advertiser = sum of values of top Ca items she gets.

Page 49: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithm

Assign impression to an advertisermaximizing Marginal Gain = (imp. value - min. impression value).

I Competitive Ratio: 1/2. [NWF78]I Follows from submodularity of the value function.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

Evenly Split?

Page 50: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithm

Assign impression to an advertisermaximizing Marginal Gain = (imp. value - min. impression value).

I Competitive Ratio: 1/2. [NWF78]I Follows from submodularity of the value function.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

Evenly Split?

Page 51: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithm

Assign impression to an advertisermaximizing Marginal Gain = (imp. value - min. impression value).

I Competitive Ratio: 1/2. [NWF78]I Follows from submodularity of the value function.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n

n copiesn copies


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

Evenly Split?

Page 52: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithm

Assign impression to an advertisermaximizing Marginal Gain = (imp. value - min. impression value).

I Competitive Ratio: 1/2. [NWF78]I Follows from submodularity of the value function.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

Evenly Split?

Page 53: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithm

Assign impression to an advertisermaximizing Marginal Gain = (imp. value - min. impression value).

I Competitive Ratio: 1/2. [NWF78]I Follows from submodularity of the value function.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

Evenly Split?

Page 54: Online Ad Serving: Theory and Practice - SIGMETRICS

A better algorithm?

Assign impression to an advertiser amaximizing (imp. value - βa),

where βa = average value of top Ca impressions assigned to a.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

I Competitive Ratio: 12 if Ca >> 1. [FKMMP09]

I Primal-Dual Approach.

Page 55: Online Ad Serving: Theory and Practice - SIGMETRICS

A better algorithm?

Assign impression to an advertiser amaximizing (imp. value - βa),

where βa = average value of top Ca impressions assigned to a.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

I Competitive Ratio: 12 if Ca >> 1. [FKMMP09]

I Primal-Dual Approach.

Page 56: Online Ad Serving: Theory and Practice - SIGMETRICS

A better algorithm?

Assign impression to an advertiser amaximizing (imp. value - βa),

where βa = average value of top Ca impressions assigned to a.


1 + ε

Ad 1: C1 = n

Ad 2: C2 = n1

n copiesn copies

n copies

I Competitive Ratio: 12 if Ca >> 1. [FKMMP09]

I Primal-Dual Approach.

Page 57: Online Ad Serving: Theory and Practice - SIGMETRICS

An Optimal Algorithm

Assign impression to an advertiser a:maximizing (imp. value - βa),

I Greedy: βa = min. impression assigned to a.

I Better (pd-avg): βa = average value of top Ca impressionsassigned to a.

I Optimal (pd-exp): order value of edges assigned to a:v(1) ≥ v(2) . . . ≥ v(Ca):

βa =1

Ca(e − 1)


v(j)(1 +1


I Thm: pd-exp achieves optimal competitive Ratio: 1− 1e − ε if

Ca > O(1ε ). [Feldman, Korula, M., Muthukrishnan, Pal 2009]

Page 58: Online Ad Serving: Theory and Practice - SIGMETRICS

An Optimal Algorithm

Assign impression to an advertiser a:maximizing (imp. value - βa),

I Greedy: βa = min. impression assigned to a.

I Better (pd-avg): βa = average value of top Ca impressionsassigned to a.

I Optimal (pd-exp): order value of edges assigned to a:v(1) ≥ v(2) . . . ≥ v(Ca):

βa =1

Ca(e − 1)


v(j)(1 +1


I Thm: pd-exp achieves optimal competitive Ratio: 1− 1e − ε if

Ca > O(1ε ). [Feldman, Korula, M., Muthukrishnan, Pal 2009]

Page 59: Online Ad Serving: Theory and Practice - SIGMETRICS

An Optimal Algorithm

Assign impression to an advertiser a:maximizing (imp. value - βa),

I Greedy: βa = min. impression assigned to a.

I Better (pd-avg): βa = average value of top Ca impressionsassigned to a.

I Optimal (pd-exp): order value of edges assigned to a:v(1) ≥ v(2) . . . ≥ v(Ca):

βa =1

Ca(e − 1)


v(j)(1 +1


I Thm: pd-exp achieves optimal competitive Ratio: 1− 1e − ε if

Ca > O(1ε ). [Feldman, Korula, M., Muthukrishnan, Pal 2009]

Page 60: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Generalized Assignment (with free disposal)

I Multiple Knapsack: Item i may have different value (via) anddifferent size sia for different ads a.

I DA: sia = 1, AW: via = sia.

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


siaβa + zi ≥ via (∀i , a)

βa, zi ≥ 0 (∀i , a)

I Offline Optimization: 1− 1e − δ-aprx[FGMS07,FV08].

I Thm[FKMMP09]: There exists a 1− 1e − ε-approximation

algorithm if Camax sia

≥ 1ε .

Page 61: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Generalized Assignment (with free disposal)

I Multiple Knapsack: Item i may have different value (via) anddifferent size sia for different ads a.

I DA: sia = 1, AW: via = sia.

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


siaβa + zi ≥ via (∀i , a)

βa, zi ≥ 0 (∀i , a)

I Offline Optimization: 1− 1e − δ-aprx[FGMS07,FV08].

I Thm[FKMMP09]: There exists a 1− 1e − ε-approximation

algorithm if Camax sia

≥ 1ε .

Page 62: Online Ad Serving: Theory and Practice - SIGMETRICS

Proof Idea: Primal-Dual Analysis [BJN]

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


siaβa + zi ≥ via (∀i , a)

βa, zi ≥ 0 (∀i , a)

I Proof:

1. Start from feasible primal and dual (xia = 0, βa = 0, andzi = 0, i.e., Primal=Dual=0).

2. After each assignment, update x , β, z variables and keepprimal and dual solutions.

3. Show ∆(Dual) ≤ (1− 1e )∆(Primal).

Page 63: Online Ad Serving: Theory and Practice - SIGMETRICS

Proof Idea: Primal-Dual Analysis [BJN]

max∑i ,a


xia ≤ 1 (∀ i)∑i

siaxia ≤ Ca (∀ a)

xia ≥ 0 (∀ i , a)


Caβa +∑i


siaβa + zi ≥ via (∀i , a)

βa, zi ≥ 0 (∀i , a)

I Proof:

1. Start from feasible primal and dual (xia = 0, βa = 0, andzi = 0, i.e., Primal=Dual=0).

2. After each assignment, update x , β, z variables and keepprimal and dual solutions.

3. Show ∆(Dual) ≤ (1− 1e )∆(Primal).

Page 64: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Allocation: Problems and Models

Online Matching:via = sia = 1

Disp. Ads (DA):sia = 1

AdWords (AW):sia = via

Worst Case Greedy: 12 ,

[KVV]: 1− 1e -aprx

Free Disposal[FKMMP09]:1− 1

e -aprx: ifCa max sia

[MSVV,BJN]:1− 1

e -aprx



[FHKMS10,AWY]:1−ε-aprx,if opt max viaand Ca max sia

[DH09]:1−ε-aprx,ifopt max via

Page 65: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline: Online Allocation

I Online Stochastic Assignment ProblemsI Online Stochastic PackingI Online Generalized Assignment (with free disposal)I Experimental EvaluationI Online Stochastic Weighted Matching

Page 66: Online Ad Serving: Theory and Practice - SIGMETRICS

Dual-based Algorithms in Practice

I Algorithm:I Assign each item i to ad a that maximizes via − βa.

I More practical compared to Primal Algorithms:I Just keep one number βa per advertiser.I Suitable for Distributed Ad Serving Schemes.

I Training-based AlgorithmsI Compute βa based on historical/sample data.

I Hybrid approach (see also [MNS07]):I Start with trained βa (past history), blend in online algorithm.

Page 67: Online Ad Serving: Theory and Practice - SIGMETRICS

Dual-based Algorithms in Practice

I Algorithm:I Assign each item i to ad a that maximizes via − βa.

I More practical compared to Primal Algorithms:I Just keep one number βa per advertiser.I Suitable for Distributed Ad Serving Schemes.

I Training-based AlgorithmsI Compute βa based on historical/sample data.

I Hybrid approach (see also [MNS07]):I Start with trained βa (past history), blend in online algorithm.

Page 68: Online Ad Serving: Theory and Practice - SIGMETRICS

Dual-based Algorithms in Practice

I Algorithm:I Assign each item i to ad a that maximizes via − βa.

I More practical compared to Primal Algorithms:I Just keep one number βa per advertiser.I Suitable for Distributed Ad Serving Schemes.

I Training-based AlgorithmsI Compute βa based on historical/sample data.

I Hybrid approach (see also [MNS07]):I Start with trained βa (past history), blend in online algorithm.

Page 69: Online Ad Serving: Theory and Practice - SIGMETRICS

Dual-based Algorithms in Practice

I Algorithm:I Assign each item i to ad a that maximizes via − βa.

I More practical compared to Primal Algorithms:I Just keep one number βa per advertiser.I Suitable for Distributed Ad Serving Schemes.

I Training-based AlgorithmsI Compute βa based on historical/sample data.

I Hybrid approach (see also [MNS07]):I Start with trained βa (past history), blend in online algorithm.

Page 70: Online Ad Serving: Theory and Practice - SIGMETRICS

Experiments: setup

I Real ad impression data from several large publishers

I 200k - 1.5M impressions in simulation period

I 100 - 2600 advertisers

I Edge weights = predicted click probability

I Efficiency: free disposal modelI Algorithms:

I greedy: maximum marginal valueI pd-avg, pd-exp: pure online primal-dual from [FKMMP09].I dualbase: training-based primal-dual [FHKMS10]I hybrid: convex combo of training based, pure online.I lp-weight: optimum efficiency

Page 71: Online Ad Serving: Theory and Practice - SIGMETRICS

Experimental Evaluation: Summary

Algorithm Avg Efficiency%opt 100

greedy 69pd-avg 77pd-exp 82

dualbase 87hybrid 89

I pd-exp & pd-avg outperform greedy by 9% and 14% (withmore improvements in tight competition.)

I dualbase outperforms pure online algorithms by 6% to 12%.

I Hybrid has a mild improvement of 2% (up to 10%).

I pd-avg performs much better than the theoretical analysis.

Page 72: Online Ad Serving: Theory and Practice - SIGMETRICS

Other Metrics: Fairness

I Qualititative definition: advertisers are “treated equally.”

I One suggestion[FHKMS10]: Compute ”fair” solution x∗,measure `1 distance to x∗.

I Fair solution:I Each a chooses best Ca impressions (highest via)I Repeat:

I Impressions shared among those who chose them.I If some a not receiving Ca imps, a chooses an additional imp.

Page 73: Online Ad Serving: Theory and Practice - SIGMETRICS

Other Metrics: Fairness

I Qualititative definition: advertisers are “treated equally.”

I One suggestion[FHKMS10]: Compute ”fair” solution x∗,measure `1 distance to x∗.

I Fair solution:I Each a chooses best Ca impressions (highest via)I Repeat:

I Impressions shared among those who chose them.I If some a not receiving Ca imps, a chooses an additional imp.

Page 74: Online Ad Serving: Theory and Practice - SIGMETRICS

Other Metrics: Fairness

I Qualititative definition: advertisers are “treated equally.”

I One suggestion[FHKMS10]: Compute ”fair” solution x∗,measure `1 distance to x∗.

I Fair solution:I Each a chooses best Ca impressions (highest via)I Repeat:

I Impressions shared among those who chose them.I If some a not receiving Ca imps, a chooses an additional imp.

Page 75: Online Ad Serving: Theory and Practice - SIGMETRICS

Experiments: highlights

40 50 60 70 80 90 100














1 2 3 4 5 6 7 8 9 10 11 12 13 14 15




cy (










Page 76: Online Ad Serving: Theory and Practice - SIGMETRICS

Experiments: highlights

70 75 80 85 90 95 100








dualbasegreedy hybrid




fair1 2 3 4 5 6 7 8 9 10 11 12 13 14 15




cy (










Page 77: Online Ad Serving: Theory and Practice - SIGMETRICS

In Production

I Smooth Delivery of Display AdsI Delivery of impressions throughout time should follow the

traffic smoothly.I Model this with multiple nested capacity constraints:

1− 1/e-competitive algorithm for this extension.I Bhalgat, Feldman, M., 2011

I Combined Allocation with Ad ExchangeI Yield Optimization of Display Advertising with Ad ExchangeI Belsairo, Feldman, M., Muthukrishnan, 2011

I Re-act adaptively and quickly to changes in traffic:I Use a control loop on the dual variable.I Assymptotically optimal policy: B. Tan and R. Srikant, 2011

Page 78: Online Ad Serving: Theory and Practice - SIGMETRICS

In Production

I Smooth Delivery of Display AdsI Delivery of impressions throughout time should follow the

traffic smoothly.I Model this with multiple nested capacity constraints:

1− 1/e-competitive algorithm for this extension.I Bhalgat, Feldman, M., 2011

I Combined Allocation with Ad ExchangeI Yield Optimization of Display Advertising with Ad ExchangeI Belsairo, Feldman, M., Muthukrishnan, 2011

I Re-act adaptively and quickly to changes in traffic:I Use a control loop on the dual variable.I Assymptotically optimal policy: B. Tan and R. Srikant, 2011

Page 79: Online Ad Serving: Theory and Practice - SIGMETRICS

In Production

I Smooth Delivery of Display AdsI Delivery of impressions throughout time should follow the

traffic smoothly.I Model this with multiple nested capacity constraints:

1− 1/e-competitive algorithm for this extension.I Bhalgat, Feldman, M., 2011

I Combined Allocation with Ad ExchangeI Yield Optimization of Display Advertising with Ad ExchangeI Belsairo, Feldman, M., Muthukrishnan, 2011

I Re-act adaptively and quickly to changes in traffic:I Use a control loop on the dual variable.I Assymptotically optimal policy: B. Tan and R. Srikant, 2011

Page 80: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Ad Allocation: Interesting Problems

I Online Stochastic DAI Simultaneous online worst-case & stochastic optimization.I Tradeoff between delivery penalty and efficiency: Covering

Constraints?I More complex stochastic modeling (drift, seasonality, etc.)

Page 81: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline: Online Allocation

I Online Stochastic Ad AllocationI Online Stochastic PackingI Online Generalized Assignment (with free disposal)I Experimental ResultsI Online Stochastic Weighted Matching

Page 82: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Stochastic Weighted Matching

“ALG is α-approximation?” if E [ALG(H)]E [OPT(H)] ≥ α

Power of Two ChoicesI Offline:

1. Find an optimal fractional solution x∗e to a discountedmatching LP, where xe ≤ 1− 1

e .2. Sample a matching Ms from x∗.3. Let M ′ = M1\Ms where M1 is the maximum weighted


I Online: try the edge in Ms first, and if it doesn’t work, try M ′.

I Thm: Approximation factor is better than 0.66.

(Haeupler, M., ZadiMoghaddam, 2011).

Page 83: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Stochastic Weighted Matching

“ALG is α-approximation?” if E [ALG(H)]E [OPT(H)] ≥ α

Power of Two ChoicesI Offline:

1. Find an optimal fractional solution x∗e to a discountedmatching LP, where xe ≤ 1− 1

e .2. Sample a matching Ms from x∗.3. Let M ′ = M1\Ms where M1 is the maximum weighted


I Online: try the edge in Ms first, and if it doesn’t work, try M ′.

I Thm: Approximation factor is better than 0.66.

(Haeupler, M., ZadiMoghaddam, 2011).

Page 84: Online Ad Serving: Theory and Practice - SIGMETRICS

Open Problems

I Online Stochastic Display Ad AllocationI Simultaneous online worst-case & stochastic optimization.I Tradeoff between delivery penalty and efficiency: Covering

Constraints?I More complex stochastic modeling (drift, seasonality, etc.)

I Online Stochastic Weighted MatchingI Online Stochastic Matching: Close gap between 0.703 & 0.81.I Online Stochastic Weighted Matching: Power of many

choices? Better lower bound?I Online Weighted Matching (with Free Disposal): Is

1− 1/e-approximation possible?

Page 85: Online Ad Serving: Theory and Practice - SIGMETRICS

Contract-based Ad Delivery: Outline

I Basic InformationI Ad Serving.

I Targeting.I Online Allocation

I Ad Planning: Reservation

Page 86: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery: Overview


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

Page 87: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning: Research Issues

Which set of contracts should we accept?

Related Research Issues.I Pricing Uncertain Inventory.

I Stochastic Supply and DemandI Bundling Opportunities

I Online Mechanisms for Signing Contracts.

I Contracts with Delivery Penalty.

I Offline Optimization of Contracts.

Page 88: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning: Research Issues

Which set of contracts should we accept?Related Research Issues.

I Pricing Uncertain Inventory.I Stochastic Supply and DemandI Bundling Opportunities

I Online Mechanisms for Signing Contracts.

I Contracts with Delivery Penalty.

I Offline Optimization of Contracts.

Page 89: Online Ad Serving: Theory and Practice - SIGMETRICS

General Ad Planning: Weighted Matching

I n advertisers, and set Y of impressions (items).I Each advertiser i

I Interested in a set Ji of impressions, (e.g, young women inSeattle),

I Needs di impressions (Demand),I Value vit (or Bid bi ) for each impression t,

Jjdi = 4





(Ji, bi, di)

Efficiency (or Revenue) Maximization: Find an assignment withthe maximum value.

Page 90: Online Ad Serving: Theory and Practice - SIGMETRICS

Contracts with Delivery Penalty

If we don’t meet the demand of an advertiser this month, weshould give him/her free impressions in the next month.

I Each advertiser iI Needs di impressions,I Bids bi for each impression,I Penalty λbi for not satisfying each unit (Guaranteed Delivery).

Page 91: Online Ad Serving: Theory and Practice - SIGMETRICS

Contracts with Delivery Penalty

If we don’t meet the demand of an advertiser this month, weshould give him/her free impressions in the next month.

I Each advertiser iI Needs di impressions,I Bids bi for each impression,I Penalty λbi for not satisfying each unit (Guaranteed Delivery).

Page 92: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning With Penalties

I n advertisers, and set Y of impressions (items).I Each advertiser i

I Interested in set Ji of impressions, (e.g, young women inSeattle),

I Bids bi for each impression,I Needs di impressions,I Penalty λbi for not satisfying each unit (Guaranteed Delivery).

Jjdi = 4





(Ji, bi, di)

Goal: Choose a set T of advertisers to maximize revenue, f (T ).

Page 93: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning With Penalties

I n advertisers, and set Y of impressions (items).I Each advertiser i

I Interested in set Ji of impressions, (e.g, young women inSeattle),

I Bids bi for each impression,I Needs di impressions,I Penalty λbi for not satisfying each unit (Guaranteed Delivery).

Jjdi = 4





(Ji, bi, di)

Goal: Choose a set T of advertisers to maximize revenue, f (T ).

Page 94: Online Ad Serving: Theory and Practice - SIGMETRICS

Relation to Weighted MatchingIf we give q items to advertiser i , we get

qbi − λbi (di − q) = q(1 + λ)bi − diλbi = q(1 + λ)bi − ci

If we commit to a set T of advertisers:

f (T ) = P(T )−∑

i∈T ci = P(T )− C (T ).

I C (T ) =∑

i∈T ci where ci = diλbi .

I P(T ) be the maximum weighted matching in the followingbipartite graph with (1 + λ)bi as weights of edges.

I Advertiser i ∈ T needs at most di ads.I Each ad can go to at most one advertiser.




Maximum weighted matching for this graph.

(1 + λ)bj(1 + λ)bi

P (T ) =

Page 95: Online Ad Serving: Theory and Practice - SIGMETRICS

Relation to Weighted MatchingIf we give q items to advertiser i , we get

qbi − λbi (di − q) = q(1 + λ)bi − diλbi = q(1 + λ)bi − ci

If we commit to a set T of advertisers:

f (T ) = P(T )−∑

i∈T ci = P(T )− C (T ).

I C (T ) =∑

i∈T ci where ci = diλbi .

I P(T ) be the maximum weighted matching in the followingbipartite graph with (1 + λ)bi as weights of edges.

I Advertiser i ∈ T needs at most di ads.I Each ad can go to at most one advertiser.




Maximum weighted matching for this graph.

(1 + λ)bj(1 + λ)bi

P (T ) =

Page 96: Online Ad Serving: Theory and Practice - SIGMETRICS

Discussion Summary

Given a set T of advertisers, maximizing f (T ) is easy as it is amaximum weighted matching.

Challenge: Which set of advertisers T should we accept tomaximize f (T )?


I Offline Optimization.

I Can be used in a negotiation process.

I Can have different penalty factors λi for each advertiser i .

Page 97: Online Ad Serving: Theory and Practice - SIGMETRICS

Discussion Summary

Given a set T of advertisers, maximizing f (T ) is easy as it is amaximum weighted matching.

Challenge: Which set of advertisers T should we accept tomaximize f (T )?


I Offline Optimization.

I Can be used in a negotiation process.

I Can have different penalty factors λi for each advertiser i .

Page 98: Online Ad Serving: Theory and Practice - SIGMETRICS

Discussion Summary

Given a set T of advertisers, maximizing f (T ) is easy as it is amaximum weighted matching.

Challenge: Which set of advertisers T should we accept tomaximize f (T )?


I Offline Optimization.

I Can be used in a negotiation process.

I Can have different penalty factors λi for each advertiser i .

Page 99: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning with Penalties

Feige, Immorlica, M., Nazerzadeh, 2008.

I Hardness: No Constant-factor Approximation.I Heuristic Greedy Algorithms:

I Simple Greedy Algorithm: Bicriteria Approximation.I At each step, add the advertiser with the maximum

profit-per-impression fixing the existing assignment.

Theorem: This is a good approximation compared to theoptimum with larger penalty factor.

I Greedy-Rate Algorithm: Structural Approximation.

Page 100: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning with Penalties

Feige, Immorlica, M., Nazerzadeh, 2008.

I Hardness: No Constant-factor Approximation.I Heuristic Greedy Algorithms:

I Simple Greedy Algorithm: Bicriteria Approximation.I At each step, add the advertiser with the maximum

profit-per-impression fixing the existing assignment.

Theorem: This is a good approximation compared to theoptimum with larger penalty factor.

I Greedy-Rate Algorithm: Structural Approximation.

Page 101: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning with Penalties

Feige, Immorlica, M., Nazerzadeh, 2008.

I Hardness: No Constant-factor Approximation.I Heuristic Greedy Algorithms:

I Simple Greedy Algorithm: Bicriteria Approximation.I At each step, add the advertiser with the maximum

profit-per-impression fixing the existing assignment.

Theorem: This is a good approximation compared to theoptimum with larger penalty factor.

I Greedy-Rate Algorithm: Structural Approximation.

Page 102: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithms

Greedy-Rate Algorithm:

I At each step, add the advertiser with the maximummarginal-profit per marginal-cost ratio.

T : Set of advertisers we committed to (initialized to ∅)I At each step, add an advertiser i ∈ X\S to T that maximizes


, if P(S ∪ i)− P(S)− ci > 0.

Theorem: The greedy-rate algorithm achieves the best structuralapproximation.Proof: Uses submodularity of P(T ), and f (T ).

I The approximation factor of the algorithm is a function of the

structure of the solution, i.e., a Signature: α = C(OPT)


I Greedy-rate algorithm achieves at least factor1−α−α ln 1


(improving factor 1 + α− 2√α).

Page 103: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithms

Greedy-Rate Algorithm:

I At each step, add the advertiser with the maximummarginal-profit per marginal-cost ratio.

T : Set of advertisers we committed to (initialized to ∅)I At each step, add an advertiser i ∈ X\S to T that maximizes


, if P(S ∪ i)− P(S)− ci > 0.

Theorem: The greedy-rate algorithm achieves the best structuralapproximation.Proof: Uses submodularity of P(T ), and f (T ).

I The approximation factor of the algorithm is a function of the

structure of the solution, i.e., a Signature: α = C(OPT)


I Greedy-rate algorithm achieves at least factor1−α−α ln 1


(improving factor 1 + α− 2√α).

Page 104: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithms

Greedy-Rate Algorithm:

I At each step, add the advertiser with the maximummarginal-profit per marginal-cost ratio.

T : Set of advertisers we committed to (initialized to ∅)I At each step, add an advertiser i ∈ X\S to T that maximizes


, if P(S ∪ i)− P(S)− ci > 0.

Theorem: The greedy-rate algorithm achieves the best structuralapproximation.

Proof: Uses submodularity of P(T ), and f (T ).

I The approximation factor of the algorithm is a function of the

structure of the solution, i.e., a Signature: α = C(OPT)


I Greedy-rate algorithm achieves at least factor1−α−α ln 1


(improving factor 1 + α− 2√α).

Page 105: Online Ad Serving: Theory and Practice - SIGMETRICS

Greedy Algorithms

Greedy-Rate Algorithm:

I At each step, add the advertiser with the maximummarginal-profit per marginal-cost ratio.

T : Set of advertisers we committed to (initialized to ∅)I At each step, add an advertiser i ∈ X\S to T that maximizes


, if P(S ∪ i)− P(S)− ci > 0.

Theorem: The greedy-rate algorithm achieves the best structuralapproximation.Proof: Uses submodularity of P(T ), and f (T ).

I The approximation factor of the algorithm is a function of the

structure of the solution, i.e., a Signature: α = C(OPT)


I Greedy-rate algorithm achieves at least factor1−α−α ln 1


(improving factor 1 + α− 2√α).

Page 106: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning with Delivery Penalty


Banner Ad Delivery


ads/advertisersshould wecommit to?




Constraints: Guaranteed Delivery, Fairness, ...

BHK 09


Banner Ad Delivery

• Babaiof, Hartline, Kleinberg: Online Algorithms with Buyback

• Special cases: Single item, Matroid, Knapsack

• Constantin, Feldman, Muthkrishnan, Pal: Ad slotting with Cancellations.

• Special case: Demand=1, Truthful Mechanism: Constant-factor

Interesting Algorithmic Problem:I Online mechanism for general ad planning with delivery

penalty: bicriteria approximation?

Page 107: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning with Delivery Penalty


Banner Ad Delivery


ads/advertisersshould wecommit to?




Constraints: Guaranteed Delivery, Fairness, ...

BHK 09


Banner Ad Delivery

• Babaiof, Hartline, Kleinberg: Online Algorithms with Buyback

• Special cases: Single item, Matroid, Knapsack

• Constantin, Feldman, Muthkrishnan, Pal: Ad slotting with Cancellations.

• Special case: Demand=1, Truthful Mechanism: Constant-factor

Interesting Algorithmic Problem:I Online mechanism for general ad planning with delivery

penalty: bicriteria approximation?

Page 108: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning: Offline Optimization of Contracts

I Fast Algorithms to Verify Feasibility of Contracts:I Lopsided Bipartite Graphs.I Improved Algorithms for Bipartite Network Flow: Ahuja, Orlin,

Stein, Tarjan 94I Faster Algorithm for max-flow: running time depends on the

size of smaller part.

I Sampling for Max-Cardinality Matching: Charles, Chickering,Devanur, Jain, Sanghi 2010,

I Sampling and Concise Allocation.I Algorithm: Iteratively assign nodes, and minimize the future

failure probability.I WHP verifies if there exists a feasible matching.

I Online algorithms for accepting contracts: Alaei, Arcuate,Khuller, Ma, Malekian, Tomlin 2009

I Utility model to combine contract-based advertisers &sales-based advertisers.

I Online algorithm for accepting contracts (under assumptions)

Page 109: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning: Offline Optimization of Contracts

I Fast Algorithms to Verify Feasibility of Contracts:I Lopsided Bipartite Graphs.I Improved Algorithms for Bipartite Network Flow: Ahuja, Orlin,

Stein, Tarjan 94I Faster Algorithm for max-flow: running time depends on the

size of smaller part.

I Sampling for Max-Cardinality Matching: Charles, Chickering,Devanur, Jain, Sanghi 2010,

I Sampling and Concise Allocation.I Algorithm: Iteratively assign nodes, and minimize the future

failure probability.I WHP verifies if there exists a feasible matching.

I Online algorithms for accepting contracts: Alaei, Arcuate,Khuller, Ma, Malekian, Tomlin 2009

I Utility model to combine contract-based advertisers &sales-based advertisers.

I Online algorithm for accepting contracts (under assumptions)

Page 110: Online Ad Serving: Theory and Practice - SIGMETRICS

Ad Planning: Offline Optimization of Contracts

I Fast Algorithms to Verify Feasibility of Contracts:I Lopsided Bipartite Graphs.I Improved Algorithms for Bipartite Network Flow: Ahuja, Orlin,

Stein, Tarjan 94I Faster Algorithm for max-flow: running time depends on the

size of smaller part.

I Sampling for Max-Cardinality Matching: Charles, Chickering,Devanur, Jain, Sanghi 2010,

I Sampling and Concise Allocation.I Algorithm: Iteratively assign nodes, and minimize the future

failure probability.I WHP verifies if there exists a feasible matching.

I Online algorithms for accepting contracts: Alaei, Arcuate,Khuller, Ma, Malekian, Tomlin 2009

I Utility model to combine contract-based advertisers &sales-based advertisers.

I Online algorithm for accepting contracts (under assumptions)

Page 111: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline of this talk

I Ad delivery for contract based settingsI PlanningI Ad Serving

I Ad serving in repeated auction settingsI General architecture.I Allocation for budget constrained advertisers.

I Other interactionsI Learning + allocationI Learning + auctionI Auction + contracts

Page 112: Online Ad Serving: Theory and Practice - SIGMETRICS

Outline of this talk

I Ad delivery for contract based settingsI PlanningI Ad Serving

I Ad serving in repeated auction settingsI General architecture.I Allocation for budget constrained advertisers.

I Other interactionsI Learning + allocationI Learning + auctionI Auction + contracts

Page 113: Online Ad Serving: Theory and Practice - SIGMETRICS

Three main theory/practice problems

Page 114: Online Ad Serving: Theory and Practice - SIGMETRICS

Combined Allocation with AdX: Objective

Short-term: boost revenue from AdXVS

Long-term: prioritize quality of guaranteed contracts.

Find allocation policy to maximize

yield = revenue(AdX) + γ · quality(advertisers),

where γ ≥ 0 is a tradeoff paramater (or Lagrange multiplier).

Page 115: Online Ad Serving: Theory and Practice - SIGMETRICS

Publisher’s Decisions

Impression n-tharrives withquality Qn

Assign to an advertiser or discard

Submit to AdXwith price p

Obtain payment

Assign to anadvertiser ordiscard



Optimal to always test the exchange!

Page 116: Online Ad Serving: Theory and Practice - SIGMETRICS

Publisher’s Decisions

Impression n-tharrives withquality Qn

Assign to an advertiser or discard

Submit to AdXwith price p

Obtain payment

Assign to anadvertiser ordiscard



Optimal to always test the exchange!

Page 117: Online Ad Serving: Theory and Practice - SIGMETRICS

Impact of AdX








350 370 390 410 430 450




Revenue AdX

Pareto efficient frontier

I γ = 0: maximum revenue from AdX.

I γ =∞: maximum placement quality for contracts.

Page 118: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online Forecasting

Demand for ads

Supply of impressions

Page 119: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online



Demand for ads

Supply of impressions

Page 120: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Allocation

I Value: Estimated Click-Through-Rate (CTR).

I Combined online capacity planning & learning?I Budgeted Active Learning

I Madani, Lizotte, Greiner 2004, Active Model Selection.

I Bayesian Budgeted Multi-armed Bandits:I Guha, Munagala, Multi-armed Bandits with Metric Switching

Costs.I Goel, Khanna, Null, The Ratio Index for Budgeted Learning,

with Applications.I Guha, Munagala, Pal, Multi-armed Bandit with Delayed


I Budgeted Unknown-CTR Multi-armed BanditI Pandey, Olston 2007, Handling Advertisement of Unknown


Page 121: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Allocation

I Value: Estimated Click-Through-Rate (CTR).I Combined online capacity planning & learning?

I Budgeted Active LearningI Madani, Lizotte, Greiner 2004, Active Model Selection.

I Bayesian Budgeted Multi-armed Bandits:I Guha, Munagala, Multi-armed Bandits with Metric Switching

Costs.I Goel, Khanna, Null, The Ratio Index for Budgeted Learning,

with Applications.I Guha, Munagala, Pal, Multi-armed Bandit with Delayed


I Budgeted Unknown-CTR Multi-armed BanditI Pandey, Olston 2007, Handling Advertisement of Unknown


Page 122: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Allocation

I Value: Estimated Click-Through-Rate (CTR).I Combined online capacity planning & learning?

I Budgeted Active LearningI Madani, Lizotte, Greiner 2004, Active Model Selection.

I Bayesian Budgeted Multi-armed Bandits:I Guha, Munagala, Multi-armed Bandits with Metric Switching

Costs.I Goel, Khanna, Null, The Ratio Index for Budgeted Learning,

with Applications.I Guha, Munagala, Pal, Multi-armed Bandit with Delayed


I Budgeted Unknown-CTR Multi-armed BanditI Pandey, Olston 2007, Handling Advertisement of Unknown


Page 123: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Allocation

I Value: Estimated Click-Through-Rate (CTR).I Combined online capacity planning & learning?

I Budgeted Active LearningI Madani, Lizotte, Greiner 2004, Active Model Selection.

I Bayesian Budgeted Multi-armed Bandits:I Guha, Munagala, Multi-armed Bandits with Metric Switching

Costs.I Goel, Khanna, Null, The Ratio Index for Budgeted Learning,

with Applications.I Guha, Munagala, Pal, Multi-armed Bandit with Delayed


I Budgeted Unknown-CTR Multi-armed BanditI Pandey, Olston 2007, Handling Advertisement of Unknown


Page 124: Online Ad Serving: Theory and Practice - SIGMETRICS

Online CTR Learning: Mixed Explore/Exploit

I Pandey, Olston 2007, Handling Advertisement of UnknownQuality.

I Algorithm: Revised GreedyI Upon arrival of query of type i , assign it to an ad a maximizing

Pia = (cia +√

2 ln ninia


where cia is the current estimate of CTR, nia is the number oftimes i has been assigned to a, ni is the number of queries oftype i so far.

I Thm[PO07]: ALG ≥ opt2 − O(ln n) where n is the number of


Page 125: Online Ad Serving: Theory and Practice - SIGMETRICS

Online CTR Learning: Mixed Explore/Exploit

I Pandey, Olston 2007, Handling Advertisement of UnknownQuality.

I Algorithm: Revised GreedyI Upon arrival of query of type i , assign it to an ad a maximizing

Pia = (cia +√

2 ln ninia


where cia is the current estimate of CTR, nia is the number oftimes i has been assigned to a, ni is the number of queries oftype i so far.

I Thm[PO07]: ALG ≥ opt2 − O(ln n) where n is the number of


Page 126: Online Ad Serving: Theory and Practice - SIGMETRICS

Online CTR Learning: Mixed Explore/Exploit

I Pandey, Olston 2007, Handling Advertisement of UnknownQuality.

I Algorithm: Revised GreedyI Upon arrival of query of type i , assign it to an ad a maximizing

Pia = (cia +√

2 ln ninia


where cia is the current estimate of CTR, nia is the number oftimes i has been assigned to a, ni is the number of queries oftype i so far.

I Thm[PO07]: ALG ≥ opt2 − O(ln n) where n is the number of


Page 127: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

Given a page view, and two types of advertisers:

I Contract-based.

I Auction-based.

I Decide who wins and how much do they pay.I Requirements:

I For each contract-advertiser, meet its demand.I Implement the scheme using proxy-bidding for

contract-advertisers in the spot auction.

Page 128: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

Given a page view, and two types of advertisers:

I Contract-based.

I Auction-based.

I Decide who wins and how much do they pay.I Requirements:

I For each contract-advertiser, meet its demand.I Implement the scheme using proxy-bidding for

contract-advertisers in the spot auction.

Page 129: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

I Naive solution: If a contract-adv is eligible and has notfinished demand, then let it win the spot. Bid infinity for allauctions.

I Optimize for revenue: If the auction pressure (price) is lowthen let the contract-adv win. Bid a low bid for all auctions.

I Unfair to contract-adv, since low auction-price ⇒ it is a lowervalue impression.

I Ideally:I Provide contract-adv with a representative allocation, an equal

slice of impressions from each price-point.I A price-oblivious scheme, i.e., bid without seeing the auction

bids.I Revenue per auction: average auction-price of impressions

given away to contract-advertisers is at most some target t.

Page 130: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

I Naive solution: If a contract-adv is eligible and has notfinished demand, then let it win the spot. Bid infinity for allauctions.

I Optimize for revenue: If the auction pressure (price) is lowthen let the contract-adv win. Bid a low bid for all auctions.

I Unfair to contract-adv, since low auction-price ⇒ it is a lowervalue impression.

I Ideally:I Provide contract-adv with a representative allocation, an equal

slice of impressions from each price-point.I A price-oblivious scheme, i.e., bid without seeing the auction

bids.I Revenue per auction: average auction-price of impressions

given away to contract-advertisers is at most some target t.

Page 131: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

I Naive solution: If a contract-adv is eligible and has notfinished demand, then let it win the spot. Bid infinity for allauctions.

I Optimize for revenue: If the auction pressure (price) is lowthen let the contract-adv win. Bid a low bid for all auctions.

I Unfair to contract-adv, since low auction-price ⇒ it is a lowervalue impression.

I Ideally:I Provide contract-adv with a representative allocation, an equal

slice of impressions from each price-point.I A price-oblivious scheme, i.e., bid without seeing the auction

bids.I Revenue per auction: average auction-price of impressions

given away to contract-advertisers is at most some target t.

Page 132: Online Ad Serving: Theory and Practice - SIGMETRICS

Hybrid ad serving: Contracts + Spot Auctions

I Naive solution: If a contract-adv is eligible and has notfinished demand, then let it win the spot. Bid infinity for allauctions.

I Optimize for revenue: If the auction pressure (price) is lowthen let the contract-adv win. Bid a low bid for all auctions.

I Unfair to contract-adv, since low auction-price ⇒ it is a lowervalue impression.

I Ideally:I Provide contract-adv with a representative allocation, an equal

slice of impressions from each price-point.I A price-oblivious scheme, i.e., bid without seeing the auction

bids.I Revenue per auction: average auction-price of impressions

given away to contract-advertisers is at most some target t.

Page 133: Online Ad Serving: Theory and Practice - SIGMETRICS

Obtaining representative allocationsTwo main ideas:

1. Can implement any decreasing function a(p) for fraction ofimpressions of auction-price p.

2. Solve the system for well chosen distance functions:

Minimize dist(U, a)


∫pa(p)f (p)dp = d∫

ppa(p)f (p)dp ≤ td

Page 134: Online Ad Serving: Theory and Practice - SIGMETRICS

Obtaining representative allocationsTwo main ideas:

1. Can implement any decreasing function a(p) for fraction ofimpressions of auction-price p.

2. Solve the system for well chosen distance functions:

Minimize dist(U, a)


∫pa(p)f (p)dp = d∫

ppa(p)f (p)dp ≤ td

Page 135: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Auction Incentives

[Devanur,Kakade’09, Babaioff,Sharma,Slivkins’09]

I Multi-Armed Bandit algorithms achieve an “implicit”exploration-exploitation tradeoff to get a regret of O(

√T )

(e.g., UCB).

I Can these be run in tandem with truthful auctions? (e.g., 2ndprice for a single slot).

I A naive explore-exploit method gets O(T 2/3) regret:I Explore ads for the first phase, giving them out for free.I Fix the CTRs thus learned in the first phase.I Run 2nd price auction for the 2nd phase.

I Can you do better that this simpe decoupling?

I No!

Theorem[DK09,BSS09] For every truthful auction (under certainassumptions), there exist bids, ctrs, s.t. regret = Ω(T 2/3).

Page 136: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Auction Incentives

[Devanur,Kakade’09, Babaioff,Sharma,Slivkins’09]

I Multi-Armed Bandit algorithms achieve an “implicit”exploration-exploitation tradeoff to get a regret of O(

√T )

(e.g., UCB).

I Can these be run in tandem with truthful auctions? (e.g., 2ndprice for a single slot).

I A naive explore-exploit method gets O(T 2/3) regret:I Explore ads for the first phase, giving them out for free.I Fix the CTRs thus learned in the first phase.I Run 2nd price auction for the 2nd phase.

I Can you do better that this simpe decoupling?

I No!

Theorem[DK09,BSS09] For every truthful auction (under certainassumptions), there exist bids, ctrs, s.t. regret = Ω(T 2/3).

Page 137: Online Ad Serving: Theory and Practice - SIGMETRICS

Online Learning & Auction Incentives

[Devanur,Kakade’09, Babaioff,Sharma,Slivkins’09]

I Multi-Armed Bandit algorithms achieve an “implicit”exploration-exploitation tradeoff to get a regret of O(

√T )

(e.g., UCB).

I Can these be run in tandem with truthful auctions? (e.g., 2ndprice for a single slot).

I A naive explore-exploit method gets O(T 2/3) regret:I Explore ads for the first phase, giving them out for free.I Fix the CTRs thus learned in the first phase.I Run 2nd price auction for the 2nd phase.

I Can you do better that this simpe decoupling?

I No!

Theorem[DK09,BSS09] For every truthful auction (under certainassumptions), there exist bids, ctrs, s.t. regret = Ω(T 2/3).

Page 138: Online Ad Serving: Theory and Practice - SIGMETRICS

Display Ad Delivery


Display Ad Delivery

Ad Serving: Targeting: Allocation:

Delivery Constraints, Budget


Strategic, Stochastic

Online, Stochastic

Offline, Online




Demand for ads

Supply of impressions

Open Problems:I Optimal combined online allocation & learning.I Feature selection and correlation in learning CTR.I Optimal combined stochastic planning and serving?

Page 139: Online Ad Serving: Theory and Practice - SIGMETRICS

Thank You