output url bidding panagiotis papadimitriou, hector garcia-molina, (stanford university) ali dasdan,...

Post on 01-Apr-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Output URL Bidding

Panagiotis Papadimitriou, Hector Garcia-Molina, (Stanford University)

Ali Dasdan, Santanu Kolay(Ebay Inc)

Related papers: VLDB 2011, InfoLab TR-939, AdAuctions 2009

Search Engine Results Page (SERP)

Organic Results

Sponsored Ads

Query

Sponsored Search Ads

Keyword Bidding

Advertiser Search Engines

the social networklord of the rings

the matrixlotr III

... ...

# keywords = ~ 10K

KEYWORDS

Example SERPs

en.wikipedia.org/wiki/The_Social_Network

www.imdb.com/title/tt1285016/

www.imdb.com/title/tt133093/

en.wikipedia.org/wiki/The_Matrix

en.wikipedia.org/wiki/The_Lord_of_the_rings

en.wikipedia.org/wiki/The_Lord_of_the_rings

www.imdb.com/title/tt167260/

www.imdb.com/title/tt120737/

the social network

the matrix

the lord of the rings

lotr iii

Output Bidding

Advertiser Search Engines

imdb.com AND wikipedia.org

# URLs = 2

URLs

Outline

• Architectures

• Bid Language

• Output bid/expression generation

• Spill Evaluation

• Experiments

ArchitecturesCurrent Search Engine Architecture

ArchitecturesSerialization

• Overview– First, retrieve organic

results – Then, retrieve ads

• Pros– Simplicity

• Cons– Results Latency

O: Organic Search SystemS: Sponsored Search System

SERP

Architectures Pipelining

• Split organic search system to– Or: retrieval subsystem

(retrieve relevant docs)– Op: post-processing

subsystem (create result snippets)

• Op and S run in parallel• Pros

– No additional latency

• Cons– Sponsored search system

depends on organic system

O: Organic Search System = Or + OpS: Sponsored Search System

SERP

ArchitecturesParallelization

• URLs with ads are known a priori

• S can use– Or’: Or replica that indexes

only URLs with ads

• Pros– No additional latency– Independent organic and

sponsored search system

• Cons– More resources

O: Organic Search System (Or + Op)S: Sponsored Search SystemOr’: Small replica of OrV: Ad validation

SERP

Bid Language Model

• Output Expression– e.g., a := (u1 u2) u3 (h1 h2)– u: URL• e.g., en.wikipedia.org/wiki/The_Social_Network

– h: host• e.g., en.wikipedia.org

• Questions– URLs or hosts or both?– complex or simple?

Output Expression GenerationMotivation

• Use existing keyword campaigns to generate realistic output expressions to study

The social networklord of the rings

the matrixlotr III

……

Output Expression Generator

imdb.com AND wikipedia.org

• Problem– INPUT: keyword set R– OUTPUT: expression a

that “covers” R

• Candidate solutions– a1 := u1 u2 u3

– a2 := u1 u4

– a3 := u5

Output Expression GenerationMotivating Example

• CompactnessContain few URLs

• Spill minimization:Do not match “irrelevant” queries

Output Expression GenerationObjectives

OutputExpression

Size|a|

Spillspill(a,R)

a1 := u1 u2 u3 3 {}

a2 := u1 u4 2 {q5}

a3 := u5 1 {q4,q5, q6}

• Query Set Output Coverminimize γ|a| + (1-γ) |spill(a, R)|subj. to m(a,q), q R

• γ : regularization parameter

• Related to– Set Cover– Red-Blue Set Cover

Output Expression GenerationProblem Statement

Output Expression GenerationGreedy Algorithm

• Pre-compute– C[u]: Queries covered by URL u– S[u]: Spill of URL u w.r.t. R

• Algorithm

Spill Evaluation

• Spill queries may be relevant to R

• Divide spill(a, R) to – positive: relevant – negative: irrelevant

• Use query clustering for evaluation

• Example:– a := u2 u3

– Positive spill = {q1}

– Negative spill = {q5}

Experimental EvaluationGoals

• Compare output URL bidding variations– 1-URL, 2-URL, 3-URL• e.g, 2-URL: use only URLs, up to 2 URLs in a disjunct

– 1-host, 2-host, 3-host– 1-mixed, 2-mixed

• Comparison criteria– Compactness – Spill tradeoff– Spill Evaluation

Experimental EvaluationSetup

• Dataset (from Yahoo query logs)– 12,931,117 queries– 62,666,514 URLs– 7,185,392 hosts– 2,251 ads

• Process– For each variation (1-URL, 2-URL, …)• For different γ values

– Generate output expressions for all 2,251 ads

Experimental EvaluationCompactness vs Spill

Experimental EvaluationPositive and Negative Spill

Experimental EvaluationSummary

• Compactness-spill trade-off– Using both URLs/hosts outperform other options– Up to 2 conjuncts in a disjunct is sufficient

• Spill evaluation– Output expressions can bring additional queries

• Other experiments in Combining keyword and output bidding– Output expression are suitable for half of the keywords– Using only hosts seems to be sufficient

Conclusions

• Output URL bidding can be implemented efficiently

• Advantages over keyword bidding– Bid Compactness– More relevant queries

THANK YOU!

top related