online expansion of rare queries for sponsored search

Online Expansion of Rare Queries for Sponsored

Search

Defended by Mykell MillerDefended by Mykell Miller

Summary: The Short VersionSummary: The Short Version

This paper describes and evaluates a This paper describes and evaluates a method of determining which ads to method of determining which ads to display on a search engine result page. display on a search engine result page. Users input varied queries, so it is Users input varied queries, so it is beneficial to post ads pertaining to not beneficial to post ads pertaining to not only the query, but to related queries as only the query, but to related queries as well. However, previous methods of well. However, previous methods of finding these related queries and finding these related queries and transforming them into ads takes a long transforming them into ads takes a long time, and therefore are done offline. This time, and therefore are done offline. This paper describes a method that allows paper describes a method that allows some of the work to be done on the fly some of the work to be done on the fly without too much overhead.without too much overhead.

Why it’s good: The Short VersionWhy it’s good: The Short Version

UsefulUseful• Ads fund search enginesAds fund search engines• If ads were more relevant, Jared might actually click on themIf ads were more relevant, Jared might actually click on them• The method shows statistically significant improvement in The method shows statistically significant improvement in

making ads more relevant, at a low overheadmaking ads more relevant, at a low overhead InterestingInteresting

• Interestingness is subjective, but this is MY defenseInterestingness is subjective, but this is MY defense Well-writtenWell-written

• Well-organizedWell-organized• I could actually understand the math because they very clearly I could actually understand the math because they very clearly

told me what all the variables meanttold me what all the variables meant• They defined all the relevant terms and summarized all the They defined all the relevant terms and summarized all the

references so I didn’t have to read 32 other papers.references so I didn’t have to read 32 other papers. Time TravelTime Travel

• This paper is only three weeks oldThis paper is only three weeks old• A paper that was published in April cited itA paper that was published in April cited it

Now for the long version…Now for the long version…

What this paper is aboutWhat this paper is about

Broad matching is where an ad is displayed when Broad matching is where an ad is displayed when its bid phrase is similar to, but not exactly, the its bid phrase is similar to, but not exactly, the query the user inputted.query the user inputted.

What this paper is aboutWhat this paper is about

Sponsored SearchSponsored Search• A.K.A. Paid search advertisingA.K.A. Paid search advertising• On Search Engine Result PagesOn Search Engine Result Pages• All major web search engines do thisAll major web search engines do this

Context MatchContext Match• A.K.A. Contextual AdvertisingA.K.A. Contextual Advertising• On other websitesOn other websites• What we looked at last WednesdayWhat we looked at last Wednesday

More on Sponsored SearchMore on Sponsored Search The authors assume a pay-per-click modelThe authors assume a pay-per-click model

• Google, Yahoo, and Microsoft all use this modelGoogle, Yahoo, and Microsoft all use this model Bid PhrasesBid Phrases

• This is the query that will result in showing this ad.This is the query that will result in showing this ad. Bidding systemBidding system

• An advertiser pays the search company whatever it An advertiser pays the search company whatever it wants to associate its ad with a bid phrasewants to associate its ad with a bid phrase

• If an advertiser pays more, its ad gets a higher ranking.If an advertiser pays more, its ad gets a higher ranking. Example:Example:

• High Bidders pays $1,000,000,000,000,000,000,000 for High Bidders pays $1,000,000,000,000,000,000,000 for the bid phrase “Dummy Query”the bid phrase “Dummy Query”

• Low Bidders pays $1 for the bid phrase “Dummy Query”Low Bidders pays $1 for the bid phrase “Dummy Query”• When I search for “Dummy Query” I see High Bidders’ When I search for “Dummy Query” I see High Bidders’

ad first, then Low Bidders’ ad.ad first, then Low Bidders’ ad.

More on Sponsored SearchMore on Sponsored Search

System

An Advertiser Other Advertisers

An Account More Accounts

An Ad Campaign More Ad Campaigns

An Ad Group More Ad Groups

Creative Bid Phrases

Why Do This Paper?Why Do This Paper?

30-40% of search engine result 30-40% of search engine result pages have no ads on them because pages have no ads on them because Google, Yahoo, etc. don’t know what Google, Yahoo, etc. don’t know what queries are similar to the bid phrasequeries are similar to the bid phrase

Previous work has developed Previous work has developed systems that are far too inefficient to systems that are far too inefficient to use in real lifeuse in real life

My Own ExperimentMy Own Experiment

Query: Banana BreadQuery: Banana Bread Query: Nut-Free Query: Nut-Free Banana BreadBanana Bread

Query: Vegan Banana BreadQuery: Vegan Banana Bread

Why do tail queries have so few Why do tail queries have so few ads?ads?

They are often harder to interpret They are often harder to interpret than more common (head and torso) than more common (head and torso) queriesqueries

There are rarely exact matches for There are rarely exact matches for bid queriesbid queries

There is little historical click dataThere is little historical click data Search engines don’t like posting Search engines don’t like posting

irrelevant adsirrelevant ads

What does this paper accomplish? What does this paper accomplish?

Online query expansion for tail Online query expansion for tail queriesqueries

New way to index query expansions New way to index query expansions for fast computation of query for fast computation of query similaritysimilarity

A way to go from pre-expanded A way to go from pre-expanded queries to expanding related queries queries to expanding related queries on the flyon the fly

A ranking and scoring methodA ranking and scoring method

The Architecture of their systemThe Architecture of their system

Query Feature ExtractionQuery Feature Extraction UnigramsUnigrams

• Process them viaProcess them via StemmingStemming

• Taking words like “Extraction” and “Extracting” and stemming Taking words like “Extraction” and “Extracting” and stemming them to “Extract”them to “Extract”

Stop wordsStop words• Ignoring words you don’t likeIgnoring words you don’t like

PhrasesPhrases• Multi-word phrases are from a dictionary of ~10 million Multi-word phrases are from a dictionary of ~10 million

phrases gathered from query logs and web pagesphrases gathered from query logs and web pages Semantic ClassesSemantic Classes

• Developed a hierarchical taxonomy of 6000 semantic Developed a hierarchical taxonomy of 6000 semantic classesclasses

• Annotate each query with the 5 most likely semantic Annotate each query with the 5 most likely semantic classesclasses

Related Query RetrievalRelated Query Retrieval

Now we have a pseudo-query made Now we have a pseudo-query made up of features.up of features.

Compare this pseudo-query to our Compare this pseudo-query to our inverted index and pull out related inverted index and pull out related pseudo-queriespseudo-queries

Runs a system that pulls out key Runs a system that pulls out key words then calculates the similarity words then calculates the similarity using a dot productusing a dot product

Query ExpansionQuery Expansion

Q* is the set of features describing Q* is the set of features describing the original features and related the original features and related queriesqueries

The weight of a given feature in Q* is The weight of a given feature in Q* is a linear combination of its weight in a linear combination of its weight in the original and related queriesthe original and related queries

This expansion is efficient because This expansion is efficient because you’re only looking at the features in you’re only looking at the features in related queriesrelated queries

Ad Feature WeightingAd Feature Weighting

Extract the same features from the Extract the same features from the bid phrases of ad groups as from bid phrases of ad groups as from queries (unigrams, phrases, semantic queries (unigrams, phrases, semantic classes)classes)

Since the weighting from the queries Since the weighting from the queries would unfairly benefit short ad would unfairly benefit short ad groups, use the BM25 weighting groups, use the BM25 weighting scheme.scheme.

Title Match BoostingTitle Match Boosting

Increases the score of ads whose Increases the score of ads whose titles match the original query very titles match the original query very wellwell

Scoring FunctionScoring Function

The end result of all thisThe end result of all this A weighted sum of dot products A weighted sum of dot products

between features and the title match between features and the title match boostboost

Now on to the results!Now on to the results!

Test SetTest Set

Test set: 400 random rare queries from Test set: 400 random rare queries from YahooYahoo• 121 were in the lookup table, 279 were not121 were in the lookup table, 279 were not• Eliminated the 10% of rare queries that were Eliminated the 10% of rare queries that were

foreignforeign Human editors judged the top 3 ads.Human editors judged the top 3 ads.

• 3556 judgments3556 judgments The system was built off of every ad Yahoo The system was built off of every ad Yahoo

has and 100 million queries based off of has and 100 million queries based off of U.S. YahooU.S. Yahoo

MetricsMetrics Discounted Cumulative Gain (DCG)Discounted Cumulative Gain (DCG)

• ““a measure of effectiveness of a Web search engine a measure of effectiveness of a Web search engine algorithm or related applications, often used in algorithm or related applications, often used in information retrieval. Using a graded relevance scale of information retrieval. Using a graded relevance scale of documents in a search engine result set, DCG measures documents in a search engine result set, DCG measures the usefulness, or the usefulness, or gaingain, of a document based on its , of a document based on its position in the result list. The gain is accumulated position in the result list. The gain is accumulated cumulatively from the top of the result list to the bottom cumulatively from the top of the result list to the bottom with the gain of each result discounted at lower ranks.” –with the gain of each result discounted at lower ranks.” –WikipediaWikipedia

• DCG is a number; higher numbers are betterDCG is a number; higher numbers are better Precision-Recall CurvesPrecision-Recall Curves

• Precision: Fraction of results returned that are relevantPrecision: Fraction of results returned that are relevant• Recall: Fraction of relevant results that are returnedRecall: Fraction of relevant results that are returned• A way to visualize it; higher is betterA way to visualize it; higher is better

Ad Matching Algorithms TestedAd Matching Algorithms Tested BaselineBaseline

• The original, unexpanded version of the query The original, unexpanded version of the query vectorvector

Offline ExpansionOffline Expansion• Expands the original query by pre-processing Expands the original query by pre-processing

offline onlyoffline only Online ExpansionOnline Expansion

• Expands the original query by processing Expands the original query by processing online onlyonline only

Online + Offline ExpansionOnline + Offline Expansion• Expands the original query using both offline Expands the original query using both offline

and online expansion algorithmsand online expansion algorithms

Test Results: Queries not found in Test Results: Queries not found in lookup tablelookup table

Tested the baseline Tested the baseline vs online vs online expansionexpansion

The online The online expansion gave expansion gave statistically statistically significant significant improvementsimprovements

Test Results: Queries found in Test Results: Queries found in lookup tablelookup table

Tested all 4 Tested all 4 algorithmsalgorithms

Best: offline Best: offline expansionexpansion

Second best: online Second best: online + offline expansion+ offline expansion

Difference between Difference between the two was not the two was not statistically statistically significantsignificant

Test results: full setTest results: full set

Tested on all four algorithmsTested on all four algorithms Best: online + offline expansionBest: online + offline expansion Online expansion also offers statistically Online expansion also offers statistically

significant improvementsignificant improvement Even better: hybridEven better: hybrid

EfficiencyEfficiency

The table lookup takes only 1 msThe table lookup takes only 1 ms Least efficient when a query is not in the Least efficient when a query is not in the

lookup tablelookup table When a query is not in the lookup table, When a query is not in the lookup table,

there is a 50% overheadthere is a 50% overhead• This is badThis is bad

But given the small proportion of queries But given the small proportion of queries not in the lookup table, the estimated not in the lookup table, the estimated average is 12.5% overheadaverage is 12.5% overhead• This is good This is good

online expansion of rare queries for sponsored search

Documents