personalization in online services cj82p676m/... · personalization in online services...

Download Personalization in Online Services cj82p676m/... · Personalization in Online Services Measurement,…

Post on 29-Aug-2018

213 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Personalization in Online Services

    Measurement, Analysis, and Implications

    A Dissertation Presented

    by

    Aniko Hannak

    to

    The College of Computer and Information Science

    in partial fulfillment of the requirements

    for the degree of

    Doctor of Philosophy

    in

    Computer Science

    Northeastern UniversityBoston, Massachusetts

    April 2016

  • Contents

    List of Figures iv

    List of Tables vi

    Acknowledgments vii

    Abstract of the Dissertation viii

    1 Introduction 11.1 Outline of the Presented Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1.1 Methodology for Measuring Personalization . . . . . . . . . . . . . . . . . 41.1.2 Web Search Personalization . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 Location-based Personalization . . . . . . . . . . . . . . . . . . . . . . . 51.1.4 Personalization of E-commerce . . . . . . . . . . . . . . . . . . . . . . . 6

    2 Background and Related Work 82.1 Search Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Personalization of E-commerce . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    3 Measuring Personalization 183.1 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.4 Measurement Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    4 Measuring Web Search Personalization 244.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    4.2.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.2.3 Search Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2.4 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.3 Real-World Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    ii

  • 4.3.1 Collecting Real-World Data . . . . . . . . . . . . . . . . . . . . . . . . . 324.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    4.4 Personalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4.1 Collecting Synthetic Account Data . . . . . . . . . . . . . . . . . . . . . . 374.4.2 Basic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4.3 Historical Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    4.5 Quantifying Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.5.1 Temporal Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.5.2 Personalization of Query Categories . . . . . . . . . . . . . . . . . . . . . 494.5.3 Personalization and Result Ranking . . . . . . . . . . . . . . . . . . . . . 504.5.4 Personalization and Aggregated Search . . . . . . . . . . . . . . . . . . . 52

    4.6 Concluding Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5 The Impact of Geolocation on Web Search Personalization 565.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    5.2.1 Locations and Search Terms . . . . . . . . . . . . . . . . . . . . . . . . . 585.2.2 Data Collection and Parsing . . . . . . . . . . . . . . . . . . . . . . . . . 605.2.3 Measuring Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    5.3 Analysis and Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.3.1 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3.2 Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    5.4 Concluding Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    6 Measuring Personalization of Ecommerce Sites 686.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    6.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.2.2 E-commerce Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.2.3 Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    6.3 Real-World Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726.3.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.3.2 Price Steering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.3.3 Price Discrimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766.3.4 Per-User Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    6.4 Personalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786.4.1 Experimental Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 796.4.2 Hotels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.4.3 General Retailers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    6.5 Concluding Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    7 Conclusion 90

    Bibliography 93

    iii

  • List of Figures

    4.1 Example page of Google Search results. . . . . . . . . . . . . . . . . . . . . . . . 274.2 Example page of Bing Search results. . . . . . . . . . . . . . . . . . . . . . . . . 274.3 Example of result carry-over, searching for hawaii then searching for urban

    outfitters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.4 Overlap of results when searching for test followed by touring compared to just

    touring for different waiting periods. . . . . . . . . . . . . . . . . . . . . . . . . 294.5 Results for the no-SSL versus SSL experiment on Google Search. . . . . . . . . . 334.6 Usage of Google/Microsoft services by AMT workers. . . . . . . . . . . . . . . . 344.7 % of AMT and control results changed at each rank. . . . . . . . . . . . . . . . . . 344.8 Results for the cookie tracking experiments on Google and Bing. . . . . . . . . . . 384.9 Results for the browser experiments on Google, Bing, and DuckDuckGo. . . . . . 394.10 Results for the geolocation experiments on Google, Bing, and DuckDuckGo. . . . 414.11 Results for the User Profile: Gender experiments on Google and Bing. . . . . . . . 424.12 Results for the Search History: Age Bracket experiments on Google and Bing. . . . 434.13 Results for the targeted domain clicking experiments on Google and Bing. . . . . . 454.14 Day-to-day consistency of results for the cookie tracking experiments. . . . . . . . 474.15 Day-to-day consistency within search query categories for the cookie tracking exper-

    iments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.16 Differences in search results for five query categories on Google Search and Bing. . 494.17 The percentage of results changed at each rank on Google Search and Bing. . . . . 504.18 Movement of results to and from rank 1 for personalized searches. . . . . . . . . . 514.19 Rank of embedded services in search results from Google and Bing. . . . . . . . . 524.20 Percentage of embeddings of different services on Google and Bing. . . . . . . . . 53

    5.1 Example search results from the mobile version of Google Search. . . . . . . . . . 595.2 Average noise levels across different query types and granularities. Error bars show

    standard deviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.3 Noise levels for local queries across three granularities. . . . . . . . . . . . . . . . 635.4 Amount of noise caused by different types of search results for local queries. . . . 635.5 Average personalization across different query types and granularities. Black bars

    shows average noise levels from Figure 5.2. . . . . . . . . . . . . . . . . . . . . . 645.6 Personalization of each search term for local queries. . . . . . . . . . . . . . . . . 655.7 Amount of personalization caused by different types of search results. . . . . . . . 65

    iv

  • 5.8 Personalization of 25 locations, each compared to a baseline location, for localqueries. The red line compares two treatments at the baseline location (i.e., theexperimental control), and thus shows the noise floor. . . . . . . . . . . . . . . . . 66

    5.9 Correlation of physical distance and edit distance . . . . . . . . . . . . . . . . . . 67

    6.1 Previous usage (i.e., having an account and making a purchase) of different e-commerce sites by myAMT users. . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    6.2 Average Jaccard index (top), Kendalls (middle), and nDCG (bottom) across allusers and searches for each web site. . . . . . . . . . . . . . . . . . . . . . . . . 74

    6.3 Percent of products with inconsistent prices (bottom), and the distribution of pricedifferences for sites with 0.5% of products showing differences (top), across allusers and searches for each web site. The top plot shows the mean (thick line), 25thand 75th percentile (box), and 5th and 95th percentile (whisker). . . . . . . . . . . 75

    6.4 Exam

Recommended

View more >