Context Adaptation in Image Search

Download Context Adaptation in Image Search

Post on 05-Jun-2015




2 download


Presentation about context-adaptation in image search, given at the 4th Twente/Siks workshop (held for the occasion of Robin Aly's PhD defense).


  • 1. Context Adaptation in Image Search

2. Context Adaptation GOAL:Present different photos to a sports journalist who queries for Beckham, than the glossy magazine editor issuing the same query 3. IPTC Categories ACE (arts, culture, LIF (lifestyle & leisure)entertainment) POL (politics) CLJ (crime, law & justice) REL (religion) DIS (disasters & accidents) SCI (science & technology) EBF (economy, business & SOI (social issues)finance) SPO (sports) EDU (education) WAR (unrest, conflicts, ENV (environment)war) HTH (health) WEA (weather) HUM (human interest) LAB (labour, work) 4. What Context? Collection context One main IPTC category per image 96,351 out of 97,760 images in 100k Belga Collection Note: noisy data, in spite of it being edited content! E.g., we found lifestyle Beckham images annotated as SPO, and even typos in IPTC category assignment! User context Classified 813 users into IPTC categories to represent their main interest (based on Belga input about the users organizations) 5. Filter on IPTC?//image[@IPTC eq SPO][about(.,Beckham)] Bad for recall: Not all images have been assigned IPTC categories Bad for precision: Noisy assignment of IPTC categories to images At least 4 of the top 10 SPO Beckham results do not show Beckham taking part in sporting activities 6. Retrieval Model Re-rank results based on cluster membershipd(q) + (1-) c Clusters c(q) c(d)P(Q|D) P(D|c)P(Q|c) Modify scores based on documents context Oren Kurland and Lillian Lee. ACM Transactions on Information Systems (TOIS), 27(3),2009. Novelty in Vitalas: Modify scores based on users context Cluster formation based on user clicks Cluster selection based on user context 7. Retrieval Model Cluster formation: IPTC-image categories; forms disjoint clusters IPTC-user categories of users who clicked the image; gives overlapping clusters Cluster selection: {dc}: cluster contains document {uc}: cluster/@category corresponds to user's interests 8. Results on Click Predictionimageimage image image useruser userUser NDCG D0.00.1 0.4 0.7 0.00.1 0.4 0.7 ACE0.1724 0.14230.17410.1721 0.1721 0.20700.19780.1767 0.1747 EBF0.5527 0.47440.54600.5497 0.5504 0.48820.55190.5509 0.5509 EDU0.0145 0.01630.01450.0145 0.0145 0.01650.01670.0155 0.0146 HTH0.1308 0.13470.13080.1308 0.1308 0.63420.37120.1934 0.1414 HUM0.1849 0.16120.17980.1772 0.1849 0.21090.20430.1776 0.1760 LAB0.1331 0.15430.13310.1331 0.1331 0.21640.23390.1817 0.1380 LIF0.1245 0.08880.12340.1233 0.1232 0.18940.15550.1121 0.1253 POL0.0723 0.05860.07040.0717 0.0721 0.10540.09900.0916 0.0769 SOI0.2880 0.18060.28830.2880 0.2880 0.29640.29700.2968 0.3008 SPO0.1811 0.18010.18090.1806 0.1807 0.21510.20050.1839 0.1820 Related literature on evaluation methodology: Carterette and Jones, NIPS 2007, and, Carterette, Allan, and Sitaraman, SIGIR 2006. 9. No Adaptation Greece 10. SPO Adaptation Greece, collection-based clusters, =0.1 11. SPO Adaptation Greece, collection-based clusters, =0.0 12. SPO Adaptation Greece, user-based clusters, =0.1 13. SPO Adaptation Greece, user-based clusters, =0.0 14. SPO Observations Re-ranking pushes the sports-related images to the top No more images about the fires When =0.0 the initial retrieval score is not taken into account (initial text ranking ignored) Minimal differences between collection- based and user-based cluster formation Archivists consider as sports-related those images that users with sports-related interests click on 15. POL Adaptation Greece, collection-based clusters, =0.1 16. POL Adaptation Greece, collection-based clusters, =0.0 17. POL Adaptation Greece, user-based clusters, =0.1 18. POL Adaptation Greece, user-based clusters, =0.0 19. POL Observations Re-ranking for a politics context shows a difference in interpretation between the archivist and the user group Archivists focussed on the actual political rallies etc. Users focussed on the forest fires 20. ACE Adaptation Greece, collection-based clusters, =0.1 21. ACE Adaptation Greece, collection-based clusters, =0.0 22. ACE Observations Re-ranking for arts, culture and entertainment requires =0.0, to ignore the initial ranking and let the right images shine 23. No AdaptationBeckham 24. SPO Adaptation Beckham, collection-based clusters, =0.1 25. SPO Adaptation Beckham, collection-based clusters, =0.0 26. HUM Adaptation Beckham, collection-based clusters, =0.1 27. Conclusions this far Adaptation also retrieves images not assigned IPTC category, by considering clusters formed by the images clicked by users with the same interests Alternative cluster formation approaches can be investigated; e.g., using visual features Method easily adapted for personalised and/or collaborative search 28. Potential for Personalization Which queries have the potential to benefit by context adaptation (personalisation)? The ones for which different users click on different results Can be studied looking at nDCG of one user assuming another users clicks are idealJaime Teevan, Susan T. Dumais and Eric Horvitz. Potential forPersonalization. ACM Transactions on Computer-Human Interaction (ToCHI)special issue on Data Mining for Understanding User Needs, 17(1), March2010. Novel in Vitalas: compare IPTC-defined user groups (instead of individual users) 29. P4P in Belga 100K 30. P4P in Belga 100KnDCG high: low potential Dean (0.8067)King albert ii (0.7810)greece (0.3910)nDCG low:high potential 31. No Adaptation King Albert II 32. EBF AdaptationKing Albert II 33. POL AdaptationKing Albert II 34. No Adaptation Dean 35. ACE Adaptation Dean, user-based clusters 36. ACE Adaptation Dean, collection-based clusters 37. Dean: Temporal Effect Log files: Dean = Hurricane Dean Still, query is quite ambiguous: James Dean Agyness Dean (a model) a (university) dean Dean Dealannoi Howard Dean Dean Martin Context adaptation for Dean requires archivist 38. Future Work Address various normalization issues In context adaptation (due to NLLR approximation) In potential for personalization/adaptation Explore temporal dimension Combinations of collection and user context? Explore cross-media cluster-based retrieval Use visual features in cluster formation 39. See also CWI Vitalas demonstrations: context instead of user context: trained by query log