contextual search and exploration russir 2015 saint peterburg, russia charles l. a. clarke...

18
Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam The Netherlands Julia Kiseleva Eindhoven University of Technology The Netherlands Grace Hui Yang Georgetown University USA (with special thanks to Adriel Dean-Hall, Waterloo)

Upload: hector-may

Post on 12-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Contextual Search and ExplorationRuSSIR 2015

Saint Peterburg, Russia

Charles L. A. ClarkeUniversity of Waterloo

Canada

Jaap KampsUniversity of Amsterdam

The Netherlands

Julia KiselevaEindhoven University of Technology

The Netherlands

Grace Hui YangGeorgetown University

USA

(with special thanks to Adriel Dean-Hall, Waterloo)

Page 2: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Part 3Hackathon

• You have two days (until Thursday at 1800)

• Do something interesting with our data (or just something interesting on this topic) ideally in a group

• Presentation and prizes on Thursday

Page 3: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Hackathon

• Basic task: Take our profiles (and a bunch of training data and resources) and make recommendations for us.

• Could be done in a variety of ways (including manually).

• Or do something else with the data

Page 4: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Presentations on Thursday

• Everyone who participated gets up to five minutes to speak (with or without slides).

• Tell us what you did

• Tell us the results

Page 5: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

The data

• http://plg.uwaterloo.ca/~claclark/russir2015/

Page 6: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Directory “Data”Everything you really need to do the task.

•contexts2015spb.csv•collection_2015_batch_requests_spb.csv

•batch_requests_combined.json

•sample_batch_response_combined.json•batch_validate.py

Page 7: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

contexts2015spb.csv

• Context contains all locations/cities.

id,city,state

151,New York City,NY

152,Chicago,IL

...

421,Walla Walla,WA

422,Lewiston,ID

423,Saint Petersburg,Russia

Page 8: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

collection_2015_batch_requests_spb.csv

• Collection contains all venues (ID,ContextID,URL,title)

TRECCS-00000005-418,418,http://www.greatfallsmt.net/people_offices/park_rec/gibson.php,"Gibson Park"

TRECCS-00000007-418,418,http://www.bostons.com,"Bostons Restaurant Sports Bar"

TRECCS-00000101-423,https://foursquare.com/v/vinostudia/51401b0ee4b052f64a18688c,"Vinostudia"

TRECCS-00000102-423,https://foursquare.com/v/le-tour-de-vin/5370e6d8498e666a1bfe1c09,"Le Tour de Vin"

Page 9: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

batch_requests_combined.json • This is the main file: profiles and candidates in json

{ "body" : { "group" : "Friends", "duration" : "Longer", "season" : "Autumn" "trip_type" : "Holiday", "person" : … "id" : 1234568, "age" : "47", "gender" : "male”}, "location" : { "id" : 423, "lat" : 59.95, "lng" : 30.3, "name" : "Saint Petersburg”}, }, "id" : 901, "candidates" : [ "TRECCS-00000001-423",

… "TRECCS-00000102-423”]}

Page 10: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

batch_requests_combined.json (profile) •Preferences elsewhere:"person" : { "preferences" : [ {"documentId" : "TRECCS-00247656-160", "tags" : [ "Bar-hopping", "Clubbing" ], "rating" : "4" }, {"documentId" : "TRECCS-00211603-161", "tags" : [ "Fast Food", "Restaurants" ], "rating" : "0" }, … ],

Page 11: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

sample_batch_response_combined.json

• Example of a valid response (+ script to validate the format)

{ "groupid" : "demo", "runid" : "demoA", "id" : 901, "body" : { "suggestions" : [ "TRECCS-00000099-423",

"TRECCS-00000006-423",…

"TRECCS-00000079-423” ] }}

Page 12: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Again,“Data”Everything you really need to do the task.

•contexts2015spb.csv•collection_2015_batch_requests_spb.csv

•batch_requests_combined.json

•sample_batch_response_combined.json•batch_validate.py

Page 13: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Directory “Evaluation”Everything you need to evaluate on the U.S./non-Spb data.

•TRECCS15_Batch_Candidates_graded.qrels• Crowdsourced judgments on the candidates

•batch_response_to_trec.py• Turn a json response into a trec format.

•trec_eval.8.1.tar.gz• Evaluate with trec_eval

Page 14: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Directory “Crawl”If you want the crawled URLs (WARC format)

•crawls_batch_requests_TRECCS.zip• All web pages of U.S. venues.

•collection_2015_spb_nodesc.zip• All web pages of Spb venues.

Page 15: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Directory “Data”Everything you really need to do the task.

•contexts2015spb.csv•collection_2015_batch_requests_spb.csv

•batch_requests_combined.json

•sample_batch_response_combined.json•batch_validate.py

Page 16: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Additional Information about USA attractions

(Directory “infoUS”)

• cat_dict.json: categories for each attraction id (from a commercial service)

• rating_dict.json: ratings for each attraction id (from a commercial service)

Page 17: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Full TREC collection of USA Attractions

(directory TREC)

• contexts2015.csv: mapping between numeric context ids and cities

• collection_2015.csv: triples mapping attraction id, context id, attraction URL

Page 18: Contextual Search and Exploration RuSSIR 2015 Saint Peterburg, Russia Charles L. A. Clarke University of Waterloo Canada Jaap Kamps University of Amsterdam

Discussion

• Ideas?

• Groups?