entity search are you searching for what you want?

27
Entity Search Are you searching for what you want? Kevin C. Chang Joint work with: Bin He, Zhen Zhang, Chengkai Li, Govind Kabra, Shui-Lung Chuang, Joe Kelley, Tao Cheng, Bill Davis, Mitesh Patel, Dave Killian

Upload: coral

Post on 05-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Entity Search Are you searching for what you want?. Kevin C. Chang Joint work with : Bin He, Zhen Zhang, Chengkai Li, Govind Kabra, Shui-Lung Chuang, Joe Kelley, Tao Cheng, Bill Davis, Mitesh Patel, Dave Killian. Let’s start with the new universal greeting…. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Entity Search Are you searching for what you want?

Entity SearchAre you searching for what you want?

Kevin C. ChangJoint work with: Bin He, Zhen Zhang, Chengkai Li, Govind Kabra,

Shui-Lung Chuang, Joe Kelley, Tao Cheng, Bill Davis, Mitesh Patel, Dave Killian

Page 2: Entity Search Are you searching for what you want?

2

What have you been reading

lately?

Let’s start with the new universal greeting…

What have you been searching

lately?

Page 3: Entity Search Are you searching for what you want?

3

From the MetaQuerier to WISDM:I am becoming superficial…

Access

Structure

Deep Web Surface Web

Kevin’s 2 projects in the 4-quardants:

Page 4: Entity Search Are you searching for what you want?

4

First Question:

Where is U. of Illinois?

Can we search it?

Page 5: Entity Search Are you searching for what you want?

5

What have you been searching lately? The university and area of Kevin Chang? The email of Marc Snir? Customer service phone number of Amazon? What profs are doing databases at UIUC? The papers and presentations of ICDE 2007? Due date of SIGMOD 2007? Sale price of “Canon PowerShot A400”? “Hamlet” books available at bookstores?

Page 6: Entity Search Are you searching for what you want?

6

Are we searchingfor what we want?

Challenge of the surface Web:Despite all the glorious search

engines…

Page 7: Entity Search Are you searching for what you want?

7

What you search is not what you want.

Page 8: Entity Search Are you searching for what you want?

8

Function follows view:

What is “the Web”? Or: How do search engines view the Web?

Page 9: Entity Search Are you searching for what you want?

9

They say: Web is a corpus of PAGES.

Page 10: Entity Search Are you searching for what you want?

10

We take an entity view of the Web:

Page 11: Entity Search Are you searching for what you want?

11

What is an “entity”? Your target of information– or, anything.

Phone number Email address PDF Image Person name Book title, author, … Price (of something)

Page 12: Entity Search Are you searching for what you want?

12

From pages to entitiesTraditional Search Entity Search

Page 13: Entity Search Are you searching for what you want?

13

Demo.We build Ver. 0.1,

to understand the promises and issues.

Three scenarios: Academic: CS sites, DBLP homepages. ECommerce: Books, Cellphones. Yellowpage: Comprehensive corpus.

Page 14: Entity Search Are you searching for what you want?

14

Special Thanks:Data from Stanford WebBase.

Page 15: Entity Search Are you searching for what you want?

15

Example application: Question answering

Q: Who are DB profs at UIUC?

WISDM

query: #dtf-nnuw100(#entity(professor) #entity(university) #entity(research Database Systems, Data Mining, IR))

results: ranked list of (<prof, univ, research>, )

Query Generation

Querying

Filtering& Validation

A: Geneva Belford, Kevin C. Chang, AnHan Doan, Jiawei Han, Marianne Winslett , ChengXiang Zhai

Page 16: Entity Search Are you searching for what you want?

16

Example application: Relation construction

… …… …… …

[email protected] Winslett

[email protected] DeWitt

emailphoneprof

<prof, phone, email>

WISDM

tagging: #entity(prof)

query: #tf-nnow50(#entity(professor) #tf-nnuw20(#entity(email) #entity(phone)))

results: ranked list of (<prof, phone, email>, )

App-specificEntity Tagging

Querying

RelationConstruction

Page 17: Entity Search Are you searching for what you want?

17

Example application: Best-effort integrationPrice of “Hamlet”?

WISDM

query: #od50(#entity(title Hamlet) #entity(price))

results: ranked list of (<title, price>, )

Buy.com: $ $10.99, Amazon.com: $12.00… …

Query Generation

Querying

Validation& Ranking

Page 18: Entity Search Are you searching for what you want?

18

How different is “entity search”?How to define such searches?

Page 19: Entity Search Are you searching for what you want?

19

Why is Entity Search different…

Probabilistic entities v.s. A page is for sure a page.

Contextual patterns v.s. Match a page by its content.

Holistic Aggregates v.s. A page occurs only once.

Associative results v.s. We never search for pairs of pages.

Page 20: Entity Search Are you searching for what you want?

20

Consider the entire process:Page Retrieval

1. Input: pages.

2. Criteria: content keywords.3. Scope: Each page itself.

4. Output: one page per result.

Marc Snir

Marc Snir

Page 21: Entity Search Are you searching for what you want?

21

Entity search is thus different…Entity Search

1. Input: probabilistic entities.

2. Criteria: contextual patterns.3. Scope: holistic aggregates.

4. Output: associative results.

Page 22: Entity Search Are you searching for what you want?

22

What are technical

challenges?

Or, how to write (reviewer-friendly) papers?

Page 23: Entity Search Are you searching for what you want?

23

Issue #1. EntityRank: How to rank entities?Say, Jiawei Han with #email, #phone, #researcharea Entity matters

Is “jhan@” an email? Is “2-3457” a phone? Context matters:

Order, distance Frequency matters:

How often is Jiawei Han – “data mining”? Associativity matters:

[email protected]” “algorithm”

Source matters: Where did you get this info from?

Page 24: Entity Search Are you searching for what you want?

24

Issue #2: Query Processing: How to optimize?

phone

tf

#entity(professor)

prof=“…”

“fax”-#entity(phone)

nnow50

Q: #tf-nnow50(#entity(professor[David DeWitt]) fax #entity(phone))

(pre-materialized context index)

Page 25: Entity Search Are you searching for what you want?

25

Conclusion: One step at a time towards …

Integration Mining

Search

surface

deep

What You Search Is What You Want!

Page 26: Entity Search Are you searching for what you want?

26

Thank You!

Chengkai LiZhen Zhang ShuiLung Chuang

Tao ChengGovind Kabra

And the warriors behind …

Arpit Jain

Amit Behal David Killian

Yuping Tseng

Hanna Zhong Ngoc Bui Sonia Jahid

Aniruddh Nath Paul Yuan Raj Sodhi

Quoc Le

Hemanta MajiSung-Eun Kim

Page 27: Entity Search Are you searching for what you want?

27

Thank You!

Chengkai LiZhen Zhang ShuiLung Chuang

Tao ChengGovind Kabra Arpit Jain

Amit Behal David Killian

Yuping Tseng

Hanna Zhong Ngoc Bui Sonia Jahid

Aniruddh Nath Paul Yuan Raj Sodhi

Quoc Le

Hemanta MajiSung-Eun Kim

And the warriors behind …