research on enterprise track of trec 2007

30
Huizhong Duan, Qi Zhou, Zhen Lu, Ou Jin, Shenghua Bao, Yunbo Cao and Yong Yu Apex Knowledge & Data Management Lab Presenter: Yangbo Zhu

Upload: cwen

Post on 19-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Research on Enterprise Track of TREC 2007. Huizhong Duan, Qi Zhou, Zhen Lu, Ou Jin, Shenghua Bao, Yunbo Cao and Yong Yu Apex Knowledge & Data Management Lab. Presenter: Yangbo Zhu. Document Search. Outline. Static Ranking Approaches. Link Sparse; Similar, Small Rank. HostRank Algorithm. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Research on Enterprise Track of TREC 2007

Huizhong Duan, Qi Zhou, Zhen Lu, Ou Jin, Shenghua Bao, Yunbo Cao and Yong Yu

Apex Knowledge & Data Management Lab

Presenter: Yangbo Zhu

Page 2: Research on Enterprise Track of TREC 2007
Page 3: Research on Enterprise Track of TREC 2007
Page 4: Research on Enterprise Track of TREC 2007

Link Sparse;Similar, Small

Rank.

Link Sparse;Similar, Small

Rank.

Page 5: Research on Enterprise Track of TREC 2007
Page 6: Research on Enterprise Track of TREC 2007

http://www.ento.csiro.auhttp://www.ento.csiro.auhttp://www.atnf.csiro.auhttp://www.atnf.csiro.auhttp://www.csiro.au/sciencehttp://www.csiro.au/science

www.atnf.csiro.au/~rgoochwww.atnf.csiro.au/~rgooch

Page 7: Research on Enterprise Track of TREC 2007

Hierarchical Weight Structure

www.atnf.csiro.au/computingwww.atnf.csiro.au/computing

www.atnf.csiro.au/computing/softwarewww.atnf.csiro.au/computing/software

www.atnf.csiro.au/computing/software/smongowww.atnf.csiro.au/computing/software/smongo

Page 8: Research on Enterprise Track of TREC 2007

The factor ω is defined as:

◦ Index(p) is a boolean value denoting whether the page is an index page.

◦ Link(P) is define as the percentage of the inlinks of Page P

Reference: G. Xue, Q. Yang, H. Zeng, Y. Yu, Z. Chen: Exploiting the Hierarchical Structure for

Link Analysis. In: Proceedings of SIGIR2005

Page 9: Research on Enterprise Track of TREC 2007
Page 10: Research on Enterprise Track of TREC 2007

TitleTitle

H1H1

H2H2

H1H1

Page 11: Research on Enterprise Track of TREC 2007

Dividing the page based on DOM tree structure.

Page 12: Research on Enterprise Track of TREC 2007

Filtering divided parts

Page 13: Research on Enterprise Track of TREC 2007
Page 14: Research on Enterprise Track of TREC 2007
Page 15: Research on Enterprise Track of TREC 2007
Page 16: Research on Enterprise Track of TREC 2007
Page 17: Research on Enterprise Track of TREC 2007
Page 18: Research on Enterprise Track of TREC 2007

S. Bao, H. Duan, Q. Zhou, M. Xiong, Y. Cao and Y. Yu: Research on Expert Search at Enterprise Track

of TREC 2006. In: proceedings of 15th Text Retrieval Conference (TREC 2006), 2006.

Page 19: Research on Enterprise Track of TREC 2007
Page 20: Research on Enterprise Track of TREC 2007
Page 21: Research on Enterprise Track of TREC 2007
Page 22: Research on Enterprise Track of TREC 2007

au

csiro

atnf ffp

atoa vo

www

reprints

emCutting Level

Expert

Rank

Expert

Rank

Topic Sensitiv

e

Topic Sensitiv

e

Page 23: Research on Enterprise Track of TREC 2007
Page 24: Research on Enterprise Track of TREC 2007
Page 25: Research on Enterprise Track of TREC 2007

Some anti-spam format

Page 26: Research on Enterprise Track of TREC 2007

[email protected]@noble.org

[email protected]@csiro.au

[email protected]@csiro.au

[email protected]@dem.csiro.au

Emails with single letter in

its person name part

Page 27: Research on Enterprise Track of TREC 2007

VisualPageRank◦ Too simple: Too complicated:

Page 28: Research on Enterprise Track of TREC 2007

Example of Expert Homepage

Page 29: Research on Enterprise Track of TREC 2007
Page 30: Research on Enterprise Track of TREC 2007