google- opoly

Post on 30-Dec-2015

25 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Google- opoly. Amy N. Langville Mathematics Department College of Charleston langvillea@cofc.edu. Math Meet 2/20/10. Outline. Short History of Web Search Link Analysis and Google’s PageRank The Random Surfer Google-opoly March Madness Conclusion. Thesis. 1998. - PowerPoint PPT Presentation

TRANSCRIPT

Amy N. LangvilleMathematics Department

College of Charlestonlangvillea@cofc.edu

Math Meet 2/20/10

Outline

Short History of Web SearchLink Analysis and Google’s PageRankThe Random SurferGoogle-opolyMarch MadnessConclusion

Thesis

1998

Pre-1998 Web

Trip back in time to 1995– How did you find information then?

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question:

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question: how old were you then?

Pre-1998 Web

Trip back in time to 1995– How did you find information then?– Better question: how old were you then?

Inverted IndexMain tool of pre-1998 search engines

Problems with the Inverted Index

•Too many pages

Problems with the Inverted Index

• Too many pages• Spam

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Problems with the Inverted Index

• Too many pages• Spam: human eyes vs. spider eyes

Learn how to make millions

Win a ipod

Text 8 if you’re awake

Link Analysis

• pre-1998 engines only used text analysis.

• Link analysis saved search from SEOs and built companies like Google, Yahoo, Ask.

• Nearly every major search engine uses link analysis.

1998text analysis Link analysis

Link Analysis1998

text analysis Link analysis

Moral #1

Sometimes being perceived as an expert forces you to become one.

What happens when you google?

All the old text analysis + the new link analysis

What happens when you google?

ranked list

1

2

3

4

5 6

7

8

Why are rankings so important?

Web as a graph

Each node is a webpage.

Each arrow is a hyperlink.

In-links vs. Out-links

A Trip to Google-topia

Emmie

Randy, the Random Surfer

video clip

A Random Walk on the Web graph

Matrix Notation

BUT THERE ARE SOME PROBLEMS!

The surfer gets stuck!

This is called a dangling node.

How does Google fix this?

The surfer can “teleport”

We add a link from the dangling node to every other node.

When web surfing, this is equivalent to typing an address in the URL bar.

Probability Matrix

We must also take this into consideration for our probability matrix.

Dangling nodes and teleportation

video clip

Let’s look at another problem.

Our surfer gets stuck in the webpages 4, 5, and 6.

This is called a cycle.

How do we fix this?

Cycling

video clip

Full Teleportation

We must consider the possibility of, at any time, using the URL bar to type an address.

We add an extra link from every vertex to every other vertex.

Surfing vs. teleporting

Do people always use the URL bar as much as they use hyperlinks?

Google doesn’t think so. They think you only use the URL about 15%

of the time.

Computing PageRank by observing Randy

video clip

Summary of Ranking

Search query

Pull out relevant webpages from inverted index

Use PageRank and other information to rank webpages

Creators of Google

Sergey Brin and Larry Page

Computer Science majors

Now entire PhD programs in information retrieval

Creators of Google

Sergey Brin and Larry Page

Computer Science majors

Now entire PhD programs in information retrieval

The world’s largest eigenvector computationThe world’s largest eigenvector computation

Moral #2

Take a leave of absence for brilliant ideas.

More on PageRank

SIAM’s WhydoMath? Project– url =http://dev.whydomath.org/node/google/index.html

DDL on PageRank – url = http://spinner.cofc.edu/~langvillea/DISSECTION-LAB/ClarePageRankModule/

1_WebLetter.html?referrer=webcluster& LOCI: Google-opoly

– url=http://mathdl.maa.org/mathDL/23/?pa=content&sa=viewDocument&nodeId=3355

Moral #3

The more ways you can view a problem, the more likely you are to truly understand it, and hence, solve it.

Google-opoly

applets

March MadnessHow should teams vote?

• Losing teams give one vote to each team that beats them.

• Losing teams vote with margin of victory.

• Both winning and losing teams vote with # points scored.

Point Differential Voting

Moral #4

Now is a great time to do math.

Conclusion

PageRank is a sophisticated algorithm that set Google apart

The Web can be represented with graphs and matrices

PageRank’s idea of Voting has many applications.

Acknowledgements

Tim ChartierCarl MeyerEmmie DouglasKathryn PedingsClare RodgersErich KreutzerBen KovanichRyan Dumville

Luke IngramAnjela GovanNick DovidioYoshi YamamotoNeil GoodsonColin Stephenson

top related