how a search engine works
TRANSCRIPT
Ste
p
1
Database Module
• A graph, with nodes and edges, is constructed as per the links pointing to the other web pages.
Bots give full web pages text to the
indexer.
Stop words like (for, in, at etc) and
punctuation are ignored.
The text is converted to lower
case and stored.
Term Weighting Factor
Term Frequency
How many times the term
occurred in the collected text.
Collection Frequency
Used to discriminate one document from
the other.
Length Normalization
Long documents have larger term
set than short ones.
It is not possible to keep up with the
growth of web and update the content
asap. By the time bot is able to craw
through, its indexed content gets outdated.
So the web has been divided into segments and then the index is
incrementally updated.
Page Rank – Google’s Secret Algorithm
Latest Reputation
Popularity Authority Trustworthy Freshness Relevance
Query Terms
Position Size Proximity
User Content
Geographic Region
Web History
The algorithms get to the deeper
meaning of the words you type in
the search bar.
A search engine identifies and
corrects possible spelling errors and
provides alternatives.
Autocomplete predicts what you
might be searching for. This includes understanding
terms with more than one meaning.
The previous searches help the
engine comprehend what the user might be
looking for.
A lot more goes into displaying the
most relevant results to the user.
Search engines like Google rank based on more
than 200 factors.