




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of overlap analysis in the context of search engines and web crawling. It explains the formula for calculating the overlap between two web pages and the relationship between independence and overlap. The document also explores ways to improve search engine coverage through the use of meta-search engines and web crawlers. The document further explains the difference between breadth-first and depth-first crawlers and their respective algorithms.
Typology: Slides
1 / 8
This page cannot be seen from the preview
Don't miss anything!





P(Wa
∩
Wb| Wb) = P(Wa
∩
Wb)/ P(Wb)
-^
= |Wa
∩
Wb| / |Wb|
If a and b are independent:
-^
P(Wa
∩
Wb) = P(Wa)*P(Wb)
P(Wa
∩
Wb| Wb) = P(Wa)*P(Wb)/P(Wb)
-^
= |Wa| * |Wb| / |Wb|
-^
= |Wa| / |W|
-^
=P(Wa)
Using
|W| = |Wa|/ P(Wa),
the researchers
found:
of the web
and follows all the links on that page
Breadth First
Depth First
Use breadth-first search (BFS) algorithm
them to a queue
st
link from the queue, get all links on
the page and add to the queue
Use depth first search (DFS) algorithm
st
link not visited from the start page
st
non-visited link
level and repeat 2
nd
step