Computing Weights - E-Commerce - Lecture Slides, Slides of Fundamentals of E-Commerce

Students of Communication, study E-Commerce as an auxiliary subject. these are the key points discussed in these Lecture Slides of E-Commerce : Computing Weights, Hub Weight, Inverse Degree, Product, Downweight, Sequential Clustering, Community Problem, Large Scale, Invisible Text, Search Engine

Typology: Slides

2012/2013

Uploaded on 07/29/2013

sheil_34
sheil_34 🇮🇳

4.4

(14)

129 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Computing Weights
Hub weight computed from the sum of the
product of the inverse degree of the in-links
and the out-links
Docsity.com
pf3
pf4
pf5
pf8

Partial preview of the text

Download Computing Weights - E-Commerce - Lecture Slides and more Slides Fundamentals of E-Commerce in PDF only on Docsity!

Computing Weights

Hub weight computed from the sum of theproduct of the inverse degree of the in-linksand the out-links

Why We Care

Lempel and Moran (2001) showed theoretically that SALSAweights are more robust that HITS weights in the presence ofthe Tightly Knit Community (TKC) Effect.

This effect occurs when a small collection of pages (related to a giventopic) is connected so that every hub links to every authority and includes as a special case the mutual reinforcement effect - The pages in a community connected in this way can beranked highly by HITS, higher than pages in a much largercollection where only some hubs link to some authorities - TKC could be exploited by spammers hoping to increase theirpage weight (e.g. link farms)

Overcoming TKC

Similarity downweight sequencing andsequential clustering (Roberts and Rosenthal2003)

Consider the underlying structure of clusters

Suggest downweight sequencing to avoid the Tight KnitCommunity problem

Results indicate approach is effective for few testedqueries, but still untested on a large scale

PHITS and More

PHITS: Cohn and Chang (2000)

Only the principal eigenvector is extracted using SALSA, sothe authority along the remaining eigenvectors iscompletely neglected

Account for more eigenvectors of the co-citation matrix

See also Lempel, Moran (2003)

Limits of Link Analysis

Stability

Adding even a small number of nodes/edges to the graphhas a significant impact

Topic drift – similar to TKC

A top authority may be a hub of pages on a different topicresulting in increased rank of the authority page

Content evolution

Adding/removing links/content can affect the intuitiveauthority rank of a page requiring recalculation of pageranks

Further Reading

R. Lempel and S. Moran, Rank Stability and Rank Similarity of Link-Based Web Ranking Algorithms in Authority Connected Graphs , Submitted to Information Retrieval, special issue on Advances in Mathematics/Formal Methods in InformationRetrieval, 2003.

M. Henzinger, Link Analysis in Web Information Retreival

Bulletin of the IEEE computer Society Technical Committeeon Data Engineering, 2000.

L. Getoor, N. Friedman, D. Koller, and A. Pfeffer. Relational Data Mining , S. Dzeroski and N. Lavrac, Eds., Springer-Verlag, 2001