Recommendation systems, Slides of Computer science

Recommendation systems subject unit-3

Typology: Slides

2025/2026

Available from 06/30/2026

poojitha-chougani
poojitha-chougani 🇮🇳

8 documents

1 / 39

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
UNIT III: User-Based collaborative filtering,
Similarity Function Variants, Variants of the
Prediction Function, Item-Based Collaborative
filtering, Comparing User-Based and Item-Based
Methods, Strengths and Weaknesses of
Neighborhood-Based Methods
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27

Partial preview of the text

Download Recommendation systems and more Slides Computer science in PDF only on Docsity!

UNIT III: User-Based collaborative filtering,

Similarity Function Variants, Variants of the

Prediction Function, Item-Based Collaborative

filtering, Comparing User-Based and Item-Based

Methods, Strengths and Weaknesses of

Neighborhood-Based Methods

Neighborhood-Based Collaborative Filtering

Introduction

• Neighborhood-based collaborative filtering (also

called memory-based filtering ) relies on user and

item similarity.

• Two main types:

– User-based collaborative filtering : Predicts ratings

based on similar users' ratings.

– Item-based collaborative filtering : Predicts ratings

based on a user's ratings of similar items.

Key Properties of Ratings Matrices

1. Definition and Structure of Ratings Matrices

  • (^) The ratings matrix R is an m × n matrix where m represents users and n represents items.
  • (^) Ratings are typically sparse , with only a small subset of the entries specified.
  • (^) Specified entries = Training data ; Unspecified entries = Test data.
  • (^) Recommendation is a generalization of classification and regression problems.

2. Types of Ratings

Continuous Ratings

  • (^) Ratings can take any value within a range (e.g., Jester joke system: -10 to 10).
  • (^) Drawback : Users find it difficult to choose from an infinite set of values. Interval-Based Ratings
  • (^) Ratings are selected from a fixed scale (e.g., 1-5, -2 to 2, 1-7).
  • (^) Assumes equal distance between rating levels. Ordinal Ratings
  • (^) Categorical but ordered values (e.g., “Strongly Disagree” to “Strongly Agree”).
  • (^) No assumption that differences between categories are equal

3. Implicit Feedback & Unary Ratings

  • (^) Implicit feedback : User actions (e.g., purchases, clicks) are interpreted as preferences.
  • (^) More common than explicit ratings, as users interact more frequently than they rate.
  • (^) Can be seen as a positive-unlabeled (PU) learning problem in classification. 4. The Long-Tail Property in Ratings Distribution
  • (^) Observation : A small fraction of items are rated frequently (popular items), while the majority have few ratings (long-tail items).
  • (^) Graph representation :
    • (^) X-axis: Items ranked by frequency of ratings.
    • (^) Y-axis: Number of ratings per item.
    • (^) Results in a skewed distribution.

5. Implications of the Long-Tail Property

  • (^) Merchant Profitability
    • (^) Popular items are competitive but low-profit.
    • (^) Less popular items (long-tail) often have higher profit margins (e.g., Amazon’s strategy).

Predicting Ratings with Neighborhood-Based

Methods

1. Concept of Neighborhood-Based Methods - (^) Uses user-user similarity or item-item similarity to make recommendations. - (^) Relies on the principle that similar users or similar items have similar ratings. 2. Two Basic Principles

  • (^) User-Based Models
    • (^) Users with similar rating patterns tend to rate items similarly.
    • (^) Example: If Alice and Bob have rated movies similarly in the past, Alice’s rating for "Terminator" can predict Bob’s rating for the same movie.
  • (^) Item-Based Models
    • (^) Similar items receive similar ratings from the same user.
    • (^) Example: Bob's ratings for "Alien" and "Predator" can predict his rating for "Terminator." 3. Connection to Nearest Neighbor Classification
  • (^) Collaborative filtering is a generalization of classification/regression modeling.
  • (^) Neighborhood-based models are similar to nearest neighbor classifiers in machine learning.
  • (^) Unlike classification, collaborative filtering determines nearest neighbors using both rows (users) and columns (items).

5. Item-Item Similarity Computation (Example from Table 2.2)

  • (^) Adjusted cosine similarity is used for item similarity calculations.
  • (^) Items are compared after mean-centering ratings to eliminate user bias.
  • (^) Cosine similarity scores between items indicate their similarity levels.

User-Based Neighborhood Models

1. Concept of User-Based Neighborhoods - (^) Defines user neighborhoods by identifying similar users to the target user. - (^) Uses these similar users' ratings to predict missing ratings for the target user. - (^) A similarity function is required, but it must account for different rating scales among users. 2. Key Challenges in User-Based Similarity Computation - (^) Different rating scales: Some users consistently give higher or lower ratings than others. - (^) Sparse ratings: Many users rate only a small subset of items, making similarity computation challenging. - (^) Mutual rating sets: Similarity is computed only for the overlapping rated items between two users.

5. Variations & Enhancements - (^) Some implementations compute mean ratings dynamically based on overlapping items. - (^) Heuristic filtering removes users with low or negative similarity to improve accuracy. - (^) The method allows for different similarity measures and weighting strategies to fine-tune recommendations.