Download Recommendation systems and more Slides Computer science in PDF only on Docsity!
Rule-Based Collaborative Filtering, Association
Rules, Naive Bayes Collaborative Filtering,
Neural Network, Singular Value Decomposition,
Stochastic Gradient Descent, Regularization.
Rule-Based Collaborative Filtering
Using Association Rules
Relationship Between Association Rules and Collaborative Filtering
- (^) Association rule mining was originally used to discover relationships in supermarket transaction data.
- (^) It is naturally defined over binary data but can be extended to categorical and numerical data by conversion.
- (^) In supermarket transactions and implicit feedback datasets, unary data is common, where 1s indicate a purchase and 0s indicate missing values (often approximated as "not purchased").
Example of Support in Market Basket Data (Table 3.1)
- (^) Two frequent itemsets identified:
- (^) {Bread, Butter, Milk}
- (^) {Fish, Beef, Ham}
- (^) These itemsets have a support of at least 0.2 , meaning they appear in at least 20% of transactions.
- (^) Implication for Recommendation Systems:
- (^) If a customer buys {Butter, Milk}, they are likely to buy Bread (like Mary in the table).
- (^) If a customer buys {Fish, Ham}, they are likely to buy Beef (like John in the table).
Association Rules and Confidence
- (^) Definition: An association rule is an implication of the form X ⇒ Y , where:
- (^) X (antecedent): Items already purchased.
- (^) Y (consequent): Items that can be recommended.
- (^) Example Rule: {Butter, Milk} ⇒ {Bread}
- (^) Useful for recommending Bread to Mary , since she has already bought Butter and Milk.
- (^) Confidence Measure: The strength of the rule is measured by confidence:
Importance of Association Rule Mining in Collaborative Filtering
- (^) Helps in discovering hidden correlations between products in transactional data.
- (^) Useful for:
- (^) Personalized recommendations (e.g., suggesting items frequently bought together).
- (^) Targeted marketing strategies (e.g., offering discounts on complementary items).
- (^) Comparison with Collaborative Filtering:
- (^) Unlike traditional collaborative filtering , association rules do not require user ratings.
- (^) More effective in cases where implicit feedback (purchase history) is available.
Leveraging Association Rules for Collaborative Filtering
Association Rules and Unary Ratings Matrices
- (^) Unary ratings matrices arise from customer activities (e.g., purchases) where a customer only indicates a "like" (not a dislike).
- (^) Unary Data Representation:
- (^) Items purchased (liked) → 1
- (^) Missing items (not purchased) → 0
- (^) Unlike typical rating matrices, missing values in unary matrices are approximated as 0 to simplify processing.
- (^) Unary matrices are sparse , meaning most values are 0, making it acceptable to assume missing values are "not purchased."
- (^) The matrix is treated as binary data , allowing association rules to be applied.
Recommending Items to a Customer
- (^) Consider a customer A , and we want to recommend relevant items.
- (^) Steps:
- (^) Identify all rules "fired" for customer A , meaning the antecedents of the rule match items A has purchased.
- (^) Sort fired rules by decreasing confidence.
- (^) Top-k items in the consequents of these rules are recommended to the customer.
Handling Numeric Ratings in Association Rules
- (^) Unary matrices only capture "likes," but real-world ratings involve numeric values (e.g., 1-5 stars).
- (^) Approach for Numeric Ratings:
- (^) Convert each (item, rating) pair into a pseudo-item.
- (^) Example: (Item = Bread, Rating = Dislike) is treated as a distinct item.
- (^) Construct rules using pseudo-items rather than simple item names.
Weighted Voting for Prediction
- (^) Instead of strict rules, ratings can be numerically aggregated.
- (^) Steps:
- (^) Identify all fired rules predicting ratings for a given item.
- (^) Sum up votes for each rating based on the rule’s confidence.
- (^) The highest weighted rating determines the predicted rating.
- (^) The sorted list of top-rated items is recommended to the user. Using Interval-Based Ratings
- (^) When the rating scale has many possible values (e.g., 1-5 stars) :
- (^) Convert the scale into a smaller set of intervals (e.g., 1-2 = "Low", 3 = "Medium", 4-5 = "High").
- (^) Apply Association rule mining on the interval-based ratings.
- (^) This allows handling continuous ratings in a structured way
Item-Specific Support for Better Recommendations
- (^) Instead of one global support threshold , different items can have different support values.
- (^) Example:
- (^) A rarely purchased item may still be important, so a lower support threshold should be used.
- (^) A frequently purchased item should have a higher support threshold.
- (^) Using item-specific support can improve the quality of recommendations.
Naïve Bayes Model in Collaborative Filtering
Application of Bayes' Theorem
- (^) We compute the probability of a missing rating based on observed ratings: Estimating Conditional Probabilities
- (^) The Naïve Bayes assumption is applied: Ratings are independent given a specific rating for item j.