0% found this document useful (0 votes)
36 views16 pages

Unit..1 Rs

The document provides an overview of recommender systems, including their types, such as content-based, collaborative filtering, and hybrid systems. It discusses data mining methods used in these systems, including similarity measures, dimensionality reduction techniques like Singular Value Decomposition (SVD), and various algorithms for generating recommendations. Additionally, it highlights the advantages of SVD in terms of dimensionality reduction, noise reduction, and its applications in fields like recommendation systems and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views16 pages

Unit..1 Rs

The document provides an overview of recommender systems, including their types, such as content-based, collaborative filtering, and hybrid systems. It discusses data mining methods used in these systems, including similarity measures, dimensionality reduction techniques like Singular Value Decomposition (SVD), and various algorithms for generating recommendations. Additionally, it highlights the advantages of SVD in terms of dimensionality reduction, noise reduction, and its applications in fields like recommendation systems and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

CCS360: RECOMMENDER SYSTEMS

UNIT 1 - INTRODUCTION
Introduction and basic taxonomy of recommender systems -
Traditional and non-personalized Recommender Systems -
Overview of data mining methods for recommender systems-
similarity measures- Dimensionality reduction – Singular Value
Decomposition (SVD).

Introduction and basic taxonomy of recommender systems:

Recommendation system is a process of filtering information to retain


buyers on e-commerce sites or applications. It is used on all e-commerce
sites, social media platform and multimedia platform. This recommendation
is based on their own experience or experience between users.

 For business, it can increase sales.

 The customer service can be personalized and thereby


gain customer trust and loyalty.

 It increases the knowledge about customers.

 It can help to persuade the customers and decide on the


discount offers
How Do Content Based Recommender Systems Work?
A content based recommender works with data that the user
provides, either explicitly (rating) or implicitly (clicking on a link).Based
on that data, a user profile is generated, which is then used to make
suggestions to the user.
Example: Netflix suggesting movies based on a user's past
viewing history and the genre, cast, and plot features of the
movies.

What is collaborative filtering for recommender systems?

In Collaborative Filtering, we tend to find similar users and


recommend what similar users like. In this type of recommendation
system, we don't use the features of the item to recommend it,
rather we classify the users into clusters of similar types and
recommend each user according to the preference of its cluster.
Examples:
Amazon and Netflix.
Traditional and non-personalized Recommender Systems:

Traditional and non-personalized recommender systems are


techniques used in data science to provide recommendations to users based
on general preferences and popular items rather than individual user profiles.
These systems are often used when there is limited or no information
available about specific users or when personalization is not a primary
requirement.

Traditional Recommender Systems:


a.Popularity-Based popular or have received high overall ratings or
interactions from all users. - Use cases include recommending
trending news articles, top-rated movies, or best-selling products to a
broad audience.

b.Average Rating Recommender Systems: - Recommend items


with high average ratings from users. - Suitable for suggesting
well-liked products or services, such as highly-rated books or
restaurants.
c.Content-Based Recommender Systems: -
Recommend items that are similar in content to items the user has
interacted with in the past. - Typically used for suggesting related articles,
music, or products based on textual or feature similarity.
Non-Personalized Recommender Systems:

a. Item-Based Collaborative Filtering: - Recommends items that


are similar to a given item based on user interactions with those
items. - It doesn't consider individual user preferences, making it
a non-personalized approach. - Useful for suggesting "people
who liked this also liked" items.

b. User-Based Collaborative Filtering: - Recommends items that


other users with similar interaction patterns have liked. - While it
incorporates some level of personalization, it's not entirely
personalized since it groups users with similar behaviors
together.

Hybrid Systems:
Hybrid recommender systems combine various recommendation
techniques, including non-personalized approaches. For example, a hybrid
system might use a popularity-based recommendation as a fallback for new
users without any interaction history.
Overview of data mining methods for recommender systems:
Data mining methods play a crucial role in developing effective
recommender systems in data science. Recommender systems aim to
provide personalized recommendations to users, and data mining techniques
help extract valuable insights from user data and item information. Here is
an overview of data mining methods commonly used in recommender
systems:

1. Collaborative Filtering:
Collaborative filtering is one of the most widely used data mining
methods for recommendation. It relies on the behavior and preferences
of users to make recommendations.
 User-Based Collaborative Filtering: This method recommends
items to a user based on the preferences of users with similar
behaviors or historical interactions.
 Item-Based Collaborative Filtering: It recommends items
similar to those a user has interacted with in the past based on
item-item similarity.

2. Matrix Factorization:
Matrix factorization techniques like Singular Value
Decomposition (SVD) and Alternating Least Squares (ALS) are
used to factorize the user-item interaction matrix into latent
factors.
These latent factors represent the underlying features or characteristics of users
and items, enabling personalized recommendations.
3. Content-Based Filtering:
 Content-based filtering recommends items to users based on
the content or features of items and the user's past preferences.
 Data mining methods, such as natural language processing
(NLP) and text analysis, are used to extract item features and
user profiles.

4. Association Rule Mining:


 Association rule mining, a common data mining technique,
is used to discover patterns and associations among items in
user transactions.
 It can be used to recommend items that are frequently purchased
together or items that exhibit a strong association based on user
behavior.

5. Clustering and Segmentation:


 Clustering algorithms, like k-means or hierarchical clustering,
are used to group users or items with similar characteristics.
 Users or items within the same cluster can receive
recommendations based on the preferences of others in the
same cluster.

6. Deep Learning:
 Deep learning techniques, particularly neural collaborative
filtering, utilize deep neural networks to capture complex
patterns and representations of user-item interactions.
 These models can learn intricate relationships between users
and items, leading to highly personalized recommendations.
7. Hybrid Methods:
 Hybrid recommender systems combine multiple data mining
methods to leverage their strengths and compensate for their
weaknesses.
 Hybrid systems may incorporate both collaborative filtering and
content- based filtering, or a combination of any of the above
methods.
8. Time-Series Analysis:
 For recommendations that change over time, time-series
analysis can be employed to capture temporal patterns in user
behavior.
 This is essential for applications like recommending news
articles or seasonal products.

9.Context-Aware Recommendation:
 Data mining methods can be used to incorporate contextual
information, such as user location, device, or time, into the
recommendation process.
 This is crucial for providing recommendations that are
sensitive to the user's current context.

10.Evaluation Metrics:
a. Data mining methods also play a role in selecting
appropriate evaluation metrics to measure the
performance of recommender systems. Common metrics
include Mean Absolute Error (MAE), Root Mean Square
Error (RMSE), Precision, Recall, and F1-score.

The choice of data mining method depends on the nature of the


data, the goals of the recommender system, and the available resources.
Similarity measures:
Similarity in a recommender system is about finding items (or
users, or user and item) that are similar. How to measure it depends on which
type of recommender you use. If you are doing collaborative filtering, then two
items are similar if a certain number of people like or hate it the same way.
Similarity Measures for Collaborative Filtering based
Recommender Systems
1. Jaccard Similarity:
a. Used to measure the similarity between sets.
b. It calculates the size of the intersection divided by
the size of the union of two sets.
Commonly applied in document similarity and recommendation
systems.

1. Euclidean Distance:
 Calculates the straight-line distance between two
points in Euclidean space.
 Appropriate for continuous data.
 Commonly used in k-means clustering and
hierarchical clustering.
2. Manhattan Distance (L1 Norm):
 Measures the distance between two points by
summing the absolute differences of their
coordinates.
Suitable for data with attributes that have different
units or scales.
Used in k-medians clustering.
Dimensionality reduction:
1. Matrix Factorization :
 Singular Value Decomposition (SVD): SVD is a popular technique for
matrix factorization. It decomposes the user-item interaction matrix into
three matrices: U (user features), Σ (singular values), and V (item
features). You can retain only the top-k singular values to reduce
dimensionality.
 Alternating Least Squares (ALS): ALS is an iterative optimization
algorithm commonly used for matrix factorization in collaborative
filtering. It alternates between optimizing user and item factors.

2. Principal Component Analysis (PCA) :


 PCA is a dimensionality reduction technique that can be applied to user and
item feature matrices to extract the most important components while
reducing dimensionality. It is useful when there are
many irrelevant or redundant features.
3. Non-Negative Matrix Factorization (NMF) :
 NMF factorizes the user-item interaction matrix into two non- negative
matrices, which can be interpreted as user and item features.
NMF is suitable for non-negative data, such as ratings or counts.
4. Autoencoders :
 Autoencoders are neural network architectures that can learn a
lower-dimensional representation of the input data. They can be used for
both collaborative filtering and content-based recommendation systems.

5. Truncated SVD :
 Similar to PCA, Truncated Singular Value Decomposition (Truncated
SVD) reduces the dimensionality of the user-item interaction matrix by
retaining the top-k singular values and their corresponding
singular vectors.
6. Random Projection :
 Random projection methods reduce dimensionality by projecting data
onto a lower-dimensional subspace using random matrices. They are
computationally efficient and useful for large-scale
recommendation systems.
7. Sparse Matrix Factorization:
When dealing with very sparse user-item interaction matrices, optimize
recommendation algorithms while handling the sparsity.

8. Word Embeddings
In content-based recommendation, word embeddings (e.g.,
Word2Vec, GloVe) can be used to represent textual content in a
lower-dimensional space. These embeddings capture semantic
relationships between words and items.
9. Tensor
For recommendation systems with multi-dimensional data (e.g., user,
item, specialized sparse matrix factorization techniques can be used to
and context), tensor factorization techniques can be applied to reduce
the dimensionality of the data.
10.Feature Selection
 Instead of reducing the dimensionality of the entire dataset, you can select a
subset of relevant features for the recommendation task. Feature selection
methods help retain the most informative attributes.

Singular Value Decomposition (SVD):

 Singular Value Decomposition (SVD) representation of a


matrix A involves expressing A as a product of three separate
matrices: U, Σ, and V^T.

Here's how these matrices are represented:

1. Original Matrix (A):

- The original matrix A is typically represented as an m x n matrix,


where m is the number of rows (usually representing users) and n is the
number of columns (typically representing items).

2. Left Singular Vectors Matrix (U):

- The left singular vectors matrix U is an m x m orthogonal (or


unitary) matrix.

- U represents users in a lower-dimensional latent space,


capturing user preferences.
Each column of U is a unit vector that represents a user's position in the
latent space.
3. Singular Values Matrix (Σ):

- The singular values matrix Σ is an m x n diagonal matrix with non-


negative singular values on the diagonal.

The singular values are typically ordered in descending order, with the
most significant singular values at the top.

- The singular values represent the importance of each latent


feature or component in the decomposition.

4. Right Singular Vectors Matrix (V^T):

- The right singular vectors matrix V^T is an n x n orthogonal (or


unitary) matrix.

- V^T represents items in the same lower-dimensional latent space,


capturing item characteristics.

- Each row of V^T is a unit vector that represents an item's


position in the latent space.

Mathematically, the SVD representation of a matrix A can be expressed as:


5. Singular Values Matrix (Σ):

- The singular values matrix Σ is an m x n diagonal matrix with non-


negative singular values on the diagonal.

- The singular values are typically ordered in descending order, with


the most significant singular values at the top.

- The singular values represent the importance of each latent


feature or component in the decomposition.

6. Right Singular Vectors Matrix (V^T):

- The right singular vectors matrix V^T is an n x n orthogonal (or


unitary) matrix.
- V^T represents items in the same lower-dimensional latent space,
capturing item characteristics.

- Each row of V^T is a unit vector that represents an item's


position in the latent space.

Mathematically, the SVD representation of a matrix A can be expressed as:

7. Singular Values Matrix (Σ):

- The singular values matrix Σ is an m x n diagonal matrix with non-


negative singular values on the diagonal.

- The singular values are typically ordered in descending order, with


the most significant singular values at the top.

- The singular values represent the importance of each latent


feature or component in the decomposition.

8. Right Singular Vectors Matrix (V^T):

- The right singular vectors matrix V^T is an n x n orthogonal (or


unitary) matrix.

- V^T represents items in the same lower-dimensional latent space,


capturing item characteristics.

- Each row of V^T is a unit vector that represents an item's


position in the latent space. Mathematically, the SVD
representation of a matrix A can be expressed as:

A = UΣV^T
Where:

- U is an m x m matrix of left singular vectors.

- Σ is an m x n diagonal matrix of singular values.

- V^T is an n x n matrix of right singular vectors.

 The SVD representation allows for dimensionality reduction, as


you can choose to keep only the top-k singular values and the
corresponding columns of U and rows of V^T to approximate
the original matrix A.
This lower-dimensional approximation captures the most significant
information and is used in applications like matrix approximation,
recommendation systems, and feature reduction
- In practical applications, SVD is often used with real-world data
to uncover latent patterns and relationships, making it a powerful
tool in data analysis, recommendation systems, and other
domains.

ADVANTAGES:

1. Dimensionality Reduction: SVD allows for dimensionality


reduction by retaining only the top-k singular values and their
corresponding vectors. This reduces the data's dimensionality while
preserving the most significant information, which is useful in
applications like data compression and noise reduction.

2. Noise Reduction: SVD can be used to separate signals from noise


in data, making it particularly useful in signal processing and image
analysis.

3. Matrix Approximation: SVD provides a lower-rank


approximation of a matrix, which can be helpful in simplifying
complex data structures while maintaining the essential patterns and
relationships within the data.

4. Latent Factor Discovery: In recommendation systems and data


mining, SVD is used to discover latent factors or hidden patterns
within data, enabling more accurate recommendations and insights.

5. Collaborative Filtering: SVD-based collaborative filtering is


effective for generating personalized recommendations, particularly in
situations with sparse user-item interaction data.

6. Principal Component Analysis (PCA): PCA, a dimensionality


reduction technique, is a specific case of SVD applied to the data
covariance matrix.

7. Efficiency and Numerical Stability: SVD is a numerically stable


and well-established technique that is efficient to compute, and there
are numerous optimized algorithms and libraries available for its
implementation.

8. Interpretability: In some applications, the singular vectors extracted


through SVD provide interpretable insights, making it easier to
understand the underlying patterns in the data.

Applications in various fields:


SVD has various applications in fields like image processing, natural
language processing, and recommendation systems.

You might also like