0% found this document useful (0 votes)
39 views13 pages

RS Unit 1

A recommendation system is an AI algorithm that uses big data to suggest additional products to consumers based on criteria like purchase history, search history, demographics, and other factors. There are several types of recommender systems including content-based filtering, collaborative filtering, hybrid systems, and knowledge-based systems. Traditional recommender systems rely on general trends rather than personalization, recommending popular, trending, top-rated, or new items. Non-personalized systems make recommendations without considering individual preferences, instead focusing on demographics, context, rules, or random suggestions. Dimensionality reduction techniques like matrix factorization and feature transformation simplify data while preserving important information to build more efficient recommendation models.

Uploaded by

2k21cse093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views13 pages

RS Unit 1

A recommendation system is an AI algorithm that uses big data to suggest additional products to consumers based on criteria like purchase history, search history, demographics, and other factors. There are several types of recommender systems including content-based filtering, collaborative filtering, hybrid systems, and knowledge-based systems. Traditional recommender systems rely on general trends rather than personalization, recommending popular, trending, top-rated, or new items. Non-personalized systems make recommendations without considering individual preferences, instead focusing on demographics, context, rules, or random suggestions. Dimensionality reduction techniques like matrix factorization and feature transformation simplify data while preserving important information to build more efficient recommendation models.

Uploaded by

2k21cse093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

UNIT 1

INTRODUCTION

What Is a Recommendation System?

• A recommendation system is an artificial intelligence or AI algorithm,


usually associated with machine learning, that uses Big Data to suggest
or recommend additional products to consumers.
• These can be based on various criteria, including past purchases, search
history, demographic information, and other factors. Recommender
systems are highly useful as they help users discover products and
services.
• They play a vital role in enhancing user experience, increasing user
engagement, and driving business revenue in many online platforms
such as e-commerce, streaming services, social networks, and more.
Basic Taxonomy of Recommender Systems:
Recommender systems can be broadly categorized into several types based on
their underlying techniques and methodologies. Here are some common types:

Content-Based Filtering:
Content-based filtering recommends items to users based on the similarity
between the content of the items and the user's preferences. It analyzes the
attributes or features of items and suggests items that are similar to those the
user has liked in the past. For example, in a movie recommendation system, if a
user has liked action movies in the past, the system might recommend other
action movies with similar themes or actors.
Collaborative Filtering:
• Collaborative filtering recommends items to users based on the
preferences of other users. It identifies patterns of user behavior by
analyzing interactions (e.g., ratings, purchases) between users and items.
There are two main approaches within collaborative filtering:
• User-Based Collaborative Filtering: This approach recommends items to a
user based on the preferences of users who are similar to them. If users
A and B have similar tastes and preferences, items liked by user B but not
yet seen by user A might be recommended to user A.
• Item-Based Collaborative Filtering: In this approach, similarities between
items are calculated based on the ratings given to them by users. Items
that are highly rated by users who have also rated the current item
highly are recommended to the user.
Hybrid Recommender Systems:
Hybrid recommender systems combine multiple recommendation techniques
to provide more accurate and diverse recommendations. These systems
leverage the strengths of different approaches to overcome the limitations of
individual methods. For instance, a hybrid system might combine collaborative
filtering with content-based filtering to provide recommendations that consider
both user preferences and item attributes.
Knowledge-Based Recommender Systems:
Knowledge-based recommender systems make recommendations based on
explicit knowledge about user preferences and item properties. These systems
typically require a knowledge base or domain-specific information to generate
recommendations. They are particularly useful when there is limited user data
available or when explicit user preferences are known.
Context-Aware Recommender Systems:
Context-aware recommender systems take into account contextual information
such as time, location, and device to provide more relevant recommendations.
By considering the context in which users interact with the system, these
systems can offer personalized recommendations tailored to the user's current
situation or environment.

TRADITIONAL AND NON PERSONALIZED RECOMMENDER


SYSTEM:
Traditional Recommender Systems:
Traditional recommender systems are those that rely on general trends or
popularity to make recommendations rather than personalized user data.
These systems often employ straightforward algorithms that are not tailored to
individual user preferences. Here are a few examples:

1. Popularity-Based Recommender Systems:


These systems recommend items that are popular among all users or
within a specific demographic. For example, a movie streaming platform
might recommend the most-watched movies of the week to all users,
regardless of their individual tastes.

2. Trending Recommendations:
Similar to popularity-based systems, trending recommendations suggest
items that are currently popular or experiencing a surge in interest. This
approach is commonly used in social media platforms, where users are
shown trending topics, hashtags, or content.

3. Top-Rated Recommendations:
Some recommender systems simply recommend items that have
received the highest ratings or reviews from users. For instance, an e-
commerce website might showcase products with the highest average
ratings in a particular category.

4. New Arrivals or Latest Additions:


Another traditional approach is to recommend new items or additions to
the catalog. This can be particularly useful for platforms with regularly
updated content, such as streaming services or online retailers.

Non-Personalized Recommender Systems:


Non-personalized recommender systems make recommendations without
considering individual user preferences or behavior. Instead, they focus on
providing generic recommendations that apply to all users equally. Here are a
few examples:

1. Demographic-Based Recommendations:
These systems recommend items based on demographic information
such as age, gender, or location. For instance, a music streaming service
might suggest playlists curated for specific age groups or geographic
regions.

2. Context-Based Recommendations:
Non-personalized recommendations can also be based on contextual
information such as time of day, weather, or location. For example, a
restaurant recommendation app might suggest nearby dining options
based on the user's current location.

3. Rule-Based Systems:
Rule-based recommender systems use predefined rules or heuristics to
generate recommendations. These rules may be based on domain
knowledge or expert opinion rather than user data. For instance, a recipe
website might recommend side dishes based on the main course
selected by the user.

4. Random Recommendations:
In some cases, recommender systems may provide random
recommendations when personalized or data-driven suggestions are not
available or applicable. This approach can be used as a fallback option or to
introduce users to new content.

Data Mining:
• Data Mining as the “non-trivial extraction of meaningful information
from large amounts of data by automatic or semi-automatic means”.
• Data Mining uses methods and techniques drawn from machine
learning, artificial intelligence, statistics, and database systems. How ever
most of these “traditional” techniques need to be adapted to account for
the high dimensionality and heterogeneity of data that is pervasive in
Data Mining problems.
• The process of data mining typically consists of 3 steps, carried out in
succession: Data Preprocessing, Data Analysis, and Result Interpretation.
Examples of Predictive Analysis: Data mining can be applied to various
domains, such as predicting crop yields, assessing the likelihood of a person
having a disease based on symptoms, forecasting sales of groceries, estimating
the number of customers purchasing clothes, and projecting the expected
profit or loss percentage in the coming year.

Objective: The main objective of data mining is to analyze patterns in the


dataset and obtain useful information relevant to the desired outcome or
target. This involves training accurate models, identifying relevant patterns, and
ensuring a sufficient amount of data for accurate and efficient results.

Techniques Used: Data processing in data mining draws upon statistical


methodologies such as sampling, estimation, and hypothesis testing. It also
involves employing search algorithms, modeling techniques, and learning
theories from computing, pattern recognition, and machine learning to analyze
and interpret the data effectively.
Dimensionality Reduction
• Dimensionality reduction is a technique used to reduce the number of
input variables or features in a dataset while preserving the most
important information.
• In other words, it simplifies the dataset by representing it in a lower-
dimensional space, where each dimension captures a combination of the
original features.
• This process helps in addressing the curse of dimensionality, which
refers to the challenges associated with high-dimensional data, such as
increased computational complexity and overfitting.
The dimensionality reduction techniques that are commonly used in
recommender systems:

• Matrix Factorization: Matrix factorization methods, such as Singular


Value Decomposition (SVD), factorize the user-item interaction matrix
into lower-dimensional matrices representing users and items. By doing
so, they uncover latent factors (e.g., user preferences and item
characteristics) that influence user-item interactions. These latent factors
are then used to make personalized recommendations.

• Feature Transformation: Dimensionality reduction techniques like


Principal Component Analysis (PCA) can be applied to transform high-
dimensional item feature vectors into a lower-dimensional space. This
reduces the complexity of the feature space while preserving as much
variance as possible. The transformed feature vectors can then be used
to compute item similarities or to build more efficient recommendation
models.

• Embedding Learning: Deep learning techniques, such as autoencoders


and neural network-based embeddings, can learn low-dimensional
representations (embeddings) of users and items directly from raw or
pre-processed data. These embeddings capture semantically meaningful
representations of users and items in a lower-dimensional space, which
can improve the quality of recommendations and enable efficient
computation.
• Clustering and Neighbourhood Methods: Dimensionality reduction can
also be employed indirectly in recommender systems through clustering
or neighbourhood-based methods. By grouping similar users or items
into clusters or neighbourhoods, the dimensionality of the
recommendation problem can be effectively reduced. Users or items
within the same cluster or neighbourhood are then treated as
representatives of the cluster or neighbourhood, simplifying the
recommendation process.
Overall, dimensionality reduction techniques play a crucial role in
recommender systems by simplifying the representation of user-item
interaction data, capturing latent factors or patterns, and improving the
efficiency and effectiveness of recommendation algorithms. These techniques
enable recommender systems to handle large-scale datasets and provide
personalized recommendations to users efficiently.

Singular Value Decomposition

The Singular Value Decomposition (SVD), a method from linear algebra that has
been generally used as a dimensionality reduction technique in machine
learning. SVD is a matrix factorisation technique, which reduces the number of
features of a dataset by reducing the space dimension from N-dimension to K-
dimension (where K<N). In the context of the recommender system, the SVD is
used as a collaborative filtering technique. It uses a matrix structure where
each row represents a user, and each column represents an item. The elements
of this matrix are the ratings that are given to items by users.

The factorisation of this matrix is done by the singular value decomposition. It


finds factors of matrices from the factorisation of a high-level (user-item-rating)
matrix. The singular value decomposition is a method of decomposing a matrix
into three other matrices as given below: Where A is a m x n utility matrix, U is
a m x r orthogonal left singular matrix, which represents the relationship
between users and latent factors, S is a r x r diagonal matrix, which describes
the strength of each latent factor and V is a r x n diagonal right singular matrix,
which indicates the similarity between items and latent factors.

The latent factors here are the characteristics of the items, for example, the
genre of the music. The SVD decreases the dimension of the utility matrix A by
extracting its latent factors. It maps each user and each item into a r-
dimensional latent space. This mapping facilitates a clear representation of
relationships between users and items.

Let each item be represented by a vector xi and each user is represented by a


vector yu. The expected rating by a user on an item can be given as:Here,
is a form of factorisation in singular value decomposition. The xi and yu can be
obtained in a manner that the square error difference between their dot
product and the expected rating in the user-item matrix is minimum. It can be
expressed as:In order to let the model generalise well and not overfit the
training data, a regularisation term is added as a penalty to the above formula.
In order to reduce the error between the value predicted by the model and the
actual value, the algorithm uses a bias term. Let for a user-item pair (u, i), μ is
the average rating of all items, bi is the average rating of item i minus μ and bu
is the average rating given by user u minus μ, the final equation after adding
the regularisation term and bias can be given as:
The above equation is the main component of the algorithm which works for
singular value decomposition based recommendation system.

Similarity Measures

Similarity measures play a crucial role in recommender systems as they are


used to quantify the similarity between users, items, or their representations in
the feature space. These measures help in identifying items that are similar to
those already interacted with by a user or in finding users with similar
preferences. Here are some common similarity measures used in
recommender systems:

• Cosine Similarity: Cosine similarity measures the cosine of the angle


between two vectors in the feature space. It is widely used in
recommender systems to compute similarity between item vectors or
user vectors in collaborative filtering-based approaches. Cosine similarity
ranges from -1 to 1, where 1 indicates perfect similarity, 0 indicates no
similarity, and -1 indicates perfect dissimilarity.

• Pearson Correlation Coefficient: Pearson correlation coefficient


measures the linear correlation between two variables. In recommender
systems, Pearson correlation is often used to compute similarity between
users based on their rating patterns. It ranges from -1 to 1, where 1
indicates perfect positive correlation, 0 indicates no correlation, and -1
indicates perfect negative correlation.

• Jaccard Similarity: Jaccard similarity measures the similarity between


two sets by comparing their intersection to their union. It is commonly
used in item-based collaborative filtering to compute similarity between
items based on the overlap of users who have interacted with them.
Jaccard similarity ranges from 0 to 1, where 1 indicates identical sets and
0 indicates no overlap.
• Euclidean Distance: Euclidean distance measures the straight-line
distance between two points in the feature space. It is often used to
compute dissimilarity between items or users represented as vectors.
Smaller distances indicate higher similarity between points.

• Manhattan Distance: Manhattan distance (also known as city block


distance or L1 distance) measures the sum of the absolute differences
between corresponding coordinates of two points. It is similar to
Euclidean distance but computes distance along axes parallel to the
coordinate axes. Manhattan distance is often used in recommendation
systems to measure similarity between items or users.

• Spearman Rank Correlation: Spearman rank correlation measures the


strength and direction of the monotonic relationship between two
variables. It is particularly useful when dealing with ordinal data or when
the relationship between variables is not linear. Spearman correlation is
used in recommender systems to compute similarity based on rankings
or preferences.

• TF-IDF (Term Frequency-Inverse Document Frequency): TF-IDF is a


statistical measure used to evaluate the importance of a term in a
document relative to a collection of documents. It is commonly used in
content-based recommender systems to compute similarity between
documents or items based on the frequency of terms and their rarity in
the corpus.

These similarity measures help recommender systems identify relevant items


for users or similar users for personalized recommendations. The choice of
similarity measure depends on factors such as the nature of the data, the
recommendation algorithm used, and the specific requirements of the
application.

You might also like