Recommended System
Recommended System
Recommender systems are a subset of data filtering systems that suggest similar products to
the user based on previous products using Machine Learning algorithms.
Recommender Systems in Machine Learning work with a huge amount of data to filter out
suggestions efficiently. They can be considered the most profitable application of Machine
Learning for businesses because of the user-specific approach and their potential to expand in
various categories like text, videos, movies, books, e-commerce products, etc.
Or
A recommendation system is an artificial intelligence or AI algorithm, usually associated with
machine learning, that uses Big Data to suggest or recommend additional products to
consumers.
These can be based on various criteria, including past purchases, search history, demographic
information, and other factors.
Types of RS:
1. Collaborative Filtering
2.content-based filtering or Recommendation
3.hybid
4.deep learning
Explain each in detail:
1.Collaborative Filtering:
Collaborative filtering is a technique that can filter out items that a user might like on the basis
of reactions by similar users.
It works by searching a large group of people and finding a smaller set of users with tastes
similar to a particular user. It looks at the items they like and combines them to create a ranked
list of suggestions.
There are many ways to decide which users are similar and combine their choices to create a
list of recommendations
Collaborative Filtering recommender systems in machine learning work on two parameters -
Similarity between users and similarity between products. Based on these results, suggestions
are drawn to the user.
Recommended system
This concept is widely used in recommending movies, news, applications, and so many
other items.
Types of Collaborative Filtering
There are two types of Collaborative Filtering available:
1.User-User-based similarity/Collaborative Filtering:
UB-CF in ml is a method, to predict the products that a user may like based on the ratings
submitted to that produced by other users.
Or
UB-CF is a technique, used to predict the items that a user might like on the basis of ratings
given to that item by other users who have similar taste with that of the target user.
Step 1: Finding the similarity of users to the target user U. Similarity for any two users ‘a’ and ‘b’
can be calculated.
Item to Item Similarity: The very first step is to build the model by finding similarity between all
the item pairs. The similarity between item pairs can be found in different ways. One of the
most common methods is to use cosine similarity.
Formula:
Prediction computation:
The Dataset: To experiment with recommendation algorithms, you’ll need data that contains a
set of items and a set of users who have reacted to some of the items.
The reaction can be explicit (rating on a scale of 1 to 5, likes or dislikes) or implicit (viewing an
item, adding it to a wish list, the time spent on an article).
While working with such data, you’ll mostly see it in the form of a matrix consisting of the
reactions given by a set of users to some items from a set of items. Each row would contain the
ratings given by a user, and each column would contain the ratings received by an item. A
matrix with five users and five items could look like this:
Recommended system
Rating Matrix
The matrix shows five users who have rated some of the items on a scale of 1 to 5. For example,
the first user has given a rating 4 to the third item.
In most cases, the cells in the matrix are empty, as users only rate a few items. It’s highly
unlikely for every user to rate or react to every item available. A matrix with mostly empty cells
is called sparse, and the opposite to that (a mostly filled matrix) is called dense.
There are a lot of datasets that have been collected and made available to the public for
research and benchmarking. Here’s a list of high-quality data sources that you can choose from.
The best one to get started would be the MovieLens dataset collected by GroupLens Research.
In particular, the MovieLens 100k dataset is a stable benchmark dataset with 100,000 ratings
given by 943 users for 1682 movies, with each user having rated at least 20 movies.
This dataset consists of many files that contain information about the movies, the users, and
the ratings given by users to the movies they have watched. The ones that are of interest are
the following:
u.item: the list of movies
u.data: the list of ratings given by users
The file u.data that contains the ratings is a tab separated list of user ID, item ID, rating, and
timestamp. The first few lines of the file look like this:
Let’s suppose there are four movies and a user has seen and liked the first two.
Item Representation: First, each item in the system is represented by a set of features or
attributes. For example, in a movie recommendation system, these features could include
genre, director, actors, release year, etc.
User Profile: The system creates a profile for each user based on their past interactions or
explicit feedback. This profile typically contains information about the user's preferences, such
as liked items, rated items, or items they have interacted with.
Similarity Calculation: Content-based filtering calculates the similarity between items based on
their features and the user profile.
Recommendation Generation: Once the similarity between items is calculated, the system can
recommend items to the user based on the similarity scores. It selects items that are similar to
those the user has already liked or interacted with.
Recommended system
Identification of attributes and features - Based on the search results, browses, and purchases,
an inventory of attributes or features is compiled.
Feature Matrix or utility matrix - Feature matrix maps products and their features and assigns
them a numerical or a binary value based on the resemblance to the searched product. This
sets up the basis for accepting the product for recommendation or rejecting it.
Judging acceptance or rejection - Either the binary values assigned to the dot product vector
decides if the product is to be considered. A higher value shows acceptance, and a more
inferior one shows rejection.
Important terms
Utility matrix
A utility matrix contains the interaction information between the user and the preferred items.
Data gathered from the day-to-day activities of the user is saved in a structured format to find
the likes and dislikes of different items the user has interacted with. A value is assigned to every
interaction, known as the ‘degree of preference’.
A few values are missing in the below example of a utility matrix. This is because some users do
not interact with every item available on the platform. Note that the goal of the recommender
model is to suggest new items based on this utility matrix.
User profile
A user profile is the collection of vectors that define a user’s preferences. The profile is based
on the activities and tastes of the user; for example, user ratings, number of clicks on different
items, thumbs up or thumbs down on content, etc. This information helps the recommender
engine to best estimate newer suggestions.
Or Content-based filtering : by contrast, uses the attributes or features of an item (this is the
content part) to recommend other items similar to the user’s preferences. This approach is
based on similarity of item.
Recommended system
Content-based filtering uses item features to recommend other items similar to what the user
likes, based on their previous actions or explicit feedback.
A content-based recommender works with data that the user provides, either explicitly (rating)
or implicitly (clicking on a link). Based on that data, a user profile is generated, which is then
used to make suggestions to the user. As the user provides more inputs or takes actions on
those recommendations, the engine becomes more and more accurate.
A recommender system has to decide between two methods for information delivery when
providing the user with recommendations:
Exploitation. The system chooses documents similar to those for which the user has
already expressed a preference.
Exploration. The system chooses documents where the user profile does not provide
evidence to predict the user’s reaction.
1. Collaborative
2. Content-based
3. Social and demographic
4. Contextual
Artificial Neural networks (ANN) or neural networks is a computational model, deep learning
inspired by the biological neural networks of the human brain. It's a fundamental concept in
machine learning and is widely used for solving various tasks such as classification,
regression, pattern recognition, and more.
A neural network is a machine learning algorithm based on the model of a human neuron.
The human brain consists of millions of neurons. It sends and process signals in the form of
electrical and chemical signals. These neurons are connected with a special structure known as
synapses. Synapses allow neurons to pass signals. From large numbers of simulated neurons
neural networks forms.
An Artificial Neural Network is an information processing technique. It works like the way
human brain processes information. ANN includes a large number of connected processing
units that work together to process information. They also generate meaningful results from
it.
We can apply Neural network not only for classification. It can also apply for regression of
continuous target attributes.
Recommended system
Neural networks are a bunch of neurons working together and solving some mathematical
calculations to decode a complex problem. It includes various technologies like deep learning
and machine learning as a part of artificial intelligence.
Artificial neural networks try to replicate the way we humans learn. It consists of an input layer,
a hidden layer, and an output layer. Each node in each layer is connected to one another and
has an associated weight and threshold. If the threshold of a particular node is greater than
some specified threshold, then the node gets activated
A neural network may contain the following 3 layers:
Input layer – The activity of the input units represents the raw information that can feed
into the network.
Hidden layer – To determine the activity of each hidden unit. The activities of the input
units and the weights on the connections between the input and the hidden units. There
may be one or more hidden layers.
Output layer – The behavior of the output units depends on the activity of the hidden
units and the weights between the hidden and output units.
The nodes of the input layer are passive, meaning they do not change the data. They receive a
single value on their input and duplicate the value to their many outputs. From the input layer,
it duplicates each value and sent to all the hidden nodes.
Recommended system
b. Hidden layer: The Hidden layers apply given transformations to the input values inside the
network. In this, incoming arcs that go from other hidden nodes or from input nodes connected
to each node. It connects with outgoing arcs to output nodes or to other hidden nodes. In
hidden layer, the actual processing is done via a system of weighted ‘connections’. There may
be one or more hidden layers. The values entering a hidden node multiplied by weights, a set of
predetermined numbers stored in the program. The weighted inputs are then added to
produce a single number.
c. Output layer: The hidden layers then link to an ‘output layer‘. Output layer receives
connections from hidden layers or from input layer. It returns an output value that corresponds
to the prediction of the response variable. In classification problems, there is usually only one
output node. The active nodes of the output layer combine and change the data to produce the
output values.
The ability of the neural network to provide useful data manipulation lies in the proper
selection of the weights. This is different from conventional information processing.
1.Input layer
2.hidden layer
3.output layer
How do ANN work?
ANN includes a huge number of neurons working parallelly arranged in layers. A neuron is a
simple or multiple linear regression model with an activation function at the end.
The first layer, also called the input layer receives the raw data. The hidden layers extract the
most important information from the inputs and discard the redundant information. The
Recommended system
output layer finally gives us the desired result of all the data processed by the artificial neural
network. It can either have single or multiple nodes.
The first neuron and the second neuron of the first layer are connected to all the inputs of the
previous layer and the process goes on for all the neurons in the first hidden layer. The artificial
neural network takes the input and computes the weighted sum of the inputs and adds bias to
it.
The output of the above equation is then passed through an activation function like sigmoid,
ReLU, tanH, etc. If this output is greater than a given threshold then it “fires” the node and
passes the data to the next layer in the network.
The output of these previously hidden layers is considered as the inputs of the incoming layer.
All the neurons are connected to one another with some weight and bias.
Weights are important for ANN because that is how neural networks learn. By changing the
weight value, the NN decides which signal is significant and which is not. The process of passing
data from one layer to the next layer is called Forward Propagation
Types of NN:
1.Feed-Forward ANN:
where the data or the input travels in one direction. The data passes through the input nodes
and exit on the output nodes. This neural network may or may not have the hidden layers. In
simple words, it has a front propagated wave and no backpropagation by using a classifying
activation function usually.
Below is a Single layer feed-forward network. Here, the sum of the products of inputs and
weights are calculated and fed to the output.
Or
Recommended system
In this network flow of information is unidirectional. A unit used to send information to another
unit that does not receive any information. Also, no feedback loops are present in this.
Although, used in recognition of a pattern. As they contain fixed inputs and outputs.
2.feedBack ANN: Signals can travel in both the directions in Feedback neural networks.
Feedback neural networks are very powerful and can get very complicated. Feedback neural
networks are dynamic. The ‘state’ in such network keep changing until they reach an
equilibrium point. They remain at the equilibrium point until the input changes and a new
equilibrium needs to be found.
Neural networks perform well with linear and nonlinear data but a common criticism of
neural networks, particularly in robotics, is that they require a large diversity of training
Recommended system
for real-world operation. This is so because any learning machine needs sufficient
representative examples in order to capture the underlying structure that allows it to
generalize to new cases.
Neural networks works even if one or few units fail to respond to network but to
implement large and effective software neural networks, much processing and storage
resources need to be committed. While the brain has hardware tailored to the task of
processing signals through a graph of neurons, simulating even a most simplified form
on Von Neumann technology may compel a neural network designer to fill millions of
database rows for its connections – which can consume vast amounts of computer
memory and hard disk space.
Neural network learns from the analyzed data and does not require to reprogramming
but they are referred to as black box” models, and provide very little insight into what
these models really do. The user just needs to feed it input and watch it train and await
the output.
What is Perceptron
Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates the
class or category to which the input data belongs.
Training Algorithm: The perceptron is typically trained using a supervised learning algorithm
such as the perceptron learning algorithm or backpropagation. During training, the weights and
biases of the perceptron are adjusted to minimize the error between the predicted output and
the true output for a given set of training examples.
Or Perceptron
A perceptron is a neural network unit (an artificial neuron) that does certain computations to
detect features or business intelligence in the input data.
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning
rule based on the original MCP neuron.
A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm enables
neurons to learn and processes elements in the training set one at a time.
The Perceptron receives multiple input signals, and if the sum of the input signals exceeds a
certain threshold, it either outputs a signal or does not return an output. In the context of
supervised learning and classification, this can then be used to predict the class of a sample.
How does perceptron work?
The perceptron model begins with the multiplication of all input values and their weights, then
adds these values together to create the weighted sum. Then this weighted sum is applied to
the activation function 'f' to obtain the desired output. This activation function is also known as
the step function and is represented by 'f'.
Here the above image drow
Perceptron model works in two important steps as follows:
Step-1
In the first step first, multiply all input values with corresponding weight values and then add
them to determine the weighted sum. Mathematically, we can calculate the weighted sum as
follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
Add a special term called bias 'b' to this weighted sum to improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the above-mentioned weighted sum,
which gives us output either in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)
Perceptron Function
Recommended system
Perceptron learning algorithm function f(x) is represented as the product of the input vector (x)
and the learned weight vector (w). In mathematical notion, it can be described as:
f(x)=1,ifw.x+b>0 , f(x)=0,otherwise
or
Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value”f(x)”is generated.
“b” = bias (an element that adjusts the boundary away from origin without any dependence on
the input value)
The output can be represented as “1” or “0.” It can also be represented as “1” or “-1” depending
on which activation function is used.
Inputs of a Perceptron
A Perceptron accepts inputs, moderates them with certain weight values, then applies the
transformation function to output the final result. The above below shows a Perceptron with a
Boolean output.
A Boolean output is based on inputs such as salaried, married, age, past credit profile, etc. It has
only two values: Yes and No or True and False. The summation function “∑” multiplies all
inputs of “x” by weights “w” and then adds them up as follows:
The activation function applies a step rule (convert the numerical output into +1 or -1) to check if
the output of the weighting function is greater than zero or not.
For
example:
Step function gets triggered above a certain value of the neuron output; else it outputs zero. Sign
Function outputs +1 or -1 depending on whether neuron output is greater than zero or not.
Sigmoid is the S-curve and outputs a value between 0 and 1.
Output of Perceptron
Inputs: x1…xn
Output: o(x1….xn)
If ∑w.x > 0, output is +1, else -1. The neuron gets triggered only when weighted input reaches a
certain threshold value.
Recommended system
An output of +1 specifies that the neuron is triggered. An output of -1 specifies that the neuron
did not get triggered.
Error in Perceptron
In the Perceptron Learning Rule, the predicted output is compared with the known output. If it
does not match, the error is propagated backward to allow weight adjustment to happen.
A decision function φ(z) of Perceptron is defined to take a linear combination of x and w vectors.
Bias Unit:
For simplicity, the threshold θ can be brought to the left and represented as w0x0, where w0= -θ
and x0= 1.
Output
The figure shows how the decision function squashes wTx to either +1 or -1 and how it can be
used to discriminate between two linearly separable classes.
Single layer NN
Recommended system
A multilayer neural network, also known as a feedforward neural network or a deep neural
network, is a type of artificial neural network with multiple layers of artificial neurons, including
at least one hidden layer between the input and output layers. These networks are capable of
learning and representing complex patterns and relationships in data, making them suitable for a
wide range of machine learning tasks, including image recognition, natural language processing,
and more.
Multi-Layer Perceptron Neural Network is a Neural Network with multiple layers, and all its
layers are connected. It uses a Back-Propagation algorithm for training the model. Multilayer
Perceptron is a class of Deep Learning, also known as MLP.
Or
A multilayer perceptron (MLP) Neural network belongs to the feedforward neural network. It is
an Artificial Neural Network in which all nodes are interconnected with nodes of different
layers.
Frank Rosenblatt first defined the word Perceptron in his perceptron program. Perceptron is a
basic unit of an artificial neural network that defines the artificial neuron in the neural network. It
is a supervised learning algorithm containing nodes’ values, activation functions, inputs, and
weights to calculate the output.
The Multilayer Perceptron (MLP) Neural Network works only in the forward direction. All
nodes are fully connected to the network. Each node passes its value to the coming node only in
the forward direction. The MLP neural network uses a Backpropagation algorithm to increase the
accuracy of the training model.
In this figure, we have used circles to also denote the inputs to the network. The circles labeled
“+1” are called bias units, and correspond to the intercept term. The leftmost layer of the
network is called the input layer, and the rightmost layer the output layer (which, in this
example, has only one node). The middle layer of nodes is called the hidden layer, because its
values are not observed in the training set. We also say that our example neural network has 3
input units (not counting the bias unit), 3 hidden units, and 1 output unit.
Step1: find the summation and bias unit on performing dot product among inputs and weights as:
On feeding the r into activation function F(r) we find the output for the hidden layers. For the
first hidden layer h1, the neuron can be calculated as:
h11 = F(r)
For all the other hidden layers repeat the same procedure. Keep repeating the process until reach
the last weight set.
Backpropagation Algorithm
In an artificial neural network, the values of weights and biases are randomly initialized. Due to
random initialization, the neural network probably has errors in giving the correct output. We
need to reduce error values as much as possible. So, to reduce these error values, we need a
mechanism that can compare the desired output of the neural network with the network’s output
that consists of errors and adjust its weights and biases such that it gets closer to the desired
output after each iteration. For this, we train the network such that it back propagates and updates
the weights and biases. This is the concept of the back propagation algorithm.
Or
The principle behind the back propagation algorithm is to reduce the error values in randomly
allocated weights and biases such that it produces the correct output. The system is trained in the
supervised learning method, where the error between the system’s output and a known expected
output is presented to the system and used to modify its internal state. We need to update the
weights so that we get the global loss minimum. This is how back propagation in neural
networks works.
Input values
Recommended system
X1=0.05
X2=0.10
Initial weight
W1=0.15 w5=0.40
W2=0.2 w6=0.45
W3=0. w7=0.50
W4=0.30 w8=0.55
Bias Values
b1=0.35 b2=0.60
Target Values
T1=0.01
T2=0.99
Forward Pass
To find the value of H1 we first multiply the input value from the weights as
H1=x1×w1+x2×w2+b1
H1=0.05×0.15+0.10×0.20+0.35
H1=0.3775
H2=x1×w3+x2×w4+b1
H2=0.05×0.25+0.10×0.30+0.35
H2=0.3925
Now, we calculate the values of y1 and y2 in the same way as we calculate the H1 and H2.
To find the value of y1, we first multiply the input value i.e., the outcome of H1 and H2 from the
weights as
y1=H1×w5+H2×w6+b2
y1=0.593269992×0.40+0.596884378×0.45+0.60
y1=1.10590597
To calculate the final result of y1 we performed the sigmoid function as
Our target values are 0.01 and 0.99. Our y1 and y2 value is not matched with our target values
T1 and T2.
Now, we will find the total error, which is simply the difference between the outputs from the
target outputs. The total error is calculated as
Now, we will backpropagate this error to update the weights using a backward pass.
Backward pass at the output layer
To update the weight, we calculate the error correspond to each weight with the help of a total
error. The error on weight w is calculated by differentiating total error with respect to w.
From equation two, it is clear that we cannot partially differentiate it with respect to w5 because
there is no any w5. We split equation one into multiple terms so that we can easily differentiate it
with respect to w5 as
Now, we calculate each term one by one to differentiate Etotal with respect to w5 as
Recommended system
So, we put the values of in equation no (3) to find the final result.
Now, we will calculate the updated weight w5new with the help of the following formula
In the same way, we calculate w6new,w7new, and w8new and this will give us the following values
w5new=0.35891648
w6new=408666186
w7new=0.511301270
w8new=0.561370121
Recommended system
Now, we will backpropagate to our hidden layer and update the weight w1, w2, w3, and w4 as
we have done with w5, w6, w7, and w8 weights.
From equation (2), it is clear that we cannot partially differentiate it with respect to w1 because
there is no any w1. We split equation (1) into multiple terms so that we can easily differentiate it
with respect to w1 as
Now, we calculate each term one by one to differentiate Etotal with respect to w1 as
We again Split both because there is no any y1 and y2 term in E1 and E2. We split
it as
Now, we find the value of by putting values in equation (18) and (19) as
We calculate the partial derivative of the total net input to H1 with respect to w1 the same as we
did for the output neuron:
Recommended system
So, we put the values of in equation (13) to find the final result.
Now, we will calculate the updated weight w1new with the help of the following formula
In the same way, we calculate w2new,w3new, and w4 and this will give us the following values
w1new=0.149780716
w2new=0.19956143
w3new=0.24975114
w4new=0.29950229
We have updated all the weights. We found the error 0.298371109 on the network when we fed
forward the 0.05 and 0.1 inputs. In the first round of Backpropagation, the total error is down to
0.291027924. After repeating this process 10,000, the total error is down to 0.0000351085. At
this point, the outputs neurons generate 0.159121960 and 0.984065734 i.e., nearby our target
value when we feed forward the 0.05 and 0.1.
Deep learning is a branch of machine learning which is completely based on artificial neural
networks with three and more layers, as neural network is going to mimic the human brain so
deep learning is also a kind of mimic of human brain. In deep learning, we don’t need to
explicitly program everything.
Deep learning has aided image classification, language translation, speech recognition. It can be
used to solve any pattern recognition problem and without human intervention.
Recommended system
Deep learning is the branch of machine learning which is based on artificial neural network
architecture. An artificial neural network or ANN uses layers of interconnected nodes called
neurons that work together to process and learn from the input data.
In a fully connected Deep neural network, there is an input layer and one or more hidden layers
connected one after the other. Each neuron receives input from the previous layer neurons or the
input layer. The output of one neuron becomes the input to other neurons in the next layer of the
network, and this process continues until the final layer produces the output of the network. The
layers of the neural network transform the input data through a series of nonlinear
transformations, allowing the network to learn complex representations of the input data.
Examples of DL:
1.self-driving cars
2.chatbots
3.facial recognition
4.speech recognition
Architectures:
1. Deep Neural Network – It is a neural network with a certain level of complexity (having
multiple hidden layers in between input and output layers). They are capable of modeling
and processing non-linear relationships.
2. Deep Belief Network (DBN) – It is a class of Deep Neural Network. It is multi-layer
belief networks.
3. Steps for performing DBN:
a. Learn a layer of features from visible units using Contrastive Divergence algorithm.
b. Treat activations of previously trained features as visible units and then learn features
of features.
c. Finally, the whole DBN is trained when the learning for the final hidden layer is
achieved.
4. Recurrent (perform same task for every element of a sequence) Neural Network –
Allows for parallel and sequential computation. Similar to the human brain (large
feedback network of connected neurons). They are able to remember important things
about the input they received and hence enables them to be more precise.
Recommended system
Takes less time to train the model. Takes more time to train the model.
A model is created by relevant features which Relevant features are automatically extracted
are manually extracted from images to detect an from images. It is an end-to-end learning
object in the image. process.
It can work on the CPU or requires less It requires a high-performance computer with
computing power as compared to deep learning. GPU.
Working:
First, we need to identify the actual problem in order to get the right solution and it should be
understood, the feasibility of the Deep Learning should also be checked (whether it should fit
Deep Learning or not). Second, we need to identify the relevant data which should correspond to
the actual problem and should be prepared accordingly. Third, Choose the Deep Learning
Algorithm appropriately. Fourth, Algorithm should be used while training the dataset. Fifth,
Final testing should be done on the dataset.
Recommended system