0% found this document useful (0 votes)
24 views33 pages

Recommended System

Uploaded by

shaukeenkha3606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views33 pages

Recommended System

Uploaded by

shaukeenkha3606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Recommended system

What is recommended system in ml:

Recommender systems are a subset of data filtering systems that suggest similar products to
the user based on previous products using Machine Learning algorithms.

Recommender Systems in Machine Learning work with a huge amount of data to filter out
suggestions efficiently. They can be considered the most profitable application of Machine
Learning for businesses because of the user-specific approach and their potential to expand in
various categories like text, videos, movies, books, e-commerce products, etc.
Or
A recommendation system is an artificial intelligence or AI algorithm, usually associated with
machine learning, that uses Big Data to suggest or recommend additional products to
consumers.

These can be based on various criteria, including past purchases, search history, demographic
information, and other factors.
Types of RS:
1. Collaborative Filtering
2.content-based filtering or Recommendation
3.hybid
4.deep learning
Explain each in detail:
1.Collaborative Filtering:

Collaborative filtering is a technique that can filter out items that a user might like on the basis
of reactions by similar users.

It works by searching a large group of people and finding a smaller set of users with tastes
similar to a particular user. It looks at the items they like and combines them to create a ranked
list of suggestions.
There are many ways to decide which users are similar and combine their choices to create a
list of recommendations
Collaborative Filtering recommender systems in machine learning work on two parameters -
Similarity between users and similarity between products. Based on these results, suggestions
are drawn to the user.
Recommended system

This concept is widely used in recommending movies, news, applications, and so many
other items.
Types of Collaborative Filtering
There are two types of Collaborative Filtering available:
1.User-User-based similarity/Collaborative Filtering:
UB-CF in ml is a method, to predict the products that a user may like based on the ratings
submitted to that produced by other users.
Or
UB-CF is a technique, used to predict the items that a user might like on the basis of ratings
given to that item by other users who have similar taste with that of the target user.

Step 1: Finding the similarity of users to the target user U. Similarity for any two users ‘a’ and ‘b’
can be calculated.

Step 2: Prediction of missing rating of an item:


Recommended system

2.Item-Item-based similarity/Collaborative Filtering


Item-item collaborative filtering is one kind of recommendation method which looks for similar
items based on the items users have already liked or positively interacted with.
How IBCF works are that it suggests an item based on items the user has previously consumed.
It looks for the items the user has consumed then it finds other items similar to consumed
items and recommends accordingly.

Item to Item Similarity: The very first step is to build the model by finding similarity between all
the item pairs. The similarity between item pairs can be found in different ways. One of the
most common methods is to use cosine similarity.

Formula:

Prediction computation:

The Dataset: To experiment with recommendation algorithms, you’ll need data that contains a
set of items and a set of users who have reacted to some of the items.

The reaction can be explicit (rating on a scale of 1 to 5, likes or dislikes) or implicit (viewing an
item, adding it to a wish list, the time spent on an article).

While working with such data, you’ll mostly see it in the form of a matrix consisting of the
reactions given by a set of users to some items from a set of items. Each row would contain the
ratings given by a user, and each column would contain the ratings received by an item. A
matrix with five users and five items could look like this:
Recommended system

Rating Matrix
The matrix shows five users who have rated some of the items on a scale of 1 to 5. For example,
the first user has given a rating 4 to the third item.
In most cases, the cells in the matrix are empty, as users only rate a few items. It’s highly
unlikely for every user to rate or react to every item available. A matrix with mostly empty cells
is called sparse, and the opposite to that (a mostly filled matrix) is called dense.
There are a lot of datasets that have been collected and made available to the public for
research and benchmarking. Here’s a list of high-quality data sources that you can choose from.
The best one to get started would be the MovieLens dataset collected by GroupLens Research.
In particular, the MovieLens 100k dataset is a stable benchmark dataset with 100,000 ratings
given by 943 users for 1682 movies, with each user having rated at least 20 movies.
This dataset consists of many files that contain information about the movies, the users, and
the ratings given by users to the movies they have watched. The ones that are of interest are
the following:
 u.item: the list of movies
 u.data: the list of ratings given by users
The file u.data that contains the ratings is a tab separated list of user ID, item ID, rating, and
timestamp. The first few lines of the file look like this:

First 5 Rows of MovieLens 100k Data


As shown above, the file tells what rating a user gave to a particular movie. This file contains
100,000 such ratings, which will be used to predict the ratings of the movies not seen by the
users.
2.Content-based Filtering:
Content-based filtering is a technique used in machine learning and recommender systems to
recommend items based on their attributes and features to user.
Content based filtering is a recommendation algorithm to find similar suggestions. Here, every
unique value in a dataset is assigned keywords or attributes which help them to be recognized.
Then based on these patterns, the information about the user's likes and dislikes is saved,
recommending relevant items. The recommender system stores previous user data like clicks,
ratings, and likes to create a user profile.
To understand this, let’s use a simple example of how a content-based recommender system
might work to suggest movies.
Recommended system

Let’s suppose there are four movies and a user has seen and liked the first two.

The model automatically suggests the third movie


rather than the fourth, since it is more similar to the first two. This similarity can be calculated
based on a number of features like the actors and actresses in the movie, the director, the
genre, the duration of the film, etc.
Why use Content based filtering
In the case of Content based filtering, user privacy is maintained. The recommendation system
works on browsed, purchased, and past products. It does not need any other personal
information or inputs from the user.
Since the system utilizes browsed products, the features to be looked for remain the same, thus
making the results specific and user based. Thus every user will have a distinct set of results
making the experience unique.
Another reason to work with Content based filtering is that it is relatively uncomplicated to use
and even easier to build.

Basic working of this:

Item Representation: First, each item in the system is represented by a set of features or
attributes. For example, in a movie recommendation system, these features could include
genre, director, actors, release year, etc.

User Profile: The system creates a profile for each user based on their past interactions or
explicit feedback. This profile typically contains information about the user's preferences, such
as liked items, rated items, or items they have interacted with.

Similarity Calculation: Content-based filtering calculates the similarity between items based on
their features and the user profile.

Recommendation Generation: Once the similarity between items is calculated, the system can
recommend items to the user based on the similarity scores. It selects items that are similar to
those the user has already liked or interacted with.
Recommended system

Method to Perform Content based Filtering

Let us now understand how Content based filtering works.

Identification of attributes and features - Based on the search results, browses, and purchases,
an inventory of attributes or features is compiled.

Feature Matrix or utility matrix - Feature matrix maps products and their features and assigns
them a numerical or a binary value based on the resemblance to the searched product. This
sets up the basis for accepting the product for recommendation or rejecting it.

Judging acceptance or rejection - Either the binary values assigned to the dot product vector
decides if the product is to be considered. A higher value shows acceptance, and a more
inferior one shows rejection.

Important terms
Utility matrix
A utility matrix contains the interaction information between the user and the preferred items.
Data gathered from the day-to-day activities of the user is saved in a structured format to find
the likes and dislikes of different items the user has interacted with. A value is assigned to every
interaction, known as the ‘degree of preference’.

A few values are missing in the below example of a utility matrix. This is because some users do
not interact with every item available on the platform. Note that the goal of the recommender
model is to suggest new items based on this utility matrix.

User profile
A user profile is the collection of vectors that define a user’s preferences. The profile is based
on the activities and tastes of the user; for example, user ratings, number of clicks on different
items, thumbs up or thumbs down on content, etc. This information helps the recommender
engine to best estimate newer suggestions.

Or Content-based filtering : by contrast, uses the attributes or features of an item (this is the
content part) to recommend other items similar to the user’s preferences. This approach is
based on similarity of item.
Recommended system

Content-based filtering uses item features to recommend other items similar to what the user
likes, based on their previous actions or explicit feedback.

How do content-based recommender systems work?

A content-based recommender works with data that the user provides, either explicitly (rating)
or implicitly (clicking on a link). Based on that data, a user profile is generated, which is then
used to make suggestions to the user. As the user provides more inputs or takes actions on
those recommendations, the engine becomes more and more accurate.

A recommender system has to decide between two methods for information delivery when
providing the user with recommendations:

 Exploitation. The system chooses documents similar to those for which the user has
already expressed a preference.
 Exploration. The system chooses documents where the user profile does not provide
evidence to predict the user’s reaction.

Q. Explain Recommendation paradigms

Broadly speaking, recommender systems are of 4 types:

1. Collaborative
2. Content-based
3. Social and demographic
4. Contextual

Artificial Neural Network (ANN) in Machine Learning

Artificial Neural networks (ANN) or neural networks is a computational model, deep learning
inspired by the biological neural networks of the human brain. It's a fundamental concept in
machine learning and is widely used for solving various tasks such as classification,
regression, pattern recognition, and more.
A neural network is a machine learning algorithm based on the model of a human neuron.
The human brain consists of millions of neurons. It sends and process signals in the form of
electrical and chemical signals. These neurons are connected with a special structure known as
synapses. Synapses allow neurons to pass signals. From large numbers of simulated neurons
neural networks forms.
An Artificial Neural Network is an information processing technique. It works like the way
human brain processes information. ANN includes a large number of connected processing
units that work together to process information. They also generate meaningful results from
it.
We can apply Neural network not only for classification. It can also apply for regression of
continuous target attributes.
Recommended system

Neural networks are a bunch of neurons working together and solving some mathematical
calculations to decode a complex problem. It includes various technologies like deep learning
and machine learning as a part of artificial intelligence.

Artificial neural networks try to replicate the way we humans learn. It consists of an input layer,
a hidden layer, and an output layer. Each node in each layer is connected to one another and
has an associated weight and threshold. If the threshold of a particular node is greater than
some specified threshold, then the node gets activated
A neural network may contain the following 3 layers:
 Input layer – The activity of the input units represents the raw information that can feed
into the network.
 Hidden layer – To determine the activity of each hidden unit. The activities of the input
units and the weights on the connections between the input and the hidden units. There
may be one or more hidden layers.
 Output layer – The behavior of the output units depends on the activity of the hidden
units and the weights between the hidden and output units.

Artificial Neural Network Layers


Artificial Neural network is typically organized in layers. Layers are being made up of many
interconnected ‘nodes’ which contain an ‘activation function’. A neural network may contain
the following 3 layers:
a. Input layer: The purpose of the input layer is to receive as input the values of the explanatory
attributes for each observation. Usually, the number of input nodes in an input layer is equal to
the number of explanatory variables. ‘input layer’ presents the patterns to the network, which
communicates to one or more ‘hidden layers’.

The nodes of the input layer are passive, meaning they do not change the data. They receive a
single value on their input and duplicate the value to their many outputs. From the input layer,
it duplicates each value and sent to all the hidden nodes.
Recommended system

b. Hidden layer: The Hidden layers apply given transformations to the input values inside the
network. In this, incoming arcs that go from other hidden nodes or from input nodes connected
to each node. It connects with outgoing arcs to output nodes or to other hidden nodes. In
hidden layer, the actual processing is done via a system of weighted ‘connections’. There may
be one or more hidden layers. The values entering a hidden node multiplied by weights, a set of
predetermined numbers stored in the program. The weighted inputs are then added to
produce a single number.

c. Output layer: The hidden layers then link to an ‘output layer‘. Output layer receives
connections from hidden layers or from input layer. It returns an output value that corresponds
to the prediction of the response variable. In classification problems, there is usually only one
output node. The active nodes of the output layer combine and change the data to produce the
output values.

The ability of the neural network to provide useful data manipulation lies in the proper
selection of the weights. This is different from conventional information processing.

The architecture or Structure of a Neural Network


To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of. In order to define a neural network that consists
of a large number of artificial neurons, which are termed units arranged in a sequence of layers.
Lets us look at various types of layers available in an artificial neural network.
Artificial Neural Network primarily consists of three layers:

1.Input layer
2.hidden layer
3.output layer
How do ANN work?
ANN includes a huge number of neurons working parallelly arranged in layers. A neuron is a
simple or multiple linear regression model with an activation function at the end.
The first layer, also called the input layer receives the raw data. The hidden layers extract the
most important information from the inputs and discard the redundant information. The
Recommended system

output layer finally gives us the desired result of all the data processed by the artificial neural
network. It can either have single or multiple nodes.
The first neuron and the second neuron of the first layer are connected to all the inputs of the
previous layer and the process goes on for all the neurons in the first hidden layer. The artificial
neural network takes the input and computes the weighted sum of the inputs and adds bias to
it.

The output of the above equation is then passed through an activation function like sigmoid,
ReLU, tanH, etc. If this output is greater than a given threshold then it “fires” the node and
passes the data to the next layer in the network.

The output of these previously hidden layers is considered as the inputs of the incoming layer.
All the neurons are connected to one another with some weight and bias.

Weights are important for ANN because that is how neural networks learn. By changing the
weight value, the NN decides which signal is significant and which is not. The process of passing
data from one layer to the next layer is called Forward Propagation

Types of NN:
1.Feed-Forward ANN:
where the data or the input travels in one direction. The data passes through the input nodes
and exit on the output nodes. This neural network may or may not have the hidden layers. In
simple words, it has a front propagated wave and no backpropagation by using a classifying
activation function usually.
Below is a Single layer feed-forward network. Here, the sum of the products of inputs and
weights are calculated and fed to the output.
Or
Recommended system

In this network flow of information is unidirectional. A unit used to send information to another
unit that does not receive any information. Also, no feedback loops are present in this.
Although, used in recognition of a pattern. As they contain fixed inputs and outputs.

2.feedBack ANN: Signals can travel in both the directions in Feedback neural networks.
Feedback neural networks are very powerful and can get very complicated. Feedback neural
networks are dynamic. The ‘state’ in such network keep changing until they reach an
equilibrium point. They remain at the equilibrium point until the input changes and a new
equilibrium needs to be found.

Feedback neural network architecture is also referred to as interactive or recurrent, although


the latter term is often used to denote feedback connections in single-layer organisations.
Feedback loops are allowed in such networks. They are used in content addressable memories.

Advantages and Disadvantages of Neural Networks

Let us see few advantages and disadvantages of neural networks:

 Neural networks perform well with linear and nonlinear data but a common criticism of
neural networks, particularly in robotics, is that they require a large diversity of training
Recommended system

for real-world operation. This is so because any learning machine needs sufficient
representative examples in order to capture the underlying structure that allows it to
generalize to new cases.
 Neural networks works even if one or few units fail to respond to network but to
implement large and effective software neural networks, much processing and storage
resources need to be committed. While the brain has hardware tailored to the task of
processing signals through a graph of neurons, simulating even a most simplified form
on Von Neumann technology may compel a neural network designer to fill millions of
database rows for its connections – which can consume vast amounts of computer
memory and hard disk space.
 Neural network learns from the analyzed data and does not require to reprogramming
but they are referred to as black box” models, and provide very little insight into what
these models really do. The user just needs to feed it input and watch it train and await
the output.

What is Perceptron

A neural network is an interconnected system of perceptrons, so it is safe to say perceptrons


are the foundation of any neural network. Perceptrons can be viewed as building blocks in a
single layer in a neural network, made up of four different parts:

1. Input Values or One Input Layer


2. Weights and Bias
3. Net sum
4. Activation function
Basic Components of Perceptron
Perceptron is a type of artificial neural network, which is a fundamental concept in machine
learning. The basic components of a perceptron are:
Input Layer: The input layer consists of one or more input neurons, which receive input signals
from the external world or from other layers of the neural network.
Weights: Each input neuron is associated with a weight, which represents the strength of the
connection between the input neuron and the output neuron.
Bias: A bias term is added to the input layer to provide the perceptron with additional flexibility
in modeling complex patterns in the input data.
Activation Function or step func: The activation function determines the output of the
perceptron based on the weighted sum of the inputs and the bias term. Common activation
functions used in perceptrons include the step function, sigmoid function, and ReLU function.
that help to determine whether the neuron will fire or not. Activation Function can be
considered primarily as a step function.
Types of Activation functions:
o Sign function
o Step function, and
o Sigmoid function
Recommended system

Output: The output of the perceptron is a single binary value, either 0 or 1, which indicates the
class or category to which the input data belongs.

Training Algorithm: The perceptron is typically trained using a supervised learning algorithm
such as the perceptron learning algorithm or backpropagation. During training, the weights and
biases of the perceptron are adjusted to minimize the error between the predicted output and
the true output for a given set of training examples.

Or Perceptron
A perceptron is a neural network unit (an artificial neuron) that does certain computations to
detect features or business intelligence in the input data.
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron learning
rule based on the original MCP neuron.
A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm enables
neurons to learn and processes elements in the training set one at a time.

There are two types of Perceptrons: Single layer and Multilayer.


Recommended system

Single layer Perceptrons can learn only linearly separable patterns.


Multilayer Perceptrons or feedforward neural networks with two or more layers have the
greater processing power.
The Perceptron algorithm learns the weights for the input signals in order to draw a linear
decision boundary.
This enables you to distinguish between the two linearly separable classes +1 and -1.
Perceptron Learning Rule:
Perceptron Learning Rule states that the algorithm would automatically learn the optimal
weight coefficients. The input features are then multiplied with these weights to determine if a
neuron fires or not.

The Perceptron receives multiple input signals, and if the sum of the input signals exceeds a
certain threshold, it either outputs a signal or does not return an output. In the context of
supervised learning and classification, this can then be used to predict the class of a sample.
How does perceptron work?
The perceptron model begins with the multiplication of all input values and their weights, then
adds these values together to create the weighted sum. Then this weighted sum is applied to
the activation function 'f' to obtain the desired output. This activation function is also known as
the step function and is represented by 'f'.
Here the above image drow
Perceptron model works in two important steps as follows:
Step-1
In the first step first, multiply all input values with corresponding weight values and then add
them to determine the weighted sum. Mathematically, we can calculate the weighted sum as
follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
Add a special term called bias 'b' to this weighted sum to improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the above-mentioned weighted sum,
which gives us output either in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)

Perceptron Function
Recommended system

Perceptron learning algorithm function f(x) is represented as the product of the input vector (x)
and the learned weight vector (w). In mathematical notion, it can be described as:

f(x)=1,ifw.x+b>0 , f(x)=0,otherwise

or

Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value”f(x)”is generated.

In the equation given above:

“w” = vector of real-valued weights

“b” = bias (an element that adjusts the boundary away from origin without any dependence on
the input value)

“x” = vector of input x values

“m” = number of inputs to the Perceptron

The output can be represented as “1” or “0.” It can also be represented as “1” or “-1” depending
on which activation function is used.

Inputs of a Perceptron

A Perceptron accepts inputs, moderates them with certain weight values, then applies the
transformation function to output the final result. The above below shows a Perceptron with a
Boolean output.

A Boolean output is based on inputs such as salaried, married, age, past credit profile, etc. It has
only two values: Yes and No or True and False. The summation function “∑” multiplies all
inputs of “x” by weights “w” and then adds them up as follows:

Activation Functions of Perceptron:


Recommended system

The activation function applies a step rule (convert the numerical output into +1 or -1) to check if
the output of the weighting function is greater than zero or not.

For
example:

If ∑ wi*xi> 0 => then final output “o” = 1 (issue bank loan)

Else, final output “o” = -1 (deny bank loan)

Step function gets triggered above a certain value of the neuron output; else it outputs zero. Sign
Function outputs +1 or -1 depending on whether neuron output is greater than zero or not.
Sigmoid is the S-curve and outputs a value between 0 and 1.

Output of Perceptron

Perceptron with a Boolean output:

Inputs: x1…xn

Output: o(x1….xn)

Weights: wi=> contribution of input xi to the Perceptron output;

w0=> bias or threshold

If ∑w.x > 0, output is +1, else -1. The neuron gets triggered only when weighted input reaches a
certain threshold value.
Recommended system

An output of +1 specifies that the neuron is triggered. An output of -1 specifies that the neuron
did not get triggered.

“sgn” stands for sign function with output +1 or -1.

Error in Perceptron

In the Perceptron Learning Rule, the predicted output is compared with the known output. If it
does not match, the error is propagated backward to allow weight adjustment to happen.

Perceptron: Decision Function

A decision function φ(z) of Perceptron is defined to take a linear combination of x and w vectors.

The value z in the decision function is given by:

The decision function is +1 if z is greater than a threshold θ, and it is -1 otherwise.

This is the Perceptron algorithm.


Recommended system

Bias Unit:

For simplicity, the threshold θ can be brought to the left and represented as w0x0, where w0= -θ
and x0= 1.

The value w0 is called the bias unit.

The decision function then becomes:

Output

The figure shows how the decision function squashes wTx to either +1 or -1 and how it can be
used to discriminate between two linearly separable classes.

Multi-Layer Neural Network

Single layer NN
Recommended system

A multilayer neural network, also known as a feedforward neural network or a deep neural
network, is a type of artificial neural network with multiple layers of artificial neurons, including
at least one hidden layer between the input and output layers. These networks are capable of
learning and representing complex patterns and relationships in data, making them suitable for a
wide range of machine learning tasks, including image recognition, natural language processing,
and more.
Multi-Layer Perceptron Neural Network is a Neural Network with multiple layers, and all its
layers are connected. It uses a Back-Propagation algorithm for training the model. Multilayer
Perceptron is a class of Deep Learning, also known as MLP.
Or
A multilayer perceptron (MLP) Neural network belongs to the feedforward neural network. It is
an Artificial Neural Network in which all nodes are interconnected with nodes of different
layers.
Frank Rosenblatt first defined the word Perceptron in his perceptron program. Perceptron is a
basic unit of an artificial neural network that defines the artificial neuron in the neural network. It
is a supervised learning algorithm containing nodes’ values, activation functions, inputs, and
weights to calculate the output.
The Multilayer Perceptron (MLP) Neural Network works only in the forward direction. All
nodes are fully connected to the network. Each node passes its value to the coming node only in
the forward direction. The MLP neural network uses a Backpropagation algorithm to increase the
accuracy of the training model.

In this figure, we have used circles to also denote the inputs to the network. The circles labeled
“+1” are called bias units, and correspond to the intercept term. The leftmost layer of the
network is called the input layer, and the rightmost layer the output layer (which, in this
example, has only one node). The middle layer of nodes is called the hidden layer, because its
values are not observed in the training set. We also say that our example neural network has 3
input units (not counting the bias unit), 3 hidden units, and 1 output unit.

Step1: find the summation and bias unit on performing dot product among inputs and weights as:

r= Σi=1 to m wixi + bias (summission i=1 to m)


Recommended system

On feeding the r into activation function F(r) we find the output for the hidden layers. For the
first hidden layer h1, the neuron can be calculated as:

h11 = F(r)

For all the other hidden layers repeat the same procedure. Keep repeating the process until reach
the last weight set.

Backpropagation Algorithm

Back-propagation: Backpropagation is an algorithm that backpropagates the errors from the


output nodes to the input nodes. Therefore, it is simply referred to as the backward propagation
of errors. It uses in the vast applications of neural networks in data mining like Character
recognition, Signature verification, etc.

In an artificial neural network, the values of weights and biases are randomly initialized. Due to
random initialization, the neural network probably has errors in giving the correct output. We
need to reduce error values as much as possible. So, to reduce these error values, we need a
mechanism that can compare the desired output of the neural network with the network’s output
that consists of errors and adjust its weights and biases such that it gets closer to the desired
output after each iteration. For this, we train the network such that it back propagates and updates
the weights and biases. This is the concept of the back propagation algorithm.

Or

The principle behind the back propagation algorithm is to reduce the error values in randomly
allocated weights and biases such that it produces the correct output. The system is trained in the
supervised learning method, where the error between the system’s output and a known expected
output is presented to the system and used to modify its internal state. We need to update the
weights so that we get the global loss minimum. This is how back propagation in neural
networks works.

Input values
Recommended system

X1=0.05
X2=0.10

Initial weight

W1=0.15 w5=0.40
W2=0.2 w6=0.45
W3=0. w7=0.50
W4=0.30 w8=0.55

Bias Values

b1=0.35 b2=0.60

Target Values

T1=0.01
T2=0.99

Now, we first calculate the values of H1 and H2 by a forward pass.

Forward Pass

To find the value of H1 we first multiply the input value from the weights as
H1=x1×w1+x2×w2+b1
H1=0.05×0.15+0.10×0.20+0.35
H1=0.3775

To calculate the final result of H1, we performed the sigmoid function as

We will calculate the value of H2 in the same way as H1

H2=x1×w3+x2×w4+b1
H2=0.05×0.25+0.10×0.30+0.35
H2=0.3925

To calculate the final result of H1, we performed the sigmoid function as


Recommended system

Now, we calculate the values of y1 and y2 in the same way as we calculate the H1 and H2.
To find the value of y1, we first multiply the input value i.e., the outcome of H1 and H2 from the
weights as
y1=H1×w5+H2×w6+b2
y1=0.593269992×0.40+0.596884378×0.45+0.60
y1=1.10590597
To calculate the final result of y1 we performed the sigmoid function as

We will calculate the value of y2 in the same way as y1


y2=H1×w7+H2×w8+b2
y2=0.593269992×0.50+0.596884378×0.55+0.60
y2=1.2249214
To calculate the final result of H1, we performed the sigmoid function as

Our target values are 0.01 and 0.99. Our y1 and y2 value is not matched with our target values
T1 and T2.
Now, we will find the total error, which is simply the difference between the outputs from the
target outputs. The total error is calculated as

So, the total error is


Recommended system

Now, we will backpropagate this error to update the weights using a backward pass.
Backward pass at the output layer
To update the weight, we calculate the error correspond to each weight with the help of a total
error. The error on weight w is calculated by differentiating total error with respect to w.

We perform backward process so first consider the last weight w5 as

From equation two, it is clear that we cannot partially differentiate it with respect to w5 because
there is no any w5. We split equation one into multiple terms so that we can easily differentiate it
with respect to w5 as

Now, we calculate each term one by one to differentiate Etotal with respect to w5 as
Recommended system

Putting the value of e-y in equation (5)

So, we put the values of in equation no (3) to find the final result.

Now, we will calculate the updated weight w5new with the help of the following formula

In the same way, we calculate w6new,w7new, and w8new and this will give us the following values

w5new=0.35891648
w6new=408666186
w7new=0.511301270
w8new=0.561370121
Recommended system

Backward pass at Hidden layer

Now, we will backpropagate to our hidden layer and update the weight w1, w2, w3, and w4 as
we have done with w5, w6, w7, and w8 weights.

We will calculate the error at w1 as

From equation (2), it is clear that we cannot partially differentiate it with respect to w1 because
there is no any w1. We split equation (1) into multiple terms so that we can easily differentiate it
with respect to w1 as

Now, we calculate each term one by one to differentiate Etotal with respect to w1 as

We again split this because there is no any H1final term in Etoatal as

will again split because in E1 and E2 there is no H1 term. Splitting is done


as
Recommended system

We again Split both because there is no any y1 and y2 term in E1 and E2. We split
it as

Now, we find the value of by putting values in equation (18) and (19) as

From equation (18)

From equation (8)

From equation (19)


Recommended system

Putting the value of e-y2 in equation (23)

From equation (21)


Recommended system

Now from equation (16) and (17)

Put the value of in equation (15) as


Recommended system

We have we need to figure out as

Putting the value of e-H1 in equation (30)

We calculate the partial derivative of the total net input to H1 with respect to w1 the same as we
did for the output neuron:
Recommended system

So, we put the values of in equation (13) to find the final result.

Now, we will calculate the updated weight w1new with the help of the following formula

In the same way, we calculate w2new,w3new, and w4 and this will give us the following values

w1new=0.149780716
w2new=0.19956143
w3new=0.24975114
w4new=0.29950229

We have updated all the weights. We found the error 0.298371109 on the network when we fed
forward the 0.05 and 0.1 inputs. In the first round of Backpropagation, the total error is down to
0.291027924. After repeating this process 10,000, the total error is down to 0.0000351085. At
this point, the outputs neurons generate 0.159121960 and 0.984065734 i.e., nearby our target
value when we feed forward the 0.05 and 0.1.

What is Deep Learning?

Deep learning is a branch of machine learning which is completely based on artificial neural
networks with three and more layers, as neural network is going to mimic the human brain so
deep learning is also a kind of mimic of human brain. In deep learning, we don’t need to
explicitly program everything.
Deep learning has aided image classification, language translation, speech recognition. It can be
used to solve any pattern recognition problem and without human intervention.
Recommended system

Deep learning is the branch of machine learning which is based on artificial neural network
architecture. An artificial neural network or ANN uses layers of interconnected nodes called
neurons that work together to process and learn from the input data.
In a fully connected Deep neural network, there is an input layer and one or more hidden layers
connected one after the other. Each neuron receives input from the previous layer neurons or the
input layer. The output of one neuron becomes the input to other neurons in the next layer of the
network, and this process continues until the final layer produces the output of the network. The
layers of the neural network transform the input data through a series of nonlinear
transformations, allowing the network to learn complex representations of the input data.

Examples of DL:
1.self-driving cars
2.chatbots
3.facial recognition
4.speech recognition
Architectures:

1. Deep Neural Network – It is a neural network with a certain level of complexity (having
multiple hidden layers in between input and output layers). They are capable of modeling
and processing non-linear relationships.
2. Deep Belief Network (DBN) – It is a class of Deep Neural Network. It is multi-layer
belief networks.
3. Steps for performing DBN:
a. Learn a layer of features from visible units using Contrastive Divergence algorithm.
b. Treat activations of previously trained features as visible units and then learn features
of features.
c. Finally, the whole DBN is trained when the learning for the final hidden layer is
achieved.
4. Recurrent (perform same task for every element of a sequence) Neural Network –

Allows for parallel and sequential computation. Similar to the human brain (large
feedback network of connected neurons). They are able to remember important things
about the input they received and hence enables them to be more precise.
Recommended system

Difference between Machine Learning and Deep Learning`:

Machine Learning Deep Learning

Uses artificial neural network architecture to


Apply statistical algorithms to learn the hidden
learn the hidden patterns and relationships in
patterns and relationships in the dataset.
the dataset.

Requires the larger volume of dataset


Can work on the smaller amount of dataset
compared to machine learning

Better for complex task like image processing,


Better for the low-label task.
natural language processing, etc.

Takes less time to train the model. Takes more time to train the model.

A model is created by relevant features which Relevant features are automatically extracted
are manually extracted from images to detect an from images. It is an end-to-end learning
object in the image. process.

More complex, it works like the black box


Less complex and easy to interpret the result.
interpretations of the result are not easy.

It can work on the CPU or requires less It requires a high-performance computer with
computing power as compared to deep learning. GPU.

Working:
First, we need to identify the actual problem in order to get the right solution and it should be
understood, the feasibility of the Deep Learning should also be checked (whether it should fit
Deep Learning or not). Second, we need to identify the relevant data which should correspond to
the actual problem and should be prepared accordingly. Third, Choose the Deep Learning
Algorithm appropriately. Fourth, Algorithm should be used while training the dataset. Fifth,
Final testing should be done on the dataset.
Recommended system

You might also like