0% found this document useful (0 votes)
21 views71 pages

Unit Vi: TO Artificial Neural Network

Uploaded by

Janhvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views71 pages

Unit Vi: TO Artificial Neural Network

Uploaded by

Janhvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

UNIT VI

INTRODUCTION
TO
ARTIFICIAL NEURAL NETWORK
CONTENTS:
 Perceptron Learning– Biological Neuron, Introduction to
ANN, McCulloch Pitts Neuron, Perceptron and its Learning
Algorithm, Sigmoid Neuron, Activation Functions: Tanh,
ReLu
 Multi-layer Perceptron Model – Introduction, Learning

parameters: Weight and Bias, Loss function: Mean Square


Error
 Introduction to Deep Learning
PERCEPTRON:
 The perceptron is a mathematical model of a biological neuron. While in
actual neurons the dendrite receives electrical signals from the axons of
other neurons, in the perceptron these electrical signals are represented as
numerical values.
 At the synapses between the dendrite and axons, electrical signals are
modulated in various amounts.
 This is also modeled in the perceptron by multiplying each input value by
a value called the weight.
 An actual neuron fires an output signal only when the total strength of the
input signals exceed a certain threshold.
 We model this phenomenon in a perceptron by calculating the weighted
sum of the inputs to represent the total strength of the input signals, and
applying a step function on the sum to determine its output.
 As in biological neural networks, this output is fed to other perceptrons.

BIOLOGICAL NEURON :
Neuron Definition
 “Neurons are the fundamental unit of the nervous

system specialized to transmit information to


different parts of the body.”
What is a Neuron?
 Neurons are the building blocks of the nervous system.

They receive and transmit signals to different parts of


the body. This is carried out in both physical and
electrical forms. There are several different types of
neurons that facilitate the transmission of information.
 The sensory neurons carry information from the

sensory receptor cells present throughout the body to


the brain. Whereas, the motor neurons transmit
information from the brain to the muscles. The
interneurons transmit information between different
neurons in the body.
Neuron Structure
A neuron varies in shape and size depending on its function and
location. All neurons have three different parts – dendrites, cell body
and axon.

 Parts of Neuron
 Following are the different parts of a neuron:

 Dendrites
 These are branch-like structures that receive messages from other
neurons and allow the transmission of messages to the cell body.

 Cell Body
 Each neuron has a cell body with a nucleus, Golgi body, endoplasmic
reticulum, mitochondria and other components.

 Axon
 Axon is a tube-like structure that carries electrical impulse from the
cell body to the axon terminals that pass the impulse to another
neuron.
 Synapse
 It is the chemical junction between the terminal of one neuron and the
dendrites of another neuron.
INTRODUCTION TO ANN:
 Artificial Neural Networks (ANN) are algorithms based on brain function
and are used to model complicated patterns and forecast issues.
 The Artificial Neural Network (ANN) is a deep learning method that arose
from the concept of the human brain Biological Neural Networks.
 The development of ANN was the result of an attempt to replicate the
workings of the human brain.
 The workings of ANN are extremely similar to those of biological neural
networks, although they are not identical.
 ANN algorithm accepts only numeric and structured data.
WHAT IS ARTIFICIAL NEURAL NETWORK(ANN)?

 An Artificial Neural Network (ANN) is a computational


model inspired by the human brain’s neural structure. It
consists of interconnected nodes (neurons) organized into
layers. Information flows through these nodes, and the
network adjusts the connection strengths (weights) during
training to learn from data, enabling it to recognize patterns,
make predictions, and solve various tasks in machine
learning and artificial intelligence.
MCCULLOCH PITTS NEURON

 The McCulloch-Pitts neural model, which was the earliest ANN


model, has only two types of inputs — Excitatory and
Inhibitory. The excitatory inputs have weights of positive
magnitude and the inhibitory weights have weights of negative
magnitude. The inputs of the McCulloch-Pitts neuron could be
either 0 or 1. It has a threshold function as an activation function.
So, the output signal yout is 1 if the input ysum is greater than or
equal to a given threshold value, else 0. The diagrammatic
representation of the model is on next slide:
MCCULLOCH PITTS NEURON

Fig :McCulloch-Pitts Model


MCCULLOCH PITTS NEURON

 Simple McCulloch-Pitts neurons can be used to design logical


operations. For that purpose, the connection weights need to be
correctly decided along with the threshold function (rather than the
threshold value of the activation function). For better understanding
purpose, let me consider an example:

 John carries an umbrella if it is sunny or if it is raining. There are


four given situations. I need to decide when John will carry the
umbrella. The situations are as follows:
• First scenario: It is not raining, nor it is sunny
• Second scenario: It is not raining, but it is sunny
• Third scenario: It is raining, and it is not sunny
• Fourth scenario: It is raining as well as it is sunny
MCCULLOCH PITTS NEURON
 To analyze the situations using the McCulloch-Pitts neural model, I
can consider the input signals as follows:
• X1: Is it raining?
• X2 : Is it sunny?

 So, the value of both scenarios can be either 0 or 1. We can use the
value of both weights X1 and X2 as 1 and a threshold function as 1.
So, the neural network model will look like:
MCCULLOCH PITTS NEURON
 Truth Table for this case will be:

Situation x1 x2 ysum yout

1 0 0 0 0

2 0 1 1 1

3 1 0 1 1

4 1 1 2 1
 So, I can say that,

 The truth table built with respect to the


problem is depicted above. From the truth
table, I can conclude that in the situations
where the value of yout is 1, John needs to
carry an umbrella. Hence, he will need to
carry an umbrella in scenarios 2, 3 and
4.
EXAMPLE
MCCULLOCH PITTS NEURON :
 The fundamental block of Deep Learning is an Artificial Neuron.
 This is what an Artificial Neuron looks like. It takes on a bunch
of inputs(say x1, x2, x3 and so on), these inputs would be factors/features
based on which we make the decision and there are some weights assigned
to each of the inputs. So, an Artifical Neuron takes a weighted aggregate
of the inputs(weights are just 1 in case of MP Neuron) and it applies
some function on this weighted aggregate input and gives an output.
 The MP Neuron model is also known as a linear threshold gate.
MCCULLOCH PITTS NEURON :
 The function is split into two parts: g and f
 g sums up all the inputs(weighted sum) and then f takes g as the

input.
 In this case, as we just sum up all the inputs for g and since all

the inputs are Boolean, which means we are basically counting


the number of things which are on(have a value of 1) in the
input set; that’s what the summation means in this case when all
the weights are just 1.
 Now, this value of g we pass it to other function f which

would output 1(means neuron would fire) if the summation


of inputs(which is stored in g) is greater than some
threshold and it will output 0 if the summation of the inputs
is less than some threshold.
MCCULLOCH PITTS NEURON :
SIGMOID NEURON:
 Introducing sigmoid neurons where the output
function is much smoother than the step function. In
the sigmoid neuron, a small change in the input only
causes a small change in the output as opposed to the
stepped output. There are many functions with the
characteristic of an “S” shaped curve known as
sigmoid functions. The most commonly used function
is the logistic function.
 We no longer see a sharp transition at the threshold b.

The output from the sigmoid neuron is not 0 or 1.


Instead, it is a real value between 0–1 which can be
interpreted as a probability.
SIGMOID NEURON:
LINEAR MODEL:
 Linear regression is a linear approach for modelling the relationship
between a scalar response and one or more explanatory variables (also
known as dependent and independent variables).
TERMINOLOGIES RELATED TO THE REGRESSION
ANALYSIS:
 Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target
variable.

 Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.

 Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result, so it
should be avoided.
 Outliers are defined as abnormal values in a dataset that don't go with the regular
distribution and have the potential to significantly distort any regression model.

 Multicollinearity: If the independent variables are highly correlated with each other
than other variables, then such condition is called Multicollinearity. It should not be
present in the dataset, because it creates problem while ranking the most affecting
variable.
WHY DO WE USE REGRESSION ANALYSIS?
 Regression estimates the relationship between the target and the independent
variable.
 It is used to find the trends in data.
 It helps to predict real/continuous values.
 By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other
factors.

 Types of Regression
 Linear Regression
 Logistic Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
 Ridge Regression
 Lasso Regression:
LINEAR REGRESSION

 Linear regression is one of the easiest and most popular


Machine Learning algorithms.
 It is a statistical method that is used for predictive analysis.

Linear regression makes predictions for continuous/real or


numeric variables such as sales, salary, age, product price, etc.
 Linear regression algorithm shows a linear relationship between

a dependent (y) and one or more independent (y) variables,


hence called as linear regression.
 Since linear regression shows the linear relationship, which

means it finds how the value of the dependent variable is


changing according to the value of the independent variable.
LINEAR REGRESSION
LINEAR REGRESSION
 Mathematically, we can represent a linear regression as:
 y= a +a x+ ε
0 1
 Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of
freedom)
a1 = Linear regression coefficient (scale factor to each input
value).
ε = random error
 The values for x and y variables are training datasets for Linear

Regression model representation.


TYPES OF LINEAR REGRESSION

 Linear regression can be further divided into two types of the


algorithm:
 Simple Linear Regression:
If a single independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression
algorithm is called Simple Linear Regression.
 Multiple Linear regression:
If more than one independent variable is used to predict the
value of a numerical dependent variable, then such a Linear
Regression algorithm is called Multiple Linear Regression.
FINDING THE BEST FIT LINE

 Finding the best fit line:


 When working with linear regression, our main goal is to find the best
fit line that means the error between predicted values and actual values
should be minimized. The best fit line will have the least error.
 The different values for weights or the coefficient of lines (a 0, a1) gives
a different line of regression, so we need to calculate the best values
for a0 and a1 to find the best fit line, so to calculate this we use cost
function.
FINDING THE BEST FIT LINE
UNIVARATE REGRESSION- LEAST SQUARE METHOD:

 Univariate linear regression focuses on determining relationship between one


independent (explanatory variable) variable and one dependent variable.
 Univariate data is the type of data in which the result depends only on one
variable.
UNIVARATE REGRESSION:

 It is also called simple linear regression.


 The equation of univariate linear regression is similar to
 Y = b + mx

Features of best fit regression line


 Regression line results in minimum sum of errors.
 It does not need to go through all data points.
 It does not need same number of data points above and below.
MODEL REPRESENTATION:
COST FUNCTIONS:
 It is a mechanism utilized in supervised machine learning .
 The cost function returns the error between predicted outcomes compared
with the actual outcomes.
 In other words, it estimates the cost of production.
 Cost function is a measure of how wrong the model is in terms of its
ability to estimate the relationship between X and y.
 Loss function: Used when we refer to the error for a single training
example.
 Cost function: Used to refer to an average of the loss functions over an
entire training dataset.
WHY ON EARTH DO WE NEED A COST FUNCTION? :
 Why on earth do we need a cost function? Consider a scenario where we
wish to classify data. Suppose we have the height & weight details of
some cats & dogs.
WHY ON EARTH DO WE NEED A COST FUNCTION? :
 Blue dots are cats & red dots are dogs. Following are some solutions
to the above classification problem.
 Essentially all three classifiers have very high accuracy but the third
solution is the best because it does not misclassify any point. The
reason why it classifies all the points perfectly is that the line is
almost exactly in between the two groups, and not closer to any one
of the groups. This is where the concept of cost function comes in.
Cost function helps us reach the optimal solution. The cost function
is the technique of evaluating “the performance of our
algorithm/model”.
REGRESSION COST FUNCTION :
 Regression models deal with predicting a continuous value
for example salary of an employee, price of a car, loan
prediction, etc. A cost function used in the regression
problem is called “Regression Cost Function”. They are
calculated on the distance-based error as follows:
 Error = y-y’
 Where,
 Y – Actual Input
 Y’ – Predicted output

The most used Regression cost functions are below:


 Mean Error (ME)

 Mean Squared Error (MSE)

 Mean Absolute Error (MAE)

 R- Sqaured
MEAN ERROR (ME):
 In this cost function, the error for each training data is
calculated and then the mean value of all these errors is
derived.
 The errors can be both negative and positive. So they can
cancel each other out during summation giving zero mean
error for the model.
MEAN SQUARED ERROR (MSE) :
 This improves the drawback we encountered in Mean Error above. Here a
square of the difference between the actual and predicted value is calculated
to avoid any possibility of negative error.
 It is measured as the average of the sum of squared differences between
predictions and actual observations.

 MSE = (sum of squared errors)/n


 It is also known as L2 loss.
 In MSE, since each error is squared, it helps to penalize even small
deviations in prediction when compared to MAE. But if our dataset has
outliers that contribute to larger prediction errors, then squaring this error
further will magnify the error many times more and also lead to higher MSE
error.
 Hence we can say that it is less robust to outliers
MEAN ABSOLUTE ERROR (MAE):
 This cost function also addresses the shortcoming of mean
error differently. Here an absolute difference between the
actual and predicted value is calculated to avoid any
possibility of negative error.
 So in this cost function, MAE is measured as the average of
the sum of absolute differences between predictions and
actual observations.
 MAE = (sum of absolute errors)/n
 It is also known as L1 Loss.
 It is robust to outliers thus it will give better results even
when our dataset has noise or outliers.
MEAN SQUARED ERROR (MSE)
 For Linear Regression, we use the Mean Squared Error
(MSE) cost function, which is the average of squared error
occurred between the predicted values and actual values. It
can be written as:
 For the above linear equation, MSE can be calculated as:

 Where,
 N=Total number of observation
Yi = Actual value
(a1xi+a0)= Predicted value.
 Residuals: The distance between the actual value and
predicted values is called residual. If the observed points are
far from the regression line, then the residual will be high, and
so cost function will high. If the scatter points are close to the
regression line, then the residual will be small and hence the
cost function.
GRADIENT DESCENT:
 Gradient Descent:
 Gradient descent is used to minimize the MSE by

calculating the gradient of the cost function.


 A regression model uses gradient descent to update

the coefficients of the line by reducing the cost


function.
 It is done by a random selection of values of coefficient

and then iteratively update the values to reach the


minimum cost function.

 Model Performance:
 The Goodness of fit determines how the line of

regression fits the set of observations. The process of


finding the best model out of various models is
called optimization. It can be achieved by below
method:
MEAN SQUARED ERROR (MSE)
 Mean squared error (MSE) is the average of sum of
squared difference between actual value and the predicted
or estimated value. It is also termed as mean squared
deviation (MSD). This is how it is represented
mathematically:

 The value of MSE is always positive or greater than zero. A


value close to zero will represent better quality of the
estimator / predictor (regression model). An MSE of zero
(0) represents the fact that the predictor is a
perfect predictor. When you take a square root of
MSE value, it becomes root mean squared error
(RMSE). In the above equation, Y represents the actual
value and the Y’ is predicted value. Here is the
diagrammatic representation of MSE:
MEAN SQUARED ERROR (MSE)
MEAN ABSOLUTE ERROR (MAE)
 MAE is a very simple metric which calculates the
absolute difference between actual and predicted
values.
 To better understand, let’s take an example you

have input data and output data and use Linear


Regression, which draws a best-fit line.
 Now you have to find the MAE of your model which

is basically a mistake made by the model known


as an error. Now find the difference between the
actual value and predicted value that is an
absolute error but we have to find the mean
absolute of the complete dataset.
 so, sum all the errors and divide them by a total

number of observations And this is MAE. And we


aim to get a minimum MAE because this is a loss.
MEAN ABSOLUTE ERROR (MAE)
R-SQUARED METHOD:
 R-squared is a statistical method that determines the
goodness of fit.
 It measures the strength of the relationship between the
dependent and independent variables on a scale of 0-100%.
 0% indicates that the model explains none of the variability
of the response data around its mean.
 100% indicates that the model explains all the variability of
the response data around its mean.
 The high value of R-square determines the less difference
between the predicted values and actual values and hence
represents a good model.
 It is also called a coefficient of
determination, or coefficient of multiple
determination for multiple regression.
 It can be calculated from the below formula:
LEAST-SQUARE METHOD
 The least-squares method is a form of mathematical
regression analysis used to determine the
line of best fit for a set of data, providing a visual
demonstration of the relationship between the data
points. Each point of data represents the
relationship between a known independent variable
and an unknown dependent variable.

 Here x̅ is the mean of all the values in the


input X and ȳ is the mean of all the values in the
desired output Y. This is the Least Squares method..
SIMPLE LINEAR REGRESSION
 Simple Linear Regression is a type of Regression algorithms
that models the relationship between a dependent variable
and a single independent variable. The relationship shown
by a Simple Linear Regression model is linear or a sloped
straight line, hence it is called Simple Linear Regression.
 The key point in Simple Linear Regression is that
the dependent variable must be a continuous/real
value. However, the independent variable can be
measured on continuous or categorical values.
 Simple Linear regression algorithm has mainly two
objectives:
 Model the relationship between the two
variables. Such as the relationship between Income and
expenditure, experience and Salary, etc.
 Forecasting new observations. Such as Weather
forecasting according to temperature, Revenue of a
company according to the investments in a year, etc.
SIMPLE LINEAR REGRESSION MODEL:
 The Simple Linear Regression model can be represented
using the below equation:
 a0 + a1.x + e
 Where,
 a0= It is the intercept of the Regression line (can be
obtained putting x=0)
a1= It is the slope of the regression line, which tells
whether the line is increasing or decreasing.
ε = The error term. (For a good model it will be
negligible)
MULTIVARIATE REGRESSION
 Multivariate Regression is a supervised machine learning algorithm
involving multiple data variables for analysis. A Multivariate regression
is an extension of multiple regression with one dependent variable and
multiple independent variables. Based on the number of independent
variables, we try to predict the output.
 Multivariate regression tries to find out a formula that can explain how
factors in variables respond simultaneously to changes in others.

 The equation for a model with two input variables can be written as:
 y = β0 + β1.x1 + β2.x2

 The equation for a model with three input variables can be written as:
 y = β0 + β1.x1 + β2.x2 + β3.x3

 Below is the generalized equation for the multivariate regression


model-
 y = β0 + β1.x1 + β2.x2 +….. + βn.xn
 Where n represents the number of independent variables, β0~ βn
represents the coefficients and x1~xn, is the independent variable.
COST FUNCTION
 In simple words it is a function that assigns a cost to instances where the
model deviates from the observed data. In this case, our cost is the sum of
squared errors. The cost function for multiple linear regression is given by:

 We can understand this equation as the summation of square of difference


between our predicted value and the actual value divided by twice of length
of data set. A smaller mean squared error implies a better performance.
Generally a cost function is used along with the Gradient Descent algorithm
to find the best parameters.
 Cost functions are used to estimate how badly models are performing. Put
simply, a cost function is a measure of how wrong the model is in terms of its
ability to estimate the relationship between X and y. This is typically
expressed as a difference or distance between the predicted value and the
actual value.
POLYNOMIAL REGRESSION
 Polynomial Regression is a regression algorithm that models the
relationship between a dependent(y) and independent variable(x) as nth
degree polynomial. The Polynomial Regression equation is given below:
y= b0+b1x1+ b2x12+ b2x13+...... bnx1n
 It is also called the special case of Multiple Linear Regression in ML.
Because we add some polynomial terms to the Multiple Linear
regression equation to convert it into Polynomial Regression.
 It is a linear model with some modification in order to increase the
accuracy.
 The dataset used in Polynomial regression for training is of non-linear
nature.
 It makes use of a linear regression model to fit the complicated and non-
linear functions and datasets.
 Hence, "In Polynomial regression, the original features are converted
into Polynomial features of required degree (2,3,..,n) and then modeled
using a linear model."
NEED FOR POLYNOMIAL REGRESSION:
 The need of Polynomial Regression in ML can be
understood in the below points:
 If we apply a linear model on a linear dataset,

then it provides us a good result as we have seen


in Simple Linear Regression, but if we apply the
same model without any modification on a non-
linear dataset, then it will produce a drastic
output. Due to which loss function will increase,
the error rate will be high, and accuracy will be
decreased.
 So for such cases, where data points are

arranged in a non-linear fashion, we need


the Polynomial Regression model. We can
understand it in a better way using the below
comparison diagram of the linear dataset and non-
NEED FOR POLYNOMIAL REGRESSION:
 In the image below, we have taken a dataset which is arranged
non-linearly. So if we try to cover it with a linear model, then we
can clearly see that it hardly covers any data point. On the other
hand, a curve is suitable to cover most of the data points, which
is of the Polynomial model.
 Hence, if the datasets are arranged in a non-linear fashion, then

we should use the Polynomial Regression model instead of


Simple Linear Regression.
NEED FOR POLYNOMIAL REGRESSION:

 When we compare the above three equations, we


can clearly see that all three equations are
Polynomial equations but differ by the degree of
variables. The Simple and Multiple Linear equations
are also Polynomial equations with a single degree,
and the Polynomial regression equation is Linear
equation with the nth degree. So if we add a
degree to our linear equations, then it will be
converted into Polynomial Linear equations.
GENERALIZATION
 The main goal of each machine learning model
is to generalize well.
 Here generalization defines the ability of an ML

model to provide a suitable output by adapting the


given set of unknown input.
 It means after providing training on the dataset, it

can produce reliable and accurate output.


 Hence, the underfitting and overfitting are the two

terms that need to be checked for the


performance of the model and whether the model
is generalizing well or not.
BIAS AND VARIANCE
 Bias: Bias is a prediction error that is introduced in
the model due to oversimplifying the machine
learning algorithms. Or it is the difference between
the predicted values and the actual values.

 Variance: If the machine learning model performs


well with the training dataset, but does not perform
well with the test dataset, then variance occurs.
BIAS VS. VARIANCE
BIAS-VARIANCE TRADEOFF
 The two are complementary to each other. In other
words, if the bias of a model is decreased, the
variance of the model automatically increases. The
vice-versa is also true, that is if the variance of a
model decreases, bias starts to increase.
 Hence, it can be concluded that it is nearly

impossible to have a model with no bias or no


variance since decreasing one increases the other.
This phenomenon is known as the Bias-Variance
Trade.
BIAS-VARIANCE TRADEOFF
 Another way of looking at the Bias-Variance Tradeoff graphically is
to plot the graphical representation for error, bias, and variance
versus the complexity of the model. In the graph shown below, the
green dotted line represents variance, the blue dotted line
represents bias and the red solid line represents the error in the
prediction of the concerned model.
 Since bias is high for a simpler model and decreases with an
increase in model complexity, the line representing bias
exponentially decreases as the model complexity increases.
 Similarly, Variance is high for a more complex model and is low for
simpler models. Hence, the line representing variance increases
exponentially as the model complexity increases.
 Finally, it can be seen that on either side, the generalization error is
quite high. Both high bias and high variance lead to a higher error
rate.
 The most optimal complexity of the model is right in the middle,
where the bias and variance intersect. This part of the graph is
shown to produce the least error and is preferred.
 Also, as discussed earlier, the model underfits for high-bias
situations and overfits for high-variance situations.
OVERFITTING
 Overfitting occurs when our machine learning
 model tries to cover all the data points or more than the
required data points present in the given dataset. Because of
this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the
efficiency and accuracy of the model. The overfitted model
has low bias and high variance.
 The chances of occurrence of overfitting increase as much
we provide training to our model. It means the more we train
our model, the more chances of occurring the overfitted
model.
 Overfitting is the main problem that occurs in
supervised learning
UNDERFITTING
 Underfitting occurs when our machine learning
model is not able to capture the underlying trend
of the data. To avoid the overfitting in the model,
the fed of training data can be stopped at an early
stage, due to which the model may not learn
enough from the training data. As a result, it may
fail to find the best fit of the dominant trend in the
data.
 In the case of underfitting, the model is not able to

learn enough from the training data, and hence it


reduces the accuracy and produces unreliable
predictions.
 An underfitted model has high bias and low

variance.
INTRODUCTION TO DEEP LEARNING
 The definition of Deep learning is that it is the branch of
machine learning that is based on artificial neural network
architecture. An artificial neural network or ANN uses layers of
interconnected nodes called neurons that work together to
process and learn from the input data.
 In a fully connected Deep neural network, there is an input layer
and one or more hidden layers connected one after the other.
Each neuron receives input from the previous layer neurons or
the input layer. The output of one neuron becomes the input to
other neurons in the next layer of the network, and this process
continues until the final layer produces the output of the network.
The layers of the neural network transform the input data
through a series of nonlinear transformations, allowing the
network to learn complex representations of the input data.
INTRODUCTION TO DEEP LEARNING
• Today Deep learning AI has become one of the most popular and visible areas of
machine learning, due to its success in a variety of applications, such as computer
vision, natural language processing, and Reinforcement learning.

• Deep learning AI can be used for supervised, unsupervised as well as reinforcement


machine learning. it uses a variety of ways to process these.
DEEP LEARNING APPLICATIONS
 Deep learning can be used in a wide variety of applications,
including:
• Image recognition: To identify objects and features in images,
such as people, animals, places, etc.
• Natural language processing: To help understand the meaning
of text, such as in customer service chatbots and spam filters.
• Finance: To help analyze financial data and make predictions
about market trends
• Text to image: Convert text into images, such as in the Google
Translate app.
TYPES OF DEEP LEARNING

 There are many different types of deep learning models.


Some of the most common types include:
 Convolutional neural networks (CNNs):

CNNs are used for image recognition and processing.


They are particularly good at identifying objects in images,
even when those objects are partially obscured or distorted.
 Deep reinforcement learning:

Deep reinforcement learning is used for robotics and


game playing. It is a type of machine learning that allows
an agent to learn how to behave in an environment by
interacting with it and receiving rewards or punishments.
TYPES OF DEEP LEARNING
 Recurrent neural networks (RNNs):
RNNs are used for natural language processing and
speech recognition. They are particularly good at
understanding the context of a sentence or phrase, and they
can be used to generate text or translate languages.
WHAT ARE THE BENEFITS OF USING DEEP
LEARNING MODELS?

 There are a number of benefits to using deep learning


models, including:
• Can learn complex relationships between features in
data: This makes them more powerful than traditional
machine learning methods.
• Large dataset training: This makes them very scalable,
and able to learn from a wider range of experiences,
making more accurate predictions.
• Data-driven learning: DL models can learn in a data-
driven way, requiring less human intervention to train
them, increasing efficiency and scalability. These models
learn from data that is constantly being generated, such as
data from sensors or social media.
CHALLENGES OF USING DEEP LEARNING MODELS

 Deep learning also has a number of challenges, including:


• Data requirements: Deep learning models require large
amounts of data to learn from, making it difficult to apply
deep learning to problems where there is not a lot of data
available.
• Overfitting: DL models may be prone to overfitting. This
means that they can learn the noise in the data rather than
the underlying relationships.
• Bias: These models can potentially be biased, depending
on the data that it’s based on. This can lead to unfair or
inaccurate predictions. It is important to take steps to
mitigate bias in deep learning models.
THANK YOU

You might also like