0% found this document useful (0 votes)

72 views78 pages

Support Vector Machines and Artificial Neural Networks: Dr.S.Veena, Associate Professor/CSE

The document discusses support vector machines (SVMs) and artificial neural networks. It covers key topics such as: 1) SVMs use maximum margin classifiers or support vector classifiers to find the optimal separating hyperplane between classes of data. The hyperplane maximizes the margin between different classes. 2) Kernel functions can transform non-linearly separable data into higher dimensions, allowing SVMs to find separating hyperplanes for complex, non-linear decision boundaries. Common kernel functions include polynomial and Gaussian kernels. 3) Artificial neural networks use forward and back propagation for optimization. Stochastic gradient descent is commonly used to minimize loss during training. Deep learning applies neural networks to complex problems with many layers

Uploaded by

vdjohn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views78 pages

Support Vector Machines and Artificial Neural Networks: Dr.S.Veena, Associate Professor/CSE

Uploaded by

vdjohn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 78

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY,

RAMAPURAM CAMPUS

Support Vector Machines and

Artificial Neural Networks

Dr.S.Veena,Associate
Professor/CSE 1
SVM and ANN

Support Vector Machines

• Support vector machines working principles
• Maximum margin classifier
• Support vector classifier
• Support vector machines
• Kernel functions
Artificial Neural Networks
• Forward propagation and back propagation
• Optimization of neural networks
• Stochastic gradient descent - SGD
• Introduction to deep learning
• Solving methodology
• Deep learning software

Dr.S.Veena,Associate Professor/CSE
2
Support Vector Machines

• Support vector machines (SVMs) are a set of supervised learning methods

used for classification, regression and outliers detection.
• A support vector machine (SVM) can be imagined as a surface that
maximizes the boundaries between various types of points of data that
is represent in multidimensional space, also known as a hyperplane,
which creates the most homogeneous points in each subregion.
• Support vector machines can be used on any type of data, but have
special extra advantages for data types with very high dimensions
relative to the observations.
• For example:
– Text classification, in which language has the very dimensions of
word vectors
– For the quality control of DNA sequencing by labeling
chromatograms correctly

Dr.S.Veena,Associate Professor/CSE
3
Support Vector Machines

Dr.S.Veena,Associate Professor/CSE
4
Support Vector Machines

Support vector machines working principles

• Support vector machines are mainly classified into three types based on
their working principles:
– Maximum margin classifiers
– Support vector classifiers
– Support vector machines

Dr.S.Veena,Associate Professor/CSE
5
Support Vector Machines
Maximum margin classifier
• It is feasible to draw infinite hyperplanes to classify the same set of data
• But to consider as an ideal hyperplane, The maximum margin classifier
considers the hyperplane with the maximum margin of separation
width.
Hyperplanes:
• In n- dimensional space, a hyperplane is a flat affine subspace of
dimension n-1.
• In 2- dimensional space, the hyperplane is a straight line which
separates the 2-dimensional space into two halves.
• The hyperplane is defined by the following equation.

• Points which lay on the hyperplane have to follow the above equation.

Dr.S.Veena,Associate Professor/CSE
6
Support Vector Machines

Dr.S.Veena,Associate Professor/CSE
7
Support Vector Machines

• However, there are regions above and below as well. This means observations could
fall in either of the regions, also called the region of classes

Dr.S.Veena,Associate Professor/CSE
8
Support Vector Machines

Dr.S.Veena,Associate Professor/CSE
9
Support Vector Machines

We already know that projection of any vector or another vector is called dot-
product. Hence, we take the dot product of x and w vectors. If the dot product is
greater than ‘c’ then we can say that the point lies on the right side. If the dot
product is less than ‘c’ then the point is on the left side and if the dot product is
equal to ‘c’ then the point lies on the decision boundary.

Dr.S.Veena,Associate Professor/CSE
10
Support Vector Machines

why did we take this perpendicular vector w to the hyperplane?

So what we want is the distance of vector X from the decision boundary and
there can be infinite points on the boundary to measure the distance from. So
that’s why we come to standard, we simply take perpendicular and use it as a
reference and then take projections of all the other data points on this
perpendicular vector and then compare the distance.

Dr.S.Veena,Associate Professor/CSE
11
Support Vector Machines

Dr.S.Veena,Associate Professor/CSE
12
Support Vector Machines

• The mathematical representation of the maximum margin classifier

is as follows, which is an optimization problem

• Constraint 2 ensures that observations will be on the correct side of the

hyperplane by taking the product of coefficients with x variables and finally,
with a class variable indicator.
Dr.S.Veena,Associate Professor/CSE
13
Support Vector Machines

• Infinite separate hyperplanes can be drawn to separate the two classes (blue and red).
• However, the maximum margin classifier attempts to fit the widest slab (maximize the
margin between positive and negative hyperplanes) between two classes and the
observations touching both the positive and negative hyperplanes called support vectors:
• Classifier performance purely depends on the support vectors and any changes to
observation values which are not support vectors do not impact any change in the
performance of the Maximum Margin Classifier, as only extreme points are considered in the
algorithm.

Dr.S.Veena,Associate Professor/CSE
14
Support Vector Machines

Support vector classifier

• Support vector classifiers are an extended version of maximum margin
classifiers, in which some violations are tolerated for non-separable cases in
order to create the best fit, even with slight errors within the threshold limit.
• In fact, in real-life scenarios, we hardly find any data with purely separable
classes; most classes have a few or more observations in overlapping classes.
• The mathematical representation of the support vector classifier is as follows, a
slight correction to the constraints to accommodate error terms

Dr.S.Veena,Associate Professor/CSE
15
Support Vector Machines

Support vector classifier

• The mathematical representation of the support vector classifier is as follows, a
slight correction to the constraints to accommodate error terms

• In constraint 3, the C value is a non-negative tuning parameter to either

accommodate more or fewer overall errors in the model.

Dr.S.Veena,Associate Professor/CSE
16
Support Vector Machines

Support vector classifier

• C is a tuning parameter in Support Vector Classifiers
• The impact of changing the C value on margins is shown in the following diagram:

• High value of C - the model would be more tolerating and also have space for violations
(errors) in the left diagram,
• Lower value of C, - no scope for accepting violations leads to a reduction in margin width.

Dr.S.Veena,Associate Professor/CSE
17
Support Vector Machines
Support vector machines
• Support vector machines are used when the decision boundary is non-
linear and would not be separable with support vector classifiers
whatever the cost function is.
• The following diagram explains the non-linearly separable cases for both
1-dimension and 2-dimensions
• We need to use another way of handling the data, called the kernel trick,
using the kernel function to work with non-linearly separable data.

Dr.S.Veena,Associate Professor/CSE
18
Support Vector Machines
Support vector machines
• A polynomial kernel with degree 2 has been applied in transforming the data
from 1-dimensional to 2-dimensional data.
• In the left diagram, different classes (red and blue) are plotted on X1 only,
whereas after applying degree 2, we now have 2- dimensions, X1 and X 21
(the original and a new dimension).
• The degree of the polynomial kernel is a tuning parameter;
• the practitioner needs to tune them with various values to check where
higher accuracies are possible with the model:

Dr.S.Veena,Associate Professor/CSE
19
Support Vector Machines

Support vector machines

• In the 2-dimensional case, the kernel trick is applied with the polynomial kernel with degree
2.

• It seems that observations have been classified successfully using a linear plane after
projecting the data into higher dimensions:

Dr.S.Veena,Associate Professor/CSE
20
Support Vector Machines

Kernel functions
• Kernel functions are the functions that, given the original feature vectors,
return the same value as the dot product of its corresponding mapped
feature vectors.
• The main reason for using kernel functions is to eliminate the
computational requirement to derive the higher-dimensional vector space
from the given basic vector space, so that observations be separated
linearly in higher dimensions.
• Need - The derived vector space will grow exponentially with the
increase in dimensions and it will become almost too difficult to continue
computation, even when you have a variable size of 30 or so.

Dr.S.Veena,Associate Professor/CSE
21
Support Vector Machines

Kernel functions
• Example - shows how the size of the variables grows
• When we have two variables such as x and y, with a polynomial degree
kernel, it needs to compute x2, y2, and xy dimensions in addition.
• Whereas, if we have three variables x, y, and z, then we need to calculate
the x2, y2, z2, xy, yz, xz, and xyz vector spaces.
• The increase of one more dimension creates so many combinations.
Hence, care needs to be taken to reduce its computational complexity;
• Thus Kernels are used and defined more formally in the following
equation:

Dr.S.Veena,Associate Professor/CSE
22
Support Vector Machines

Kernel functions
• Polynomial Kernel:
• Polynomial kernels are popularly used, especially with degree 2.
• In fact, the inventor of support vector machines, Vladimir N Vapnik,
developed using a degree 2 kernel for classifying handwritten digits.
• Polynomial kernels are given by the following equation:

For reference : SVM | Support Vector Machine Algorithm in Machine Learning (analyticsvidhya.com)

Dr.S.Veena,Associate Professor/CSE
23
Support Vector Machines

Kernel functions
• Radial Basis Function (RBF) / Gaussian Kernel:
• RBF kernels are a good first choice for problems requiring nonlinear models.
• A decision boundary that is a hyperplane in the mapped feature space is similar to a decision
boundary that is a hypersphere in the original space.
• The feature space produced by the Gaussian kernel can have an infinite number of
dimensions, a feat that would be impossible otherwise. RBF kernels are represented by the
following equation

• This is often simplified as the following equation:

Dr.S.Veena,Associate Professor/CSE
24
Support Vector Machines

Kernel functions

• When the value of the gamma value is small, it gives you a pointed bump in the higher
dimensions; a larger value gives you a softer, broader bump.
• A small gamma will give you low bias and high variance solutions;
• a high gamma will give high bias and low variance solutions

Dr.S.Veena,Associate Professor/CSE
25
Support Vector Machines

SVM multilabel classifier with letter recognition data example

• Dataset : https://archive.ics.uci.edu/ml/datasets/letter+recognition.
• The task is to identify each of a large number of black and white rectangular pixel displays
as one of the 26 capital letters in the English alphabet (from A to Z; 26 classes altogether)
based on a few characteristics in integers, such as x-box (horizontal position of box), y-box
(vertical position of box), width of the box, height of the box, and so on:

>>> import os
""" First change the following directory link to
where all input files do exist """
>>> os.chdir("D:\\Book writing\\Codes\\Chapter 6")
>>> import pandas as pd
>>> letterdata = pd.read_csv("letterdata.csv")
>>> print (letterdata.head())

Dr.S.Veena,Associate Professor/CSE
26
Support Vector Machines

SVM multilabel classifier with letter recognition data example

Following code is used to remove the target variable from x variables and at the same time
create new y variable for convenience:
>>> x_vars = letterdata.drop(['letter'],axis=1)
>>> y_var = letterdata["letter"]

Dr.S.Veena,Associate Professor/CSE
27
Support Vector Machines

SVM multilabel classifier with letter recognition data example

• As scikit-learn does not directly support characters, we need to convert them into
number mappings. Here, we have done so with the dictionary:
>>> y_var = y_var.replace({'A':1,'B':2,'C':3,'D':4,'E':5,'F':6,'G':7,
'H':8,'I':9,
'J':10,'K':11,'L':12,'M':13,'N':14,'O':15,'P':16,'Q':17,'R':18,'S':19,'
T':2 0,'U':21, 'V':22, 'W':23,'X':24,'Y':25,'Z':26})

>>> from sklearn.metrics import accuracy_score,classification_report

>>> from sklearn.model_selection import train_test_split

>>> x_train,x_test,y_train,y_test =
train_test_split(x_vars,y_var,train_size = 0.7,random_state=42)

# Linear Classifier

>>> from sklearn.svm import SVC

Dr.S.Veena,Associate Professor/CSE
28
Support Vector Machines

Maximum margin classifier - linear kernel

The following code shows a linear classifier (also known as a maximum margin classifier)
with cost value as 1.0:
>>> svm_fit = SVC(kernel='linear',C=1.0,random_state=43)

>>> svm_fit.fit(x_train,y_train)
>>> print ("\nSVM Linear Classifier - Train Confusion Matrix\n\n",

pd.crosstab(y_train, svm_fit.predict(x_train),rownames = ["Actuall"],colnames =

["Predicted"]) )

>>> print ("\nSVM Linear Classifier - Train

accuracy:",round(accuracy_score(y_train, svm_fit.predict(x_train)),3))
>>> print ("\nSVM Linear Classifier - Train Classification Report\n",
classification_report(y_train,svm_fit.predict(x_train)))

Dr.S.Veena,Associate Professor/CSE
29
Support Vector Machines
Maximum margin classifier - linear kernel

Dr.S.Veena,Associate Professor/CSE
30
Support Vector Machines

Maximum margin classifier - linear kernel

Following code used for printing the test accuracy values:
>>> print ("\n\nSVM Linear Classifier - Test Confusion
Matrix\n\n",pd.crosstab(y_test, svm_fit.predict(x_test),rownames =
["Actuall"],colnames = ["Predicted"]))
>>> print ("\nSVM Linear Classifier - Test accuracy:",round(accuracy_score(
y_test,svm_fit.predict(x_test)),3))
>>> print ("\nSVM Linear Classifier - Test Classification Report\n",
classification_report(y_test,svm_fit.predict(x_test)))

Dr.S.Veena,Associate Professor/CSE
31
Support Vector Machines
Maximum margin classifier - linear kernel
From the above results, we can see that test accuracy for the linear classifier is 85%

Dr.S.Veena,Associate Professor/CSE
32
Support Vector Machines
Polynomial kernel
• A polynomial kernel with degree of 2 has been used to check whether any improvement in
accuracy is possible.
• The cost value has been kept constant with respect to the linear classifier in order to
determine the impact of the non-linear kernel:
#Polynomial Kernel
>>> svm_poly_fit = SVC(kernel='poly',C=1.0,degree=2)
>>> svm_poly_fit.fit(x_train,y_train)
>>> print ("\nSVM Polynomial Kernel Classifier - Train Confusion
Matrix\n\
n",pd.crosstab(y_train,svm_poly_fit.predict(x_train),rownames =
["Actuall"],colnames = ["Predicted"]) )
>>> print ("\nSVM Polynomial Kernel Classifier - Train
accuracy:",round(accuracy_score( y_train,svm_poly_fit.predict(x_train))
,3))
>>> print ("\nSVM Polynomial Kernel Classifier - Train Classification
Report\n",
classification_report(y_train,svm_poly_fit.predict(x_train)))

Dr.S.Veena,Associate Professor/CSE
33
Support Vector Machines
Polynomial kernel

Dr.S.Veena,Associate Professor/CSE
34
Support Vector Machines
Polynomial kernel
>>> print ("\n\nSVM Polynomial Kernel Classifier - Test Confusion Matrix\n\n",
pd.crosstab(y_test,svm_poly_fit.predict(x_test),rownames = ["Actuall"],colnames = ["Predicted"]))
>>> print ("\nSVM Polynomial Kernel Classifier - Test
accuracy:",round(accuracy_score( y_test,svm_poly_fit.predict(x_test)),3))
>>> print ("\nSVM Polynomial Kernel Classifier - Test Classification Report\n",
classification_report(y_test,svm_poly_fit.predict(x_test)))

Dr.S.Veena,Associate Professor/CSE
35
Support Vector Machines
RBF kernel
The cost value is kept constant with respective other kernels but the gamma value has been chosen
as 0.1 to fit the model:

#RBF Kernel
>>> svm_rbf_fit = SVC(kernel='rbf',C=1.0, gamma=0.1)
>>> svm_rbf_fit.fit(x_train,y_train)
>>> print ("\nSVM RBF Kernel Classifier - Train Confusion
Matrix\n\n",pd.crosstab( y_train,svm_rbf_fit.predict(x_train),rownames =
["Actuall"],colnames = ["Predicted"]))
>>> print ("\nSVM RBF Kernel Classifier - Train
accuracy:",round(accuracy_score( y_train, svm_rbf_fit.predict(x_train)),3))
>>> print ("\nSVM RBF Kernel Classifier - Train Classification Report\n",
classification_report(y_train,svm_rbf_fit.predict(x_train)))

Dr.S.Veena,Associate Professor/CSE
36
Support Vector Machines
RBF kernel

Dr.S.Veena,Associate Professor/CSE
37
Support Vector Machines
RBF kernel

>>> print ("\n\nSVM RBF Kernel Classifier - Test Confusion Matrix\n\n",

pd.crosstab(y_test,svm_rbf_fit.predict(x_test),rownames =
["Actuall"],colnames = ["Predicted"]))
>>> print ("\nSVM RBF Kernel Classifier - Test accuracy:",round(
accuracy_score( y_test,svm_rbf_fit.predict(x_test)),3))
>>> print ("\nSVM RBF Kernel Classifier - Test Classification Report\n",
classification_report(y_test,svm_rbf_fit.predict(x_test)))

Dr.S.Veena,Associate Professor/CSE
38
Support Vector Machines
RBF kernel

Dr.S.Veena,Associate Professor/CSE
39
Artificial Neural Networks - ANN
• Artificial neural networks (ANNs) model the
relationship between a set of input signals and output
signals using a model derived from a replica of the
biological brain, which responds to stimuli from its
sensory inputs.
• The human brain consists of about 90 billion neurons,
with around 1 trillion connections between them;
• ANN methods try to model problems using
interconnected artificial neurons (or nodes) to solve
machine learning problems

Dr.S.Veena,Associate Professor/CSE
40
Artificial Neural Networks - ANN
• In Human Brain
– Incoming signals are received by the cell's dendrites through a biochemical
process that allows the impulses to be weighted according to their relative
importance.
– As the cell body begins to accumulate the incoming signals, a threshold is
reached, at which the cell fires and the output signal is then transmitted via an
electrochemical process down the axon.
– At the axon terminal, an electric signal is again processed as a chemical signal
to be passed to its neighboring neurons, which will be dendrites to some other
neuron.

Dr.S.Veena,Associate Professor/CSE
41
Artificial Neural Networks - ANN
• A similar working principle is loosely used in building an artificial neural network, in which
each neuron has a set of inputs, each of which is given a specific weight.
• The neuron computes a function on these weighted inputs.
• A linear neuron takes a linear combination of weighted input and applies an activation
function (sigmoid, tanh, relu, and so on) on the aggregated sum.
• The network feeds the weighted sum of the input into the logistic function (in case of
sigmoid function).
• The logistic function returns a value between 0 and 1 based on the set threshold; for
example, here we set the threshold as 0.7.
• Any accumulated signal greater than 0.7 gives the signal of 1 and vice versa; any
accumulated signal less than 0.7 returns the value of 0:

Dr.S.Veena,Associate Professor/CSE
42
Artificial Neural Networks - ANN
• A typical artificial neuron with n input dendrites can be represented by the
following formula.
• The w weights allow each of the n inputs of x to contribute a greater or lesser
amount to the sum of input signals.
• The accumulated value is passed to the activation function, f(x), and the resulting
signal, y(x), is the output axon:

Dr.S.Veena,Associate Professor/CSE
43
Artificial Neural Networks - ANN
The parameters required for choosing for building neural networks are the following:
• Activation function: Choosing an activation function plays a major role in
aggregating signals into the output signal to be propagated to the other neurons of
the network.
• Network architecture or topology: This represents the number of layers required
and the number of neurons in each layer. More layers and neurons will create a
highly non-linear decision boundary, whereas if we reduce the architecture, the
model will be less flexible and more robust.
• Training optimization algorithm: The selection of an optimization algorithm plays a
critical role as well, in order to converge quickly and accurately to the best optimal
solutions
• Applications of Neural Networks:
– Images and videos: To identify an object in an image or to classify whether it
is a dog or a cat
– Text processing (NLP): Deep-learning-based chatbot and so on
– Speech: Recognize speech
– Structured data processing: Building highly powerful models to obtain a non-
linear decision boundary

Dr.S.Veena,Associate Professor/CSE
44
Artificial Neural Networks - ANN
Activation functions
• Activation functions are the mechanisms by which an artificial neuron processes
information and passes it throughout the network.
• The activation function takes a single number and performs a certain fixed
mathematical functional mapping on it.
• Different types of activation functions are :
• Sigmoid function: Sigmoid has the mathematical form σ(x) = 1 / (1+e−x). It takes
a real- valued number and squashes it into a range between 0 and 1. Sigmoid is a
popular choice, which makes calculating derivatives easy and is easy to interpret.
• Tanh function: Tanh squashes the real-valued number into the range [-1, 1]. The
output is zero-centered. In practice, tanh non-linearity is always preferred to sigmoid
non-linearity. Also, it can be proved that tanh is scaled sigmoid neuron tanh(x) = 2σ
(2x) − 1.
• Rectified Linear Unit (ReLU) function: ReLU has become very popular in the last
few years. It computes the function f(x) = max (0, x). Activation is simply thresholds
at zero.
• Linear function: The linear activation function is used in linear regression
problems, where it always provides a derivative as 1 due to the function used being
f(x) = x.
• Relu is now popularly being used in place of Sigmoid or Tanh due to its better
convergence property.
Dr.S.Veena,Associate Professor/CSE
45
Artificial Neural Networks - ANN

Dr.S.Veena,Associate Professor/CSE
46
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• Forward propagation and backpropagation are illustrated with the two
hidden layer deep neural networks in the following example, in which both
layers get three neurons each, in addition to input and output layers.
• The number of neurons in the input layer is based on the number of x
(independent) variables, whereas the number of neurons in the output layer
is decided by the number of classes the model needs to be predicted.
• For ease, we have shown only one neuron in each layer;
• Weights and biases are initiated from some random numbers, so that in both
forward and backward passes, these can be updated in order to minimize the
errors altogether.

Dr.S.Veena,Associate Professor/CSE
47
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• During forward propagation, features are input to the network and fed through the following layers to produce
the output activation.
• In the hidden layer 1, the activation obtained is the combination of bias weight 1 and weighted combination of
input values; If the overall value crosses the threshold, it will trigger to the next layer, else the signal will be 0 to
the next layer values. Bias values are necessary to control the trigger points.
• In some cases, the weighted combination signal is low; in those cases, bias will compensate the extra amount for
adjusting the aggregated value, which can trigger for the next level

Dr.S.Veena,Associate Professor/CSE
48
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• Once all the neurons are calculated in Hidden Layer 1 (Hidden1, Hidden2,
and Hidden3 neurons), the next layer of neurons needs to be calculated in a
similar way from the output of the hidden neurons from the first layer with
the addition of bias (bias weight 4).
• The following figure describes the hidden neuron 4 shown in layer 2:

Dr.S.Veena,Associate Professor/CSE
49
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• In the last layer (also known as the output layer), outputs are calculated in the same
way from the outputs obtained from hidden layer 2 by taking the weighted combination
of weights and outputs obtained from hidden layer 2.
• Once we obtain the output from the model, a comparison needs to be made with the
actual value and we need to backpropagate the errors across the net backward in order
to correct the weights of the entire neural network:

Dr.S.Veena,Associate Professor/CSE
50
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• In the following diagram, we have taken the derivative of the output value
and multiplied by that much amount to the error component, which was
obtained from differencing the actual value with the model output:

Dr.S.Veena,Associate Professor/CSE
51
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• In a similar way, we will backpropagate the error from the second hidden
layer as well.
• In the following diagram, errors are computed from the Hidden 4 neuron in
the second hidden layer:

Dr.S.Veena,Associate Professor/CSE
52
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• In the following diagram, errors are calculated for the Hidden 1 neuron in
layer 1 based on errors obtained from all the neurons in layer 2:

Dr.S.Veena,Associate Professor/CSE
53
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• Once all the neurons in hidden layer 1 are updated, weights between inputs and the
hidden layer also need to be updated,
• In the following diagram, we will be updating the weights of both the inputs and also,
at the same time, the neurons in hidden layer 1, as neurons in layer 1 utilize the
weights from input only:

Dr.S.Veena,Associate Professor/CSE
54
Artificial Neural Networks - ANN
Forward propagation and backpropagation
• Finally, in the following figure, layer 2 neurons are being updated in the
forward propagation pass:

Dr.S.Veena,Associate Professor/CSE
55
Artificial Neural Networks - ANN
Optimization of neural networks
• Various techniques have been used for optimizing the weights of
neural networks:
– Stochastic gradient descent (SGD)
– Momentum
– Nesterov accelerated gradient (NAG)
– Adaptive gradient (Adagrad)
– Adadelta
– RMSprop
– Adaptive moment estimation (Adam)
– Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)

Dr.S.Veena,Associate Professor/CSE
56
Artificial Neural Networks - ANN
Optimization of neural networks -Stochastic gradient descent - SGD
• Gradient descent is a way to minimize an objective function J(θ)
parameterized by a model's parameter θ ε Rd by updating the
parameters in the opposite direction of the gradient of the objective
function with regard to the parameters.
• The learning rate determines the size of the steps taken to reach the
minimum
– Batch gradient descent (all training observations utilized in each
iteration)
– SGD (one observation per iteration)
– Mini batch gradient descent (size of about 50 training
observations for each iteration)

Dr.S.Veena,Associate Professor/CSE
57
Artificial Neural Networks - ANN
Optimization of neural networks -Stochastic gradient descent - SGD

Dr.S.Veena,Associate Professor/CSE
58
Artificial Neural Networks - ANN
Optimization of neural networks -Stochastic gradient descent - SGD
● In the following image 2D projection convergence characteristics of both
full batch and stochastic gradient descent with batch size 1 has been
compared.
● If we see, full batch updates, are more smoother due to the consideration of
all the observations.
● Whereas, SGD have wiggly convergence characteristics due to the reason
of using 1 observation for each update:

Dr.S.Veena,Associate Professor/CSE
59
Artificial Neural Networks - ANN
Introduction to deep learning
Deep learning is a class of machine learning algorithms which utilizes neural
networks for building models to solve both supervised and unsupervised problems
on structured and unstructured datasets such as images, videos, NLP, voice
processing, and so on:

Dr.S.Veena,Associate Professor/CSE
60
Artificial Neural Networks - ANN
Introduction to deep learning
• Deep neural network/deep architecture consists of multiple hidden layers of
units between input and output layers.
• Each layer is fully connected with the subsequent layer.
• The output of each artificial neuron in a layer is an input to every artificial
neuron in the next layer towards the output:

Dr.S.Veena,Associate Professor/CSE
61
Artificial Neural Networks - ANN
Introduction to deep learning
With the more number of hidden layers are being added to the neural network,
more complex decision boundaries are being created to classify different
categories.
Example of complex decision boundary can be seen in the following graph:

Dr.S.Veena,Associate Professor/CSE
62
Artificial Neural Networks - ANN
Solving methodology
• Backpropagation is used to solve deep layers by calculating the error
of the network at output units and propagate back through layers to
update the weights to reduce error terms.
• Thumb rules in designing deep neural networks:
– All hidden layers should have the same number of neurons per
layer
– Typically, two hidden layers are good enough to solve the
majority of problems
– Using scaling/batch normalization (mean 0, variance 1) for all
input variables after each layer improves convergence
effectiveness
– Reduction in step size after each iteration improves convergence,
in addition to the use of momentum and dropout

Dr.S.Veena,Associate Professor/CSE
63
Artificial Neural Networks - ANN
Deep learning software
• Deep learning software has evolved multi-fold in recent times.
• Different types of Deep Learning software are
– Theano: Python-based deep learning library developed by
the University of Montreal
– TensorFlow: Google's deep learning library runs on top of
Python/C++
– Keras / Lasagne: Lightweight wrapper which sits on top of
Theano/TensorFlow and enables faster model prototyping
– Torch: Lua-based deep learning library with wide support
for machine learning algorithms
– Caffe: deep learning library primarily used for processing
pictures

Dr.S.Veena,Associate Professor/CSE
64
Artificial Neural Networks - ANN
Deep learning software
TensorFlow is recently picking up momentum among the deep learning
community, as it is being backed up by Google and also has good visualization
capabilities using TensorBoard:

Dr.S.Veena,Associate Professor/CSE
65
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
We are using the same data as we trained the model on previously using scikit-
learn in order to perform apple-to-apple comparison between scikit-learn and the
deep learning software Keras.
Data loading steps

>>> import numpy as np

>> import pandas as pd

>>> import matplotlib.pyplot as plt
>>> from sklearn.datasets import load_digits
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.preprocessing import StandardScaler
>>> from sklearn.metrics import
accuracy_score,classification_report

Dr.S.Veena,Associate Professor/CSE
66
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
Keras Library modules
>>> from keras.models import Sequential
>>> from keras.layers.core import Dense, Dropout, Activation
>>> from keras.optimizers import Adadelta,Adam,RMSprop
>>> from keras.utils import np_utils

Dr.S.Veena,Associate Professor/CSE
67
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
The following code loads the digit data from scikit-learn datasets. A quick piece
of code to check the shape of the data, as data embedded in numpy arrays itself,
hence we do not need to change it into any other format, as deep learning models
get trained on numpy arrays:
>>> digits = load_digits()
>>> X = digits.data
>>> y = digits.target
>>> print (X.shape)
>>> print (y.shape)
>>> print ("\nPrinting first digit")
>>> plt.matshow(digits.images[0])
>>> plt.show()

Dr.S.Veena,Associate Professor/CSE
68
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
The previous code prints the first digit in matrix form. It appears
that the following digit looks like a 0:

Dr.S.Veena,Associate Professor/CSE
69
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
We are performing the standardizing of data with the following code to demean
the series, followed by standard deviation to put all the 64 dimensions in a similar
scale.
>>> x_vars_stdscle = StandardScaler().fit_transform(X)

The following section of the code splits the data into train and test based on a 70-
30 split:
>>> x_train,x_test,y_train,y_test =
train_test_split(x_vars_stdscle,y,train_size = 0.7,random_state=42)

Dr.S.Veena,Associate Professor/CSE
70
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
We have used nb_classes as 10, due to the reason that the digits range from 0-9;
batch_size as 128, which means for each batch, we utilize 128 observations to
update the weights; and finally, we have used nb_epochs as 200, which means the
number of epochs the model needs to be trained is 200

# Defining hyper parameters

>>> np.random.seed(1337)
>>> nb_classes = 10
>>> batch_size = 128
>>> nb_epochs = 200

Dr.S.Veena,Associate Professor/CSE
71
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
The following code actually creates the n-dimensional vector for multiclass values
based on the nb_classes value. Here, we will get the dimension as 10 for all train
observations for training using the softmax classifier:
>>> Y_train = np_utils.to_categorical(y_train, nb_classes)

The core model building code, which looks like Lego blocks, is shown as follows.
Here we, initiate the model as sequential rather than parallel and so on:
#Deep Layer Model building in Keras
>>> model = Sequential()

Dr.S.Veena,Associate Professor/CSE
72
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras

In the first layer, we are using 100 neurons with input shape as 64 columns (as the number
of columns in X is 64), followed by relu activation functions with dropout value as 0.5
>>> model.add(Dense(100,input_shape= (64,)))
>>> model.add(Activation('relu'))
>>> model.add(Dropout(0.5))
In the second layer, we are using 50 neurons (to compare the results obtained using the
scikit-learn methodology, we have used a similar architecture):
>>> model.add(Dense(50))
>>> model.add(Activation('relu'))
>>> model.add(Dropout(0.5))
In the output layer, the number of classes needs to be used with the softmax classifier:
>>> model.add(Dense(nb_classes))
>>> model.add(Activation('softmax'))

Dr.S.Veena,Associate Professor/CSE
73
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras
Here, we are compiling with categorical_crossentropy, as the output is multiclass;
whereas, if we want to use binary class, we need to use binary_crossentropy
instead:
>>> model.compile(loss='categorical_crossentropy', optimizer='adam')

The model is being trained in the following step with all the given batch sizes and
number of epochs:
#Model training
>>> model.fit(x_train, Y_train, batch_size=batch_size,
nb_epoch=nb_epochs,verbose=1)

Dr.S.Veena,Associate Professor/CSE
74
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras

#Model Prediction
>>> y_train_predclass =
model.predict_classes(x_train,batch_size=batch_size)
>>> y_test_predclass = model.predict_classes(x_test,batch_size=batch_size)
>>> print ("\n\nDeep Neural Network - Train accuracy:"),
(round(accuracy_score(y_train,y_train_predclass),3))
>>> print ("\nDeep Neural Network - Train Classification Report")
>>> print classification_report(y_train,y_train_predclass)
>>> print ("\nDeep Neural Network - Train Confusion Matrix\n")
>>> print (pd.crosstab(y_train,y_train_predclass,rownames =
["Actuall"],colnames = ["Predicted"]) )

Dr.S.Veena,Associate Professor/CSE
75
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras

Dr.S.Veena,Associate Professor/CSE
76
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras

Testing

>>> print ("\nDeep Neural Network - Test

accuracy:"),(round(accuracy_score(y_test, y_test_predclass),3))
>>> print ("\nDeep Neural Network - Test Classification Report")
>>> print (classification_report(y_test,y_test_predclass))
>>> print ("\nDeep Neural Network - Test Confusion Matrix\n")
>>> print (pd.crosstab(y_test,y_test_predclass,rownames =
["Actuall"],colnames = ["Predicted"]) )

Dr.S.Veena,Associate Professor/CSE
77
Artificial Neural Networks - ANN
Deep learning software - Deep neural network classifier applied on
handwritten digits using Keras

Dr.S.Veena,Associate Professor/CSE
78

Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
Unit-III - SVM
No ratings yet
Unit-III - SVM
105 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
103 pages
Final - Support Vector Machine - Class - Modifie
No ratings yet
Final - Support Vector Machine - Class - Modifie
69 pages
Support Vector Machines: Theory, Implementation, and Applications
No ratings yet
Support Vector Machines: Theory, Implementation, and Applications
40 pages
IVPML Unit III
No ratings yet
IVPML Unit III
139 pages
S V M (SVM) : Upport Ector Achine
No ratings yet
S V M (SVM) : Upport Ector Achine
67 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
SMV 3
No ratings yet
SMV 3
23 pages
27 Support - Vector - Machine
No ratings yet
27 Support - Vector - Machine
17 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Basic Concept of SVM
No ratings yet
Basic Concept of SVM
29 pages
Slide - SVM
No ratings yet
Slide - SVM
12 pages
Seminar
No ratings yet
Seminar
51 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Lecture - 7 Classification (SVM)
No ratings yet
Lecture - 7 Classification (SVM)
48 pages
1694600937-Unit2.5 Support Vector Machine CU 2.0
No ratings yet
1694600937-Unit2.5 Support Vector Machine CU 2.0
26 pages
Support Vector Machines: Constantin F. Aliferis & Ioannis Tsamardinos
No ratings yet
Support Vector Machines: Constantin F. Aliferis & Ioannis Tsamardinos
37 pages
Main
No ratings yet
Main
12 pages
Top 50 Data Analyst Portfolio Project
50% (2)
Top 50 Data Analyst Portfolio Project
53 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Support Vector Machine
No ratings yet
Support Vector Machine
29 pages
Exp 14
No ratings yet
Exp 14
27 pages
Support Vector Machines For Classification: A Seminar On Data Mining
No ratings yet
Support Vector Machines For Classification: A Seminar On Data Mining
18 pages
Support Vector Machines
No ratings yet
Support Vector Machines
33 pages
Master Thesis Support Vector Machine
100% (3)
Master Thesis Support Vector Machine
5 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
9 Svm-Handout PDF
No ratings yet
9 Svm-Handout PDF
21 pages
Icml Tutorial
No ratings yet
Icml Tutorial
85 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
Support Vector Machines - An Introduction: Department of Electrical Engineering Technion, Israel
100% (1)
Support Vector Machines - An Introduction: Department of Electrical Engineering Technion, Israel
44 pages
Support Vector Machine Classifiers
No ratings yet
Support Vector Machine Classifiers
44 pages
Module10 - Support Vector Machine
No ratings yet
Module10 - Support Vector Machine
23 pages
IQ Bot Developer Quiz Practice Q A
No ratings yet
IQ Bot Developer Quiz Practice Q A
12 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
This Is
No ratings yet
This Is
7 pages
SVM Class
No ratings yet
SVM Class
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Machine Learning Tutorial
100% (2)
Machine Learning Tutorial
139 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
2024, DCNAM - Automatic Detection of Pixel Level Fine Crack Using A Densely Connected - 'Beyene Et Al' (Structures)
No ratings yet
2024, DCNAM - Automatic Detection of Pixel Level Fine Crack Using A Densely Connected - 'Beyene Et Al' (Structures)
12 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Machine Learning
No ratings yet
Machine Learning
45 pages
IoT Using Arduino and Raspberry Pi
No ratings yet
IoT Using Arduino and Raspberry Pi
85 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
No ratings yet
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
14 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Unit 5 - Ai
No ratings yet
Unit 5 - Ai
138 pages
Another Introduction SVM
No ratings yet
Another Introduction SVM
4 pages
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
No ratings yet
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
16 pages
Unit 4 - Ai
No ratings yet
Unit 4 - Ai
109 pages
CS601PC - MACHINE LEARNING Unit - 1-2
No ratings yet
CS601PC - MACHINE LEARNING Unit - 1-2
145 pages
Hedge Fund Use of AI
No ratings yet
Hedge Fund Use of AI
46 pages
R22M.tech - CSE Syllabus
No ratings yet
R22M.tech - CSE Syllabus
50 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
Unit - I-Object Oriented Programming Concepts
No ratings yet
Unit - I-Object Oriented Programming Concepts
22 pages
Chapter 3 Artificial Intelligence (AI)
No ratings yet
Chapter 3 Artificial Intelligence (AI)
47 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
Unit 1 - Ai
No ratings yet
Unit 1 - Ai
90 pages
Machine Learning 2 Books in 1 The Complete Guide For Beginners To Master Neural Networks Artificial Intelligence and Data Science With Python Park Download
No ratings yet
Machine Learning 2 Books in 1 The Complete Guide For Beginners To Master Neural Networks Artificial Intelligence and Data Science With Python Park Download
89 pages
Soft Computing For Problem Solving Proceedings of The Socpros 2022 Manoj Thakur PDF Download
No ratings yet
Soft Computing For Problem Solving Proceedings of The Socpros 2022 Manoj Thakur PDF Download
84 pages
Report Mini Project
No ratings yet
Report Mini Project
35 pages
Urban Heat Island Prediction Using ANN
No ratings yet
Urban Heat Island Prediction Using ANN
7 pages
Ex. 6 - Create Two Datacenters With One Host Each and Run Cloudlets of Two Users On Them
No ratings yet
Ex. 6 - Create Two Datacenters With One Host Each and Run Cloudlets of Two Users On Them
9 pages
The Benefits and Challenges of ChatGPT An Overview
No ratings yet
The Benefits and Challenges of ChatGPT An Overview
3 pages
Python - Deep Learning
No ratings yet
Python - Deep Learning
3 pages
Introduction To Statistical Machine Learning
No ratings yet
Introduction To Statistical Machine Learning
84 pages
Deep Unsupervised Learning
No ratings yet
Deep Unsupervised Learning
90 pages
Unit I-CAP QUESTION BANK
No ratings yet
Unit I-CAP QUESTION BANK
4 pages
Exercise 4: Cloudsim
No ratings yet
Exercise 4: Cloudsim
9 pages
Unit II-CAP
No ratings yet
Unit II-CAP
4 pages
KNN and Naive Bayes
No ratings yet
KNN and Naive Bayes
61 pages
OODP - Unit - I - UML Diagram
No ratings yet
OODP - Unit - I - UML Diagram
64 pages
Classes and Objects
100% (1)
Classes and Objects
20 pages
Operator Overloading: Dr.S.Veena, Associate Professor/CSE
No ratings yet
Operator Overloading: Dr.S.Veena, Associate Professor/CSE
13 pages
Unit III Cap
No ratings yet
Unit III Cap
7 pages
Electronics
No ratings yet
Electronics
23 pages
WIREs Data Min Knowl - 2023 - Shaik - Remote Patient Monitoring Using Artificial Intelligence Current State
No ratings yet
WIREs Data Min Knowl - 2023 - Shaik - Remote Patient Monitoring Using Artificial Intelligence Current State
31 pages
Thesis Presentation2
No ratings yet
Thesis Presentation2
43 pages
Ex. 15 - Load Balancer Using AWS
No ratings yet
Ex. 15 - Load Balancer Using AWS
3 pages
NAFNET Megvii Research Paper
No ratings yet
NAFNET Megvii Research Paper
17 pages
Ex. 10 - EC2 Linux VM Launch
No ratings yet
Ex. 10 - EC2 Linux VM Launch
3 pages
Evolution of Neuromorphic Computing With Machine Learning and Artificial Intelligence
No ratings yet
Evolution of Neuromorphic Computing With Machine Learning and Artificial Intelligence
6 pages
6 - Into To Data Science Techniques and Clustering
No ratings yet
6 - Into To Data Science Techniques and Clustering
16 pages
Apk Word Jurnal
No ratings yet
Apk Word Jurnal
12 pages
KAN or MLP: A Fairer Comparison: Which Network To Use Under Fair Comparison?
No ratings yet
KAN or MLP: A Fairer Comparison: Which Network To Use Under Fair Comparison?
12 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
AI - Unit 03
No ratings yet
AI - Unit 03
9 pages
Ms. Humera Shaziya
No ratings yet
Ms. Humera Shaziya
13 pages
1805 07297 PDF
No ratings yet
1805 07297 PDF
29 pages
UCNN: Exploiting Computational Reuse in Deep Neural Networks Via Weight Repetition
No ratings yet
UCNN: Exploiting Computational Reuse in Deep Neural Networks Via Weight Repetition
14 pages
Computer Vision in Healthcare - IEEE
No ratings yet
Computer Vision in Healthcare - IEEE
4 pages
CSE AI ML Brochure2020-21
No ratings yet
CSE AI ML Brochure2020-21
4 pages
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

Support Vector Machines and Artificial Neural Networks: Dr.S.Veena, Associate Professor/CSE

Uploaded by

Support Vector Machines and Artificial Neural Networks: Dr.S.Veena, Associate Professor/CSE

Uploaded by

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY,

Support Vector Machines and

Support Vector Machines

• Support vector machines (SVMs) are a set of supervised learning methods

Support vector machines working principles

why did we take this perpendicular vector w to the hyperplane?

• The mathematical representation of the maximum margin classifier

• Constraint 2 ensures that observations will be on the correct side of the

Support vector classifier

Support vector classifier

• In constraint 3, the C value is a non-negative tuning parameter to either

Support vector classifier

Support vector machines

• This is often simplified as the following equation:

SVM multilabel classifier with letter recognition data example

SVM multilabel classifier with letter recognition data example

SVM multilabel classifier with letter recognition data example

>>> from sklearn.metrics import accuracy_score,classification_report

>>> from sklearn.model_selection import train_test_split

>>> from sklearn.svm import SVC

Maximum margin classifier - linear kernel

pd.crosstab(y_train, svm_fit.predict(x_train),rownames = ["Actuall"],colnames =

>>> print ("\nSVM Linear Classifier - Train

Maximum margin classifier - linear kernel

>>> print ("\n\nSVM RBF Kernel Classifier - Test Confusion Matrix\n\n",

>>> import numpy as np

>> import pandas as pd

# Defining hyper parameters

>>> print ("\nDeep Neural Network - Test

You might also like