Artifical Intelligence Coursework Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 28

Data Classification using Neural

Networks
55-700241 Applicable Artificial Intelligence

BY
AMAL THOMAS JOSE
30046158
Data Classification using Neural Networks

Table of Contents
1. INTRODUCTION..............................................................................................................2
2. REQUIREMENT ANALYSIS..........................................................................................2
3. DESIGN CONSIDERATIONS.........................................................................................3
4. IMPLEMENTATION AND TESTING.............................................................................3
4.1 IMPELEMENTATION...............................................................................................3
(a) Data Analysis:..........................................................................................................3
(b) Data Standardization and Normalization:................................................................4
4.2 TESTING....................................................................................................................8
5. EVALUATION/CONCLUSION.....................................................................................21
6. REFERENCES.................................................................................................................21
7. APPENDIX......................................................................................................................22

1
Data Classification using Neural Networks

1. INTRODUCTION

The objective of this coursework is to design, implement and evaluate a Neural


Network for data classification and write a report on it. The neural network is
developed using MATLAB.
Neural Networks are complex models with a collection of interconnected nodes or
layers, which is categorised as input, hidden and output layers. Neural networks
consist of neurons which resembles to the human brain, enabling them to make
classification decisions. They are also called as artificial neural networks and used
widely in deep learning technology. Neural networks can be used for two purposes:
Regression analysis and Data classification.
In this coursework a subset of the Heart Disease (Cleveland) data set is provided, and
the data has the following properties:
 Data consist of a cleaned-up subset of 13 features from a full set of 75,
 It contains multiple classes (0: no heart disease, 1: mild heart disease, 2:
severe heart disease)
 Most experiments in the literature focus on detecting presence (1 or 2)
from absence (0)
 Current state of the art is around 90% accuracy. The task is to design a
neural network to achieve a cross validated classification rate as close as
possible to current state of the art.

2. REQUIREMENT ANALYSIS

The task to be done is classification of data using a Neural network to detect heart
disease from a set of cleaned up data. Using backpropagation algorithm, it is possible
to achieve the required results. Computational or programming software’s like
MATLAB or Python is generally used to solve neural networks problems.
MATLAB version R2021a is used in this coursework to design, train and simulate the
neural network to an accuracy of 90 percentage. The system will be designed in such
a way that it will be reliable, robust, and accurate to work and process any given
dataset to satisfaction.
Input parameters and the corresponding target parameters are used to train a network
until it can approximate a function. Networks with biases, a sigmoid layer, and a
linear output layer can approximate any function with a finite number of
discontinuities. Standard backpropagation is a gradient descent algorithm, in which
the network weights are moved along the negative of the gradient of the performance
function.
The requirement or aim of this coursework is to design, implement and obtain a
system that will be able to distinguish different classes (0, 1 and 2) from a set of 13
known inputs, which contains different parameters.

2
Data Classification using Neural Networks

3. DESIGN CONSIDERATIONS

The neural network is designed using MATLAB, to build an efficient, accurate neural
network, three main steps are followed:
 Design
 Training
 Simulation
The first and foremost step in designing a neural network is to define whether the
given input data is linearly separable or not. Because depending upon this we decide
whether to use “perceptron” or “feedforwardnet” command, if the data is linearly
separable “perceptron” command is used and if not, we use “feedforwardnet”. This
process is done by plotting a graph using the inputs and checking if a straight line can
separate them.
The second important design consideration is to identify how many hidden layers and
nodes are needed to test a given dataset to gain an accurate network.
The third consideration is to decide which transfer function (tansig, logsig, purlin),
type of data divider function (dividetrain, dividerand, divideblock etc) and training
function (traingd, trainlm etc) to use to obtain a robust and desirable network.

4. IMPLEMENTATION AND TESTING

4.1 IMPELEMENTATION

Neural networks are data-driven i.e., it depends on the data given. The given input (x)
dataset contains an array of elements of order (297 x 13) and the response or target (t)
is of order (297 x 1) array. The aim to build and train a neutral network as close to the
target. So, as mentioned in design consideration, the first step is to design the
network.

4.1.1. Design:
(a) Data Analysis:
Using “find” command in MATLAB, it is possible to plot a graph between class 0,
class 1, and class 2.
Where, class 0: No heart disease.
Class 1: Mild heart disease.
Class 2: Severe heart disease.
By using MATLAB code figure, (plot), we can separate classes 0,1 &2 and define the
nature of data, i.e.., linearly separable or not.

3
Data Classification using Neural Networks

Fig 1: Data analysis of classes 0, 1 & 2.


It is clear from the figure that the data are not linearly separable, hence it is not
possible to use command “perceptron” instead “feedforwardnet” command is used.
(b) Data Standardization and Normalization:
By default, when using function normalize(x) in MATLAB, the data is automatically
standardized. Standardization is done between mean 0 and standard deviation 1. By
using the function xs = normalize(x) and we get the following figure.

4
Data Classification using Neural Networks

Fig 2: Standardized Dataset.

Normalization is used to scale the data between 0 and 1.


By using command “help normalize” in MATLAB, we can learn more about this
function,
N = normalize(A) normalizes data in A using the 'zscore' method, which centres the
data to have mean 0 and scales it to have standard deviation 1. NaN values are
ignored. If A is a matrix or a table, normalize operates on each column separately. If
A is an N-D array, normalize operates along the first array dimension whose size does
not equal 1.
Function: 'range' - normalizes by rescaling the range of the data to the interval [0,1].
Now we know what normalize function in MATLAB is used for and applying this
function to the dataset.
xnormal = normalize(x,’range’);

Fig 3: Normalized data.

The only way to understand which dataset is better, is to try building the neural
network using different datasets. In this coursework, to understand how neural
network depends on data, three case studies will be conducted, with raw data,

5
Data Classification using Neural Networks

normalized data, and standardized data to see which will train the dataset to best
accuracy.

4.1.2. Training:

Step1. MATLAB command “feedforwardnet” is used to design the neurons i.e..,


nodes of neural network.
Step 2: Train the network using command mynet = train(mynet,inputs,target);
Here, we can change the activation function, divide function and gradient to
improve the quality of neural networks.
In this coursework, all three datasets will be trained using functions as shown
in the table below.
Input Data Activation Function Training Gradient Divide Function
x logsig traingd dividetrain
xs logsig traingd dividetrain
xn logsig traingd dividetrain
Table 1: Algorithm Table
The functions used is discussed in detail below:
Activation Function
In artificial neural networks, the activation function of a node defines the
output of that node given an input or set of inputs [6]. In this coursework all
three cases, logsig or sigmoid curve is used as the activation function.
Code “logsig” is a transfer function and it calculates a layer’s output from its
net input.

6
Data Classification using Neural Networks

Fig 4: Sigmoid curve (logsig).

Training Gradient Descent


Gradient Descent is an optimization algorithm for finding a local minimum of
a differentiable function [2]. A gradient simply measures the change in all
weights with regard to the change in error.
For gradient descent to reach the local minimum we must set the learning rate
to an appropriate value, which is neither too low nor too high. This is
important because if the learning rate are too big, it may not reach the local
minimum because it bounces back and forth between the convex function of
gradient descent. If we set the learning rate to a very small value, gradient
descent will eventually reach the local minimum but that may take a while [2].
So, it is very important not to set learning rate too high or low because it
changes the output.

7
Data Classification using Neural Networks

Fig 5: Cost function vs epochs graph.


A good way to make sure gradient descent runs properly is by plotting the cost
function against the epochs or iterations. X-axis represents the no of iterations
while the Y-axis contains cost function. It can be noticed that, if the gradient is
working properly then the cost function will decrease gradually after each
iteration.
We use algorithms to tell us automatically if the gradient has converged or not.
In this course work all three cases will use a batch training gradient “traingd”
as the training algorithm.
The command “traingd” is a network training function that updates weight and
bias values according to gradient descent [1].

8
Data Classification using Neural Networks

Divide Training Function


In this coursework, function “dividetrain” is used for all three cases. The
function assigns all targets to training set and no targets to the validation or
test sets [1].
By using these, we will train our datasets using command:
mynet= train (mynet, inputs, target);

4.1.3 Simulation:

Once the neural network is designed and trained using the set function, the
next step is to simulate the network to see how well the network learned the
data during the training. The MATLAB command “sim” is used to simulate
the neural network.
Eg: mynet = sim(mynet, inputs);

4.2TESTING
We use three cases of data (raw data, standardized data, and normalized data)
to train the neural networks and results of each is recorded to identify which
form of dataset will give the most accurate neural network.
(a) Case 1: Design, train and simulate a neural network using raw data.

In this case, we use the raw dataset for training and testing, also the results will be
recorded to measure the accuracy of network obtained using this dataset.

Step1: Design

Feed-forward net command to build the nodes of neural network.


.
We use the code “mynet = feedforward ([120 40 60 80]);”, thus we get our neural
network with four hidden layers as shown in figure.

9
Data Classification using Neural Networks

Fig 6: Feed-forward Neural Network of Raw dataset.

Step 2: Train

Using code “mynet = train (mynet, x, t);

Where, mynet = the neural network


x= raw input data,
and t= target.

We will also change the divdeFcn, transferFcn and trainFcn as discussed in Table
1. MATLAB codes is in appendix [1].

After changing the function, the network should be reinitialised using the
command.

mynet = “init(mynet)”,

otherwise, the weights will not be changed.

10
Data Classification using Neural Networks

Fig 7: Training Parameters.

11
Data Classification using Neural Networks

Fig 8: Performance Curve


Fig 9: Training State curve

Fig 10: Regression Curve

12
Data Classification using Neural Networks

Step 3: Simulation

Once the neural network is trained using the algorithm, we will simulate the
network using code “result =sim (mynet, x);”
To check the accuracy of the network, we will create a truth table using predicated
class and true class with a for loop. The logic used is shown below and in
appendix [1].
Results = sim(mynet,x)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end

MATLAB code “confusionchart” is used to form this truth table.


Using code confusionchart(results,t); we get the following table and the
accuracy can be calculated from it.

13
Data Classification using Neural Networks

Fig 11: Truth Table

The values in blue coloured squares along the diagonals on the confusion chart
is the accurate values and hence.

151+74 +30
accuracy= =0.858 ≅ 85.8 %
297

(b) Case 2: Design, train and simulate a neural network using standardized data.

In this case, we use the standardized dataset for training and testing, also the
results will be recorded to measure the accuracy of network obtained using this
dataset.

Step1: Design

Feed-forwardnet command to build the nodes of neural network.

We use the code “mynet = feedforward ([120 40 60 80]);”, thus we get our neural
network with four hidden layers as shown in figure.

Fig 12: Feed-forward Neural Network of Standardized dataset.

Step 2: Train

Using code “mynet = train (mynet, xs, t);

Where, mynet = the neural network


xs= standardizes input data,
and t= target.

14
Data Classification using Neural Networks

We will also change the divdeFcn, transferFcn and trainFcn as discussed in Table
1. MATLAB codes is in appendix [2].

Epochs or iterations are set to 5000.

After changing the function, the network should be reinitialised using the
command.

“mynet=init(mynet)”,

otherwise, the weights will not be changed.

Fig 13: Training parameters.

15
Data Classification using Neural Networks

Fig 14: Performance Curve.

16
Data Classification using Neural Networks

Fig 15: Training state curve.

17
Data Classification using Neural Networks

Fig 16: Regression Curve.


Step 3: Simulation

Once the neural network is trained using the algorithm, we will simulate the
network using code “result =sim (mynet, xs);”
To check the accuracy of the network, we will create a truth table using predicated
class and true class with a for loop. The logic used is shown below and in
appendix [2].
Results = sim(mynet,xn)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end

MATLAB code “confusionchart” is used to form this truth table.


Using code confusionchart(results, t); we get the following table and
the accuracy can be calculated from it.

Fig 17: Truth Table.


The values in blue coloured squares along the diagonals on the confusion chart
is the accurate values and hence.

18
Data Classification using Neural Networks

147+77+30
accuracy = =0.855 ≅ 85.5 %
297
(c) Case 3: Design, train and simulate a neural network using normalized data.

In this case, we use the normalized dataset for training and testing, also the results
will be recorded to measure the accuracy of network obtained using this dataset.

Step1: Design

Feedforward net command to build the nodes of neural network.


Fig: Feed-forward Neural network.
We use the code “mynet = feedforward ([120 40 60 80]);”, thus we get our neural
network with four hidden layers as shown in figure.

Fig 18: Feed-forward Neural Network of Normalized dataset.

Step 2: Train

Using code “mynet = train (mynet, xn, t);

Where, mynet = the neural network


xn= normalized input data,
and t= target.

We will also change the divdeFcn, transferFcn and trainFcn as discussed in Table
1. MATLAB codes is in appendix [3].

Epochs or iterations are set to 5000.

After changing the function, the network should be reinitialised using the
command.

“mynet=init(mynet)”,

otherwise, the weights will not be changed.

19
Data Classification using Neural Networks

Fig 19: Training parameters.

20
Data Classification using Neural Networks

Fig 20: Performance Curve.

21
Data Classification using Neural Networks

Fig 21: Neural Network Training State curve.

Fig 22: Regression Curve of Normalized neural network.

Step 3: Simulation

22
Data Classification using Neural Networks

Once the neural network is trained using the algorithm, we will simulate the
network using code “result =sim (mynet, xn);”
To check the accuracy of the network, we will create a truth table using predicated
class and true class with a for loop. The logic used is shown below and in
appendix [3].
Results = sim(mynet,xn)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end

MATLAB code “confusionchart” is used to form this truth table.


Using code confusionchart(results, t); we get the following table and
the accuracy can be calculated from it.

Fig 23: Confusion Matrix Chart


The values in blue coloured squares along the diagonals on the confusion chart
is the accurate values and hence.
149+ 75+36
accuracy= =0.875 ≅ 87.5 %
297

23
Data Classification using Neural Networks

5. EVALUATION/CONCLUSION

In this coursework, we did data classification by building neural networks. They were
designed, implemented, and tested under three different cases. The table below shows
the accuracy and concluding results of neutral networks in each case.
Dataset Gradient value Performance Epochs Accuracy
x 0.036658 0.10987 5000 85.8
xs 0.036285 0.10790 5000 85.5
xn 0.035478 0.092901 5000 87.5
Table 2: Results
From the above table it is clear that accuracy of a neural network depends on the data
set provided i.e.., it is data-driven. From the graph we can see that the curve is
critically damped and will not converge but will lean towards convergence (small
errors). From performance curves, we can observe the network is very sensitive to the
type and quality of data.
In conclusion, we can say that, in this coursework the normalized data set have more
accuracy compared to the other cases and have an efficiency of 87%. By changing
algorithms and other parameters, it is still possible to obtain the 90% benchmark
accuracy.

6. REFERENCES

1. MathWorks, (2021), “Deep Learning Toolbox (R2021a)”,


https://uk.mathworks.com/help/deeplearning.
2. Dongas, Niklas, (2020), “Gradient Descent: An Introduction to 1 of Machine
Learning’s most Popular Algorithms”, https://builtin.com/data-science/gradient-
descent.
3. CALLAN, R. (2003) Artificial intelligence, Palgrave Macmillan.
4. NEGNEVITSKY, M. (2010) Artificial intelligence: a guide to intelligent systems, Addison
Wesley.
5. RUSSELL, S.J. and NORVIG, P. (2010) Artificial intelligence: a modern approach, Pearson
Education.

6.  Hinkelmann, Knut. "Neural Networks". University of Applied Sciences Northwestern


Switzerland.

24
Data Classification using Neural Networks

7. APPENDIX
MATLAB CODES
load cleveland_heart_disease_dataset_labelled.mat

(1). MATLAB codes for Raw Data


Class0_idx = find(t==0);
Class1_idx = find(t==1);
Class2_idx = find(t==2);

% Data Analysis

figure, plot(x(Class0_idx,:)','y'),title('Data Analysis')


hold on
plot(x(Class1_idx,:)','b')
plot(x(Class2_idx,:)','r')
hold off

%Designing the Network

mynet = feedforwardnet([120 40 60 80 ]);


%view(rawdatanet);
x=x';
t=t';

%Training the Network

mynet = train(mynet,x,t)

mynet.layers{1}.transferFcn = 'logsig';
mynet.layers{2}.transferFcn = 'logsig';
mynet.layers{3}.transferFcn = 'logsig';
mynet.layers{4}.transferFcn = 'logsig';

mynet.divideFcn = 'dividetrain'
mynet.trainFcn = 'traingd';
mynet.trainParam.epochs =5000;
mynet = init(mynet);
mynet = train(mynet,x,t)

% Simulating the Network

results = sim(mynet,x)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end
end
confusionchart(results, t)

25
Data Classification using Neural Networks

(2) MATLAB codes for Standardized Data

Class0_idx = find(t==0);
Class1_idx = find(t==1);
Class2_idx = find(t==2);

%Standardization

xs = normalize(x)
figure, plot(xs(Class0_idx,:)','y'),title('Standardized Data')
hold on
plot(xs(Class1_idx,:)','b')
plot(xs(Class2_idx,:)','r')
hold off

%Designing the Network

mynet = feedforwardnet([120 40 60 80 ]);


%view(rawdatanet);
xs=xs';
t=t';

%Training the Network

mynet = train(mynet,xs,t)

mynet.layers{1}.transferFcn = 'logsig';
mynet.layers{2}.transferFcn = 'logsig';
mynet.layers{3}.transferFcn = 'logsig';
mynet.layers{4}.transferFcn = 'logsig';

mynet.divideFcn = 'dividetrain'
mynet.trainFcn = 'traingd';
mynet.trainParam.epochs =5000;
mynet = init(mynet);
mynet = train(mynet,xs,t)

% Simulating the Network

results = sim(mynet,xs)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end
end
confusionchart(results, t)

26
Data Classification using Neural Networks

(3). MATLAB codes for Normalized Data

Class0_idx = find(t==0);
Class1_idx = find(t==1);
Class2_idx = find(t==2);

%Normalization

xn = normalize(x,'range')
figure, plot(xn(Class0_idx,:)','y'), title('Normalized Dataset')
hold on
plot(xn(Class1_idx,:)','b')
plot(xn(Class2_idx,:)','r')
hold off

%Designing the Network

mynet = feedforwardnet([120 40 60 80 ]);


%view(rawdatanet);
xn=xn';
t=t';

%Training the Network

mynet = train(mynet,xn,t);
mynet.layers{1}.transferFcn = 'logsig';
mynet.layers{2}.transferFcn = 'logsig';
mynet.layers{3}.transferFcn = 'logsig';
mynet.layers{4}.transferFcn = 'logsig';

mynet.divideFcn = 'dividetrain'
mynet.trainFcn = 'traingd';
mynet.trainParam.epochs =5000;
mynet = init(mynet);
mynet = train(mynet,xn,t)

% Simulating the Network

results = sim(mynet,xn)
for i=1: numel(t)
if results(i)>1.5
results(i)=2;
elseif results(i)<1.5 && results(i)>=0.5
results(i)=1;
else
results(i)=0;
end
confusionchart(results, t)

27

You might also like