K-Nearest Neighbors in MATLAB & Classification Learner App - Machine Learning - @MATLABHelper
K-Nearest Neighbors in MATLAB & Classification Learner App - Machine Learning - @MATLABHelper
0:04
Welcome user in the last video we have seen theoretical approach of K nearest
neighbors
0:08
algorithm, in that video we have seen step by step approach to perform K nearest
neighbor
0:13
algorithm. Now, in this video we are going to implement those steps in the MATLAB.
To
0:18
perform KNN algorithm we are going to see two approaches in the MATLAB, First we
will
0:23
see coding structure of applying algorithm that we normally do and after that I�m
going
0:27
to use classification learner app to perform KNN algorithm. So let us begin.
MATLAB implementation of K-Nearest Neighbors
0:39
In the classification problem we classify the data based on the available
information.
0:44
So while making model first we will look which are the predictors in the data and
which variable
0:49
would be the prediction, first let us see the data and note some important points.
Over
0:53
here you can see under workspace window I have already imported data, the data
which
0:58
I have imported shows information about the prediction of tumour type based on the
certain
1:02
features. Right here 1st feature we can ignore from the analysis as it is just
showing the
1:07
id number, by looking into the second feature, diagnosis, we can say this one is
our prediction
1:13
and all other features are the predictors. Now our goal is by using available
information
1:18
we are going to predict whether the patient will have benign type of tumour or it
will
1:24
have malignant type of tumour. So for this data set diagnosis feature is our output
label
1:30
and all other features are the input features. From the input features I am going
to delete
1:34
Id feature and all other features, I am going to scale to make them on the same
level. So
1:40
right here you can see the steps that I am going to follow to perform KNN
algorithm:
1:45
In the very first step I will delete unnecessary columns. After that I will handle
missing
1:50
values and outliers from the data set. After that I will check the categorical
data, if
1:55
the data has categorical information then I will convert that categorical data to
numerical
1:59
data. After that we can scale or split the data. Now create the model on training
set
2:04
and predict the model on testing set. At the end calculate accuracy and for that we
will
2:09
use y actual data and y predicted data. Now let us see the coding approach in the
script
2:14
window. In the 1st selection, we are going to remove id feature from the data,
because
2:18
id feature is not providing any useful information. In next section of code I am
going to split
2:24
data into xtrain, ytrain, xtest and ytest by using hold out method to get more
information
2:30
about splitting method kindly watch lesson number 23. After that I have scaled
input
2:35
data. In MATLAB, function that will return algorithm will accept data in terms of
array
2:41
so I have used table2array function to get data in array format. Fitcknn function
with
2:47
mentioned arguments will create model on testing data set by considering 5 nearest
neighbors.
2:52
To get information about training model, run the model and type name of model in
the command
2:57
window; now run the model and check basic information of the model. Right here you
can
3:09
see basic information of the trained model. Now predict the values on testing data.
Y_predict
3:16
variable will provide us predicted class for the x_test data. Now we can easily
calculate
3:25
accuracy by using confusion matrix, first let us discuss about the confusion
matrix.
3:29
Confusion matrix is a table that is used to describe the performance of
classification
Confusion Matrix
3:35
problem. Right here you can see structure of confusion matrix for 2 factor level.
Each
3:39
row in the confusion matrix represents an actual class, while each column
represents
3:44
a predicted class. If actual label is positive and predicted label is also positive
then
3:49
we called such entries as a True positive entries. If actual label is negative and
predicted
3:54
label is also negative then we called such entries as a false positive entries and
if
3:59
actual label is positive and predicted label is negative then we called such
entries as
4:04
a false negative entries and if actual label is negative and predicted label is
positive
4:09
then we called such entries as a true negative entries. In general, diagonal
entries are
4:14
showing positives entries and off diagonal entries are showing opposite entries.
Over
4:19
here you can see mathematical formulas for accuracy, error, precision and recall.
Let
4:24
us discuss simple example of confusion matrix: Over here you can see confusion
matrix for
4:28
two labels male and female. By looking into the confusion matrix we can say 40 male
labels
4:34
are correctly classified as a male labels, 50 female labels are correctly
classified
4:38
as a female labels and 10 female labels are misclassified. This is all about the
confusion
4:43
matrix now run the code and check confusion matrix and confusion chart. Over here
you
4:51
can see the confusion chart and to get confusion matrix write down result in
command window.
4:58
To calculate accuracy I have formed formula of accuracy, now check the accuracy.
Over
5:12
here you can see the accuracy. Now our model is ready and to predict new
observation, we
5:17
just call this model and we will add entries from the observation. This will end
MATLAB
5:22
coding Implementation of KNN algorithm now let us perform the same problem by using
classification
Classification learner app for K-Nearest Neighbors
5:27
learner app. Classification learner app is the part of the machine learning and
deep
5:30
learning toolbox. We can import the data into the app and we can analyses the
different
5:35
algorithms at the same time. To open the classification learner app type
classification learner in
5:41
the command window or under the apps section you will get classification learner
heading,
5:48
right here you can see the interface of the classification learner app. Over here
you
6:02
can see the steps that we use to perform classification in learner app. The very
first step is to
6:08
import data into classification learner app make sure data which we are going to
import
6:12
has no missing values and outliers. In the next step we will choose algorithm that
we
6:17
are going to use for the classification and after that we will create and analyze
model
6:22
performance and to predict new entries we will export model to MATLAB environment.
Now
6:27
let us implement the discussed steps, to import data click on new session tab,
under the new
6:34
session tab you can see two ways of importing data. I am going to choose data from
workspace
6:41
that I have already imported to workspace from the system. Over here you can see
the
6:53
data, the Learner app will automatically detect predictors and the response
features, also
6:59
we can ignore unnecessary features while importing data, in our case we are going
to ignore id
7:04
feature, now start the session. The opened window will show default settings of the
learner
7:17
app. Under the features section you can see PCA and feature selection options, we
can
7:22
use those options if model needs. Now under the model type tag select KNN
algorithm, over
7:36
here you can see multiple KNN algorithms to get more information about the
algorithm,
7:41
hold cursor on algorithm or I can select all KNN algorithms, but for simplicity I
am going
7:47
to select fine KNN algorithm. Now click on train, Model will take some time for
training
8:00
and finally model is trained, now you can see the accuracy of the model and some
other
8:20
parameters like prediction speed and training time of the model. The model is
trained for
8:26
default settings, under the advance option you can check the default settings for
the
8:30
model which is trained. The 1st point, we can conclude from the advance option that
8:36
model is trained for 1 nearest neighbor, 2nd point we can note to find out distance
between
8:41
data points, model has considered Euclidian method and model is trained for scaled
data.
8:46
Under the plot section you can see different plots, 1st let us discuss about the
scatter
8:50
plot. The data option will show original data set and under the model prediction
dot will
8:57
show correct classification and cross will show incorrect classifications.
Confusion
9:02
matrix will show, count of correct classified and misclassified entries, diagonal
elements
9:08
will shows correct classified entries and off-diagonal elements will shows
misclassified
9:12
entries. Roc curve will shows graph for each label, actually the information about
Roc
9:21
curve and parallel coordinates plot we are going to see in the upcoming
classification
9:25
problems. Now we can export this model to workspace and we can predict the label
for
9:30
the new entries. Right here you can see I am going to export the model with
trainedmodel
9:35
name, now check the workspace window model is saved with trainedmodel name and
double
9:44
click on model to predict new entries. This is all about the KNN algorithm in the
MATLAB
Conclusion from MATLAB Helper
9:51
this will end our video in the next video we are going to start na�ve bays
classification
9:56
problem. Thank you for watching this video. Do like
9:59
this video if you found it helpful. If you have any queries post them in the
comments
10:03
and get in touch with us. Follow us on LinkedIn, Facebook and subscribe to our
YouTube channel.
10:07
Education is our future. MATLAB is our feature. Happy MATLABing!
La rediffusion du chat a �t� d�sactiv�e pour cette Premi�re.