0% found this document useful (0 votes)

10 views63 pages

BSC ML CH1

Supervised learning is a machine learning technique where models are trained using labeled data to classify new observations into predefined categories, such as binary or multiclass classification. Key algorithms for classification include Logistic Regression, k-Nearest Neighbors, and Decision Trees, with evaluation metrics like accuracy, precision, recall, and F1 score used to assess model performance. The document also discusses the K-Nearest Neighbor algorithm, explaining its operation and application in classification tasks.

Uploaded by

rachitdhiliwal18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views63 pages

BSC ML CH1

Uploaded by

rachitdhiliwal18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

• What is supervised learning?

Binary and
multiclass classification, Evaluation
Unit 1: measures for supervised learning, k-
Nearest Neighbor algorithm
2
It is the field of study that gives computers the capability to learn
without being explicitly programmed.

I/P
Data
Traditional Program Output
Algorithm

I/P
Data
Machine Learning Program
Output

3
Relationship Between
AI, ML, DL and DS

4
Types
Supervised Learning
• Supervised learning is when we train the machine using data that is well labeled.
• After that, the machine is provided with a new set of examples(data) so that the
supervised learning algorithm analyses the training data(set of training examples)
and produces a correct outcome from labeled data.
Classification
• The Classification algorithm is a Supervised Learning
technique that is used to identify the category of new
observations on the basis of training data.
• In Classification, a program learns from the given
dataset or observations and then classifies new
observation into a number of classes or groups.
• Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or
dog, etc. Classes can be called as targets/labels or
categories.
• Types:
➢ Binary Classifier: If the classification problem has only
two possible outcomes, then it is called as Binary
Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT
SPAM, CAT or DOG, etc.
➢ Multi-class Classifier: If a classification problem has
more than two outcomes, then it is called as Multi-class
Classifier.
Example: Classifications of types of crops, Classification
of types of music.
Binary Classification
• It is a process or task of classification, in which a given data is being classified into two
classes. It’s basically a kind of prediction about which of two groups the thing belongs to.
• categorizing data into two distinct classes. This method is essential for tasks like email spam
detection and medical diagnostics. It provides a clear decision boundary.
• The most popular algorithms used by the binary classification are-

• Logistic Regression
• k-Nearest Neighbors
• Decision Trees
• Support Vector Machine
• Naive Bayes
Multiclass Classification
Multi-class classification is the task of classifying elements into different classes. Unlike binary, it doesn’t restrict itself to any number of classes.

Examples of multi-class classification are

• classification of news in different categories,
• classifying books according to the subject,
• classifying students according to their streams etc.

Popular algorithms that can be used for multi-class classification include:

• k-Nearest Neighbors
• Decision Trees
• Naive Bayes
• Random Forest.
• Gradient Boosting
• There are several methods for training multiclass models:
• One-vs-rest strategy: Trains a separate classifier for each class against all
others
• One-vs-one approach: Creates binary classifiers for every pair of classes
• Softmax activation: Often used in neural networks to output probability
distributions across classes
Classification Type Output Structure Example

Spam (1) or Not Spam

Binary Single probability
(0)
Fruit: Apple (0.7),
Multiclass Probability distribution
Orange (0.2), Pear (0.1)
Emotions: Happy (1),
Multi-label Set of binary indicators
Sad (0), Excited (1)
• Linear Models
• Logistic Regression
Types of ML • Support Vector Machines
• Non-linear Models
Classificati • K-Nearest Neighbours
on • Kernel SVM
Algorithms: • Naïve Bayes
• Decision Tree Classification
• Random Forest Classification
Supervised Learning
Examples
• Predicting House Prices:
• Given features like area, number of rooms, location, etc.,
predict the price of a house
• The labeled data would consist of houses with their
corresponding prices

• Email Spam Classification:

• Given the content and features of an email, classify it as
either spam or non-spam
• The labeled data would consist of emails marked as
spam or non-spam
Unsupervised Learning
• Unsupervised learning is the training of a machine using information that is neither
classified nor labeled and allowing the algorithm to act on that information without
guidance.
• Here the task of the machine is to group unsorted information according to similarities,
patterns, and differences without any prior training of data.

14
Comparison

16
Machine Learning Models
Task Driven Data
Driven

Supervised Learning Unsupervised Learning

(Pre-categorized data) (Unlabeled Data)
Predications + Predictive Models Pattern/Structure Recognition

Clustering
Divide by similarity

Association
Regression Classification Identify Sequences

Divide the ties by length Divide the socks by color

Dimensionality
Linear Regression Logistic Reduction
Compress data based on features
Regression

Decision Tree
Support Vector Machine
Random Forest

18
Neural Networks Naïve Bayes
Model Evaluation
⮚ Train/Test is a method to measure the accuracy of your model.
⮚ It is called Train/Test because you split the data set into two sets: a training set and a testing
set.
⮚ Example: 80% for training, and 20% for testing.
⮚ You train the model using the training set.
⮚ You test the model using the testing set.
⮚ Train the model means create the model.
⮚ Test the model means test the accuracy of the model.
⮚ We can measure model accuracy by two methods. Accuracy simply means the number of values correctly
predicted.
1. Confusion Matrix
2. Classification Measure
Confusion Matrix
• The confusion matrix is also known as Error matrix and is represented by a table which describes the

performance of a classification model on a set of test data in machine learning.

• It is a two-dimensional matrix where each row represents the instances in predictive class while each

column represents the instances in the actual class or you put the values in the other way.

Here, TP (True Positive) means the observation is positive and is predicted as positive,
FP (False Positive) means observation is negative but is predicted as positive,
TN (True Negative) means the observation is negative and is predicted as negative
and FN (False Negative) means the observation is positive but it is predicted as negative.

20
• Actual values =
[‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’,
‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’,
‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’,
‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’]

• Predicted values =
[‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’,
‘dog’, ‘cat’, ‘cat’, ‘cat’, ‘cat’,
‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’,
‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’]
• A good model is one which has high TP and TN rates, while low FP and FN
rates.
• If you have an imbalanced dataset to work with, it’s always better to
use confusion matrix as your evaluation criteria for your machine learning
model.
2. Classification Measure
• Basically, it is an extended version of the confusion matrix.
• There are measures other than the confusion matrix which can
help achieve better understanding and analysis of our model and
its performance.
a. Accuracy
b. Precision
c. Recall (TPR, Sensitivity)
d. F1-Score
e. FPR (Type I Error)
f. FNR (Type II Error)
Accuracy

➢ Accuracy is the ratio of the total number of correct predictions and the total number of
predictions.

➢ Accuracy is, simply put, the total proportion of observations that have been correctly
predicted.
➢ We can use accuracy when we are interested in predicting both 0 and 1 correctly and our
dataset is balanced enough.
➢ The formula for calculating accuracy is as follows:

25
A common complaint about accuracy is that it fails when the classes are imbalanced.

For example if the data contains only 10% of positive instances, a majority baseline classifier which always assigns
the negative label would reach 90% accuracy since it would correctly predict 90% instances. But of course such a
classifier is useless, it doesn't classify anything.
Precision
• Precision is the ratio between the True Positives and all the Positives.
• Precision is a measure of how many of the positive predictions made are correct (true
positives)
• Precision is a good measure to determine, when the costs of False Positive is high
Recall
• The recall is the measure of our model correctly identifying True Positives.
• Thus, for all the patients who actually have heart disease, recall tells us how
many we correctly identified as having a heart disease

• Recall also gives a measure of how accurately our model is able to identify
the relevant data. We refer to it as Sensitivity or True Positive Rate.
• In most cases, we want both our precision and recall being high, but it is not
possible.
• When our precision will be high our recall will be low and vice versa.
• So to balance these we have another metric called F1 Score.
F1 Score
F1 score is a machine learning evaluation metric that measures a model’s accuracy which combines the precision and recall
scores of a model.
The F1 score is a popular performance measure for classification and often preferred over accuracy when data is
unbalanced, such as when the quantity of examples belonging to one class significantly outnumbers those found in the
other class.
F1 Score might be a better measure to use if we need to seek a balance between Precision and Recall
Advantages:
• Very small precision or recall will result in lower overall score. Thus it helps balance the two metrics.
• If you choose your positive class as the one with fewer samples, F1-score can help balance the metric across
positive/negative samples.

31
AUC-ROC

• ROC: Receiver Operating Characteristics

• AUC: Area Under Curve
• AUC-ROC curve helps us visualize how well our machine learning classifier
performs.
• ROC stands for Receiver Operating Characteristics, and the ROC curve is the
graphical representation of the effectiveness of the binary classification
model.
• It plots the true positive rate (TPR) vs the false positive rate (FPR) at
different classification thresholds.
• The curve plots two parameters, True Positive Rate (TPR) and False Positive Rate (FPR).

• Area Under ROC curve is basically used as a measure of the quality of a classification model.
Hence, the AUC-ROC curve is the performance measurement for the classification problem at
various threshold settings.

• The True Positive Rate (sensitivity ) or Recall is defined as

Benefit of using the model

• The False Positive Rate(1-Specificity ) is defined as

• Loss due to the model

33
• It measures the overall performance of the binary classification model.
• As both TPR and FPR range between 0 to 1, So, the area will always lie between
0 and 1, and A greater value of AUC denotes better model performance.
• Our main goal is to maximize this area in order to have the highest TPR and
lowest FPR at the given threshold.
• It represents the probability with our model to distinguish between the two
classes which are present in our target.
•Higher X-axis value indicates a higher number of false positive
than True Negative

•Higher Y-axis values indicates higher number of True positive

than False Negative.

• An excellent model has AUC near the 1, which means it has

a good measure of separability.

• A poor model has AUC near the 0, which means it has the
worst measure of separability.

• When AUC is 0.5, it means the model has no class

separation capacity whatsoever.
AUC value (x) Interpretation

x = 0.5
Implies that the ROC is random and the classifier was unable to differentiate the positive and negative
classes properly.

x > 0.5 && x <= 0.7

Implies that the classifier's performance is poor and limited but better than the random probability.

x > 0.7 && x <= 0.8

Implies that the classifier's performance is decently better, but there is still room for improvement.

x > 0.8 && x <= 0.9

Implies that the classifier is significantly good and can visibly differentiate between the positive and
negative classes to provide reliable results.

x = 1.0
Implies that the ROC is perfect and the classifier has the ability to provide highly accurate results with
reliable performance.
AUC-ROC
• ROC (Receiver Operating Characteristic) Curve tells us about how good the model can
distinguish between two things (e.g If a patient has a disease or no).
• Better models can accurately distinguish between the two classes , Whereas, a poor model
will have difficulties in distinguishing between the two.
• ROC Curves and AUC in Python

# calculate roc curve : The function returns the false positive rates
for each threshold, true positive rates for each threshold .
fpr, tpr, thresholds = roc_curve(y, probs)

# calculate AUC
auc = roc_auc_score(y, probs)
print('AUC: %.3f' % auc)
• from sklearn.datasets import make_classification
• from sklearn.linear_model import LogisticRegression
• from sklearn.metrics import roc_curve
• from sklearn.metrics import roc_auc_score
• import plotly.express as px
• import pandas as pd
•
• # Random Classification dataset
• X, y = make_classification(n_samples=1000, n_classes=2, random_state=1)
•
• model = LogisticRegression()
• model.fit(X, y)
• Now we want to evaluate how good our model is using ROC curves. To do this, we need to find FPR and TPR for various
threshold values

• fpr, tpr, thresh = roc_curve(y, preds)

• roc_df = pd.DataFrame(zip(fpr, tpr, thresh),columns =
["FPR","TPR","Threshold"])
K-Nearest Neighbor(KNN)
Algorithm
Introduction
• K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on
Supervised Learning technique.
• K-NN algorithm assumes the similarity between the new case/data and available
cases and put the new case into the category that is most similar to the available
categories.
• K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be easily
classified into a well suite category by using K- NN algorithm.
• K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to
the new data.
• Example:
Suppose, we have an image of a creature that looks similar to cat and dog, but
we want to know either it is a cat or dog. KNN model will find the similar
features of the new data set to the cats and dogs images and based on the
most similar features it will put it in either cat or dog category.
•
Why do we need a K-NN Algorithm?

• Suppose there are two categories, i.e., Category A and Category B, and we have a new data
point x1, so this data point will lie in which of these categories.
• To solve this type of problem, we need a K-NN algorithm.
• With the help of K-NN, we can easily identify the category or class of a particular dataset.
How does K-NN work?

• Step-1: Select the number K of the neighbors

• Step-2: Calculate the Euclidean distance of K number of
neighbors
• Step-3: Take the K nearest neighbors as per the calculated
Euclidean distance.
• Step-4: Among these k neighbors, count the number of the data
points in each category.
• Step-5: Assign the new data points to that category for which the
number of the neighbor is maximum.
•As we can see the 3 nearest neighbors are from category A, hence this new data
point must belong to category A.
How to select the value of K in the K-NN
Algorithm?
• There is no particular way to determine the best value for "K", so
we need to try some values to find the best out of them. The most
preferred value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy and lead to
the effects of outliers in the model.
• Large values for K are good, but it may find some difficulties
• Optimal K:Usually determined using cross-validation to balance bias
and variance.
• For a very low value of k (suppose k=1), the
model is overfitting the training data, which
leads to a high error rate on the validation set.
On the other hand, for a high value of k, the
model performs poorly on both the train and
validation sets. If you observe closely, the
validation error curve reaches a minimum at a
value of k = 9. This value of k is the optimum
value of the model (it will vary for different
datasets). Researchers typically use the elbow
curve, named for its resemblance to an elbow,
to determine the k value.
when we take k=1, we get a very high RMSE
value. The RMSE value decreases as we
increase the k value. At k= 7, the RMSE is
approximately 1219.06 and shoots upon
further increasing the k value. We can safely
say that k=7 will give us the best result in
this case.
Advantages/ Disadvantages
Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:
• Always needs to determine the value of K which may be complex
some time.
• The computation cost is high because of calculating the distance
between the data points for all the training samples.
• Preprocessing Steps
• Scaling Features: Distance-based algorithms like KNN are sensitive to
varying ranges in feature values. Standardize or normalize your
features to ensure fair comparisons.
• Handling Missing Data: Impute or remove missing values, as KNN
relies heavily on complete data for distance calculations.
• Real-World Applications
• Recommendation Systems: Matching users with similar preferences.
• Image Recognition: Identifying objects by comparing pixel patterns.
• Medical Diagnostics: Classifying diseases based on patient records.
• Customer Segmentation: Grouping customers based on purchasing
behavior.
Example KNN
Find the class label for given instance using KNN with K=5
Step 1: Find distance
Step 2: Find Rank

Step 3: Find nearest neighbours

to assign class
• https://github.com/codebasics/py/blob/master/ML/17_knn_classi
fication/knn_classification_tutorial.ipynb

Classification
100% (2)
Classification
105 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Lecture-5 Classification in ML
No ratings yet
Lecture-5 Classification in ML
50 pages
Classification
No ratings yet
Classification
22 pages
Supervised Learning
No ratings yet
Supervised Learning
30 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Module 2
No ratings yet
Module 2
151 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
ML Notes UT-2
No ratings yet
ML Notes UT-2
19 pages
Gene Expression Prediction Using Machine Learning Project Presentation
No ratings yet
Gene Expression Prediction Using Machine Learning Project Presentation
14 pages
Chapter3 Classification Summary Final
No ratings yet
Chapter3 Classification Summary Final
11 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
13 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Lec 17 - Dsfa23
No ratings yet
Lec 17 - Dsfa23
32 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
ML 4
No ratings yet
ML 4
32 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Lec 8
No ratings yet
Lec 8
35 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
0 Machine Learning Overview and Metrics LT
No ratings yet
0 Machine Learning Overview and Metrics LT
84 pages
Bernd Klein Python and Machine Learning Letter
No ratings yet
Bernd Klein Python and Machine Learning Letter
453 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Module 2 - ML
No ratings yet
Module 2 - ML
53 pages
Mla Unit-5'2
No ratings yet
Mla Unit-5'2
74 pages
ML Unit-1-1
No ratings yet
ML Unit-1-1
16 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
2024, DCNAM - Automatic Detection of Pixel Level Fine Crack Using A Densely Connected - 'Beyene Et Al' (Structures)
No ratings yet
2024, DCNAM - Automatic Detection of Pixel Level Fine Crack Using A Densely Connected - 'Beyene Et Al' (Structures)
12 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lecture 11 - 09.09.24 Classification Part 1
No ratings yet
Lecture 11 - 09.09.24 Classification Part 1
51 pages
ML Unit2
No ratings yet
ML Unit2
22 pages
Instruction & Option Choice
No ratings yet
Instruction & Option Choice
6 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
Classification
No ratings yet
Classification
53 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Unit 3
No ratings yet
Unit 3
123 pages
Ai Unit 5
No ratings yet
Ai Unit 5
13 pages
Unit 2
No ratings yet
Unit 2
28 pages
Unit 3
No ratings yet
Unit 3
27 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Machine Learning Note
No ratings yet
Machine Learning Note
40 pages
05 - Machine Learning
No ratings yet
05 - Machine Learning
31 pages
Lecture 7
No ratings yet
Lecture 7
25 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
Update Week 13 Machine Learning Supervised
No ratings yet
Update Week 13 Machine Learning Supervised
21 pages
DTS 101 Lecture 2
No ratings yet
DTS 101 Lecture 2
30 pages
Placement Cell Interview Preparation Question Paper 1
No ratings yet
Placement Cell Interview Preparation Question Paper 1
20 pages
Compiler Construction
No ratings yet
Compiler Construction
7 pages
Fraud Detection in Banking Data by Machine Learning Techniques
No ratings yet
Fraud Detection in Banking Data by Machine Learning Techniques
10 pages
2021-Application of Artificial Intelligence and Machine Learning To Detect DrillingAnomalies Leading To Stuck Pipe Incidents
No ratings yet
2021-Application of Artificial Intelligence and Machine Learning To Detect DrillingAnomalies Leading To Stuck Pipe Incidents
11 pages
Fashion MNIST-6
No ratings yet
Fashion MNIST-6
10 pages
Measurement: Sensors: Takowa Rahman, MD Saiful Islam
No ratings yet
Measurement: Sensors: Takowa Rahman, MD Saiful Islam
11 pages
Heart Disease Prediction Using Hybrid Model
No ratings yet
Heart Disease Prediction Using Hybrid Model
6 pages
Sample Report
No ratings yet
Sample Report
38 pages
South African Institute of Computer Scientists and Information Technologists
No ratings yet
South African Institute of Computer Scientists and Information Technologists
298 pages
CH 2
No ratings yet
CH 2
26 pages
Ai Mini Project Report
No ratings yet
Ai Mini Project Report
41 pages
Artificial Intelligence-Based Medical Prescription
No ratings yet
Artificial Intelligence-Based Medical Prescription
6 pages
Ai Cbse Paper 2024
No ratings yet
Ai Cbse Paper 2024
11 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
DAI-Net Dual Adaptive Interaction Network For Coordinated Medication Recommendation
No ratings yet
DAI-Net Dual Adaptive Interaction Network For Coordinated Medication Recommendation
11 pages
ML Lab Report
No ratings yet
ML Lab Report
6 pages
Asif Et Al. - 2023
No ratings yet
Asif Et Al. - 2023
19 pages
Brain Tumor Classification Using CNN On MRI Data: A PyTorch Implementation
No ratings yet
Brain Tumor Classification Using CNN On MRI Data: A PyTorch Implementation
7 pages
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis
No ratings yet
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis
14 pages
Real-Time People Counting System For Customer Movement Analysis
No ratings yet
Real-Time People Counting System For Customer Movement Analysis
9 pages
Malaria Detection Using Deep-Learning Shakib PDF
No ratings yet
Malaria Detection Using Deep-Learning Shakib PDF
14 pages
Dong 等 - 2021 - Radar Camera Fusion via Representation Learning in Autonomous Driving
No ratings yet
Dong 等 - 2021 - Radar Camera Fusion via Representation Learning in Autonomous Driving
10 pages
Enhancing Data Integrity in Banking Applications Using AI/ML Techniques
No ratings yet
Enhancing Data Integrity in Banking Applications Using AI/ML Techniques
11 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
5 pages
SSCLNet A Self-Supervised Contrastive Loss-Based Pre-Trained Network For Brain MRI Classification
No ratings yet
SSCLNet A Self-Supervised Contrastive Loss-Based Pre-Trained Network For Brain MRI Classification
9 pages
Yolov10 and Sam 2.1 For Enhanced Mri Segmentation and Improved Neurological Disease Diagnosis
No ratings yet
Yolov10 and Sam 2.1 For Enhanced Mri Segmentation and Improved Neurological Disease Diagnosis
30 pages
Fingerprint Liveliness Detection Using Stacked Ensemble and Transfer Learning Technique
No ratings yet
Fingerprint Liveliness Detection Using Stacked Ensemble and Transfer Learning Technique
7 pages
Question Bank 4
No ratings yet
Question Bank 4
4 pages
Software Requirements Specification
No ratings yet
Software Requirements Specification
3 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

BSC ML CH1

Uploaded by

BSC ML CH1

Uploaded by

• What is supervised learning?

Examples of multi-class classification are

Popular algorithms that can be used for multi-class classification include:

Spam (1) or Not Spam

• Email Spam Classification:

Supervised Learning Unsupervised Learning

Divide the ties by length Divide the socks by color

performance of a classification model on a set of test data in machine learning.

• ROC: Receiver Operating Characteristics

• The True Positive Rate (sensitivity ) or Recall is defined as

• The False Positive Rate(1-Specificity ) is defined as

•Higher Y-axis values indicates higher number of True positive

• An excellent model has AUC near the 1, which means it has

• When AUC is 0.5, it means the model has no class

x > 0.5 && x <= 0.7

x > 0.7 && x <= 0.8

x > 0.8 && x <= 0.9

• fpr, tpr, thresh = roc_curve(y, preds)

• Step-1: Select the number K of the neighbors

Step 3: Find nearest neighbours

You might also like