0% found this document useful (0 votes)

24 views26 pages

Classifying in Machine Learning

Uploaded by

Bích Ngọc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views26 pages

Classifying in Machine Learning

Uploaded by

Bích Ngọc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

THYNK UNLIMITED

WE LEARN FOR THE FUTURE

CLASSIFYING IN
MACHINE LEARNING
PRESENTATION
PRESENTED BY:

PHẠM TUẤ N DŨNG

WHAT IS ARTIFICIAL
INTELLIGENCE?
Artificial intelligence, or AI, is technology that enables
computers and machines to simulate human intelligence
and problem-solving capabilities.

On its own or combined with other technologies (e.g.,

sensors, geolocation, robotics) AI can perform tasks that
would otherwise require human intelligence or
intervention.

Digital assistants, GPS guidance, autonomous vehicles,

and generative AI tools (like Open AI's Chat GPT) are just
a few examples of AI in the daily news and our daily lives.
WHAT IS MACHINE LEARNING?
Machine learning (ML) is a branch of
artificial intelligence (AI) and computer
science that focuses on the using data
and algorithms to enable AI to imitate the
way that humans learn, gradually
improving its accuracy.

It's a key driver of AI applications,

including natural language processing,
image recognition, and recommendation
systems.
TYPES OF MACHINE LEARNING?
Supervised Learning Unsupervised Learning

Semi-Supervised Learning Reinforcement Learning

SUPERVISED LEARNING:
Supervised learning is an algorithm that predicts the
output of a new input based on previously known
(input, outcome) pairs. This data pair is also called
(data, label). Supervised learning is the most popular
group of Machine Learning algorithms.

Supervised learning algorithms are further divided

into two main types:

Classification (Phân loại)

Regression (Hồ i quy)

CLASSIFICATION:
Classification is a supervised machine learning method where the model tries to
predict the correct label of a given input data. In classification, the model is fully
trained using the training data, and then it is evaluated on test data before being
used to perform prediction on new unseen data.

For instance, an algorithm can learn to

predict whether a given email is spam or
ham (no spam):
REGRESSION:
Regression is a statistical method used
to analyze the relationship between a
dependent variable (target variable) and
one or more independent variables
(predictor variables). The goal is to
determine the most suitable function
that describes the relationship between
these variables.

It seeks to find the best-fitting model,

which can be used to make predictions
or draw conclusions.
UNSUPERVISED LEARNING:
In this algorithm, we do not know the outcome or label but only the input data. The
unsupervised learning algorithm will rely on the structure of the data to perform
certain tasks, such as clustering or dimension reduction for convenient storage
and calculation.
Mathematically, Unsupervised learning is when we only have X input data without
knowing the corresponding Y label.
Supervised learning algorithms are further divided into two main types:

Clustering (phân nhóm)

Association
CLUSTERING:
Clustering is the process of arranging a group of objects in such a manner that the
objects in the same group (which is referred to as a cluster) are more similar to
each other than to the objects in any other group. Data professionals often use
clustering in the Exploratory Data Analysis phase to discover new information and
patterns in the data. As clustering is unsupervised machine learning, it doesn’t
require a labeled dataset.
ASSOCIATION:
Association learning, often referred to in the context of association rule learning, is
a rule-based machine learning method for discovering interesting relations
between variables in large databases. It is intended to identify strong rules
discovered in databases using some measures of interestingness.

This method is widely used for market basket

analysis, where it is used to find relationships
between items that are frequently bought
together.
SEMI-SUPERVISED LEARNING:
Semi-supervised learning is a branch of machine learning that combines
supervised and unsupervised learning by using both labeled and unlabeled data to
train artificial intelligence (AI) models for classification and regression tasks.

In fact, many Machine Learning problems belong to this group because collecting
labeled data takes a lot of time and has high costs. Many types of data even require
experts to label (medical images, for example). In contrast, unlabeled data can be
collected at low cost from the internet.
REINFORCEMENT LEARNING:
Reinforcement learning (RL) is a machine learning (ML) technique that trains
software to make decisions to achieve the most optimal results. It mimics the
trial-and-error learning process that humans use to achieve their goals.

An example of reinforcement learning is

teaching a computer program to play a video
game. The program learns by trying different
actions, receiving points for good moves and
losing points for mistakes. Over time, it learns
the best strategies to maximize its score and
improve its performance in the game.
SOME BASIC MACHINE LEARNING ALGORITHMS

LINEAR REGRESSION DECISION TREE RANDOM FOREST

A algorithm used to A graph of decisions An ensemble
predict the value of a and their possible learning method for
variable based on consequences. classification,
the value of another regression, and other
variable tasks that works by
building an infinite
number of decision
trees at training
time.
LINEAR REGRESSION:
Linear Regression is one of the most important algorithms in Machine Learning
especially in the Supervised Learning category. This algorithm will predict
continuous values based on input data. Linear Regression finds a linear relationship
between the input variable (X) and the output variable (Y) by finding a straight line
of the form Y=mx+b where:
m is the slope of the line, also known as the weight.
b is the y-axis intercept coefficient.
LINEAR REGRESSION:
The goal of the algorithm is to adjust the weights m and b so that the distance
between the data points and the line is minimized, usually measured by calculating
the sum of squared errors. Linear Regression algorithm is used to predict sales
based on advertising costs, predict house prices based on location/area,...
DECISION TREE:
A decision tree is a flowchart-like structure in which each internal node represents a
"test" on an attribute (e.g. whether a coin flip comes up heads or tails), each branch
represents the outcome of the test, and each leaf node represents a class label
(decision taken after computing all attributes).
Each leaf node is labeled as the most
common class in the corresponding sub-
dataset.

Once built, the decision tree can be used

to classify new data by following rules
from root to leaf.

Application of Decision Tree algorithm

for classification and prediction in
machine learning and data mining
problems.
RANDOM FOREST:
The Random Forest algorithm combines
decision tree construction to create a
more stable and powerful basic Machine
Learning model. Each decision tree in
Random Forest is trained on a randomly
selected subset of data. Then build a
decision tree for each sample and get the
prediction results.

When there is a new data point to predict, Random Forest will make a prediction by
combining the predictions of all subtrees. Finally, the algorithm will choose the
result with the most votes to conclude the problem and situation.
ENTROPY AND GINI INDEX IN DECISION
TREE Both entropy and Gini index are
impurity measures used in decision
trees to guide the process of splitting
data points.

They essentially tell you how mixed up

the data is at a particular node in the
tree, and the goal is to make the data
purer (more homogenous) as you move
down the tree.
ENTROPY:
Entropy, in the context of decision
trees, is a measure of impurity or
disorder within a dataset at a
specific node. It essentially tells you
how mixed up the data is in terms of
class labels.

Entropy is calculated using a formula that involves the probabilities of

each class being present in the data. The result is a value between 0
and 1, where:
0 indicates perfect purity: All data points belong to the same class
(e.g., all emails are spam).
1 indicates complete mix-up: There's an equal probability of any class
being present (completely random).
HOW TO CALCULATE ENTROPY:

Example:
If we had total 10 data points in our dataset with 3 belonging to positive
class and 7 belonging to negative class , then we use the fomula:

The entropy is approximately 0.88.

The higher the entropy, the more disorder or impurity.
ENTROPY IN DECISION TREE
INFORMATION GAIN:
Information gain, directly related to entropy in decision trees, tells you how
much more organized your data becomes after splitting it based on a
particular feature. In simpler terms, it measures the reduction in uncertainty
about the class labels achieved by learning the value of that feature.

Mathematically, information gain can be expressed with the below formular :

information gain = (Entropy of parent node) - (entropy of

child node)
INFORMATION GAIN:
We have:
GINI INDEX:
In decision trees, the Gini index, also
known as Gini impurity, is another
measure of impurity used alongside
entropy. It essentially tells you how likely
you are to misclassify a data point if you
were to randomly pick one from a set.

Gini specifically looks at the probability of making a mistake. It calculates a

value between 0 and 0.5, where:
0 represents perfect purity: All data points belong to the same class (no
chance of misclassification).
0.5 represents complete mix-up: There's an equal probability of any class
being present (completely random, high chance of misclassification).
GINI INDEX FORMULA:

Example:
THANK YOU VERY MUCH
FOR LISTENING.

Wilcom 2006 Guide
88% (8)
Wilcom 2006 Guide
24 pages
Whipstock
No ratings yet
Whipstock
20 pages
Advances in Human Error, Reliability, Resilience, and Performance
100% (1)
Advances in Human Error, Reliability, Resilience, and Performance
372 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
AI.5 Machine Learning (21 26)
No ratings yet
AI.5 Machine Learning (21 26)
196 pages
Aiya Session 4
No ratings yet
Aiya Session 4
42 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Module 3 - Classification
No ratings yet
Module 3 - Classification
9 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
MCC Mba ML and Ai May30 2024
No ratings yet
MCC Mba ML and Ai May30 2024
201 pages
Machine Learning
No ratings yet
Machine Learning
56 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Chapter - 4
No ratings yet
Chapter - 4
14 pages
Chapter 7 Supervised Learning
No ratings yet
Chapter 7 Supervised Learning
71 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
Introduction To AI
No ratings yet
Introduction To AI
51 pages
DWM - Module 3
No ratings yet
DWM - Module 3
22 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Fundamentals of Data Science Unit 4
100% (1)
Fundamentals of Data Science Unit 4
31 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
224 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
ML Unit II - Final
No ratings yet
ML Unit II - Final
138 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
Module 3
No ratings yet
Module 3
132 pages
Chapter 02 - DM Tasks - Part I - Classification
No ratings yet
Chapter 02 - DM Tasks - Part I - Classification
58 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Machine Learning and Regression
No ratings yet
Machine Learning and Regression
8 pages
Unit 5
No ratings yet
Unit 5
25 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
Bike Buyer Prediction Using Classification Algorithm
No ratings yet
Bike Buyer Prediction Using Classification Algorithm
19 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
Machine Learning Notes ?
No ratings yet
Machine Learning Notes ?
14 pages
AIch 5
No ratings yet
AIch 5
50 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
ML-Unit-2
No ratings yet
ML-Unit-2
6 pages
Unit - 2 ML Notes
No ratings yet
Unit - 2 ML Notes
14 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
No ratings yet
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
65 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
ML Imp Que
No ratings yet
ML Imp Que
57 pages
Unit Ivnotes
No ratings yet
Unit Ivnotes
19 pages
WEEK 5 Machine Learning
No ratings yet
WEEK 5 Machine Learning
8 pages
Classification Notes
No ratings yet
Classification Notes
14 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
ML Important
No ratings yet
ML Important
11 pages
Unit - 3
No ratings yet
Unit - 3
73 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
ML Unit 3
No ratings yet
ML Unit 3
15 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Classification
No ratings yet
Classification
50 pages
Machine Learning Clustering AlgorithmsI
No ratings yet
Machine Learning Clustering AlgorithmsI
129 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
DCGFVHBJNKM
No ratings yet
DCGFVHBJNKM
2 pages
LAPLACIAN SPECTRUM OF WEAKLY ZERO-DIVISOR GRAPH OF THE RING ZN
No ratings yet
LAPLACIAN SPECTRUM OF WEAKLY ZERO-DIVISOR GRAPH OF THE RING ZN
10 pages
Ec Centrifugal Module - Radipac: K3G560-Aq04-01
No ratings yet
Ec Centrifugal Module - Radipac: K3G560-Aq04-01
6 pages
PN Flange Dimensions
No ratings yet
PN Flange Dimensions
4 pages
Nitto Needle Scalers
No ratings yet
Nitto Needle Scalers
10 pages
Feb Bill
No ratings yet
Feb Bill
4 pages
Module 2.1,2.2
No ratings yet
Module 2.1,2.2
98 pages
Math 7 Pre Test Sy 2023-2024
No ratings yet
Math 7 Pre Test Sy 2023-2024
8 pages
Datasheet
No ratings yet
Datasheet
9 pages
Vlink Commandbatch Interface Setup Guide
No ratings yet
Vlink Commandbatch Interface Setup Guide
22 pages
Earthing: Prepared By: Asst. Prof. Divya Susanna Ebin
No ratings yet
Earthing: Prepared By: Asst. Prof. Divya Susanna Ebin
10 pages
NOC Computer Science and Engineering PDF
No ratings yet
NOC Computer Science and Engineering PDF
6 pages
UFGS Solar Panel Specification
100% (1)
UFGS Solar Panel Specification
39 pages
Surface Optimal Path Planning Using An Extended Dijkstra Algorithm
No ratings yet
Surface Optimal Path Planning Using An Extended Dijkstra Algorithm
12 pages
Library Management System Project
No ratings yet
Library Management System Project
5 pages
VitrA - Signature Catalogue
No ratings yet
VitrA - Signature Catalogue
180 pages
Resume-Covering Letter-Arabic Equivalency-Moe Approval-Teaching License-1-3 1 - 1-2
No ratings yet
Resume-Covering Letter-Arabic Equivalency-Moe Approval-Teaching License-1-3 1 - 1-2
2 pages
MBA204
No ratings yet
MBA204
2 pages
Vardhaman College of Engineering, Hyderabad: Autonomous Institute Affiliated To JNTUH
No ratings yet
Vardhaman College of Engineering, Hyderabad: Autonomous Institute Affiliated To JNTUH
2 pages
RGPV Cs-It-302 Solution Discrete Structure Dec 2015
No ratings yet
RGPV Cs-It-302 Solution Discrete Structure Dec 2015
16 pages
Dexterity Basic Skills Training: Evaluation of Fastening
No ratings yet
Dexterity Basic Skills Training: Evaluation of Fastening
1 page
Ganesh DJ
No ratings yet
Ganesh DJ
74 pages
Augmented
No ratings yet
Augmented
3 pages
Demucking Procedure Activities - Rev.2
No ratings yet
Demucking Procedure Activities - Rev.2
11 pages
Copy of Attracting A Crowd To Worship Slides
No ratings yet
Copy of Attracting A Crowd To Worship Slides
19 pages
EA Eaton Battery Brochure - V2
No ratings yet
EA Eaton Battery Brochure - V2
24 pages
Importance of Technical Writing
100% (1)
Importance of Technical Writing
9 pages