0% found this document useful (0 votes)

19 views53 pages

UNIT3 Machine Learning

The document provides an overview of various machine learning algorithms, focusing on supervised and unsupervised learning techniques such as Linear Regression, Support Vector Machines (SVM), Decision Trees, and Random Forests. It explains key concepts, equations, and methodologies associated with these algorithms, including their advantages, disadvantages, and applications. Additionally, it discusses unsupervised learning methods like clustering and association rule learning, highlighting their characteristics and evaluation metrics.

Uploaded by

devhirpara8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views53 pages

UNIT3 Machine Learning

Uploaded by

devhirpara8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Supervised Learning Algorithm

Unsupervised Learning Algorithms

Machine Learning

Prof. Purvi Patel

[email protected]
What is Linear Regression?
• Definition:
• Linear regression is a type of supervised machine learning algorithm that computes the
linear relationship between the dependent variable and one or more independent
features by fitting a linear equation to observed data.

• Types:
• - Simple Linear Regression: One independent variable
• - Multiple Linear Regression: More than one independent variable
• - Univariate Linear Regression: One dependent variable
• - Multivariate Regression: More than one dependent variable
Linear Regression
• Simple Linear Regression:
• Equation: y = β₀ + β₁X
• Variables:
o y: Dependent variable

Types of o
o
X: Independent variable
β₀: Intercept
o β₁: Slope

Linear • Multiple Linear Regression:

• Equation: y = β₀ + β₁X₁ + β₂X₂ + ... + βnXn
Regression • Variables:
o y: Dependent variable
o X₁, X₂, ..., Xn: Independent variables
o β₀: Intercept
o β₁, β₂, ..., βn: Slopes
• Objective:
• To locate the best-fit line minimizing the error
between predicted and actual values

What is the • Best Fit Line: Provides a straight line representing

the relationship between dependent and
Best Fit independent variables

Line? • Slope: Indicates the change in the dependent

variable for a unit change in the independent
variable(s)

• Equation: y = β₀ + β₁X
• Assumption:
• Linear relationship between X (experience) and Y
(salary)
Hypothesis
• Equation: Ŷ = θ₁ + θ₂X
Function in • ŷᵢ = θ₁ + θ₂xᵢ

Linear • Variables:
o yᵢ: True values (dependent variable)
o xᵢ: Input independent training data (independent
Regression variable)
o ŷᵢ: Predicted values
o θ₁: Intercept
o θ₂: Coefficient of x
• Definition:
• Error or difference between predicted value Ŷ and
true value Y

Cost Function • Mean Squared Error (MSE):

• Equation: J(θ) = 1/n Σᵢ=1ⁿ (ŷᵢ - yᵢ)²
for Linear
• Variables:
Regression • - J(θ): Cost function
• - n: Number of data points
• - ŷᵢ: Predicted values
• - yᵢ: Actual values
• Objective:
• Update θ₁ and θ₂ to minimize the error between

Minimizing predicted and true values

• Gradient Descent:
the Cost • Iterative process to update θ₁ and θ₂ based on
gradients calculated from MSE
Function
• Ensures MSE value converges to the global
minima
• Process:
Gradient • Calculate gradients: ∂J(θ)/∂θ₁ and ∂J(θ)/∂θ₂

Descent for • Update parameters:

• θ₁ ← θ₁ - α ∂J(θ)/∂θ₁
Linear • θ₂ ← θ₂ - α ∂J(θ)/∂θ₂

Regression • Learning Rate (α):

• Controls the step size in gradient descent
Data Relations
What is Polynomial Regression?
• Definition:
o Polynomial regression is a type of regression analysis used in statistics and machine learning when the
relationship between the independent variable (input) and the dependent variable (output) is not
linear.

• Non-linear Relationship:
o Allows for more flexibility by fitting a polynomial equation to the data.
Why • Curvilinear Relationships:
o Suitable for relationships that are better represented by

Polynomial a curve rather than a straight line.

Regression? • Capture Non-linear Patterns

• Feature Engineering:
• Add higher-order terms of the dependent
features in the feature space.
How Does • General Form:

Polynomial • Equation: y = β₀ + β₁x + β₂x² + ... + βnxⁿ + ε

• Variables:
Regression •
•
- y: Dependent variable
- x: Independent variable
Work? • - β₀, β₁, ..., βn: Coefficients of the polynomial
terms
• - n: Degree of the polynomial
• - ε: Error term
• Degree (n):
• Crucial aspect of polynomial regression.

Choosing the • Higher Degree:

Polynomial • Allows the model to fit the training data more

closely but may lead to overfitting.

Degree • Complexity:
• Degree should be chosen based on the complexity
of the underlying relationship in the data.
What is • Definition:
• SVM is a supervised machine learning algorithm
Support used for both classification and regression, though
it is best suited for classification.
Vector • Objective:

Machine • Find the optimal hyperplane in an N-dimensional

space that can separate the data points into
different classes.
(SVM)?
• Hyperplane:
• The decision boundary that separates data points
of different classes.

How SVM • Margin:

• The distance between the hyperplane and the
Works nearest data points from each class.

• Maximum Margin Hyperplane:

• The hyperplane with the largest margin, providing
the best separation.
• Outliers:
• Data points that do not fit the general pattern.

SVM with • Soft Margin:

• Allows some misclassifications to handle outliers.
Outliers • Hinge Loss:
• Penalty for misclassified points, proportional to
the distance from the margin.
• Kernel Trick:
• SVM uses kernel functions to map data to a

Non-Linearl higher-dimensional space where it can be linearly

separable.

y Separable • Common Kernels:

• - Linear
Data • - Polynomial
• - Radial Basis Function (RBF)
• - Sigmoid
• Hyperplane:
• The decision boundary that separates data points
of different classes in a feature space.
Support
• Support Vectors:
Vector • The closest data points to the hyperplane, playing
a critical role in defining the hyperplane and
Machine margin.

Terminology • Margin:
• The distance between the support vector and
hyperplane, which SVM aims to maximize.
• Kernel Functions:
SVM Kernel •
•
- Linear: K(w,b) = wᵀx + b
- Polynomial: K(w,x) = (γwᵀx + b)ⁿ
Functions •
•
- Gaussian RBF: K(w,x) = exp(-γ||xi - xj||²)
- Sigmoid: K(xi,xj) = tanh(αxᵢᵀxⱼ + b)
• Advantages:
• - Effective in high-dimensional spaces.
Advantages • - Memory efficient as it uses a subset of training
points (support vectors).
of SVM • - Different kernel functions can be specified for
decision functions, with the option to define
custom kernels.
• Binary Classification:
• Consider a binary classification problem with two
Mathematical classes, labeled as +1 and -1.

Intuition of • Hyperplane Equation:

• wᵀx + b = 0
SVM
• Distance Calculation:
• dᵢ = (wᵀxᵢ + b) / ||w||
• Decision Rule:
• ŷ = 1 if wᵀx + b ≥ 0, else 0

Linear SVM • Optimization:

Classifier • Hard Margin: Minimize (1/2)||w||² subject to yᵢ

(wᵀxᵢ + b) ≥ 1
• Soft Margin: Minimize (1/2)||w||² + C Σᵢ ζᵢ
subject to yᵢ(wᵀxᵢ + b) ≥ 1 - ζᵢ and ζᵢ ≥ 0
Types of • Linear SVM:
• Uses a linear decision boundary to separate data

Support points of different classes.

• Non-Linear SVM:
Vector • Uses kernel functions to handle non-linearly
separable data by transforming it into a
Machines higher-dimensional space.
What is a • A versatile, interpretable algorithm used for
predictive modeling.
• Suitable for both classification and regression
Decision tasks.
• Visual representation of decisions and their
Tree? possible consequences.
• Root Node: Represents the initial feature or
decision.
• Internal Nodes: Test on attributes, leading to
Decision further branching.
• Leaf Nodes: Represent the final decision or
Tree prediction.
• Branches: Indicate the outcomes of decisions.
Structure • Splitting: The process of dividing nodes based on
decision criteria.
• Pruning: Removing unnecessary branches to
improve accuracy.
Decision Tree Approach
Attribute • Information Gain: Measures the change in
entropy after a split.
Selection • Gini Index: Measures the impurity of a node.

Measures
How • Recursive Partitioning: Splitting data based on
attributes.
Decision • Selecting Attributes: Use criteria like Information
Gain or Gini Index.

Trees Are • Stopping Criterion: Max depth or minimum

instances in a leaf node.

Formed
• Interpretability: Easy to understand and visualize.
• Versatility: Handles both numerical and
categorical data.
Advantages • Feature Importance: Provides insights into which
features are most important.
• Handling Missing Data: Decision trees can manage
missing values effectively.
• Overfitting: Decision trees can be prone to
overfitting, especially with small datasets.
• Data Sensitivity: Small changes in data can lead to
Disadvantages a completely different tree.
• Bias: Potential bias in the presence of imbalanced
data.
What is • A powerful ensemble learning technique.
• Combines multiple decision trees to enhance
predictive accuracy.
Random • Originated in 2001 by Leo Breiman.
• Widely used for both classification and regression
Forest? tasks.
• Ensemble of Decision Trees: Multiple trees work
together for a common output.
Fundamental • Randomness in Training: Random subsets of data
and features reduce overfitting.
Concepts • Final Prediction: Aggregation of individual tree
predictions (voting for classification, averaging for
regression).
•

Random • Training Phase: Builds multiple decision trees

using random subsets of data and features.
• Prediction Phase: Aggregates the results from all
Forest trees for final prediction.
• Advantages: Reduces overfitting, improves
Algorithm accuracy, handles complex data.
Ensemble • Concept: Combining multiple models to improve
performance.
• Analogy: Like a team of experts collaborating on a
Learning problem.
• Examples: Random Forest, XGBoost, AdaBoost,
Models LightGBM, Bagging
• Bagging: Training multiple weak models on
Bagging and different data subsets and averaging results.
• Boosting: Sequential training where each model
Boosting corrects the errors of the previous one, with
weighted voting for final prediction.
• Step 1: Select random K data points from the
training set.

How Random • Step 2: Build decision trees for selected subsets.

• Step 3: Choose the number N of decision trees.
• Step 4: Repeat steps 1 and 2 to build the forest.
Forest Works • Step 5: For new data, aggregate predictions from
all trees (majority vote for classification, average
for regression).
Random Forest Approach
• High Predictive Accuracy: Collaborative
decision-making leads to better predictions.
• Resistance to Overfitting: Randomness in training
helps generalize better.
Key Features • Handling Large Datasets: Efficiently manages large
and complex datasets.
of Random • Variable Importance: Identifies and ranks the
most important features.
Forest • Built-in Cross-Validation: Out-of-bag samples used
for internal validation.
• Handling Missing Values: Robust against
incomplete data.
• Parallelization: Trees can be trained
simultaneously, speeding up the process.
• Complexity: More computationally intensive than

Potential single models.

• Interpretability: Less transparent than individual
decision trees.
Drawbacks • Memory Usage: Requires more memory to store
multiple trees.
Unsupervised • Learning from unlabeled data without predefined
categories.
• Focus on discovering patterns and relationships
Learning autonomously.
• Process Overview:
– No explicit guidance or labeled data.
How – The model identifies hidden structures in
the data.
Unsupervised • Example:
Learning Works – Distinguishing between different species
of animals based on traits without prior
labeling.
Key • Pattern Discovery: Models find patterns in data
without labels.
Characteristics of • Clustering: Grouping similar data points together.
• Feature Extraction: Capturing essential
Unsupervised information to differentiate data.
• Label Association: Assigning categories based on
Learning discovered patterns.
Example of • Scenario: Model trained on unlabeled images of
cow, elephant, and camel.
Unsupervised • Identifies and groups images based on similarities,
even without prior knowledge of what a dog or
Learning cat looks like.
Uunsupervised Learning
Types of • Clustering:Grouping similar data points together.
Unsupervised • Association:Identifying patterns and relationships
between items in a dataset.
Learning
• Types of Clustering
– Hierarchical Clustering
– K-means Clustering
– Principal Component Analysis (PCA)
Clustering –
–
Singular Value Decomposition (SVD)
Independent Component Analysis (ICA)
– Gaussian Mixture Models (GMMs)
– Density-Based Spatial Clustering of Applications with
Noise (DBSCAN)
• Definition
– Identifying patterns in data using association rules.
Association Rule • Algorithms
Learning – Apriori Algorithm
– Eclat Algorithm
– FP-Growth Algorithm
• Evaluation Metrics
Evaluating – Silhouette Score

Unsupervised –
–
Calinski-Harabasz Score
Adjusted Rand Index
– Davies-Bouldin Index
Learning Models – F1 Score (adapted for clustering)
• Areas of Application
– Anomaly Detection
Applications of – Scientific Discovery
Unsupervised Learning – Recommendation Systems
– Customer Segmentation
– Image Analysis
• No need for labeled training data.
Advantages of • Effective for dimensionality reduction.
Unsupervised Learning • Capable of finding unknown patterns.
• Provides insights from unlabeled data.
• Hard to measure accuracy due to the lack of
predefined answers.
• Typically lower accuracy compared to supervised
Disadvantages of learning.
Unsupervised Learning • Requires manual interpretation and labeling
post-classification.
• Sensitive to data quality and challenges in performance
evaluation.
Thank You !!

Machine Learning Notes
No ratings yet
Machine Learning Notes
167 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
ML 7th Sem AIML ITE Notes Complete LONG
No ratings yet
ML 7th Sem AIML ITE Notes Complete LONG
202 pages
M2 - Supervised Machine Learning
No ratings yet
M2 - Supervised Machine Learning
79 pages
Unit 2
No ratings yet
Unit 2
133 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
Unit 3
No ratings yet
Unit 3
12 pages
Support Vector Machine
No ratings yet
Support Vector Machine
9 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
KCA 034 - Unit 2
No ratings yet
KCA 034 - Unit 2
97 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
SVM Unit3
No ratings yet
SVM Unit3
23 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
ML - Unit-2 - Machine Learning Algorithm
No ratings yet
ML - Unit-2 - Machine Learning Algorithm
42 pages
Regression Bayesian SVM Notes
No ratings yet
Regression Bayesian SVM Notes
6 pages
Assessing A Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing A Single Classification Algorithm and Two Classification Algorithms
12 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
S, SVM, LR
No ratings yet
S, SVM, LR
18 pages
Support Vector Machines For Classification and Regression: Steve R. Gunn
No ratings yet
Support Vector Machines For Classification and Regression: Steve R. Gunn
66 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Our First NP-Complete Problem: The Cook-Levin Theorem
No ratings yet
Our First NP-Complete Problem: The Cook-Levin Theorem
22 pages
Machine Learning
No ratings yet
Machine Learning
37 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
MLT UNIT-2 Notes
No ratings yet
MLT UNIT-2 Notes
16 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
AIML
No ratings yet
AIML
30 pages
DL
No ratings yet
DL
10 pages
Oec552 Soft Computing QB PDF
No ratings yet
Oec552 Soft Computing QB PDF
13 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
Codes
No ratings yet
Codes
35 pages
MLRS Assignment 1 24070146008 Sreemanth Mannem
No ratings yet
MLRS Assignment 1 24070146008 Sreemanth Mannem
12 pages
MATH 6 - T.Janen
No ratings yet
MATH 6 - T.Janen
30 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
ML Models
No ratings yet
ML Models
21 pages
Unit 1,2,3
No ratings yet
Unit 1,2,3
17 pages
Support Vector Machines
No ratings yet
Support Vector Machines
12 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
ML Unit-4
No ratings yet
ML Unit-4
20 pages
Unit 2 Svms Linear Logistic Regression
No ratings yet
Unit 2 Svms Linear Logistic Regression
9 pages
DAA3
No ratings yet
DAA3
18 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
13 pages
Linear Regression & SVM
No ratings yet
Linear Regression & SVM
33 pages
Module 5
No ratings yet
Module 5
48 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
cpp01 - 42 C++
No ratings yet
cpp01 - 42 C++
15 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Lecture 7
No ratings yet
Lecture 7
52 pages
PID5108657
No ratings yet
PID5108657
8 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
ML Points
No ratings yet
ML Points
13 pages
Cyber Security Week 2 Notes
No ratings yet
Cyber Security Week 2 Notes
10 pages
Types of Regression
No ratings yet
Types of Regression
8 pages
FLAT PAPER Gtu
No ratings yet
FLAT PAPER Gtu
2 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
5 pages
Dbms Lesson Plan
No ratings yet
Dbms Lesson Plan
4 pages
Exercise 1 Micropipetting Worksheet
No ratings yet
Exercise 1 Micropipetting Worksheet
2 pages
Mahatma Phule Arts, Science & Commerce College, Panvel. Dist: Raigad
No ratings yet
Mahatma Phule Arts, Science & Commerce College, Panvel. Dist: Raigad
83 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Question Bank
No ratings yet
Question Bank
4 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
IMP Questions ADA
No ratings yet
IMP Questions ADA
7 pages
Including Block Chain230222084110-3b9d577c
No ratings yet
Including Block Chain230222084110-3b9d577c
11 pages
Arrays: Introduction - Variables in A Program Have Values Associated With Them. During
No ratings yet
Arrays: Introduction - Variables in A Program Have Values Associated With Them. During
66 pages
CS2201 OS Paper Set 1 Solution
No ratings yet
CS2201 OS Paper Set 1 Solution
11 pages
Compiler Construction (CS4623) : Course Instructor: Ms. Tayyaba Zaheer
No ratings yet
Compiler Construction (CS4623) : Course Instructor: Ms. Tayyaba Zaheer
16 pages
Simplifications of Context-Free Grammars
No ratings yet
Simplifications of Context-Free Grammars
51 pages
09 Static Keyword in Java
No ratings yet
09 Static Keyword in Java
9 pages
Pabitracv
No ratings yet
Pabitracv
8 pages
S024 60018230088 Ai Exp-5
No ratings yet
S024 60018230088 Ai Exp-5
5 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Syllabus DLC
No ratings yet
Syllabus DLC
3 pages
Chapter 1 - Real Numbers Class 10 Maths
No ratings yet
Chapter 1 - Real Numbers Class 10 Maths
4 pages
Seminar1 - Alper Nebi Kanlı
No ratings yet
Seminar1 - Alper Nebi Kanlı
7 pages
6ITSOU InterMid Syllabus
No ratings yet
6ITSOU InterMid Syllabus
1 page
AIML&DL Assignment1
No ratings yet
AIML&DL Assignment1
1 page
B.SC (H) COMPUTER SC
No ratings yet
B.SC (H) COMPUTER SC
6 pages
Lab 9
No ratings yet
Lab 9
6 pages
Friend Function in C++ ?
No ratings yet
Friend Function in C++ ?
10 pages
Lab 9
No ratings yet
Lab 9
6 pages
The Forwarding Index of Communication Networks
No ratings yet
The Forwarding Index of Communication Networks
9 pages
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet

UNIT3 Machine Learning

Uploaded by

UNIT3 Machine Learning

Uploaded by

Supervised Learning Algorithm

Unsupervised Learning Algorithms

Prof. Purvi Patel

Linear • Multiple Linear Regression:

What is the • Best Fit Line: Provides a straight line representing

Line? • Slope: Indicates the change in the dependent

Cost Function • Mean Squared Error (MSE):

Minimizing predicted and true values

Descent for • Update parameters:

Regression • Learning Rate (α):

Polynomial a curve rather than a straight line.

Regression? • Capture Non-linear Patterns

Polynomial • Equation: y = β₀ + β₁x + β₂x² + ... + βnxⁿ + ε

Choosing the • Higher Degree:

Polynomial • Allows the model to fit the training data more

Machine • Find the optimal hyperplane in an N-dimensional

How SVM • Margin:

• Maximum Margin Hyperplane:

SVM with • Soft Margin:

Non-Linearl higher-dimensional space where it can be linearly

y Separable • Common Kernels:

Intuition of • Hyperplane Equation:

Linear SVM • Optimization:

Classifier • Hard Margin: Minimize (1/2)||w||² subject to yᵢ

Support points of different classes.

Trees Are • Stopping Criterion: Max depth or minimum

Random • Training Phase: Builds multiple decision trees

How Random • Step 2: Build decision trees for selected subsets.

Potential single models.

You might also like