0% found this document useful (0 votes)

38 views18 pages

Unit Iii

Uploaded by

alwin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views18 pages

Unit Iii

Uploaded by

alwin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

UNIT - III

Machine learning (ML) is defined as a discipline of artificial intelligence (AI) that provides
machines the ability to automatically learn from data and past experiences to identify patterns
and make predictions with minimal human intervention.
Supervised Learning
• Supervised learning is a type of machine learning that uses labeled data to train machine
learning models. In labeled data, the output is already known. The model just needs to
map the inputs to the respective outputs.
• The algorithm uses this knowledge to try to generalize to new examples that its never
seen before.
• Using labeled inputs and outputs, the model can measure its accuracy and learn over
time
• An example of supervised learning is to train a system that identifies the image of an
animal.
• Supervised Learning is classified into two types
• Classification :
• Whether the output is a discrete class label eg: spam & not Spam
• Linear Classifiers, Support Vector machines Decision trees, random forests
• Regression:
• The output is a continuous value, such as price or probability
• Linear regression and logistic regression are two common types of regression
algorithms
Unsupervised Learning
• Machine algorithm is not given any labels and these algorithms discover hidden
patterns in data and group them together without need for human intervention.
• These algorithms don’t make predictions they only group the data
• Clustering : The algorithm groups similar experiences together eg: businesses might
group customers together based on similarities like age, location or spending habits
• Associations : The algorithm looks for relationships between variables in the data,
businesses want to know which items are often bought together
• Dimensional Reduction: The algorithm reduces the variables in the data, while still
preserving as much of the information as possible. Normally this technique is used in
the pre-processing data stage, removing noise from visual images
Semi-Supervised Learning
• The training data set with both labelled and unlabeled data
• Semi-supervised learning is a branch of machine learning that
combines supervised and unsupervised learning by using both labeled and unlabeled
data to train models for classification and regression tasks.
• train an initial model on a few labeled samples and then iteratively apply it to the
greater number of unlabeled data.
Reinforcement Learning
• Reinforcement Learning is a type of machine learning algorithm that learns to solve a
multi-level problem by trial and error.
• The machine is trained on real-life scenarios to make a sequence of decisions. It
receives either rewards or penalties for the actions it performs. Its goal is to maximize
the total reward.
• Natural Language Processing
• Image Processing

Linear Regression
• Linear regression analysis is used to predict the value of a variable based
on the value of another variable.
• The variable you want to predict is called the dependent variable.
• The variable you are using to predict the other variable's value is called the
independent variable.
• Linear regression fits a straight line or surface that minimizes the
discrepancies between predicted and actual output values.
• There are simple linear regression calculators that use a “least squares”
method to discover the best-fit line for a set of paired data.
• Estimate the value of X (dependent variable) from Y (independent
variable).
Least Square Method

Example
]

Multiple Linear Regression

Example
Logistic Regression
• Supervised learning algorithms can be grouped under two main categories:
• Regression: Predicting continuous target variables. For example,
predicting the price of a house is a regression task.
• Classification: Prediction discrete target variables. For example, predicting
if an email is spam is a classification task.
• Logistic regression is a supervised learning algorithm which is mostly used
to solve binary “classification” tasks although it contains the word
“regression” .
• “logistic” referring to logistic function which actually does the
classification task in the algorithm

Refer problem in the last page

Bayesian Linear Regression
Bayesian Regression
• Bayesian regression is a type of linear regression that uses Bayesian statistics
to estimate the unknown parameters of a model.
• It uses Bayes’ theorem to estimate the likelihood of a set of parameters given
observed data.
• The goal of Bayesian regression is to find the best estimate of the parameters
of a linear model that describes the relationship between the independent and
the dependent variables.
Bayesian Linear Regression
• Bayesian linear regression considers various plausible explanations for how the data
were generated.
• It makes predictions using all possible regression weights, weighted by their posterior
probability.
• In the Bayesian viewpoint, we formulate linear regression using probability distributions
rather than point estimates.
• The response, y, is not estimated as a single value, but is assumed to be drawn from a
probability distribution. The model for Bayesian Linear Regression with the response
sampled from a normal distribution is:

• The output, y is generated from a normal (Gaussian) Distribution characterized by a

mean and variance.
• The mean for linear regression is the transpose of the weight matrix multiplied by the
predictor matrix.
• The variance is the square of the standard deviation σ
• The aim of Bayesian Linear Regression is not to find the single “best” value of the
model parameters, but rather to determine the posterior distribution for the model
parameters.
• The posterior probability of the model parameters is conditional upon the training inputs
and outputs:

The two primary benefits of Bayesian Linear Regression.

1. Priors: If we have domain knowledge, or a guess for what the model parameters
should be, we can include them in our model. If we don’t have any estimates ahead
of time, we can use non-informative priors for the parameters such as a normal
distribution.
2. Posterior: The result of performing Bayesian Linear Regression is a distribution of
possible model parameters based on the data and the prior. This allows us to
quantify our uncertainty about the model: if we have fewer data points, the posterior
distribution will be more spread out.

As the amount of data points increases, the likelihood washes out the prior, and in the
case of infinite data, the outputs for the parameters converge to the values obtained from
Ordinary Least Squares.

Discriminative model:
Linear Discriminant Analysis (LDA) is a dimensionality reduction and classification technique
commonly used in machine learning and pattern recognition.
In the context of classification, it aims to find a linear combination of features that best separates
different classes or categories of data.
It seeks to reduce the dimensionality of the feature space while preserving as much of the class-
separability information as possible.

Steps:
1. Data Preparation: Let’s say we have 150 iris samples with four features each,
and the samples are evenly distributed among the three species.
2. Compute Class Statistics: Calculate the mean and covariance matrix for each
feature in each class. This gives us three mean vectors and three covariance
matrices (one for each class).
3. Compute Between-Class and Within-Class Scatter Matrices: Calculate the
between-class scatter matrix by computing the differences between the mean
vectors of each class and the overall mean, and then summing these outer
products. Calculate the within-class scatter matrix by summing the covariance
matrices of each class, weighted by the number of samples in each class.
4. Compute Eigenvectors and Eigenvalues: Solve the generalized eigenvalue
problem using the between-class scatter matrix and the within-class scatter matrix.
This gives us a set of eigenvectors and their corresponding eigenvalues.
5. Select Discriminant Directions: Sort the eigenvectors by their eigenvalues in
descending order. Let’s say we want to reduce the dimensionality to 2, so we
select the top two eigenvectors.
6. Transform Data: Project the original iris data onto the two selected eigenvectors.
This gives us a new two-dimensional representation of the data.
7. Classification: In the reduced-dimensional space, we can use a classifier (e.g., k-
nearest neighbors) to classify the iris flowers into one of the three species based
on their positions in the reduced space.

Probabilistic discriminative model

Naïve Bayes classifiers

• The Naïve Bayes classifier is a supervised machine learning algorithm. which is
used for classification tasks, such as text classification.
• Naive Bayes is a classification technique based on Bayes’ Theorem with an
assumption that all the features that predict the target value are independent of
each other.
• It calculates each class’s probability and then picks the one with the highest
probability.
• Naive Bayes With “naive” assumption of independence among predictors
• It works with huge data and is mostly used to solve text kinds of data.
• Examples: Email classification, Twitter sentiment analysis, etc.

Baye’s Theorem:
• Bayes theorem is an indispensable law of probability, allowing you to deductively
quantify unknown probabilities
• Bayes’ Theorem allows you to update the predicted probabilities of an event by
incorporating new information.
• Bayes’ Theorem was named after 18th-century mathematician Thomas Bayes.
• It often is employed in finance in calculating or updating risk evaluation.
• The theorem has become a useful element in the implementation of machine
learning.
• Bayes’ theorem is stated mathematically as the following equation:

How the Naive Bayes algorithm works:

• We are training a data set of weather and the corresponding target variable ‘Play’
(suggesting possibilities of playing).
• Now, we need to classify whether players will play or not based on weather
conditions.
Steps:
1. Convert the data set into a frequency table.
2. Create a Likelihood table by finding the probabilities like Overcast probability = 0.29 and
probability of playing is 0.64.
3. Now, use the Naive Bayesian equation to calculate the posterior probability for each class.
4. The class with the highest posterior probability is the outcome of the prediction.
Problem Statement:
Players will play if the weather is sunny. Is this statement correct?
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
Here we have,
P (Sunny |Yes) = 3/9 = 0.33,
P(Sunny) = 5/14 = 0.36,
P( Yes)= 9/14 = 0.64
Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, (high probability)

P(No | Sunny) = P( Sunny | No) * P(No) / P (Sunny)

Here we have
P (Sunny |No) = 2/5 = 0.4,
P(Sunny) = 5/14 = 0.36,
P( No)= 5/14 = 0.36
Now, P (No | Sunny) = 0.4 * 0.36 / 0.36 = 0.40, (low probability)
Players will play if the weather is sunny. This statement is correct.

Support Vector Machine (SVM) Algorithm

➢ It is a supervised machine learning problem used to find a hyperplane that best
separates the two classes.
➢ support vector machine is based on statistical approaches.
➢ There can be an infinite number of hyperplanes passing through a point and
classifying the two classes perfectly, finding the maximum margin between the
hyperplanes that means maximum distances between the two classes.
• Support Vectors: These are the points that are closest to the hyperplane. A
separating line will be defined with the help of these data points.
• Margin: it is the distance between the hyperplane and the observations closest to
the hyperplane (support vectors).

Step 1: SVM algorithm predicts the classes. One of the classes is identified as 1 while the other
is identified as -1.
Step 2: As optimization problems always aim at maximizing or minimizing something while
looking and tweaking for the unknowns, in the case of the SVM classifier, a loss function
known as the hinge loss function is used and tweaked to find the maximum margin.
Step 3: For ease of understanding, this loss function can also be called a cost function whose
cost is 0 when no class is incorrectly predicted. To bring these concepts in theory, a
regularization parameter is added.

Step 4: As is the case with most optimization problems, weights are optimized by calculating
the gradients using advanced mathematical concepts of calculus viz. partial derivatives.\
Step 5: The gradients are updated only by using the regularization parameter when there is no
error in the classification while the loss function is also used when misclassification happens.

Step 6: The gradients are updated only by using the regularization parameter when there is no
error in the classification, while the loss function is also used when misclassification happens.

Hard Margin
Hard Margin refers to that kind of decision boundary that makes sure that all the data points
are classified correctly. While this leads to the SVM classifier not causing any error, it can also
cause the margins to shrink thus making the whole purpose of running an SVM algorithm futile.
Soft Margin
a regularization parameter is also added to the loss function in the SVM classification
algorithm. This combination of the loss function with the regularization parameter allows the
user to maximize the margins at the cost of misclassification.

Decision Tree
Decision Tree is a supervised (labeled data) machine learning algorithm that can be used
for both classification and regression problems.
A decision tree is a tree-like structure that is used as a model for classifying data.

Step by Step Procedure

• Step 1: Determine the Root of the Tree

• Step 2: Calculate Entropy for The Classes

• Step 3: Calculate Entropy After Split for Each Attribute

• Step 4: Calculate Information Gain for each split

• Step 5: Perform the Split

• Step 6: Perform Further Splits
• Step 7: Complete the Decision Tree

Refer problem
Random forest
Random forest, a popular machine learning algorithm developed by Leo Breiman and
Adele Cutler, merges the outputs of numerous decision trees to produce a single
outcome.
One of the most important features of the Random Forest Algorithm is that it can handle
the data set containing continuous variables
Random forest Classifier works on the Bagging principle.
Bagging, also known as Bootstrap Aggregation, serves as the ensemble technique in the
Random Forest algorithm. Here are the steps involved in Bagging:
1. Selection of Subset: Bagging starts by choosing a random sample, or subset, from
the entire dataset.
2. Bootstrap Sampling: Each model is then created from these samples, called
Bootstrap Samples, which are taken from the original data with replacement. This
process is known as row sampling.
3. Bootstrapping: The step of row sampling with replacement is referred to as
bootstrapping.
4. Independent Model Training: Each model is trained independently on its
corresponding Bootstrap Sample. This training process generates results for each
model.
5. Majority Voting: The final output is determined by combining the results of all
models through majority voting. The most commonly predicted outcome among
the models is selected.
6. Aggregation: This step, which involves combining all the results and generating
the final output based on majority voting, is known as aggregation.
Steps Involved in Random Forest Algorithm
• Step 1: In the Random forest model, a subset of data points and a subset of
features is selected for constructing each decision tree. Simply put, n random
records and m features are taken from the data set having k number of records.
• Step 2: Individual decision trees are constructed for each sample.
• Step 3: Each decision tree will generate an output.
• Step 4: Final output is considered based on Majority Voting or Averaging for
Classification and regression, respectively.

Machine Learning With Python Barua Hiran Jain Doshi 2024
100% (2)
Machine Learning With Python Barua Hiran Jain Doshi 2024
541 pages
ML Unit 2
No ratings yet
ML Unit 2
21 pages
Rapport Stage PFE Finale
No ratings yet
Rapport Stage PFE Finale
57 pages
MLT Unit 2 Notes
No ratings yet
MLT Unit 2 Notes
58 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
Supervised Machine Learning - Linear Regression
No ratings yet
Supervised Machine Learning - Linear Regression
92 pages
Machine Learning Based Cyber Bullying Detection
No ratings yet
Machine Learning Based Cyber Bullying Detection
5 pages
Machine Learning
No ratings yet
Machine Learning
100 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Unit3aiml 230421054431 97b34666
No ratings yet
Unit3aiml 230421054431 97b34666
62 pages
Unit 3
No ratings yet
Unit 3
62 pages
Machine Leraning Unit 2
No ratings yet
Machine Leraning Unit 2
62 pages
Applied AI - Machine Learning Course Syllabus PDF
No ratings yet
Applied AI - Machine Learning Course Syllabus PDF
22 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Unit Iii Supervised Learning
No ratings yet
Unit Iii Supervised Learning
67 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Machine Learning
No ratings yet
Machine Learning
41 pages
Ai ML 3
No ratings yet
Ai ML 3
27 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Unit 3
No ratings yet
Unit 3
45 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
W8-Supervised Learning Methods
No ratings yet
W8-Supervised Learning Methods
30 pages
Chapter - 2-ML
No ratings yet
Chapter - 2-ML
63 pages
Unit 2 - NOTES1 - ML
No ratings yet
Unit 2 - NOTES1 - ML
35 pages
AI 4 Unit Notes
No ratings yet
AI 4 Unit Notes
47 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
ML Interview Questions
No ratings yet
ML Interview Questions
21 pages
Slide 1
No ratings yet
Slide 1
29 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
12 pages
MLT UNIT-2 Notes
No ratings yet
MLT UNIT-2 Notes
16 pages
Group 2 ML Asignmet
No ratings yet
Group 2 ML Asignmet
23 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Supervised and Unsupervised Learning Algorithm-2
No ratings yet
Supervised and Unsupervised Learning Algorithm-2
52 pages
Module 5
No ratings yet
Module 5
48 pages
ML Unit-4
No ratings yet
ML Unit-4
20 pages
Project
No ratings yet
Project
12 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
Naive - Bayes - Ipynb - Colab
No ratings yet
Naive - Bayes - Ipynb - Colab
3 pages
AISCIENCES - Data Science Cookbook - V0
No ratings yet
AISCIENCES - Data Science Cookbook - V0
244 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
ML Notes
No ratings yet
ML Notes
10 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
ML Notes UT-1
No ratings yet
ML Notes UT-1
21 pages
Unit 2 Machine Learning
No ratings yet
Unit 2 Machine Learning
32 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Machine Learning UNIT-2: Logistic Regression
No ratings yet
Machine Learning UNIT-2: Logistic Regression
12 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Data Science Syllabus
No ratings yet
Data Science Syllabus
23 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
CS8091 LN
No ratings yet
CS8091 LN
68 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Unit 4 - Machine Learning PDF
No ratings yet
Unit 4 - Machine Learning PDF
49 pages
Unit I
No ratings yet
Unit I
14 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
11 pages
MLT - MKC
No ratings yet
MLT - MKC
10 pages
Gender Recong Paper 4
No ratings yet
Gender Recong Paper 4
9 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
ML Unit-2 Material Add-On
No ratings yet
ML Unit-2 Material Add-On
82 pages
Approaches For Credit Scorecard Calibration: An Empirical Analysis
No ratings yet
Approaches For Credit Scorecard Calibration: An Empirical Analysis
40 pages
ML Last Min Notes
No ratings yet
ML Last Min Notes
81 pages
Data Science & Analytics: Course Code: CSE3105 Credits: 02 Credit Hours: 02/week Exam Hours: 03
No ratings yet
Data Science & Analytics: Course Code: CSE3105 Credits: 02 Credit Hours: 02/week Exam Hours: 03
2 pages
A Study On Sentiment Analysis Techniques of Twitter Data
No ratings yet
A Study On Sentiment Analysis Techniques of Twitter Data
15 pages
ML Lab Manual
No ratings yet
ML Lab Manual
40 pages
Bayesian Networks 1
No ratings yet
Bayesian Networks 1
41 pages
Classification and Prediction: Data Mining Concepts and Techniques
No ratings yet
Classification and Prediction: Data Mining Concepts and Techniques
18 pages
Abhishek ML File
No ratings yet
Abhishek ML File
23 pages
Aiml Question Bank
No ratings yet
Aiml Question Bank
4 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
24 pages
Pinto Evaluating N-Gram Models For A Bilingual Word Sense Disambiguation Task
No ratings yet
Pinto Evaluating N-Gram Models For A Bilingual Word Sense Disambiguation Task
12 pages
Data Mining
No ratings yet
Data Mining
27 pages
Applied Artificial Intelligence For Predicting Construction Projects Delay
No ratings yet
Applied Artificial Intelligence For Predicting Construction Projects Delay
16 pages
Applsci 09 05178
No ratings yet
Applsci 09 05178
13 pages
Predicting Good Probabilities With Supervised Learning: Alexandru Niculescu-Mizil Rich Caruana
No ratings yet
Predicting Good Probabilities With Supervised Learning: Alexandru Niculescu-Mizil Rich Caruana
8 pages
Lab 78
No ratings yet
Lab 78
6 pages
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet

Unit Iii

Uploaded by

Unit Iii

Uploaded by

UNIT - III

Multiple Linear Regression

Refer problem in the last page

• The output, y is generated from a normal (Gaussian) Distribution characterized by a

The two primary benefits of Bayesian Linear Regression.

Probabilistic discriminative model

Naïve Bayes classifiers

How the Naive Bayes algorithm works:

P(No | Sunny) = P( Sunny | No) * P(No) / P (Sunny)

Support Vector Machine (SVM) Algorithm

Step by Step Procedure

• Step 1: Determine the Root of the Tree

• Step 3: Calculate Entropy After Split for Each Attribute

• Step 5: Perform the Split

You might also like