0% found this document useful (0 votes)

15 views62 pages

Machine Learning

The document provides an overview of supervised learning, focusing on classification and regression algorithms used in machine learning. It details various types of regression models, including simple and multiple linear regression, as well as classification techniques like decision trees, Naïve Bayes, and KNN. Additionally, it discusses performance metrics for evaluating regression models and the principles behind gradient descent.

Uploaded by

ravikumarrangulk18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views62 pages

Machine Learning

Uploaded by

ravikumarrangulk18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Unit II

Supervised Learning - I

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Supervised Learning
• Classification and Regression algorithms are Supervised Learning algorithms.
• Both the algorithms can be used for forecasting in Machine learning and operate with the labelled
datasets.
• Regression algorithms are used to determine continuous values such as price, income, age, etc.
• Classification algorithms are used to forecast or classify the distinct values such as Real or False, Male
or Female, Spam or Not Spam, etc.
• Types of Regression Algorithms:
• Simple Linear Regression
• Multiple Linear Regression
• Types of Classification Algorithms:
• Decision Trees
• Naïve Bayes
• K Nearest Neighbors
• Logistic Regression
• Multinominal Logistic Regression
• SVM
Regression Models

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Regression
• Regression in machine learning consists of mathematical methods that allow data scientists to
predict a continuous outcome (y) based on the value of one or more predictor variables (X).
• Regression analysis is a form of predictive modeling technique which investigates the relationship
between a dependent and independent variable.
• The aim of Regression is to find a function of X to predict y.
y=f(X)
• X: independent variable also known as predictor variable, Regressor, Exploratory
Variable, input Variable
• y: dependent variable also known as Response Variable, Regressand, predicted
variable, output variable

• Types of Regression Models:

• Linear Regression: One dependent and one independent variable
• Multiple Linear Regression : One dependent and multiple independent variables
Regression
Regression
• Using regression, a function is fit on the available data and try to predict the outcome for the future
or hold-out data points.
• Fitting of Regression Functions help in Interpolation and Extrapolation.
• Interpolation - estimate missing data within the data range
• Extrapolation - estimate future data out of the data range

Interpolation
Regression

Extrapolation
Simple Linear Regression - Least Square Regression Line
• Least Squares method is used to determine the best-fitting line for the given data by reducing the
sum of the squares of the vertical deviations from each data point to the line.
• If a point rests on the fitted line accurately, then the value of its perpendicular deviation is 0.
• Linear regression determines the straight line, known as the least-squares regression line or LSRL.
• Suppose y is a dependent variable and X is an independent variable, then the population
regression line is given by the equation;

• When a random sample of observations is given, then the regression line is expressed as:

• Where B0 is a constant
• B1 is the regression coefficient
• X is the independent variable
• is known as the predicted value of the dependent variable.
Simple Linear Regression - Implementation
• Obtain the input dataset and identify independent and dependent features.
• Fit the regression line of the form:

• Compute B0 and B1

• Using B0 and B1 values, compute

• Calculate the evaluation metrics to find whether the line obtained is best fit or not.
Regression - Performance Metrics
• The performance of a Regression model is reported as errors in the prediction.
• Following are the popular metrics that are used to evaluate the performance of Regression models.
• Mean Absolute Error: MAE is one of the simplest metrics, which measures the absolute difference between
actual and predicted values, where absolute means taking a number as positive.

• y is the Actual outcome, y' is the predicted outcome, and N is the total number of data points.
• Mean Squared Error: MSE measures the average of the Squared difference between predicted values and the actual
value given by the model.

• R2 Score: R2error is also known as Coefficient of Determination, which is another popular metric used for
Regression model evaluation.
• The R-squared metric enables us to compare the model with a constant baseline to determine the performance of the
model.
• To select the constant baseline, we need to take the mean of the data and draw the line at the mean.
Regression - Performance Metrics

• Adjusted R2:
• Adjusted R2, as the name suggests, is the improved version of R2 error.
• R2 has a limitation of improvement of a score on increasing the terms, even though the model is not
improving, and it may mislead the data scientists.
• To overcome the issue of R square, adjusted R squared is used, which will always show a lower value than
R². It is because it adjusts the values of increasing predictors and only shows improvement if there is a real
improvement.
• We can calculate the adjusted R squared as follows:

• n is the number of observations

• k denotes the number of independent variables
• Ra2 denotes the adjusted R2
Multiple Linear Regression
• Multiple regression, also known as multiple linear regression (MLR), is a statistical technique that
uses two or more explanatory variables to predict the outcome of a response variable.
• In other words, it can explain the relationship between multiple independent variables against one
dependent variable.
• Multiple linear regression formula
• Here’s the formula for multiple linear regression, which produces a more specific calculation:
y = β0 + β1 X1 + β2 X2 + ... + βp Xp
• The variables in this equation are:
• y is the predicted or expected value of the dependent variable.
• X1, X2, Xp are three independent or predictor variables.
• β0 is the value of y when all the independent variables are equal to zero.
• β1 , β2, βp are the estimated regression coefficients.
• Each regression coefficient represents the change in y relative to a one-unit change in the
respective independent variable.
Multiple Linear Regression - Implementation
• Obtain the input dataset and identify independent and dependent features.
• Fit the regression line of the form:

• Calculate
• Calculate Regression sums using
Multiple Linear Regression - Implementation
Gradient Descent
Gradient Descent
Gradient Descent
• The basic principle of gradient descent is to choose the step size (also called as learning rate)
appropriately so that we can get close to the exact solution.
• Gradient descent stops when step size is very close to zero along with max number of steps we
want to perform.
Classification

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Classification
• Classification is the process of predicting the class of given data points. Classes are sometimes
called targets, labels or categories.
• Classification predictive modeling is the task of approximating a mapping function (f) from input
variables (X) to discrete output variables (y).
• There are two types of learners in classification — lazy learners and eager learners.
• Lazy learners store the training data and wait until testing data appears. When it does,
classification is conducted based on the most related stored training data. Compared to eager
learners, lazy learners spend less training time but more time in predicting.
• Eager learners construct a classification model based on the given training data before receiving
data for classification. It must be able to commit to a single hypothesis that covers the entire
instance space. Because of this, eager learners take a long time for training and less time for
predicting.
Decision Trees – ID3, CART

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Decision Trees
• A decision tree is a non-parametric supervised learning algorithm for classification and regression
tasks.
• It breaks down a dataset into smaller and smaller subsets while at the same time, an associated
decision tree is incrementally developed.
• It has a hierarchical tree structure consisting of a root node, branches, internal nodes, and leaf
nodes.
• The final result is a tree with decision nodes and leaf nodes.
• Decision Trees are the foundation for many classical machine learning algorithms like Random
Forests, Bagging, and Boosting.
• Types of Decision Trees:
• CART (Classification and Regression Trees) → uses Gini Index(Classification) as metric.
• ID3 (Iterative Dichotomiser 3) → uses Entropy function and Information gain as metrics.
Decision Trees
• Decision Tree Terminologies:
• Root Node
• Decision Nodes
• Leaf Nodes
• Sub-Tree
• Pruning
• Parent and Child node
Decision Tree – ID3
• ID3 stands for Iterative Dichotomiser 3 and is named such because the algorithm iteratively
(repeatedly) dichotomizes(divides) features into two or more groups at each step.
• The ID3 algorithm builds decision trees using a top-down greedy search approach through the
space of possible branches with no backtracking.
• A greedy algorithm, as the name suggests, always makes the choice that seems to be the best at
that moment.
• Steps of ID3 Algorithm:
1. Calculate the Information Gain of each feature.
2. Considering that all rows don’t belong to the same class, split the dataset S into subsets
using the feature for which the Information Gain is maximum.
3. Make a decision node using the feature with the maximum Information gain.
4. If all rows belong to the same class, make the current node as a leaf node with the class as
its label.
5. Repeat for the remaining features until we run out of all features, or the decision tree has all
leaf nodes.
Decision Tree – ID3
• Entropy is a measure of disorder in a dataset.
• A dataset with high entropy is a dataset where the data points are evenly distributed across the
different categories.
• A dataset with low entropy is a dataset where the data points are concentrated in one or a few
categories.

• Information Gain calculates the reduction in the entropy and measures how well a given feature
separates or classifies the target classes.
• The feature with the highest Information Gain is selected as the best one.
Decision Tree – ID3
• Pros:
• Builds fastest tree
• Builds short tree
• Searches the whole dataset to build the decision tree
• Generates Multibranch tree

• Cons:
• Handles only Categorical data
• Cannot perform pruning
• Prioritizes attributes with more values
• Do not handle imbalanced and missing data values
Decision Tree – CART
• CART is an alternative decision tree building algorithm.
• It can handle both classification and regression tasks.
• It generates binary trees. This algorithm uses a new metric named gini index to create decision
points for classification tasks.
• Gini index is a metric for classification tasks in CART. It stores sum of squared probabilities of each
class. We can formulate it as illustrated below.
Gini Index(Class) = 1 – Σ (Pi)2 for i=1 to number of subclasses
• Gini Index of feature = Weighted sum of the Gini Index of classes in a feature

• Gini Index(feature)=
Decision Tree – CART
• Process:
1. Start: Begin with the full dataset.
2. Split: Find the best feature and split point to divide the data into two subsets.
3. Create Node: Make a node in the tree based on the chosen split, with lower gini index.
4. Partition Data: Split the dataset into two subsets based on the chosen split.
• 2n - 2 possible ways to form two partitions of the data.
• n is the number of categories in an attribute
5. Recursive Splitting: Repeat steps 2-4 for each subset until stopping criteria are met.
6. Stopping Criteria: Stop growing the tree when certain conditions are reached, all instances belong
to the same class.
7. Pruning (optional): Trim the tree to avoid overfitting.
8. Output: Use the resulting tree for making predictions on new data.
Decision Tree – CART
• Pros:
• Handles both categorical and numerical values
• Handles Outliers
• Generates only binary trees

• Cons:
• Produces unstable decision trees
• Has a preference towards multivalued attributes
Naïve Bayes Algorithm

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Naïve Bayes Algorithm
• Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for
solving classification problems.
• Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in
building the fast machine learning models that can make quick predictions.
• Some popular applications of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and
classifying articles.
• Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability of a
hypothesis with prior knowledge. It depends on the conditional probability.
• The formula for Bayes' theorem is given as:

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
Naïve Bayes Algorithm
• Process:
1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Use Bayes theorem to calculate the posterior probability for a new instance.

4. Normalize the obtained Naïve Bayes probability values.

5. The value with highest normalized probability will be the result of classifier.
Naïve Bayes Algorithm
• Pros:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
• It can be used for Binary as well as Multi-class Classifications.
• Efficient with Large Datasets

• Cons:
• Assumption of Independence
• Limited Expressiveness
• Requires Sufficient Training Data
• Cannot capture the correlation among features
KNN Algorithm

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
KNN Algorithm
• The abbreviation KNN stands for “K-Nearest Neighbor”. It is a supervised machine learning algorithm.
• The algorithm can be used to solve both classification and regression problem statements.
• It is a voting system, where the majority class label determines the class label of a new data point among its
nearest ‘k’ (where k is an integer) neighbors in the feature space.
• K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
• It is also called a lazy learner algorithm because it does not learn from the training set immediately
instead it stores the dataset and at the time of classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that
data into a category that is much similar to the new data.
KNN Algorithm
KNN Algorithm - Process
1. Select the number K of the neighbors
2. Choose the appropriate distance measure and compute the distance of K number of neighbors
3. Take the K nearest neighbors as per the calculated distance.
4. Among these k neighbors, count the number of the data points in each category.
5. Assign the new data points to that category for which the number of the neighbor is maximum.
Types of Distance Metrics
a. Euclidean distance: It is the square root of the sum of squared distance between two points.

b. Manhattan distance: It is the sum of the absolute values of the differences between two points.
Types of Distance Metrics

c. Minkowski distance: It is used to find distance similarity between two points. Based on the below formula
changes to either Manhattan distance (When p=1) and Euclidean distance (When p=2).

d. Hamming Distance: It is used for categorical variables. This metric will tell whether two categorical
variables are the same or not.
Example
Example
K-NN Algorithm
• Pros:
• k-NN is easy to understand and implement.
• No Training Phase is needed.
• Can handle categorical and numerical data.
• It has versatile distance measures.
• Incremental learning

• Cons:
• Slow and inefficient for large datasets, especially as the size of the training set grows.
• Choosing right value of ‘k’
• Imbalanced Data
Logistic Regression

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Logistic Regression
• It is used for predicting the categorical dependent variable using a given set of independent variables.
• Instead of giving a categorical or discrete value, it gives the probabilistic values that lie between 0 and 1.
• In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which
predicts two maximum values (0 or 1).
Logistic Regression
• The sigmoid function is a mathematical function used to map the predicted values to probabilities.
• It maps any real value into another value within a range of 0 and 1.
• The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms
a curve like the "S" form.
• The S-form curve is called the Sigmoid function or the logistic function.

Assumptions:
• The dependent variable must be categorical in nature.
• The independent variable should not have multi-collinearity.

Cost Function:
• Cost function for logistic regression called log loss which is also derived from the maximum likelihood
estimation method.
Logistic Regression
Process using Gradient Descent:
1. Initialize parameters - Intercept and coefficient(s)
2. Substitute the values in sigmoid function and obtain

3. Compute the cost function

4. Compute Gradients

5. Update the intercept and coefficients using the partial derivatives

6. Reiterate the steps through steps 1 through 3 until optimal values of parameters are obtained.
Types of Logistic Regression
Multinominal Logistic Regression

Assumptions:
• The Dependent variable should be either nominal or ordinal variable.
• Set of one or more Independent variables can be continuous, ordinal or nominal.
• The Observations and dependent variables must be mutually exclusive and exhaustive.
• No Multicollinearity between Independent variables.
• There should be no Outliers in the data points.
Multinominal Logistic Regression
Process using Gradient Descent:
1. Initialize parameters - Intercept and coefficient(s)
2. Compute Logits: Calculate the logits for all samples in the batch for each class using the current model parameters (weights
and biases).

3. Compute Softmax Probabilities: Apply the softmax function to the logits to obtain the predicted probabilities for each class
for all samples in the batch.

4. Compute Loss: Calculate the cross-entropy loss between the predicted probabilities and the true class labels for all samples in
the batch.

5. Compute Gradients: Compute the gradients of the loss function with respect to the model parameters (weights and biases)
using the entire batch of samples. This involves taking the derivative of the loss function with respect to each parameter.

6. Update Parameters: Update the model parameters (weights and biases) using the gradients and the chosen optimization
algorithm (e.g., gradient descent, SGD, Adam)..
Support Vector Machine
• Step 1: SVM algorithm predicts the classes. One of the classes is identified as 1 while the other is identified as -1.
• Step 2: SVM classifier uses a loss function known as the hinge loss function to find the maximum margin.
• Step 3: There is a trade-off between maximizing margin and the loss generated if the margin is maximized to a very
large extent. A regularization parameter is used to handle this.
• Step 4: weights are optimized by calculating the gradients.
• Step 5: The gradients are updated only by using the regularization parameter when there is no error in the
classification while the loss function is also used when misclassification happens.
Support Vector Machine

Dr. Y. Krishna Bhargavi

Associate Professor
Department of CSE
GRIET
Support Vector Machine
• Support Vector Machines uses the concept of Margins to come up with predictions.
• The goal of the SVM algorithm is to create a hyperplane in an N-dimensional space that divides the data
points belonging to different classes.
• This hyperplane is chosen as the hyperplane providing the maximum margin between the two classes is
considered.
• These margins are calculated using data points known as Support Vectors.
• Support Vectors are those data points that are near to the hyper-plane and help in orienting it.
Support Vector Machine
Types of Margins:
Hard Margin: Hard Margin refers to that kind of decision boundary that makes sure that all the data points
are classified correctly.
While this leads to the SVM classifier not causing any error, it can also cause the margins to
shrink thus making the whole purpose of running an SVM algorithm futile.
Soft Margin: A regularization parameter is also added to the loss function in the SVM classification
algorithm.
This combination of the loss function with the regularization parameter allows the user to
maximize the margins at the cost of misclassification.
However, this classification needs to be kept in check, which gives birth to another
hyper-parameter that needs to be tuned.
Support Vector Machine
Types of Margins:
Support Vector Machine
• The data points or vectors that are the closest to the hyperplane and which affect the position of the
hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a Support
vector.
Types of SVM:
• Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into
two classes by using a single straight line, then such data is termed as linearly separable data, and classifier
is used called as Linear SVM classifier.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset cannot
be classified by using a straight line, then such data is termed as non-linear data and classifier used is called
as Non-linear SVM classifier.
Support Vector Machine
Linear SVM
• To compute the decision boundary, Lagrangian function is used to obtain w and b values.

Steps:
• Compute the partial derivatives of L with respect to w and b.

• Solve the dual maximization problem

• Compute partial derivatives of dual problem Lagrangian
• Compute margin

• Compute bias using any of the support vectors

Non Linear SVM
• Nonlinear SVM (Support Vector Machine) is necessary when the data cannot be effectively separated by a
linear decision boundary in the original feature space.
• Nonlinear SVM addresses this limitation by utilizing kernel functions to map the data into a
higher-dimensional space where linear separation becomes possible.
• Kernel Function transforms the training set of data so that a non-linear decision surface is able to transform
to a linear equation in a higher number of dimension spaces.
Support Vector Machine
Non Linear SVM

• Is the transformed/ mapped feature space.

• Issues in feature transformation:
• It is difficult to specify the explicit form of ϕ(X)
• If the original feature space is a quadratic or higher polynomial dimensions, then computing the dot
product is time consuming and computationally expensive.
• To solve this, Kernel trick is used. Instead of computing , Kernel function gives the direct value of
similarity.
• For example, if kernel is to be used as a quadratic polynomial of degree 2, then

• Then the feature transformation with kernel results in:

Non Linear SVM – Types of Kernels

Linear Kernel

Polynomial Kernel

Gaussian Kernel

Exponential Kernel

Laplacian Kernel

Hyperbolic Kernel
Support Vector Machine
Advantages of SVM:
• Effective in high dimensional cases
• Its memory efficient as it uses a subset of training points in the decision function called support vectors
• Different kernel functions can be specified for the decision functions and its possible to specify custom
kernels
THANK YOU

Business Analytics: Data Analysis and Decision Making (7e)
100% (1)
Business Analytics: Data Analysis and Decision Making (7e)
50 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
Data Warehouse and Data Mining MCQ Questions: Name: Shivani Dattatraya Chatte Roll No: 08
No ratings yet
Data Warehouse and Data Mining MCQ Questions: Name: Shivani Dattatraya Chatte Roll No: 08
46 pages
Cp4252 ML Unit-II
No ratings yet
Cp4252 ML Unit-II
44 pages
Proposal Heart Desease Prediction System
No ratings yet
Proposal Heart Desease Prediction System
30 pages
ML Unit 5..
No ratings yet
ML Unit 5..
40 pages
Unit 2
No ratings yet
Unit 2
136 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
A Review On Cybersecurity Based On Machine Learnin
No ratings yet
A Review On Cybersecurity Based On Machine Learnin
12 pages
Class 3 - Classification
No ratings yet
Class 3 - Classification
80 pages
Machine Learning Techniques - SDN
No ratings yet
Machine Learning Techniques - SDN
38 pages
Regression
No ratings yet
Regression
45 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Amity University: Jharkhand
No ratings yet
Amity University: Jharkhand
54 pages
Module 2 Modified
No ratings yet
Module 2 Modified
67 pages
ML Unit3b
No ratings yet
ML Unit3b
175 pages
Weekend 1 Slides
No ratings yet
Weekend 1 Slides
94 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Day.9 SML
No ratings yet
Day.9 SML
23 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
IDA117V Supervised ML
No ratings yet
IDA117V Supervised ML
39 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
Linear Regression
No ratings yet
Linear Regression
16 pages
Unit 2
No ratings yet
Unit 2
19 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Module 5.2
No ratings yet
Module 5.2
51 pages
Kanamori 22 A
No ratings yet
Kanamori 22 A
25 pages
ML Unit-2 Final
No ratings yet
ML Unit-2 Final
32 pages
Fiche Econo 2
No ratings yet
Fiche Econo 2
14 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
CSE3506 Module2 Notes
No ratings yet
CSE3506 Module2 Notes
96 pages
CS550 Regression
No ratings yet
CS550 Regression
62 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Unit2 ML Notes
No ratings yet
Unit2 ML Notes
19 pages
Unit I
No ratings yet
Unit I
14 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Unit 2
No ratings yet
Unit 2
18 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
SemVII MachineLearning
No ratings yet
SemVII MachineLearning
22 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
12 pages
Security and Communication Networks - 2022 - Ahmed - Machine Learning Techniques For Spam Detection in Email and IoT
No ratings yet
Security and Communication Networks - 2022 - Ahmed - Machine Learning Techniques For Spam Detection in Email and IoT
19 pages
Travel Time Prediction Using Machine Learning and Weather Impact On Traffic Conditions
No ratings yet
Travel Time Prediction Using Machine Learning and Weather Impact On Traffic Conditions
8 pages
Data Mining Techniques and Tools For Syn PDF
No ratings yet
Data Mining Techniques and Tools For Syn PDF
45 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Unit Iii
No ratings yet
Unit Iii
27 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Semester Suggestion Solution
No ratings yet
Semester Suggestion Solution
26 pages
Machine Learning Report
No ratings yet
Machine Learning Report
16 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Module 5
No ratings yet
Module 5
48 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
Sybca Bigdata MCQ
No ratings yet
Sybca Bigdata MCQ
47 pages
Chapter 2. Introduction To Data Mining: Prof. Keith Rennolls K.rennolls@gre - Ac.uk
No ratings yet
Chapter 2. Introduction To Data Mining: Prof. Keith Rennolls K.rennolls@gre - Ac.uk
28 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Decision Tree Problems
No ratings yet
Decision Tree Problems
14 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
ML Unit Ii
No ratings yet
ML Unit Ii
30 pages
Linear Regression
No ratings yet
Linear Regression
60 pages
Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
11 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Apl I Kasi Data Mining
No ratings yet
Apl I Kasi Data Mining
44 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
EdYoda Data Scientist Program Curriculum
No ratings yet
EdYoda Data Scientist Program Curriculum
14 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
3 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Prediction of Hospital Admission Using Machine Learning
No ratings yet
Prediction of Hospital Admission Using Machine Learning
9 pages
Data Analytics 01: Drag The Titanic Data Add Set Role Connect It Configure It
No ratings yet
Data Analytics 01: Drag The Titanic Data Add Set Role Connect It Configure It
2 pages
International Journal of Pavement Engineering
No ratings yet
International Journal of Pavement Engineering
6 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
Data Mining of Restaurant Review Using W PDF
No ratings yet
Data Mining of Restaurant Review Using W PDF
4 pages
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet

Machine Learning

Uploaded by

Machine Learning

Uploaded by

Unit II

Dr. Y. Krishna Bhargavi

Dr. Y. Krishna Bhargavi

• Types of Regression Models:

• Using B0 and B1 values, compute

• n is the number of observations

Dr. Y. Krishna Bhargavi

Dr. Y. Krishna Bhargavi

Dr. Y. Krishna Bhargavi

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

4. Normalize the obtained Naïve Bayes probability values.

Dr. Y. Krishna Bhargavi

Dr. Y. Krishna Bhargavi

3. Compute the cost function

5. Update the intercept and coefficients using the partial derivatives

Dr. Y. Krishna Bhargavi

• Solve the dual maximization problem

• Compute bias using any of the support vectors

• Is the transformed/ mapped feature space.

• Then the feature transformation with kernel results in:

You might also like