0% found this document useful (0 votes)

6 views134 pages

ML - Introduction - Linear Regression - Regularization

The document provides an introduction to machine learning (ML), defining it as a process where a computer program improves its performance on a task through experience. It outlines various types of ML, including supervised, unsupervised, and reinforcement learning, along with key concepts such as regression, classification, and decision-making. Additionally, it highlights the importance of metrics for evaluating model performance in both regression and classification tasks.

Uploaded by

raj.prakhar26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views134 pages

ML - Introduction - Linear Regression - Regularization

Uploaded by

raj.prakhar26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 134

Introduction to Machine

Learning

Dr. Saketh Athkuri

What we have seen so far
Statistics
• Mean
• Median
• Mode
• IQR
• CI
• Hypothesis testing
• 𝑡-test
• 𝑧-test
• 𝜒 2 -test
• 𝐹-test

10-09-2024 10:35 AM 2
ML definition
A computer program is said to learn from Experience E with respect to
task T and performance measure P, if its performance at task T as
measured by P improves with experience E.

Example: Alan Turing, Loan approval example

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

2. https://www.wordstream.com/blog/ws/2017/07/28/machine-learning-applications

10-09-2024 10:35 AM 3
Artificial Intelligence

Enigma code

Data RULES Output

10-09-2024 10:35 AM 4
Machine Learning

Features
Black box Rules
Output

10-09-2024 10:35 AM 5
Deep Learning

Raw data
Black box Rules
Output

10-09-2024 10:35 AM 6
AI and its fields

Artificial Intelligence

Machine Learning

Deep Learning

10-09-2024 10:35 AM 7
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 8
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 9
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 10
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box
Rul
es

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 11
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 12
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 13
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 14
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 15
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 16
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 17
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Rul
Black box
es

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 18
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 19
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 20
ML definition
A computer program is said to learn from Experience E with respect to task T and performance measure P,
if its performance at task T as measured by P improves with experience E.

Black box Rules

1. Mitchell T M (1997), Machine Learning. McGraw-Hill, New York

10-09-2024 10:35 AM 21
ML overview

Machine
Learning

10-09-2024 10:35 AM 23
ML overview

Regression Classification
•Linear regression •Logistic regression
•Forecasting •Naive-Bayes
•Decision Trees
•SVM, knn
Supervised •Decision trees
•SVM, knn
•Ensemble techniques •Ensemble techniques

10-09-2024 10:35 AM 24
ML overview

Machine
Learning

10-09-2024 10:35 AM 25
ML overview
Dimension
ality
reduction
(PCA)

Clustering
Unsupervised •k-means
•Hierarchical
•DB-Scan

Association
rules

10-09-2024 10:35 AM 26
ML overview

Machine
Learning

10-09-2024 10:35 AM 27
ML overview

Reinforcement

10-09-2024 10:35 AM 28
ML overview

Machine
Learning

10-09-2024 10:35 AM 29
ML overview

10-09-2024 10:35 AM 30
ML overview

Optimization

10-09-2024 10:35 AM 31
Machine Learning – Experience, E
Input 𝒙 ∈ ℝ𝑝 p: number of features in the data

p-tuple
𝑥1 𝑥2 𝑥3 ⋯ 𝑥𝑝
𝒙 = (𝑥1 , 𝑥2 , 𝑥3 , … 𝑥𝑝 ) 𝑳𝒂𝒃𝒆𝒍: 𝒚
𝒙𝟏 𝑥11 𝑥12 𝑥13 ⋯ 𝑥1𝑝
𝒙𝟏 = (𝑥11 , 𝑥12 , 𝑥13 , … 𝑥1𝑝 ) 𝒚𝟏 𝒙𝟐 𝑥21 𝑥22 𝑥23 ⋯ 𝑥2𝑝
⋮ ⋮ ⋮ ⋯ ⋮
𝒙𝒏 𝑥𝑛1 𝑥𝑛2 𝑥𝑛3 ⋯ 𝑥𝑛𝑝
𝒙𝒏 = (𝑥𝑛1 , 𝑥𝑛2 , 𝑥𝑛3 , … 𝑥𝑛𝑝 ) 𝒚𝒏

10-09-2024 10:35 AM 32
Machine Learning – Task, T
• Predict or forecast a value • Group objects
• Classify an object in to one of ‘n’ • Identify areas of interest in an
given categories image – segmentation
• Anomaly detection • Fastest route between two cities
• Transcription • Combination of stocks with
• Translation maximum ROI
• Synthesis of a new exemplar • Predict the next product the
customer will buy
• Determination of missing value –
imputation
• Data Cleaning – Denoising
• Estimation of probability mass
function or density
10-09-2024 10:35 AM 33
What is decision making?
• Decision making is the process of identifying and selecting a
course of action among several alternatives to achieve a
desired outcome.

• Decision making is essential for navigating

uncertainties and achieving
organizational goals.
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
NIFTY50

Types of decision making

1. Certainty
Mid cap

2. Risk
Small cap

3. Uncertainty
Image Source: Link
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
Types of decision making
1. Certainty

2. Risk

3. Uncertainty
Image Source: Link
Supervised Learning
Linear regression and logistic regression
Supervised learning
• Labelled data – target column or dependent variable

• Labelled data can be numerical or categorical

• What is numerical data – Eg: Age
• What is categorical data – Eg: Type

• Generally, model assumes some relationship. Eg: Linear and logistic

regression

• Applications (identify the right applications)

• Image Classification, Spam Detection, Customer Segmentation, Network
Anomaly, Fraud Detection, House Price Prediction, Handwriting Recognition

10-09-2024 10:35 AM 43
Metrics
Regression Classification
MAE Accuracy
MSE Recall
RMSE Precision
MAPE F1-score
R-square

10-09-2024 10:35 AM 44
Metrics
Regression Classification
MAE Accuracy
MSE Recall
RMSE Precision
MAPE F1-score
R-square

10-09-2024 10:35 AM 45
Linear regression
Simple and Multiple

10-09-2024 10:35 AM 46
MPG – application in automobile sector
• Suppose you want to launch a new car model and wants to find
mileage of it.

• How to find it?

10-09-2024 10:35 AM 47
Dataset

10-09-2024 10:35 AM 48
Dataset

10-09-2024 10:35 AM 49
Dataset

10-09-2024 10:35 AM 50
Dataset
y X

10-09-2024 10:35 AM 51
Dataset
y = 𝛽1 X + 𝛽0

10-09-2024 10:35 AM 52
Model summary (R)

10-09-2024 10:35 AM 53
Model summary (R)

10-09-2024 10:35 AM 54
Model summary (R)

10-09-2024 10:35 AM 55
Model summary (R)

10-09-2024 10:35 AM 56
Model summary (R)

10-09-2024 10:35 AM 57
Model summary (R)

10-09-2024 10:35 AM 58
Model summary (R)

𝑚𝑝𝑔 = −0.0077(𝑤𝑒𝑖𝑔ℎ𝑡) + 46.31

10-09-2024 10:35 AM 59
Model summary (R)

𝑚𝑝𝑔 = −0.0077(𝑤𝑒𝑖𝑔ℎ𝑡) + 46.31

10-09-2024 10:35 AM 60
Model summary (R) What is p-value?

𝑚𝑝𝑔 = −0.0077(𝑤𝑒𝑖𝑔ℎ𝑡) + 46.31

10-09-2024 10:35 AM 61
Model summary (R)

10-09-2024 10:35 AM 62
Multiple
𝑦ො linear regression
= 𝛽𝑖 X𝑖 + 𝛽0

10-09-2024 10:35 AM 63
Model summary (MLR)

10-09-2024 10:35 AM 64
Model summary (MLR)

10-09-2024 10:35 AM 65
Model summary (MLR)

10-09-2024 10:35 AM 66
Model summary (MLR)

10-09-2024 10:35 AM 67
Model summary (MLR)

10-09-2024 10:35 AM 68
Model summary (MLR)

10-09-2024 10:35 AM 69
Model summary (MLR)

10-09-2024 10:35 AM 70
Model summary (MLR)

10-09-2024 10:35 AM 71
Mutual fund manager skill

10-09-2024 10:35 AM 72
Mutual fund manager skill

10-09-2024 10:35 AM 73
Mutual fund manager skill

10-09-2024 10:35 AM 74
Mutual fund manager skill

10-09-2024 10:35 AM 75
Mutual fund manager skill

10-09-2024 10:35 AM 76
Application

10-09-2024 10:35 AM 77
Application

10-09-2024 10:35 AM 78
𝜶, 𝜷 values

10-09-2024 10:35 AM 79
𝜶, 𝜷 values

10-09-2024 10:35 AM 80
𝜶, 𝜷 values

10-09-2024 10:35 AM 81
𝜶, 𝜷 values

10-09-2024 10:35 AM 82
How to find 𝜷𝒊 ?
𝑦ො = 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽0

2
SSE: 𝑦 − 𝑦ො
Minimize SSE to get 𝛽s.

2
Obj function: 𝑦 − 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽0

10-09-2024 10:35 AM 83
Visualization

10-09-2024 10:35 AM 84
Visualization

10-09-2024 10:35 AM 85
Visualization

10-09-2024 10:35 AM 86
Assumptions
1.Linearity: The relationship between independent and dependent
variables is linear. This can be checked using scatter plots or
residual plots.
2.Independence: Observations are independent of each other. This
assumption is often verified through knowledge of the data collection
and experiment design.
3.Homoscedasticity: The variance of the residuals (or "errors")
should be constant across all levels of the independent variables. A
plot of residuals vs. predicted values can help check this.
4.Normality of Errors: The residuals (or "errors") should be
approximately normally distributed. This can be checked using
histograms or QQ-plots of residuals

10-09-2024 10:35 AM 87
Residual plots

Linearity Normality

Homoscedasticity

10-09-2024 10:35 AM 88
Residual plots

Linearit Normality

Homoscedasticity

10-09-2024 10:35 AM 89
Residual plots

Linearity Normality

Homoscedasticity

10-09-2024 10:35 AM 90
Residual plots

Linearity
Normality

Homoscedasticity

10-09-2024 10:35 AM 91
Residual plots

Linearity Normality

Homoscedasticity

10-09-2024 10:35 AM 92
Residual plots

Linearity Normality

Homoscedasticity

10-09-2024 10:35 AM 93
Residual plots

Linearity Normality

Homoscedasticity

10-09-2024 10:35 AM 94
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 95
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 96
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 97
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 98
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 99
Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 100

Beware of Influential points
• Leverage – measure of how much the independent variable values
of an observation differ from the mean of those independent
variables.

• High residual points – points having high residuals can also be

influential points.

• Cook's Distance: Cook's Distance is a measure that combines

leverage and residual to identify influential points. It measures the
effect of deleting a given observation.

10-09-2024 10:35 AM 101

Python code using statsmodels
import statsmodels.api as sm

# Define independent variables (X) and dependent variable (y)

X = df[['horsepower', 'weight']]
X = sm.add_constant(X) # Add a constant column for the
intercept
y = df['mpg']

# Fit the linear regression model

model_statsmodels = sm.OLS(y, X).fit()

# Print the summary of the regression

print(model_statsmodels.summary())

10-09-2024 10:35 AM 102

Python code using sklearn
from sklearn.linear_model import LinearRegression

# Define independent variables (X) and dependent variable (y)

X = df[['horsepower', 'weight']]
y = df['mpg']

# Initialize and fit the linear regression model

model_sklearn = LinearRegression().fit(X, y)

# Print the coefficients and intercept

print("Intercept:", model_sklearn.intercept_)
print("Coefficients:", model_sklearn.coef_)

10-09-2024 10:35 AM 103

Multicollinearity
What is multi-collinearity?

Variance Inflation Factor (VIF):

1
𝑉𝐼𝐹 𝑋𝑖 =
1 − 𝑅𝑖2

In practice, a VIF value exceeding 5 or 10 suggests that

multicollinearity may be a problem and should be further investigated.

10-09-2024 10:35 AM 105

Numerical attributes

10-09-2024 10:35 AM 106

Handling Categorical Attributes
Qualification
_btech
_phd
_mtech
10Btech

01Mtech

01Phd

01Mtech

10-09-2024 10:35 AM 107

Handling Categorical Attributes
Qualification Qualification
_phd
_mtech _btech
0Btech 1

1Mtech
0 0

0Phd
1 0

1Mtech
0 0

10-09-2024 10:35 AM 108

Handling Categorical Attributes
Qualification Qualification Qualification
_phd _btech _mtech
0Btech 1 0

0Mtech 0 1

1Phd 0 0

0Mtech 0 1

10-09-2024 10:35 AM 109

Handling Categorical Attributes
Qualification Qualification Qualification Qualification
_btech _mtech _phd
Btech 1 0 0

Mtech 0 1 0

Phd 0 0 1

Mtech 0 1 0

10-09-2024 10:35 AM 110

Handling Categorical Attributes
Qualification Qualification Qualification Qualification
_btech _mtech _phd
Btech 1 0 0

Mtech 0 1 0

Phd 0 0 1

Mtech 0 1 0

10-09-2024 10:35 AM 111

Transformations – handle non-linear data

10-09-2024 10:35 AM 112

Transformations – handle non-linear data

10-09-2024 10:35 AM 113

Transformations – handle non-linear data