0% found this document useful (0 votes)
3 views

1. U1 ML Intro and Applications

The document outlines the syllabus for a course on the fundamentals of machine learning, covering topics such as types of learning, machine learning life cycle, evaluation metrics, and applications of machine learning. It explains key concepts like supervised, unsupervised, and reinforcement learning, as well as the importance of feature representation and evaluation metrics like precision, recall, and F1 score. Additionally, it discusses the role of machine learning in real-world applications, particularly in image classification for autonomous vehicles.

Uploaded by

q7ak26tja0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

1. U1 ML Intro and Applications

The document outlines the syllabus for a course on the fundamentals of machine learning, covering topics such as types of learning, machine learning life cycle, evaluation metrics, and applications of machine learning. It explains key concepts like supervised, unsupervised, and reinforcement learning, as well as the importance of feature representation and evaluation metrics like precision, recall, and F1 score. Additionally, it discusses the role of machine learning in real-world applications, particularly in image classification for autonomous vehicles.

Uploaded by

q7ak26tja0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

MIT School of Computing

Department of Information Technology

Third Year Engineering

21BTDA503 Fundamental of Machine Learning


Class - T.Y. (SEM-V)
PLD
Unit - I

Introduction to Machine Learning

AY 2024-2025 SEM-V

1
MIT School of Computing
Department of Information Technology

Unit-I Syllabus
Motivation and role of machine learning (ML) in computer
science and problem solving Representation (features), types of
learning, Machine learning development life cycle, Applications
PLD
of Machine learning, linear transformations,
Appropriate linear transformations and matrix vector operations
in the context of data and representation, Evaluation metrics
(confusion matrix) - Accuracy, precision, F1 score and recall,
MSE, MAE, Bias-Variance trade off

2
Machine learning

Machine learning is a subfield of artificial intelligence (AI)


that focuses on the development of algorithms and
models that enable computers to learn and make
predictions or decisions without explicit programming.

It is concerned with designing systems that can


automatically analyze and interpret data, identify patterns,
and make informed predictions or take actions based on
3
the patterns and insights learned from the data.
Machine learning

At its core, machine learning involves training


models on a large amount of data, known as a
training set. The models learn from this data by
identifying patterns, relationships, and
correlations.

Through iterative processes, the models refine


their understanding and improve their
performance over time.

Once trained, the models can make predictions or


decisions on new, unseen data by applying the
knowledge acquired during training.

4
Machine learning

Machine learning algorithms can be categorized


into several types, including

• supervised learning,
• unsupervised learning, and
• reinforcement learning.

5
What is Machine Learning

• Arthur Samuel (1959) :


Machine learning is a subset of
Artificial Intelligence (AI)
which provides machines the
ability to learn automatically &
improve from experience
without being explicitly
programmed to do so.

6
Cont.…

• Tom M. Mitchell (1998): A


computer program is said to
learn from experience E with
respect to some class of tasks T
and performance measure P if
its performance at tasks in T, as
measured by P, improves with
experience E.

7
Cont.…
• For the checkers playing example:
• E: the experience of having the program play 10,000 of games against itself
• T: the task of playing checkers
• P: the probability to win the game

• Suppose your email program watches which emails you do or do not mark
as spam, and based on that learns how to better filter spam. What is the
task T in this setting?
a) Classifying emails as spam or not spam
b) Watching you label emails as spam or not spam
c) The number of emails correctly classified as spam/not spam
d) None of the above- this is not a machine learning problem

8
Cont.…
• Suppose your email program watches which emails you do or
do not mark as spam, and based on that learns how to better
filter spam. What is the task T in this setting?
a) Classifying emails as spam or not spam (Answer)
T
b) Watching you label emails as spam or not spam E
c) The number of emails correctly classified as spam/not spam P
d) None of the above- this is not a machine learning problem

9
Machine Learning Terminology
• Algorithm: A Machine Learning algorithm is a set of rules and statistical
techniques used to learn patterns from data and draw significant
information from it.
• Model: A model is trained by using a Machine Learning Algorithm.
• Predictor Variable: It is a feature(s) of the data that can be used to predict
the output.
• Response Variable: It is the feature or the output variable that needs to be
predicted by using the predictor variable(s).
• Training Data: The Machine Learning model is built using the training
data. The training data helps the model to identify key trends and patterns
essential to predict the output.
• Testing Data: After the model is trained, it must be tested to evaluate how
accurately it can predict an outcome. This is done by the testing data set.
10
Types of Machine Learning Algorithm
(3 Types)
• Supervised Learning: Supervised learning is a technique in
which we teach or train the machine using data which is well
labeled.

11
Cont.…
• Unsupervised Learning : Unsupervised learning involves
training by using unlabeled data and allowing the model to act
on that information without guidance.

12
Cont.…
• Reinforcement Learning
is a part of Machine
learning where an agent is
put in an environment and
he learns to behave in this
environment by
performing certain actions
and observing the
rewards which it gets from
those actions.

13
Machine learning

• In supervised learning, the


models learn from labeled
examples where each data
point is associated with a
known outcome or label.
• Unsupervised learning, on the
other hand, involves learning
from unlabeled data to identify
hidden patterns or structures.
• Reinforcement learning
focuses on learning through
interaction with an
environment, where the model
receives feedback or rewards
for its actions.
14
Motivation and Role of ML

15
Problem Solving Representation (features)

• In machine learning, the choice and design


of features are crucial as they directly impact
the performance and effectiveness of the
models. The goal is to extract relevant
information that helps the models
understand the underlying patterns or
relationships in the data.

• Problem-solving representation refers to the


process of transforming raw data into a
format that is suitable for machine learning
algorithms to solve a specific problem.

• It involves selecting and creating relevant


features or variables that capture meaningful
information from the data. 16
Problem Solving Representation (features)

Feature representation can include various techniques such as:

• Feature Selection: Selecting a subset of existing features that are most informative and
relevant to the problem at hand, thereby reducing dimensionality and improving model
efficiency.

17
Problem Solving Representation (features)

Feature representation can include various techniques such as:

• Feature Extraction: Transforming the data into a new set of features by applying mathematical
or statistical methods. This can involve techniques like Principal Component Analysis (PCA),
which identifies the most important components or dimensions of the data.

18
Problem Solving Representation (features)

Feature representation can include various


techniques such as:

• Feature Engineering: Creating new features


based on domain knowledge or expert insights.
This can involve combining or transforming
existing features, creating interaction terms, or
encoding categorical variables.

19
Machine learning development life cycle

20
Machine learning development life
cycle

21
22

Applications :https://www.javatpoint.com/applications-of-machine-learning
Applications of Machine learning,
Image Classification for Autonomous Vehicles

Motivation: Autonomous vehicles rely on


advanced computer vision systems to
understand and interpret the surrounding
environment.

One crucial task is image classification, where


the vehicle's sensors capture images of the
road, traffic signs, pedestrians, and other
objects, and the ML algorithm determines the
class or category of each object.

23
Applications of Machine learning,
Image Classification for Autonomous Vehicles

Role of ML: Machine learning plays a pivotal role


in enabling image classification for autonomous
vehicles. The ML algorithm is trained on a vast
dataset of labeled images, where humans have
manually annotated each image with its
respective class (e.g., car, pedestrian, traffic
sign).

The algorithm learns to recognize patterns,


features, and characteristics that distinguish
different classes. This trained ML model is then
deployed in the vehicle's system to classify
objects in real-time based on the incoming
camera feed.

24
Applications of Machine learning,
Image Classification for Autonomous Vehicles

Benefits: By leveraging ML for image


classification, autonomous vehicles can make
critical decisions, such as identifying and
reacting to traffic signs, detecting pedestrians,
or recognizing obstacles on the road.

ML algorithms can handle complex and diverse


visual inputs, adapt to various environmental
conditions, and continuously improve their
accuracy through iterative learning.

25
Applications of Machine learning,
Image Classification for Autonomous Vehicles

Challenges: However, there are challenges in achieving accurate image classification for
autonomous vehicles. ML models need to be trained on extensive and diverse datasets to
account for variations in lighting conditions, weather, and object appearances.

Balancing accuracy and computational efficiency is also crucial for real-time processing.
Additionally, ensuring the robustness and reliability of the ML algorithms against adversarial
attacks or unexpected scenarios is an ongoing challenge.

26
Linear Transformations

62
Linear Transformations

63
Linear Transformations

64
Linear Transformations

65
Linear Transformations

66
Linear Transformations

67
Linear Transformations

68
Linear Transformations

69
Linear Transformations

70
Linear Transformations

71
Linear Transformations

72
Matrices Application

73
Matrices Application

74
Matrices Application

75
Matrices Application

76
Matrices Application

77
Matrices Application

78
Matrices Application

79
Matrices Application

80
Matrices Application

81
Matrices Application

82
Matrices Application

83
Convolution

is known as the feature detector of a CNN.


The input to a convolution can be raw data or a feature map output from another
convolution
Matrices Application

85
Matrices Application

86
Matrices Application
Evaluation metrics (confusion matrix)

Precision, F1 Score Recall,

88
Confusion Matrix

Performance metrics can be decisive when dealing


with imbalanced data. We will learn about the
Confusion matrix and its associated terms, which
looks confusing but are trivial.

The confusion matrix, precision, recall, and F1


score gives better intuition of prediction results as
compared to accuracy.

89
Confusion Matrix

What is a confusion matrix?

Confusion matrix is a very popular measure used while


solving classification problems. It can be applied to
binary classification as well as for multiclass
classification problems.

It is a matrix of size 2×2 for binary classification


with actual values on one axis and predicted on
another.

90
Confusion Matrix

Let’s understand the confusing terms in the


confusion matrix: true positive, true negative, false
negative, and false positive with an example.

EXAMPLE

A machine learning model is trained to predict


whether an email is spam (Positive) or not spam.
(Negative). The test dataset consists of 100 mails.

91
Confusion Matrix

True Positive (TP) — model correctly predicts the positive class (prediction and actual both
are positive). In the above example, 45 mails which were spam are predicted positively by the
model.
True Negative (TN) — model correctly predicts the negative class (prediction and actual both
are negative). In the above example, 30 mails which were not –spam mails are predicted
correctly i.e negative by the model.
False Positive (FP) — model gives the wrong prediction of the negative class (predicted-
positive, actual-negative). In the above example, 5 mails are predicted as spams although
they were not spams. FP is also called a TYPE I error.
False Negative (FN) — model wrongly predicts the positive class (predicted-negative, actual-
positive). In the above example 20 mails are predicted as not -spams although they were
spams.. FN is also called a TYPE II error.

92
Confusion Matrix for Binary
Classification

93
Confusion Matrix for Binary
Classification

94
Confusion Matrix for Binary
Classification

95
Confusion Matrix

With the help of these four values, we can calculate True Positive Rate (TPR), False Negative
Rate (FPR), True Negative Rate (TNR), and False Negative Rate (FNR).

Recall (sensitivity )

Specificity

Even if data is imbalanced, we can figure out that our model is working well or not. For that,
the values of TPR and TNR should be high, and FPR and FNR should be as low as possible.

96
Confusion Matrix

With the help of TP, TN, FN, and FP, other performance metrics can be calculated.

Precision, Recall

Both precision and recall are crucial for information retrieval, where positive class mattered
the most as compared to negative. Why?

While searching something on the web, the model does not care about something
irrelevant and not retrieved (this is the true negative case). Therefore only TP, FP, FN are
used in Precision and Recall.

Out of all the positive predicted, what percentage is truly positive.

The precision value lies between 0 and 1.


97
Confusion Matrix

Recall

Out of the total positive, what percentage are predicted positive. It is the same as TPR (true
positive rate).

98
Confusion Matrix

How are precision and recall useful? Spam detection

In the detection of spam mail, it is okay if any spam mail


remains undetected (false negative), but what if we miss
any critical mail because it is classified as spam (false
positive). In this situation, False Positive should be as low
as possible. Here, precision is more vital as compared to
recall.

When comparing different models, it will be difficult to


decide which is better (high precision and low recall or
vice-versa). Therefore, there should be a metric that
combines both of these. One such metric is the F1 score.

99
Recall can also be called sensitivity or true positive rate. The term "sensitivity" is more
commonly used in medical and biological research rather than machine learning

10
0
Classification Model

• The most common metrics used for classification models include


accuracy, precision, recall, F1-score, and area under the receiver
operating characteristic (ROC) curve.

101
Accuracy, Precision, Recall, F1- Score

102
Accuracy, Precision, Recall, F1- Score

103
Accuracy, Precision, Recall, F1- Score

104
Accuracy, Precision, Recall, F1- Score

105
Accuracy, Precision, Recall, F1- Score

106
Accuracy, Precision, Recall, F1- Score

107
Accuracy, Precision, Recall, F1- Score

108
Accuracy, Precision, Recall, F1- Score

109
Accuracy, Precision, Recall, F1- Score

110
Accuracy, Precision, Recall, F1- Score

111
Accuracy, Precision, Recall, F1- Score

• Precision is the ratio of correct positive predictions to


the total number of positively predicted classes.
• Recall is the ratio of correct positive predictions to
the total number of positive classes.

112
Contd..

113
Contd..

Accuracy = (1984 + 107)/(1984 + 107 + 336 + 447) = 72.76%


Precision = (1984)/(1984 + 336) = 85.51%
Recall = (1984)/(1984 + 447) = 81.61%

114
Contd..

115
Contd..

Accuracy = ( 107 + 1984 ) /( 107 + 1984 + 447 + 336 ) = 72.76%


Precision = (107)/(107 + 447) = 19.31%
Recall = (107)/(107 + 336) = 24.15%

116
Example 2 – Spam Detection

117
Sensitivity Vs Specificity

118
Sensitivity Vs Specificity

119
Sensitivity Vs Specificity

120
https://www.bitsathy.ac.in/matrices-in-data-
science/#:~:text=Linear%20Transformation&text=In%20data%20science%2C%20linear%20transform
ations,leading%20to%20effective%20dimensionality%20reduction.

Good read

12
Annual Review September 3, 20XX
2
Regression Metrics

• Before we get into the top evaluation metrics, you need to


understand what "residual" means when you're evaluating a
regression model It is not ideal or possible for a model to accurately
predict the value of a continuous variable in a regression problem.
• A regression model can only predict values that are lower or higher
than the actual value. As a result, the only way to determine the
model’s accuracy is through residuals.
• Residuals are the difference between the actual and predicted
values. You can think of residuals as being a distance. So, the closer
the residual is to zero, the better our model performs in making its
predictions.

123
Residual

124
Mean Absolute Error

• The MAE is simply defined as the sum of all the


distances/residual s(the difference between the actual and
predicted value) divided by the total number of points in the
dataset.
• You can calculate the MAE using the following formula:

125
Mean Absolute Error

126
Mean Absolute Error

• The MAE is simply defined as the sum of all the distances/residual s(the
difference between the actual and predicted value) divided by the total
number of points in the dataset.
• Low MAE values indicate that the model is correctly predicting. Larger
MAE values indicate that the model is poor at prediction.
• You can calculate the MAE using the following formula:

127
Mean Absolute Error

Advantages of MAE
• The MAE you get is in the same unit as the output variable.

• It is most Robust to outliers.

Disadvantages of MAE
• The graph of MAE is not differentiable so we have to apply various
optimizers like Gradient descent which can be differentiable.

128
Mean Squared Error (MSE)

• MSE is a most used and very simple metric with a little bit of change
in mean absolute error. Mean squared error states that finding the
squared difference between actual and predicted value.

129
Mean Squared Error (MSE)

Advantages of MSE
• The graph of MSE is differentiable, so you can easily use it as a loss
function.

Disadvantages of MSE
• The value you get after calculating MSE is a squared unit of output. for
example, the output variable is in meter(m) then after calculating MSE
the output we get is in meter squared.
• If you have outliers in the dataset then it penalizes the outliers most and
the calculated MSE is bigger. So, in short, It is not Robust to outliers which
were an advantage in MAE.
130
Root Mean Squared Error (RMSE)

Disadvantages of RMSE
• It is not that robust to outliers as compared to MAE.

Advantages of RMSE
• The output 2value you get is in the same unit as the required output
variable which makes interpretation of loss easy.

131
• Mean Absolute Error (MAE) is simple to calculate and handles
outliers well but is not differentiable at zero.
• Mean Squared Error (MSE) is sensitive to outliers and penalizes
larger errors more due to squaring.
• Root Mean Squared Error (RMSE) provides an intuitive measure of
model accuracy and is easy to interpret

132
Bias Vs Variance

➢ There are two main types of errors present in any machine


learning model. They are Reducible Errors and Irreducible Errors.
• Irreducible errors are errors which will always be present in a
machine learning model, because of unknown variables, and whose
values cannot be reduced.
• Reducible errors are those errors whose values can be further
reduced to improve a model. They are caused because our model’s
output function does not match the desired output function and
can be optimized.
➢ We can further divide reducible errors into two: Bias and Variance.

133
Bias Vs Variance

➢ Bias is the difference between the average prediction of our model and
the correct value which we are trying to predict. Model with high bias
pays very little attention to the training data and oversimplifies the
model. It always leads to high error on training and test data.

➢ Variance is the variability of model prediction for a given data point or a


value which tells us spread of our data. Model with high variance
pays a lot of attention to training data and does not generalize
on the data which it hasn’t seen before. As a result, such models
perform very well on training data but has high error rates on test data.

134
Bias Vs Variance

135
Bias Vs Variance

136
Bias Vs Variance

Mohit Kumar 137


Bias Vs Variance

138
Bias Vs Variance

139
Bias Vs Variance

140
Bias Vs Variance

141
Bias Vs Variance

142
Bias Vs Variance

143
Bias Vs Variance

144
Bias Vs Variance

145
Bias Vs Variance

146
Bias Vs Variance

147
Bias Vs Variance

148
Bias Vs Variance

149
Bias Vs Variance

150
Bias Vs Variance

151
Bias Vs Variance

152
Bias Vs Variance

153
Bias Vs Variance

154
What is Machine Learning

• Arthur Samuel (1959) :


Machine learning is a subset of
Artificial Intelligence (AI)
which provides machines the
ability to learn automatically &
improve from experience
without being explicitly
programmed to do so.

155
Cont.…

• Tom M. Mitchell (1998): A


computer program is said to
learn from experience E with
respect to some class of tasks T
and performance measure P if
its performance at tasks in T, as
measured by P, improves with
experience E.

156
Cont.…
• For the checkers playing example:
• E: the experience of having the program play 10,000 of games against itself
• T: the task of playing checkers
• P: the probability to win the game

• Suppose your email program watches which emails you do or do not mark
as spam, and based on that learns how to better filter spam. What is the
task T in this setting?
a) Classifying emails as spam or not spam
b) Watching you label emails as spam or not spam
c) The number of emails correctly classified as spam/not spam
d) None of the above- this is not a machine learning problem

157
Cont.…
• Suppose your email program watches which emails you do or
do not mark as spam, and based on that learns how to better
filter spam. What is the task T in this setting?
a) Classifying emails as spam or not spam (Answer)
T
b) Watching you label emails as spam or not spam E
c) The number of emails correctly classified as spam/not spam P
d) None of the above- this is not a machine learning problem

158
End of Machine Learning Basics…
Keep Learning ☺

Mohit Kumar 159

You might also like