Machine Learning: Upendra Verma
Machine Learning: Upendra Verma
Upendra Verma
01/21/2025
Machine Learning
SSI
2
SSI
3
Features of Machine learning
• Machine learning is data driven technology. Large amount of
data generated by organizations on daily bases. So, by notable
relationships in data, organizations makes better decisions.
SSI
4
Types of machine learning problems
• Supervised learning: The model or algorithm is presented
with example inputs and their desired outputs and then finds
patterns and connections between the input and the output.
The goal is to learn a general rule that maps inputs to outputs.
The training process continues until the model achieves the
desired level of accuracy on the training data.
SSI
5
Supervised learning
SSI
6
Supervised learning
• Regression: Regression is a type of supervised learning where
the algorithm learns to predict continuous values based on
input features. The output labels in regression are continuous
values, such as stock prices, and housing prices.
SSI
7
Supervised learning
• Classification: Classification is a type of supervised learning
where the algorithm learns to assign input data to a specific
category or class based on input features. The output labels in
classification are discrete values. Classification algorithms can
be binary, where the output is one of two possible classes, or
multiclass, where the output can be one of several classes. The
different Classification algorithms in machine learning are:
Logistic Regression, Naive Bayes, Decision Tree, Support
Vector Machine (SVM), K-Nearest Neighbors (KNN), etc
SSI
8
Unsupervised learning
• Unsupervised learning: No labels are given to the learning
algorithm, leaving it on its own to find structure in its input. It
is used for clustering populations in different groups.
SSI
9
Unsupervised learning
SSI
10
Unsupervised learning
SSI
11
Semi-supervised learning
SSI
12
SSI
13
Reinforcement learning
SSI
14
Reinforcement learning
SSI
15
Reinforcement learning
SSI
18
Need for machine learning
• Predictive modeling: Machine learning can be used to build
predictive models that can help businesses make better decisions.
For example, machine learning can be used to predict which
customers are most likely to buy a particular product, or which
patients are most likely to develop a certain disease.
• Natural language processing: Machine learning is used to build
systems that can understand and interpret human language. This is
important for applications such as voice recognition, chatbots, and
language translation.
• Computer vision: Machine learning is used to build systems that
can recognize and interpret images and videos. This is important for
applications such as self-driving cars, surveillance systems, and
medical imaging.
• Fraud detection: Machine learning can be used to detect fraudulent
behavior in financial transactions, online advertising, and other
areas.
• Recommendation systems: Machine learning can be used to build
recommendation systems that suggest products, services, or content
to users based on their past behavior and preferences. SSI
19
Get started with machine learning
SSI
20
Applications of Machine Learning
SSI
23
Python libraries for Machine Learning
• Numpy
• Scipy
• Scikit-learn
• Theano
• TensorFlow
• Keras
• PyTorch
• Pandas
• Matplotlib
SSI
24
Least Squares(OLS) method of linear regression
SSI
25
Linear Regression
The model gets the best regression fit line by finding the best a
and b values
SSI
27
How to update a and b values to get the best-fit line
• Cost function
• In Linear Regression, the Mean Squared Error (MSE) cost
function is employed, which calculates the average of the
squared errors between the predicted values and the actual
values.
SSI
28
Gradient Descent for Linear Regression
• A linear regression model can be trained using the optimization
algorithm gradient descent by iteratively modifying the model’s
parameters to reduce the mean squared error (MSE) of the
model on a training dataset. To update a and b values in order
to reduce the Cost function (minimizing RMSE value) and
achieve the best-fit line the model uses Gradient Descent.
SSI
29
Gradient Descent for Linear Regression
SSI
30
Evaluation Metrics for Linear Regression
• A variety of evaluation measures can be used to determine the
strength of any linear regression model. These assessment
metrics often give an indication of how well the model is
producing the observed outputs.
• Coefficient of Determination (R-squared)
• R-Squared is a statistic that indicates how much variation the
developed model can explain or capture. It is always in the
range of 0 to 1. In general, the better the model matches the
data, the greater the R-squared number.
SSI
32
Decision Tree
• A decision tree is a type of supervised learning algorithm that is
commonly used in machine learning to model and predict
outcomes based on input data. It is a tree-like structure where
each internal node tests on attribute, each branch corresponds
to attribute value and each leaf node represents the final
decision or prediction.
SSI
33
Decision Tree Terminologies
• Root Node: A decision tree’s root node, which represents the original choice
or feature from which the tree branches, is the highest node.
• Internal Nodes (Decision Nodes): Nodes in the tree whose choices are
determined by the values of particular attributes. There are branches on
these nodes that go to other nodes.
• Leaf Nodes (Terminal Nodes): The branches’ termini, when choices or
forecasts are decided upon. There are no more branches on leaf nodes.
• Branches (Edges): Links between nodes that show how decisions are made
in response to particular circumstances.
• Splitting: The process of dividing a node into two or more sub-nodes based
on a decision criterion. It involves selecting a feature and a threshold to
create subsets of data.
• Parent Node: A node that is split into child nodes. The original node from
which a split originates.
• Child Node: Nodes created as a result of a split from a parent node.
• Decision Criterion: The rule or condition used to determine how the data
should be split at a decision node. It involves comparing feature values
against a threshold.
• Pruning: The process of removing branches or nodes from a decision tree to
improve its generalization and prevent overfitting. SSI
34
Why Decision Tree
• Decision trees are so versatile in simulating intricate decision-
making processes
• Their portrayal of complex choice scenarios that take into
account a variety of causes and outcomes is made possible by
their hierarchical structure.
• They provide comprehensible insights into the decision logic,
decision trees are especially helpful for tasks involving
categorization and regression.
• They are proficient with both numerical and categorical data,
and they can easily adapt to a variety of datasets thanks to
their autonomous feature selection capability.
• Decision trees also provide simple visualization, which helps to
comprehend and elucidate the underlying decision processes in
a model.
SSI
35