0% found this document useful (0 votes)
24 views

Introduction To Machine Learning Notes

Uploaded by

haqulfathima
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Introduction To Machine Learning Notes

Uploaded by

haqulfathima
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Introduction to Machine Learning

Arthur Samuel, a pioneer in the field of artificial intelligence and computer


gaming, coined the term “Machine Learning”. He defined machine learning
as – a “Field of study that gives computers the capability to learn without
being explicitly programmed”. In a very layman’s manner, Machine
Learning(ML) can be explained as automating and improving the learning
process of computers based on their experiences without being actually
programmed i.e. without any human assistance. The process starts with
feeding good quality data and then training our machines(computers) by
building machine learning models using the data and different algorithms. The
choice of algorithms depends on what type of data we have and what kind of
task we are trying to automate.

What is Machine Learning?


Machine Learning is a branch of artificial intelligence that develops algorithms
by learning the hidden patterns of the datasets used it to make predictions on
new similar type data, without being explicitly programmed for each task.
Traditional Machine Learning combines data with statistical tools to predict an
output that can be used to make actionable insights.
Machine learning is used in many different applications, from image and
speech recognition to natural language processing, recommendation systems,
fraud detection, portfolio optimization, automated task, and so on. Machine
learning models are also used to power autonomous vehicles, drones, and
robots, making them more intelligent and adaptable to changing environments.

A typical machine learning tasks are to provide a recommendation.


Recommender systems are a common application of machine learning, and
they use historical data to provide personalized recommendations to users. In
the case of Netflix, the system uses a combination of collaborative filtering
and content-based filtering to recommend movies and TV shows to users
based on their viewing history, ratings, and other factors such as genre
preferences.

Reinforcement learning is another type of machine learning that can be used to


improve recommendation-based systems. In reinforcement learning, an agent
learns to make decisions based on feedback from its environment, and this
feedback can be used to improve the recommendations provided to users. For
example, the system could track how often a user watches a recommended
movie and use this feedback to adjust the recommendations in the future.

Personalized recommendations based on machine learning have become


increasingly popular in many industries, including e-commerce, social edia,
and online advertising, as they can provide a better user experience and
increase engagement with the platform or service.

The breakthrough comes with the idea that a machine can singularly learn
from the data (i.e., an example) to produce accurate results. Machine learning
is closely related to data mining and Data Science. The machine receives data
as input and uses an algorithm to formulate answers.

Difference between Machine Learning and Traditional Programming


Machine Learning Traditional Programming Artificial Intelligence

Machine Learning is a subset of Artificial Intelligence involves


In traditional programming,
artificial intelligence(AI) that making the machine as much
rule-based code is written by
focus on learning from data to capable, So that it can perform the
the developers depending on
develop an algorithm that can be tasks that typically require human
the problem statements.
used to make a prediction. intelligence.

Machine Learning uses a data- Traditional programming is AI can involve many different
driven approach, It is typically typically rule-based and techniques, including Machine
trained on historical data and then deterministic. It hasn’t self- Learning and Deep Learning, as
used to make predictions on new learning features like well as traditional rule-based
data. Machine Learning and AI. programming.

ML can find patterns and insights Traditional programming is Sometimes AI uses a combination
Machine Learning Traditional Programming Artificial Intelligence

of both Data and Pre-defined rules,


totally dependent on the
which gives it a great edge in
in large datasets that might be intelligence of developers.
solving complex tasks with good
difficult for humans to discover. So, it has very limited
accuracy which seem impossible to
capability.
humans.

Machine Learning is the subset of Traditional programming is AI is a broad field that includes
AI. And Now it is used in various often used to build many different applications,
AI-based tasks like Chatbot applications and software including natural language
Question answering, self-driven systems that have specific processing, computer vision, and
car., etc. functionality. robotics.

The Difference between Machine Learning and Traditional Programming is as


How machine learning algorithms work
Machine Learning works in the following manner.

 Forward Pass: In the Forward Pass, the machine learning algorithm


takes in input data and produces an output. Depending on the model
algorithm it computes the predictions.
 Loss Function: The loss function, also known as the error or cost
function, is used to evaluate the accuracy of the predictions made by
the model. The function compares the predicted output of the model
to the actual output and calculates the difference between them. This
difference is known as error or loss. The goal of the model is to
minimize the error or loss function by adjusting its internal
parameters.
 Model Optimization Process: The model optimization process is
the iterative process of adjusting the internal parameters of the model
to minimize the error or loss function. This is done using an
optimization algorithm, such as gradient descent. The optimization
algorithm calculates the gradient of the error function with respect to
the model’s parameters and uses this information to adjust the
parameters to reduce the error. The algorithm repeats this process
until the error is minimized to a satisfactory level.
Once the model has been trained and optimized on the training data, it can be
used to make predictions on new, unseen data. The accuracy of the model’s
predictions can be evaluated using various performance metrics, such as
accuracy, precision, recall, and F1-score.

Machine Learning lifecycle:


The lifecycle of a machine learning project involves a series of steps that
include:

1. Study the Problems: The first step is to study the problem. This step
involves understanding the business problem and defining the
objectives of the model.
2. Data Collection: When the problem is well-defined, we can collect
the relevant data required for the model. The data could come from
various sources such as databases, APIs, or web scraping.
3. Data Preparation: When our problem-related data is collected. then
it is a good idea to check the data properly and make it in the desired
format so that it can be used by the model to find the hidden patterns.
This can be done in the following steps:
 Data cleaning
 Data Transformation
 Explanatory Data Analysis and Feature Engineering
 Split the dataset for training and testing.

4. Model Selection: The next step is to select the appropriate machine


learning algorithm that is suitable for our problem. This step requires
knowledge of the strengths and weaknesses of different algorithms.
Sometimes we use multiple models and compare their results and
select the best model as per our requirements.
5. Model building and Training: After selecting the algorithm, we
have to build the model.
1. In the case of traditional machine learning building mode is
easy it is just a few hyperparameter tunings.

2. In the case of deep learning, we have to define layer-wise


architecture along with input and output size, number of
nodes in each layer, loss function, gradient descent
optimizer, etc.
3. After that model is trained using the preprocessed dataset.
6. Model Evaluation: Once the model is trained, it can be evaluated on
the test dataset to determine its accuracy and performance using
different techniques like classification report, F1 score, precision,
recall, ROC Curve, Mean Square error, absolute error, etc.
7. Model Tuning: Based on the evaluation results, the model may need
to be tuned or optimized to improve its performance. This involves
tweaking the hyperparameters of the model.
8. Deployment: Once the model is trained and tuned, it can be
deployed in a production environment to make predictions on new
data. This step requires integrating the model into an existing
software system or creating a new system for the model.
9. Monitoring and Maintenance: Finally, it is essential to monitor the
model’s performance in the production environment and perform
maintenance tasks as required. This involves monitoring for data
drift, retraining the model as needed, and updating the model as new
data becomes available.
Types of Machine Learning
 Supervised Machine Learning
 Unsupervised Machine Learning
 Reinforcement Machine Learning

1. Supervised Machine Learning:


Supervised learning is a type of machine learning in which the algorithm is
trained on the labeled dataset. It learns to map input features to targets based
on labeled training data. In supervised learning, the algorithm is provided with
input features and corresponding output labels, and it learns to generalize from
this data to make predictions on new, unseen data.

There are two main types of supervised learning:

 Regression: Regression is a type of supervised learning where the


algorithm learns to predict continuous values based on input features.
The output labels in regression are continuous values, such as stock
prices, and housing prices. The different regression algorithms in
machine learning are: Linear Regression, Polynomial Regression,
Ridge Regression, Decision Tree Regression, Random Forest
Regression, Support Vector Regression, etc
 Classification: Classification is a type of supervised learning where
the algorithm learns to assign input data to a specific category or
class based on input features. The output labels in classification are
discrete values. Classification algorithms can be binary, where the
output is one of two possible classes, or multiclass, where the output
can be one of several classes. The different Classification algorithms
in machine learning are: Logistic Regression, Naive Bayes, Decision
Tree, Support Vector Machine (SVM), K-Nearest Neighbors (KNN),
etc

2. Unsupervised Machine Learning:

Unsupervised learning is a type of machine learning where the algorithm


learns to recognize patterns in data without being explicitly trained using
labeled examples. The goal of unsupervised learning is to discover the
underlying structure or distribution in the data.

There are two main types of unsupervised learning:

 Clustering: Clustering algorithms group similar data points together


based on their characteristics. The goal is to identify groups, or
clusters, of data points that are similar to each other, while being
distinct from other groups. Some popular clustering algorithms
include K-means, Hierarchical clustering, and DBSCAN.
 Dimensionality reduction: Dimensionality reduction algorithms
reduce the number of input variables in a dataset while preserving as
much of the original information as possible. This is useful for
reducing the complexity of a dataset and making it easier to visualize
and analyze. Some popular dimensionality reduction algorithms
include Principal Component Analysis (PCA), t-SNE, and
Autoencoders.

3. Reinforcement Machine Learning

Reinforcement learning is a type of machine learning where an agent learns to


interact with an environment by performing actions and receiving rewards or
penalties based on its actions. The goal of reinforcement learning is to learn a
policy, which is a mapping from states to actions, that maximizes the expected
cumulative reward over time.

There are two main types of reinforcement learning:

 Model-based reinforcement learning: In model-based


reinforcement learning, the agent learns a model of the environment,
including the transition probabilities between states and the rewards
associated with each state-action pair. The agent then uses this model
to plan its actions in order to maximize its expected reward. Some
popular model-based reinforcement learning algorithms include
Value Iteration and Policy Iteration.
 Model-free reinforcement learning : In model-free reinforcement
learning, the agent learns a policy directly from experience without
explicitly building a model of the environment. The agent interacts
with the environment and updates its policy based on the rewards it
receives. Some popular model-free reinforcement learning
algorithms include Q-Learning, SARSA, and Deep Reinforcement
Learning.

4.Semi-Supervised Learning

Semi-Supervised learning is a type of Machine Learning algorithm that


represents the intermediate ground between Supervised and Unsupervised
learning algorithms. It uses the combination of labeled and unlabeled datasets
during the training period.

the basic difference between Supervised and unsupervised learning is


that supervised learning datasets consist of an output label training data
associated with each tuple, and unsupervised datasets do not consist the
same. Semi-supervised learning is an important category that lies between the
Supervised and Unsupervised machine learning. Although Semi-supervised
learning is the middle ground between supervised and unsupervised learning
and operates on the data that consists of a few labels, it mostly consists of
unlabeled data. As labels are costly, but for the corporate purpose, it may have
few labels.

The basic disadvantage of supervised learning is that it requires hand-labeling


by ML specialists or data scientists, and it also requires a high cost to process.
Further unsupervised learning also has a limited spectrum for its
applications. To overcome these drawbacks of supervised learning and
unsupervised learning algorithms, the concept of Semi-supervised learning
is introduced. In this algorithm, training data is a combination of both labeled
and unlabeled data. However, labeled data exists with a very small amount
while it consists of a huge amount of unlabeled data. Initially, similar data is
clustered along with an unsupervised learning algorithm, and further, it helps to
label the unlabeled data into labeled data. It is why label data is a
comparatively, more expensive acquisition than unlabeled data.

We can imagine these algorithms with an example. Supervised learning is


where a student is under the supervision of an instructor at home and college.
Further, if that student is self-analyzing the same concept without any help from
the instructor, it comes under unsupervised learning. Under semi-supervised
learning, the student has to revise itself after analyzing the same concept under
the guidance of an instructor at college

Assumptions followed by Semi-Supervised Learning

To work with the unlabeled dataset, there must be a relationship between the
objects. To understand this, semi-supervised learning uses any of the following
assumptions:

o ContinuityAssumption:
As per the continuity assumption, the objects near each other tend to
share the same group or label. This assumption is also used in supervised
learning, and the datasets are separated by the decision boundaries. But in
semi-supervised, the decision boundaries are added with the smoothness
assumption in low-density boundaries.
o Cluster assumptions- In this assumption, data are divided into different
discrete clusters. Further, the points in the same cluster share the output
label.
o Manifold assumptions- This assumption helps to use distances and
densities, and this data lie on a manifold of fewer dimensions than input
space.
o The dimensional data are created by a process that has less degree of
freedom and may be hard to model directly. (This assumption becomes
practical if high).

Working of Semi-Supervised Learning

Semi-supervised learning uses pseudo labeling to train the model with less
labeled training data than supervised learning. The process can combine various
neural network models and training ways. The whole working of semi-
supervised learning is explained in the below points:

o Firstly, it trains the model with less amount of training data similar to the
supervised learning models. The training continues until the model gives
accurate results.
o The algorithms use the unlabeled dataset with pseudo labels in the next
step, and now the result may not be accurate.
o Now, the labels from labeled training data and pseudo labels data are
linked together.
o The input data in labeled training data and unlabeled training data are also
linked.
o In the end, again train the model with the new combined input as did in
the first step. It will reduce errors and improve the accuracy of the model.

Difference between Semi-supervised and Reinforcement Learning.


Reinforcement learning is different from semi-supervised learning, as it works
with rewards and feedback. Reinforcement learning aims to maximize the
rewards by their hit and trial actions, whereas in semi-supervised learning,
we train the model with a less labeled dataset.

Real-world applications of Semi-supervised Learning-

Semi-supervised learning models are becoming more popular in the industries.


Some of the main applications are as follows.

o Speech Analysis- It is the most classic example of semi-supervised


learning applications. Since, labeling the audio data is the most
impassable task that requires many human resources, this problem can be
naturally overcome with the help of applying SSL in a Semi-supervised
learning model.
o Web content classification- However, this is very critical and impossible
to label each page on the internet because it needs mode human
intervention. Still, this problem can be reduced through Semi-Supervised
learning algorithms.
Further, Google also uses semi-supervised learning algorithms to rank a
webpage for a given query.
o Protein sequence classification- DNA strands are larger, they require
active human intervention. So, the rise of the Semi-supervised model has
been proximate in this field.
o Text document classifier- As we know, it would be very unfeasible to
find a large amount of labeled text data, so semi-supervised learning is an
ideal model to overcome this.

Need for machine learning:


Machine learning is important because it allows computers to learn from data
and improve their performance on specific tasks without being explicitly
programmed. This ability to learn from data and adapt to new situations makes
machine learning particularly useful for tasks that involve large amounts of
data, complex decision-making, and dynamic environments.

Here are some specific areas where machine learning is being used:

 Predictive modeling: Machine learning can be used to build


predictive models that can help businesses make better decisions. For
example, machine learning can be used to predict which customers
are most likely to buy a particular product, or which patients are most
likely to develop a certain disease.
 Natural language processing: Machine learning is used to build
systems that can understand and interpret human language. This is
important for applications such as voice recognition, chatbots, and
language translation.
 Computer vision: Machine learning is used to build systems that can
recognize and interpret images and videos. This is important for
applications such as self-driving cars, surveillance systems, and
medical imaging.
 Fraud detection: Machine learning can be used to detect fraudulent
behavior in financial transactions, online advertising, and other areas.
 Recommendation systems: Machine learning can be used to build
recommendation systems that suggest products, services, or content
to users based on their past behavior and preferences.
Overall, machine learning has become an essential tool for many businesses
and industries, as it enables them to make better use of data, improve their
decision-making processes, and deliver more personalized experiences to their
customers.
Various Applications of Machine Learning

Now in this Machine learning tutorial, let’s learn the applications of Machine
Learning:

 Automation: Machine learning, which works entirely autonomously


in any field without the need for any human intervention. For
example, robots perform the essential process steps in manufacturing
plants.
 Finance Industry: Machine learning is growing in popularity in the
finance industry. Banks are mainly using ML to find patterns inside
the data but also to prevent fraud.
 Government organization: The government makes use of ML to
manage public safety and utilities. Take the example of China with
its massive face recognition. The government uses Artificial
intelligence to prevent jaywalking.
 Healthcare industry: Healthcare was one of the first industries to
use machine learning with image detection.
 Marketing: Broad use of AI is done in marketing thanks to abundant
access to data. Before the age of mass data, researchers develop
advanced mathematical tools like Bayesian analysis to estimate the
value of a customer. With the boom of data, the marketing
department relies on AI to optimize customer relationships and
marketing campaigns.
 Retail industry: Machine learning is used in the retail industry to
analyze customer behavior, predict demand, and manage inventory.
It also helps retailers to personalize the shopping experience for each
customer by recommending products based on their past purchases
and preferences.
 Transportation: Machine learning is used in the transportation
industry to optimize routes, reduce fuel consumption, and improve
the overall efficiency of transportation systems. It also plays a role in
autonomous vehicles, where ML algorithms are used to make
decisions about navigation and safety.
Challenges and Limitations of Machine Learning-
Limitations of Machine Learning:
1. The primary challenge of machine learning is the lack of data or the
diversity in the dataset.
2. A machine cannot learn if there is no data available. Besides, a
dataset with a lack of diversity gives the machine a hard time.
3. A machine needs to have heterogeneity to learn meaningful insight.
4. It is rare that an algorithm can extract information when there are no
or few variations.
5. It is recommended to have at least 20 observations per group to help
the machine learn. This constraint leads to poor evaluation and
prediction.
Types of Regression Techniques in ML

A regression problem is when the output variable is a real or continuous value,


such as “salary” or “weight”. Many different models can be used, the simplest is
linear regression. It tries to fit data with the best hyperplane that goes through
the points.
What is Regression Analysis?
Regression Analysis is a statistical process for estimating the relationships
between the dependent variables or criterion variables and one or more
independent variables or predictors. Regression analysis is generally used when
we deal with a dataset that has the target variable in the form of continuous data.
Regression analysis explains the changes in criteria about changes in select
predictors. The conditional expectation of the criteria is based on predictors
where the average value of the dependent variables is given when the
independent variables are changed. Three major uses for regression analysis are
determining the strength of predictors, forecasting an effect, and trend
forecasting.
What is the purpose of using Regression Analysis?
There are times when we would like to analyze the effect of different
independent features on the target or what we say dependent features. This
helps us make decisions that can affect the target variable in the desired
direction. Regression analysis is heavily based on statistics and hence gives
quite reliable results to this reason only regression models are used to find the
linear as well as non-linear relation between the independent and the dependent
or target variables.
Types of Regression Techniques
Along with the development of the machine learning domain regression analysis
techniques have gained popularity as well as developed manifold from just y =
mx + c. There are several types of regression techniques, each suited for
different types of data and different types of relationships. The main types of
regression techniques are:
Linear Regression
Linear regression is used for predictive analysis. Linear regression is a linear
approach for modeling the relationship between the criterion or the scalar
response and the multiple predictors or explanatory variables. Linear
regression focuses on the conditional probability distribution of the response
given the values of the predictors. For linear regression, there is a danger
of overfitting. The formula for linear regression is:
Syntax:
y = θx + b
where,
 θ – It is the model weights or parameters
 b – It is known as the bias.

This is the most basic form of regression analysis and is used to model a linear
relationship between a single dependent variable and one or more independent
variables.
Here, a linear regression model is instantiated to fit a linear relationship
between input features (X) and target values (y). This code is used for simple
demonstration of the approach.
Types of Linear Regression
There are two main types of linear regression:
Simple Linear Regression
This is the simplest form of linear regression, and it involves only one
independent variable and one dependent variable. The equation for simple
linear regression is:

where:
Y is the dependent variable
X is the independent variable
β0 is the intercept
β1 is the slope
Multiple Linear Regression
This involves more than one independent variable and one dependent variable.
The equation for multiple linear regression is:

where:
Y is the dependent variable
X1, X2, …, Xp are the independent variables
β0 is the intercept
β1, β2, …, βn are the slopes
What is the best Fit Line?
Our primary objective while using linear regression is to locate the best-fit
line, which implies that the error between the predicted and actual values
should be kept to a minimum. There will be the least error in the best-fit line.
The best Fit Line equation provides a straight line that represents the
relationship between the dependent and independent variables. The slope of
the line indicates how much the dependent variable changes for a unit change
in the independent variable(s).

Linear Regression

Here Y is called a dependent or target variable and X is called an independent


variable also known as the predictor of Y. There are many types of functions
or modules that can be used for regression. A linear function is the simplest
type of function. Here, X may be a single feature or multiple features
representing the problem.
Linear regression performs the task to predict a dependent variable value (y)
based on a given independent variable (x)). Hence, the name is Linear
Regression. In the figure above, X (input) is the work experience and Y
(output) is the salary of a person. The regression line is the best-fit line for our
model.
We utilize the cost function to compute the best values in order to get the best
fit line since different values for weights or the coefficient of lines result in
different regression lines.
Assumptions of Simple Linear Regression
Linear regression is a powerful tool for understanding and predicting the
behavior of a variable, however, it needs to meet a few conditions in order to
be accurate and dependable solutions.

1. Linearity: The independent and dependent variables have a linear


relationship with one another. This implies that changes in the
dependent variable follow those in the independent variable(s) in a
linear fashion. This means that there should be a straight line that can
be drawn through the data points. If the relationship is not linear,
then linear regression will not be an accurate model.

2. Independence: The observations in the dataset are independent of


each other. This means that the value of the dependent variable for
one observation does not depend on the value of the dependent
variable for another observation. If the observations are not
independent, then linear regression will not be an accurate model.
3. Homoscedasticity: Across all levels of the independent variable(s),
the variance of the errors is constant. This indicates that the amount
of the independent variable(s) has no impact on the variance of the
errors. If the variance of the residuals is not constant, then linear
regression will not be an accurate model.
Homoscedasticity in Linear Regression

4. Normality: The residuals should be normally distributed. This means


that the residuals should follow a bell-shaped curve. If the residuals
are not normally distributed, then linear regression will not be an
accurate model.
Assumptions of Multiple Linear Regression
For Multiple Linear Regression, all four of the assumptions from Simple
Linear Regression apply. In addition to this, below are few more:
1. No multicollinearity: There is no high correlation between the
independent variables. This indicates that there is little or no
correlation between the independent variables. Multicollinearity
occurs when two or more independent variables are highly correlated
with each other, which can make it difficult to determine the
individual effect of each variable on the dependent variable. If there
is multicollinearity, then multiple linear regression will not be an
accurate model.
2. Additivity: The model assumes that the effect of changes in a
predictor variable on the response variable is consistent regardless of
the values of the other variables. This assumption implies that there
is no interaction between variables in their effects on the dependent
variable.
3. Feature Selection: In multiple linear regression, it is essential to
carefully select the independent variables that will be included in the
model. Including irrelevant or redundant variables may lead to
overfitting and complicate the interpretation of the model.
4. Overfitting: Overfitting occurs when the model fits the training data
too closely, capturing noise or random fluctuations that do not
represent the true underlying relationship between variables. This can
lead to poor generalization performance on new, unseen data.
Polynomial Regression is a form of linear regression in which the
relationship between the independent variable x and dependent variable y is
modeled as an nth-degree polynomial. Polynomial regression fits a nonlinear
relationship between the value of x and the corresponding conditional mean of
y, denoted E(y | x).
What is a Polynomial Regression?
 There are some relationships that a researcher will hypothesize is
curvilinear. Clearly, such types of cases will include a polynomial
term.
 Inspection of residuals. If we try to fit a linear model to curved data,
a scatter plot of residuals (Y-axis) on the predictor (X-axis) will have
patches of many positive residuals in the middle. Hence in such a
situation, it is not appropriate.
 An assumption in the usual multiple linear regression analysis is that
all the independent variables are independent. In the polynomial
regression model, this assumption is not satisfied.
How does a Polynomial Regression work?
If we observe closely then we will realize that to evolve from linear regression
to polynomial regression. We are just supposed to add the higher-order terms
of the dependent features in the feature space. This is sometimes also known
as feature engineering but not exactly.
Application of Polynomial Regression
The reason behind the vast use cases of the polynomial regression is that
approximately all of the real-world data is non-linear in nature and hence when
we fit a non-linear model on the data or a curvilinear regression line then the
results that we obtain are far better than what we can achieve with the standard
linear regression. Some of the use cases of the Polynomial regression are as
stated below:

 The growth rate of tissues.


 Progression of disease epidemics
 Distribution of carbon isotopes in lake sediments
Logistic Regression in Machine Learning
Logistic regression is a supervised machine learning algorithm mainly used for
classification tasks where the goal is to predict the probability that an instance
belongs to a given class or not. It is a kind of statistical algorithm, which
analyze the relationship between a set of independent variables and the
dependent binary variables. It is a powerful tool for decision-making. For
example email spam or not.
Logistic Regression
Logistic regression is a supervised machine learning algorithm mainly used for
binary classification where we use a logistic function, also known as a sigmoid
function that takes input as independent variables and produces a probability
value between 0 and 1. For example, we have two classes Class 0 and Class 1
if the value of the logistic function for an input is greater than 0.5 (threshold
value) then it belongs to Class 1 it belongs to Class 0. It’s referred to as
regression because it is the extension of linear regression but is mainly used
for classification problems. The difference between linear regression and
logistic regression is that linear regression output is the continuous value that
can be anything while logistic regression predicts the probability that an
instance belongs to a given class or not.
Understanding Logistic Regression
It is used for predicting the categorical dependent variable using a given set of
independent variables.
 Logistic regression predicts the output of a categorical dependent
variable. Therefore the outcome must be a categorical or discrete
value.
 It can be either Yes or No, 0 or 1, true or False, etc. but instead of
giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
 Logistic Regression is much similar to the Linear Regression except
that how they are used. Linear Regression is used for solving
Regression problems, whereas Logistic regression is used for solving
the classification problems.
 In Logistic regression, instead of fitting a regression line, we fit an
“S” shaped logistic function, which predicts two maximum values (0
or 1).
 The curve from the logistic function indicates the likelihood of
something such as whether the cells are cancerous or not, a mouse is
obese or not based on its weight, etc.
 Logistic Regression is a significant machine learning algorithm
because it has the ability to provide probabilities and classify new
data using continuous and discrete datasets.
 Logistic Regression can be used to classify the observations using
different types of data and can easily determine the most effective
variables used for the classification.
Logistic Function (Sigmoid Function):
 The sigmoid function is a mathematical function used to map the
predicted values to probabilities.
 It maps any real value into another value within a range of 0 and 1.
The value of the logistic regression must be between 0 and 1, which
cannot go beyond this limit, so it forms a curve like the “S” form.
 The S-form curve is called the Sigmoid function or the logistic
function.
 In logistic regression, we use the concept of the threshold value,
which defines the probability of either 0 or 1. Such as values above
the threshold value tends to 1, and a value below the threshold values
tends to 0.
Differences b/w Linear and Logistic Regression

Sr.No Linear Regresssion Logistic Regression

Linear regression is Logistic regression is


used to predict the used to predict the
continuous dependent categorical dependent
variable using a given variable using a given
set of independent set of independent
1 variables. variables.

Linear regression is
It is used for solving
used for solving
classification problems.
2 Regression problem.

In this we predict the


In this we predict values
value of continuous
of categorical varibles
3 variables
Sr.No Linear Regresssion Logistic Regression

In this we find best fit In this we find S-


4 line. Curve .

Maximum likelihood
Least square estimation
estimation method is
method is used for
used for Estimation of
estimation of accuracy.
5 accuracy.

The output must be Output is must be


continuous value,such categorical value such
6 as price,age,etc. as 0 or 1, Yes or no, etc.

It required linear
relationship between It not required linear
dependent and relationship.
7 independent variables.

There may be There should not be


collinearity between the collinearity between
8 independent variables. independent varible.

Terminologies involved in Logistic Regression


 Independent variables: The input characteristics or predictor factors
applied to the dependent variable’s predictions.
 Dependent variable: The target variable in a logistic regression
model, which we are trying to predict.
 Logistic function: The formula used to represent how the
independent and dependent variables relate to one another. The
logistic function transforms the input variables into a probability
value between 0 and 1, which represents the likelihood of the
dependent variable being 1 or 0.
 Odds: It is the ratio of something occurring to something not
occurring. it is different from probability as the probability is the
ratio of something occurring to everything that could possibly occur.
 Log-odds: The log-odds, also known as the logit function, is the
natural logarithm of the odds. In logistic regression, the log odds of
the dependent variable are modeled as a linear combination of the
independent variables and the intercept.
 Coefficient: The logistic regression model’s estimated parameters,
show how the independent and dependent variables relate to one
another.
 Intercept: A constant term in the logistic regression model, which
represents the log odds when all independent variables are equal to
zero.
 Maximum likelihood estimation: The method used to estimate the
coefficients of the logistic regression model, which maximizes the
likelihood of observing the data given the model.

Assumptions for Logistic Regression


The assumptions for Logistic regression are as follows:
 Independent observations: Each observation is independent of the
other. meaning there is no correlation between any input variables.
 Binary dependent variables: It takes the assumption that the
dependent variable must be binary or dichotomous, meaning it can
take only two values. For more than two categories softmax
functions are used.
 Linearity relationship between independent variables and log
odds: The relationship between the independent variables and the log
odds of the dependent variable should be linear.
 No outliers: There should be no outliers in the dataset.
 Large sample size: The sample size is sufficiently large
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three
types:
1. Binomial: In binomial Logistic regression, there can be only two
possible types of the dependent variables, such as 0 or 1, Pass or Fail,
etc.
2. Multinomial: In multinomial Logistic regression, there can be 3 or
more possible unordered types of the dependent variable, such as
“cat”, “dogs”, or “sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more
possible ordered types of dependent variables, such as “low”,
“Medium”, or “High”.

You might also like