0% found this document useful (0 votes)
7 views24 pages

Week 7. Intro To ML. Regression

The document provides an overview of machine learning, focusing on regression as a supervised learning technique. It explains the differences between regression and classification, outlines various machine learning algorithms, and discusses the assumptions and evaluation metrics for linear regression models. Applications of machine learning across various fields such as healthcare, finance, and e-commerce are also highlighted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views24 pages

Week 7. Intro To ML. Regression

The document provides an overview of machine learning, focusing on regression as a supervised learning technique. It explains the differences between regression and classification, outlines various machine learning algorithms, and discusses the assumptions and evaluation metrics for linear regression models. Applications of machine learning across various fields such as healthcare, finance, and e-commerce are also highlighted.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Machine Learning:

Regression

Instructor: Sabina Mammadova


Agenda
• Types of Machine Learning

• Supervised and Unsupervised Learning

• Regression vs Classification

• Simple and Multiple Linear Regression

• Assumptions of Linear Regression

• Evaluation
What is Machine Learning (ML)?
• Machine learning is the process of extracting knowledge from
data, combining elements of statistics, AI, and computer science.
It is widely used in daily life, from personalized recommendations
(Netflix, Amazon) to scientific research (DNA analysis, cancer
treatment).
• Earlier intelligent systems relied on manually coded rules ("if-
else" conditions), but these were limited in flexibility and
required expert knowledge. Machine learning, however, allows
models to learn patterns from data without explicit programming.
A key example is face detection, which was once unsolvable with
rule-based methods but is now achieved through ML algorithms
trained on large datasets.
What is Machine Learning (ML)?
Machine learning (ML) is a branch of artificial intelligence that
enables computers to learn patterns from data and make
predictions without explicit programming.
Application of Machine Learning
• Healthcare – Disease diagnosis, personalized treatments, drug discovery, patient outcome
prediction.
• Finance – Fraud detection, stock predictions, credit scoring, algorithmic trading.
• E-Commerce – Product recommendations, targeted ads, customer sentiment analysis, chatbots.
• Transportation – Self-driving cars, traffic prediction, predictive vehicle maintenance.
• Manufacturing – Quality control, supply chain optimization, automation, predictive maintenance.
• Cybersecurity – Threat detection, spam filtering, fraud prevention, malware analysis.
• Education – AI tutors, automated grading, student performance tracking.
• Agriculture – Crop monitoring, disease detection, smart irrigation, automated harvesting.
• Entertainment – Video/music recommendations, AI-generated content, face/speech recognition.
• Government – Smart cities, disaster prediction, surveillance, traffic flow optimization.
Machine Learning Algorithms
Linear Regression, Polynomial
Regression, Support Vector
Regression Regression, Decision Tree
Regression, Random Forest
Supervised Regression
Learning Logistic Regression, K-Nearest
Neighbors, Support Vector
Classification Machines, Decision Tree, Random
Forest, Naïve Bayes

Clustering K-Means, Hierarchical, DBSCAN

Machine Unsupervised Association Apriori, FP-Growth


Learning Learning Analysis
Dimensionality
PCA, LDA
Reduction

Reinforcemen Q-Learning, Deep Q-Networks…


t Learning
Difference between Supervised and
Unsupervised Learning

• Input data is labelled • Input data is unlabeled


• There is a training phase • There is no training phase
• Data is modelled based on • Uses properties of given data
training dataset for clustering
• Known number of classes (for • Unknown number of classes
classification)
Supervised Learning
Unsupervised Learning
Supervised Learning:
Regression
Regression vs Classification
• Classification and Regression are both types of supervised machine learning tasks,
but they serve different purposes. Classification is used when the goal is to predict
discrete labels or categories. For example, you might want to predict whether an email is
"spam" or "not spam." The key here is that the output is categorical; it's about classifying
the input into one of several predefined classes. Common examples of classification
tasks include disease detection (predicting whether someone is healthy or sick), or
image recognition (e.g., classifying an image as a cat, dog, or bird). Some common
algorithms used for classification are Logistic Regression, Decision Trees, K-Nearest
Neighbors (KNN), and Support Vector Machines (SVM).
• On the other hand, Regression is used when the goal is to predict a continuous value. In
this case, you're predicting a real number rather than a category. For example, you might
want to predict the price of a house based on features like its size, location, and number
of bedrooms. The output here is a continuous value, such as a price or temperature.
Other examples of regression tasks include predicting stock prices or forecasting sales
figures. Algorithms typically used for regression include Linear Regression, Polynomial
Regression, and Decision Trees for Regression.
Linear Regression
• Linear Regression is a type of supervised learning algorithm
used for regression tasks, where the goal is to predict a
continuous value. It models the relationship between one or
more input features (independent variables) and a continuous
output (dependent variable) by fitting a linear equation to
observed data.
• Simple Regression: One dependent and one independent
Y= β₀ + β₁ X+ ε
• Multiple Linear Regression: One dependent and many
independent
Simple Linear Regression
Simple Linear Regression is a statistical method used to model the relationship
between two variables: one independent variable (X) and one dependent variable (Y). The
goal is to find the best-fitting straight line (linear relationship) that predicts the value of Y
based on X.
The equation for the line is: Y= β₀ + β₁ X+ ε
Where:
• Y is the dependent variable (what we're predicting).
• X is the independent variable (the feature we use to predict Y).
• β₀ is the intercept (value of Y when X = 0).
• β₁ is the slope (how much Y changes for a one-unit change in X).
• ε is the error term (captures the difference between the predicted and actual values).
The model aims to find the best β₀ and β₁ values that minimize the error (using the least
squares method).
In short, Simple Linear Regression is about fitting a straight line to data to predict one
variable based on another.
Multiple Linear Regression
Multiple Linear Regression is an extension of simple linear regression, where the goal
is to model the relationship between two or more independent variables (features) and a
single dependent variable (outcome). Instead of fitting a line to data, this approach fits a
hyperplane in a multi-dimensional space, accounting for the combined effect of several
predictors.
The equation for Multiple Linear Regression is: Y= β₀ + β₁ X₁ + β₂ X₂ +…+ βₙ Xₙ + ε
Where:
• Y is the dependent variable (the value we're trying to predict).
• X₁, X₂, ..., Xₙ are the independent variables (features or predictors).
• β₀ is the intercept (the value of Y when all X variables are 0).
• β₁, β₂, ..., βₙ are the coefficients (weights that represent the impact of each
independent variable on Y).
• ε is the error term (captures the difference between the predicted and actual values).
• Predicting house prices, the independent variables could include the number of
bedrooms, square footage, and location. Multiple linear regression would model the
combined effect of all these factors to predict the price (Y).
Train and Test Split
In machine learning, we need to evaluate how well a model generalizes to unseen data.
• Training Set: This subset of data is used to train the model. The model learns the
patterns in the data and adjusts its parameters to fit the training data.
• Test Set: This subset is used to evaluate the model’s performance. The model makes
predictions on the test set, and these predictions are compared with the actual
outcomes to assess accuracy, precision, and other performance metrics.
• Typical Split Ratios: A common split ratio is 70/30 or 80/20, meaning 70-80% of the
data is used for training, and the remaining 20-30% is used for testing. Sometimes,
a 60/40 or other splits are used, depending on the amount of data available.
• Random Split: The data is usually shuffled randomly before splitting, so that both
training and test sets represent the overall distribution of the data.
Assumptions of Linear Regression
Linearity: The independent and dependent variables have a linear relationship with one
another. This implies that changes in the dependent variable follow those in the independent
variable(s) in a linear fashion. This means that there should be a straight line that can be drawn
through the data points. If the relationship is not linear, then linear regression will not be an
accurate model.
Independence: The observations in the dataset are independent of each other. This means
that the value of the dependent variable for one observation does not depend on the value of
the dependent variable for another observation. If the observations are not independent, then
linear regression will not be an accurate model.
Homoscedasticity: Across all levels of the independent variable(s), the variance of the errors
is constant. This indicates that the amount of the independent variable(s) has no impact on the
variance of the errors. If the variance of the residuals is not constant, then linear regression will
not be an accurate model.
Normality: The residuals should be normally distributed. This means that the residuals should
follow a bell-shaped curve. If the residuals are not normally distributed, then linear regression
will not be an accurate model.
Multicollinearity
Multicollinearity is a statistical phenomenon where two or more independent variables in a
multiple regression model are highly correlated, making it difficult to assess the individual
effects of each variable on the dependent variable.

Detecting Multicollinearity includes two techniques:

Correlation Matrix: Examining the correlation matrix among the independent variables is a
common way to detect multicollinearity. High correlations (close to 1 or -1) indicate potential
multicollinearity.
VIF (Variance Inflation Factor): VIF is a measure that quantifies how much the variance of an
estimated regression coefficient increases if your predictors are correlated. A high VIF (typically
above 10) suggests multicollinearity.
Model Evaluation:
Regression
Evaluation: Mean squared error (MSE)
Mean squared error (MSE) measures error in statistical models by using the average
squared difference between observed and predicted values.

𝑦^​ is the predicted value from the model.


y_i is the actual value of the target variable.

n is the number of data points.


MSE calculates the average of the squared differences between the actual and predicted
values. A lower MSE indicates better model performance. However, since MSE is in
squared units of the target variable, it may be difficult to interpret directly.
A lower MSE indicates better model performance, but it can be difficult to interpret directly
since it’s in the squared units of the target variable.
Evaluation: Root Mean Square Error (RMSE)

• RMSE is the square root of MSE. By taking the square root, we bring the
units of error back to the original units of the target variable, making
RMSE easier to interpret.
• A lower RMSE indicates better model performance, and it’s directly
comparable to the original units of the target. RMSE is sensitive to large
errors (outliers), so it might penalize models with big mistakes more
heavily than other metrics.
Evaluation: Mean Absolut Error (MAE)

• MAE measures the average of the absolute differences between the


actual and predicted values. Unlike MSE, it doesn’t square the errors, so
it’s less sensitive to large errors.
• MAE provides a straightforward and interpretable measure of prediction
error in the original units of the target variable. It’s a good metric when
you care equally about all errors, regardless of their size. A lower MAE
indicates better model performance.
Evaluation: R-Squared

• R² tells us how well the model explains the variance in the target variable. It
represents the proportion of the variance in the target variable that’s
explained by the model.
• R² ranges from 0 to 1. A higher R² indicates that the model explains a larger
proportion of the variance, which typically means a better model.
• An R² of 0 means the model explains none of the variance, and R² = 1 means
the model explains all the variance perfectly.
• Negative R² values can occur when the model performs worse than a simple
mean-based model.
Evaluation: Adjusted R-Squared

• Adjusted R² adjusts the R² value to account for the number of predictors


in the model. It penalizes the addition of irrelevant predictors.
• Higher Adjusted R² values indicate that the model better explains the
variance in the target, while also considering the number of predictors.
• Lower Adjusted R² values indicate overfitting when the model adds
unnecessary predictors.
Thank you!

You might also like