0% found this document useful (0 votes)
3 views

Mod 3

The document discusses various regression techniques including Ridge and Lasso Regression, highlighting their differences in regularization type, feature selection, and use cases. It also explains Least-Squares Regression for classification, its methodology, advantages, and disadvantages. Additionally, the document covers Support Vector Machines, detailing their functionality, advantages, disadvantages, and applications.

Uploaded by

aditideo624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Mod 3

The document discusses various regression techniques including Ridge and Lasso Regression, highlighting their differences in regularization type, feature selection, and use cases. It also explains Least-Squares Regression for classification, its methodology, advantages, and disadvantages. Additionally, the document covers Support Vector Machines, detailing their functionality, advantages, disadvantages, and applications.

Uploaded by

aditideo624
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Differentiate between Ridge and Lasso Regression

Characteristic Ridge Regression Lasso Regression

Applies L1 regularization, adding a


Applies L2 regularization, adding
Regularization penalty term proportional to
a penalty term proportional to
Type the absolute value of the
the square of the coefficients
coefficients.

Does not perform feature


Performs automatic feature
selection. All predictors are
Feature selection. Less important predictors
retained, although their
Selection are completely excluded by setting
coefficients are reduced in size to
their coefficients to zero.
minimize overfitting

Best suited for situations


Ideal when you suspect that only
where all predictors are
a subset of predictors is important,
When to use potentially relevant, and the goal
and the model should focus on those
is to reduce overfitting rather
while ignoring the irrelevant ones.
than eliminate features

Produces a model that Produces a model that is simpler,


includes all features, but their retaining only the most significant
Output model
coefficients are smaller in features and ignoring the rest by
magnitude to prevent overfitting setting their coefficients to zero.

Reduces the magnitude of


Shrinks some coefficients to exactly
coefficients, shrinking them
zero, effectively removing their
Impact on towards zero, but does not set
influence from the model. This leads
Prediction any coefficients exactly to zero.
to a simpler model with fewer
All predictors remain in the
features
model

Aditi Deorukhakar
Characteristic Ridge Regression Lasso Regression

Generally faster as it doesn’t May be slower due to the feature


Computation
involve feature selection selection process

Use when you have many


Use when you believe only some
predictors, all contributing to the
Example Use predictors are truly important (e.g.,
outcome (e.g., predicting house
Case genetic studies where only a few
prices where all features like size,
genes out of thousands are relevant).
location, etc., matter)

Find a linear regression equation for the following two sets of data:

x y
3 12
5 18
7 24
9 30
line of the form:

y=mx+c

Where:

• mmm is the slope (regression coefficient)

• c is the y-intercept

Aditi Deorukhakar
x y x̄ = 6 ȳ = 21 x - x̄ y-ȳ (x - x̄) (y - ȳ) (x - x̄)²

3 12 6 21 -3 -9 27 9

5 18 6 21 -1 -3 3 1

7 24 6 21 1 3 3 1

9 30 6 21 3 9 27 9

Explain Least-Squares Regression for classification.

Least-squares regression is a method that finds a straight line (or a surface in higher
dimensions) that best fits the data by minimizing the total squared difference between the
predicted and actual values.

While it's designed for predicting numbers, it can also be adapted for classification, especially
in binary classification problems.

How it’s used for classification

Even though classification is about predicting categories (like yes/no, spam/not spam), we can
use least-squares regression by:

1. Converting class labels into numbers:


For example, assign 0 to one class and 1 to the other.

Aditi Deorukhakar
2. Training the model:
The algorithm finds a line (or boundary) that tries to fit these numerical labels as if they
were continuous values.

3. Making predictions:
For a new input, the model gives a number—possibly between 0 and 1, or even outside
that range.

4. Classifying using a threshold:


If the predicted number is greater than a certain cutoff (usually 0.5), it is classified as
one class (say 1); otherwise, the other (say 0).

Example

Let’s say you want to classify emails as spam or not spam. You label spam as 1, not spam as 0.
The regression model is trained to predict values close to these numbers. When a new email
comes in, if the model gives a score like 0.8, you label it as spam. If it gives 0.2, it's not spam.

Pros

• Simple and quick to implement.

• Easy to understand.

• Works well when the data is simple and well separated.

Cons

• Not built specifically for classification.

• Can give results outside the expected range (like less than 0 or more than 1).

• Not good at handling more complex or non-linear patterns in data.

• Doesn't perform as well as classification-specific models like logistic regression.

Find the least square regression line Y= aX + b. Estimate the Y when the value of X equals 10.

x y
0 2
1 3
2 5
3 4
4 6

Aditi Deorukhakar
Write a short note on (a) Multivariate Regression

Aditi Deorukhakar
(b) Regularized Regression.

Aditi Deorukhakar
Write short note on

A. Least Square Regression for classification

B. Differentiate between Ridge and Lasso Regression

Discuss Support Vector Machines.

Support Vector Machine (SVM) is a powerful supervised machine learning algorithm used for
both classification and regression tasks, though it is more commonly used for classification. The
key idea behind SVM is to find the optimal separating boundary (called a hyperplane) that best
divides the dataset into different classes.

In a two-dimensional space, this boundary is simply a line. In higher dimensions, it becomes a


hyperplane. The goal of SVM is to choose this hyperplane in such a way that it maximizes the
margin, which is the distance between the hyperplane and the nearest data points of each
class. These nearest points are called support vectors, and they are crucial in defining the
position and orientation of the hyperplane.

If the data is not linearly separable, SVM uses a method called the kernel trick, which
transforms the input features into a higher-dimensional space where a linear separation is
possible. This makes SVM highly flexible and capable of handling complex, non-linear data as
well.

SVM is especially effective in high-dimensional spaces and is often used when the number of
features exceeds the number of samples.

How It Works

• For classification: SVM identifies the boundary (hyperplane) that best separates
different classes. The closest data points to this boundary are called support vectors.

Aditi Deorukhakar
• For regression: SVM tries to fit the best possible line (or surface) within a certain margin,
allowing some flexibility for prediction.

• For non-linear data: SVM uses the kernel trick (like RBF or polynomial kernels) to
transform data into a higher dimension where it becomes linearly separable.

Advantages

• Works well with high-dimensional and small datasets

• Can handle non-linear decision boundaries

• Robust to noise

• Good generalization and avoids overfitting

• Applicable in classification and regression

Disadvantages

• Slow and memory-intensive for large datasets

• Choice of kernel and parameters is critical and can be tricky

• No probabilistic output

• Doesn’t handle missing values well

• Not naturally suited for multi-class classification (requires special techniques)

Applications

• Face recognition

• Text classification

• Bioinformatics (e.g., DNA analysis, protein classification)

• Handwriting recognition

• Speech recognition

• Facial expression detection

• Predictive control systems

Aditi Deorukhakar
Find a linear regression equation for the following two sets of data:

x y
5 40
7 120
12 180
16 210
20 240
Write a short note on (a) Multivariate Regression and (b) Regularized Regression.

Write short note on

A. Least Square Regression for classification

B. Ridge and Lasso Regression

Aditi Deorukhakar

You might also like