Machine Learning Fundamentals
Machine Learning Fundamentals
Fundamentals
Feature Selection Techniques
Goal: To find the best set of features that allows one to build optimized models of studied
phenomena.
Task: Classification problem where we aim to predict whether a given email is spam.
Dataset Heads:
Length of the email
Number of exclamation marks
Presence of certain keywords (e.g., "free", "offer", "discount")
Number of spelling errors
Presence of attachments
Use of capital letters.
Which ones should be selected as good Features for this particular task?
Simple ways to find out good
Features
Correlation Analysis: We can analyze the correlation between each feature and the target
variable (spam or not spam). Features with high correlation are likely to be more
informative. For example, if emails containing certain keywords are more likely to be spam,
then the presence of those keywords would be a relevant feature.
Feature Importance: Train a model (such as a decision tree or a random forest) and examine
the feature importances provided by the model. Features with higher importance scores
contribute more to the predictive performance of the model and are thus more relevant.
Here though the model has correctly classified the data we are
penalizing the model because it has not classified it with much
confidence (|f(x)| < 1) as the classification score is less than 1.
Hinge Loss: Incorrect Classification
In this case either of y or f(x) will
be negative.
So, the product y.f(x) will always
be negative.
The loss function value max(0,1-
y.f(x)) will always be the value
given by (1-y.f(x)) .
Here the loss value will increase
linearly with increase in value of
y.
Hinge Loss
Margin = 0.22
Case 3: 0.22 – (+1* (-0.24)) = (0.22+0.24) = 0.460m Case 2: 0.150, (0.22-0.150) = 0.07, max(0,
max(0, 0.460) ; L= 0.460 0.07)= 0.07
Mini-Batch GD with
Momentum
Momentum is an optimization technique that accelerates the
optimization process by adding a fraction of the previous update to the
current update.
MBGD with Momentum: Steps
Initialize model parameters and momentum term to zero.
Divide the training dataset into mini-batches.
For each mini-batch:
Perform a forward pass to compute predictions.
Calculate the loss and gradients concerning the mini-batch.
Update the momentum term using the current gradients and the momentum
hyperparameter.
Update the model’s parameters using the momentum-adjusted gradient updates.
Exploration of Linear
relation between these two
variables:
Y=mX+B
Linear Regression A slope of 2 means that
every 1-unit change in X
yields a 2-unit change in
Y.
Two different types of variable are there:
Independent (=predictor) variable X
Dependent (=outcome) variable Y.
Exploration of Linear
relation between these two
variables:
Y=mX+B
Regression Equation
Expected value of y at a given x=
Predicted value for an individual
Assumptions for Linear Regression
Linear regression assumes that…
1. The relationship between X and Y is linear
2. Y is distributed normally at each value of X
3. The variance of Y at every value of X is the same (homogeneity of
variances/ homoscedasticity)
4. The observations are independent
Homoscedasticity
The standard error of Y given X is the average variability around the
regression line at any given value of X. It is assumed to be equal at
all values of X.
Types of Linear Regression
Simple Linear Regression:A single independent variable
is used to predict the value of a numerical dependent variable
Multiple Linear regression:More than one independent
variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called
Multiple Linear Regression.
Small or no multicollinearity between the features:
Multicollinearity means high-correlation between the independent
variables.
Types of Regression Line
A linear line showing the relationship between the dependent and independent variables is called
a regression line.
Positive Linear Relationship:If the dependent Negative Linear Relationship:If the dependent
variable increases on the Y-axis and independent variable decreases on the Y-axis and independent
variable increases on X-axis, then such a relationship variable increases on the X-axis, then such a
is termed as a Positive linear relationship. relationship is called a negative linear relationship.