ML Unit 3

Uploaded by

sudarshan2003sk2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views2 pages

ML Unit 3

Uploaded by

sudarshan2003sk2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Q1) Overfitting vs Underfitting ML Unit 3 Q5) Lasso vs Ridge Regression ML Unit 3

Overfitting:- It has very low training error. occurs when the model is too complex. leads to high variance 1)Lasso Regression- 1)Adds L1 penalty (sum of absolute values of coefficients). 2)Shrinks some
and low bias. Might occur when there are too few features. Perform more regularisation. For overfitting coefficients to exactly zero (feature selection). 3)Produces sparse models (useful for interpretability).
training error is much lower than test error. Underfitting:- It has a high training error. Occurs when 4)Works well when only a few features are important. 5)Sensitive to small data changes, which can
model is too simple. It leads to low variance and high bias. Might occur when there are too many affect feature selection. 6)May underperform when features are highly correlated.
features. Perform less regularisation. For underfitting training error is usually similar to test error. 2)Ridge Regression – 1)Adds L2 penalty (sum of squared values of coefficients). 2)Shrinks coefficients
towards zero but never exactly zero. 3)Retains all features, avoiding feature elimination. 4)Works well
Q2) What is Underfitting & Overfitting. Technique to reduce underfitting and overfitting
when features are highly correlated. 5)Produces stable models, less sensitive to data changes. 6)Prefers
1)Underfitting-happens when a model is too simple to capture the underlying patterns in the data,
models with all features contributing to the prediction.
leading to poor performance on both training and test data. 2)Overfitting- occurs when a model learns
the training data too well, including its noise and details, resulting in poor generalization to unseen Q6) Explain Regression- Regression is a type of supervisor machine Learning used to predict a
data. The model performs well on training data but poorly on test data. Technique to reduce continuous output or numerical value based on input data. It involves finding relationship between
Underfitting- 1) Increases model complexity- use more complex algorithm ( e.g.- switching from linear input variable ( Independent variable) and an output variable ( dependent variable). Regression models
regression to decision tree). 2)Add More features- include relevant features that help the model and estimate the value of output variable based on pattern learned from training data. Example- Suppose
better. 3)Train longer- allows the model to train for more epochs or iterations. 4) Tune you want to predict price of house based on factors like the size location, number of bedroom. Etc .
Hyperparameters- adjust learning rates or other parameters to improve models capacity to learn. Using regression, you can train a model on historical data of house, prices, and other associated
Overfitting- 1) Use Regularization- techniques like L1 ( lasso ) or L2 ( Ridge ) regularization add a features. The model will learn the pattern and relationship in data and can predict price of new house
penalty for overly complex models. 2) Simplify the model- use less complex model that is capturing based on similar inputs. Regression is widely used in field such as predicting sales, stock price or even
noise in data. 3) Increases Training data- provide more diverse example for model to learn General the whether the two basic types of regression are linear regression and multiple linear regression.
patterns. 4) Apply Dropout- in neural networks randomly drop some neurons during training to prevent
Q7) Linear Regression- Linear regression is a simple and widely ML algorithm for predicting continuous
over- reliance on specific path.
output based on one or more input variable. It assume the relationship between input variable and
Q3) Bias and Variance trade off output variable can be represented as single line. In simple linear regression, there is one input variable
Bias –It refers to the error caused by model being too simple and not capturing the underline pattern in and one output variable and the relationship is expressed as y = mc + c. Here, y is predicted value , x is
data. For e.g.- a linear model trying to fit complex non-linear data will have high bias. Variance- It refers the input variable, m is slope of line and c is the intercept ( value of y, when x = 0).
to error caused by a model being two complex and overlay sensitive to small functions in training data.
Q8) Evaluation Matrices- 1)Mean Absolute Error (MAE)- MAE is a metric used to measure the average
In this case, model perform well on training data, but poorly on new data. Trade off- the trade of
absolute difference between predicted value and actual value regression model. It tells us how much on
between bias and variance occurs because reducing bias typically increases variance and reducing
average the predictions deviate from true value. A lower MAE indicates better model performance.
variance typically increases bias. A High-bias, Low-variance: this model is simple, but it might not
2) Root Mean Squared Error (RMSE)- RMSE is the matrix used to evaluate the accuracy of regression
capture the complexity of data. Low-bias, High-variance: this model is complex and may overfit the
model. It calculate the square root of average of squared difference between predicted and actual
training data. Low-bias, low-variance: This variance shows ideal ML model, it is not possible practically.
value. RMSE gives more weight to larger error because it squares error before averaging, making it
High-bias, High-variance: predictions are inconsistent and also in accurate on average. The goal is to
more sensitive to outliers compare to MAE. A lower RMSE indicates better model performance.
find a model that performs well on both the training, data and new data.
3) R2 - also called coefficient of determination is a matrix that indicate how well regression model
Q4) Explain Lasso and Ridge Regression explains variability of target variable. Represents the proportion of variance in the target explained by
Lasso and Ridge recreations, or technique used in ML to improve performance of linear regression the features. Value ranges from 0 to 1 (or negative if the model performs worse than a simple mean
models, especially when there are too many input feature. 1) Lasso ( Least Absolute shrinkage & prediction). 1 indicates a perfect fit; 0 means no explanatory power.
selection operator )- Lasso adds a penalty, equal to absolute value of coefficient to the cost function.
Q9) Gradient Descent Algorithm-
The penalty term called L1 Regularisation not only reduces size of coefficient but can also shrink some
It is an optimisation algorithm used in ML to minimise error of model by finding optimal parameter (
coefficient zero. This means Lasso can effectively perform feature selection by keeping only most
like weight in a linear regression model ). It works by adjusting the parameters in the direction that
important features in model while discarding irrelevant ones. Formula- Cost Function = MSE + λ Σ |B|.
reduces error. Working:- 1) Begin with random value for model parameter ( Like slope & intercept for
E.g.- if a data sheet has many features, but only a few are relevant Lasso will identify and keep only the
linear regression). 2) Compute cost function- the court function measures, how will the model predict
key features. 2)Ridge Regression- Ridge adds penalty proportional to the squared value of coefficient.
the data. 3)Calculate Gradients- Gradients are partially derivatives of cost function with respect to each
This penalty term cordial L2 regularization helps reduce size of coefficient, making the model less
parameter. They tell us the direction and rate of change of cost. 4) Update Parameter- adjust parameter
sensitive to small functions in training data. Ridge is useful when all features are important but need to
in opposite direction of gradient to reduce cost. 5) Repeat the step 2-4 until cost stops decreasing
be scaled down to avoid overfitting. However, it does not shrink any coefficient exactly to zero so it
significantly or predefined number of iterations is reached.
keeps all feature model. Formula- Cost Function = MSE + λΣ B2.
Q10) Batch Gradient VS Stochastic Gradient ML unit 3 Q14) Equation of Linear regression line: X= 1,2,3,4. To find value of y for X= 12
Batch Gradient- It uses entire data acid to compute gradient at each step. It is slow because it processes Y= 3,4,5,7 Y= 11.61x + 6.39
entire data set at once. It requires more memory to process data set. More accurate as it consider data Answer- The equation of linear regression is = 11.61 ( 12) + 6.39
set for each update. It updates parameter less frequently. It is better for small or medium size data set. represented as, Y= mx + c. y= 145.7 calories
Stochastic Gradient- It uses single data point to compute gradient at each step. It is faster because it Where m= slope of line, c= intercept and they are
update after each data point. It require less memory as it process one data point at a time. It is less calculated as,
accurate because based on single data point. Update parameter frequently. Better for large dataset. m= n Σ( XY ) – ΣX ΣY / n Σ(X2) – ( ΣX)2
c= ΣY – m Σ X / n , n = n0 of values. Unit 6- Q ) Obtain output of Neuron Y for network
Q11) Define Different regression model - 1) Linear Regression- Models the relationship
Therefore we need to calculate, ΣX, ΣY, ΣXY, ΣX2. shown in figure. Using activation functions as:
between features and the target as a straight line. Suitable for simple, linear relationships.
We prepare the table. i) Binary Sigmoidal. Ii) Bipolar sigmoidal.
2)Logistic Regression: Used for binary classification problems. Outputs probabilities using the sigmoid
function 3) Polynomial Regression- Extends linear regression by adding polynomial terms of features. X Y XY X2 Answer-
Captures non-linear relationships. 4) Decision Tree Regression- Splits data into regions using decision 1 3 3 1 Inputs:- X1= 0.8, X2= 0.6, X3= 0.4
rules. Captures complex non-linear patterns. 5) Ridge Regression: Linear regression with L2 2 4 8 4 W= W1= 0.1, W2= 0.3, W3= 0.2
regularization to reduce overfitting. Penalizes large coefficients but retains all features. 3 5 15 9 bias (b)= 0.35 ( its input as always 1)
6)Lasso Regression: Linear regression with L1 regularization, shrinking some coefficients to zero. 4 7 28 16
Step 1) Net input to output neuron i,
ΣX= 10 ΣY= 19 ΣXY= 54 ΣX2= 30
Q12) Least square method- It is a technique used to find the best-fitting line in linear regression. Goal is Substituting the values, . yin= b + Σ xi wi
to minimizing the sum of the squared differences between the observed values and the predicted . = b + X1 W1 + X2 W2 + X3 W3
m= n Σ ( XY ) – ΣX ΣY / n Σ(X2) – ( ΣX)2
values. This ensures that line is as close as possible to all data points. In the context of linear regression . = 4 x 54 – 10 x 9 / 4 x 30 – (10)2 . = 0.35 + (0.8)(0.1) + (0.6)(0.3) + (0.4)(0.2)
the least square method helps determine the value of slope (m) & intercept (c) for equation: y=mx +c. . = 216-190 / 120- 100 = 26/20 . = 0.53
For each data point ( xi, yi ) the predicted value is Ypredicted= mx; + c. The difference between actual m = 1.3 1)Binary Sigmoidal:
value y; & the predicted value is called the error. To measure how well the line fits all the data points we . y= f ( yin) = ( 1/ 1 + e -yin )
calculate sum of squared errors: Error= Σ ( Yi – Y predicted)2 . Example- If we have dataset of house c= ΣY – m Σ X / n
. = 1 / 1+ e – 0.53 )
prices based on their sizes, the least square method will calculate the line that best predicts house . = 19 – 1.3 x 10 / 4
. = 0.625
prices based on size. . =6/4
c= 1.5 2)Bipolar Sigmoidal:
Q13) Stochastic gradient descent algorithm- It is an optimization algorithm used to minimize the cost . y= f ( yin) = ( 2 / 1 + e -yin )-1
function in machine learning, especially for large datasets. Widely used for training models like neural Equation of linear regression line is . = 2 / 1+ e – 0.53 ) -1
networks and linear regression. Key Points: 1)Gradient Descent: Involves updating model parameters in y= 1.3x + 1.5 . = 0.259
the direction of the negative gradient of the cost function to reduce error. 2)Stochastic: Unlike Q15)Linear regression For x= 8, 9.5 ,10, 6, 7, 4.
traditional gradient descent, which uses the entire dataset for each update, SGD updates parameters Y= 12,138, 147, 88,108, 62. Find value of Y for
using only one data point at a time. 3)Faster: Due to updates after each data point, it converges faster, X=12
making it suitable for large datasets. 4)Noisy Updates: The updates can be noisy and fluctuate because Till table same as above
they are based on a single data point, but over time, they can still converge to the optimal solution. ΣX= 44.5, ΣY= 555, ΣXY= 4409, ΣX2= 355.25
5)Learning Rate: The step size (learning rate) controls how big each update is. 6)Advantages: More
efficient for large datasets, faster than batch gradient descent. Substituting the values,
m= n Σ ( XY ) – ΣX ΣY / n Σ(X2) – ( ΣX)2
. = 6 x 4409 – 44.5 x 555 / 6 x 355.25 – 1980.25
. = 1756.5 / 151.25
m= 11.61
C= ΣY – m Σ X / n
. = 555- 11.61 x 44.5 / 6
c= 6.39
Equation of linear regression line is
y= 11.61x + 6.39

Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages
ML Solved Endsem
No ratings yet
ML Solved Endsem
16 pages
ML 3
No ratings yet
ML 3
50 pages
Supervised Regression Notes
No ratings yet
Supervised Regression Notes
11 pages
ML 1
No ratings yet
ML 1
24 pages
2.1 Supervised Regression
No ratings yet
2.1 Supervised Regression
26 pages
ML Models and When To Choose One Over Others
No ratings yet
ML Models and When To Choose One Over Others
7 pages
Regression Models Overview
No ratings yet
Regression Models Overview
170 pages
Aml 1
No ratings yet
Aml 1
5 pages
Machine Learning Questions and Answers For Interview
No ratings yet
Machine Learning Questions and Answers For Interview
20 pages
ML Assignment
No ratings yet
ML Assignment
5 pages
Lasso Vs Ridge Vs Elastic 1
No ratings yet
Lasso Vs Ridge Vs Elastic 1
5 pages
Module 3
No ratings yet
Module 3
35 pages
Advanced Regression Assignment
No ratings yet
Advanced Regression Assignment
5 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
Machine Learning Interview Question
No ratings yet
Machine Learning Interview Question
72 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
ML Short
No ratings yet
ML Short
11 pages
Unit 2
No ratings yet
Unit 2
92 pages
Describe in Brief Different Types of Regression Algorithms
No ratings yet
Describe in Brief Different Types of Regression Algorithms
25 pages
Classification & Regression BDMDM Print
No ratings yet
Classification & Regression BDMDM Print
5 pages
ML Ai
No ratings yet
ML Ai
53 pages
Chapter+3+ ++Regression+Algorithms
No ratings yet
Chapter+3+ ++Regression+Algorithms
22 pages
Linear Regression Python Programming
No ratings yet
Linear Regression Python Programming
25 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
All About ML
No ratings yet
All About ML
18 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Unit 2
No ratings yet
Unit 2
8 pages
Top 100 ML Interview Q&A
100% (1)
Top 100 ML Interview Q&A
39 pages
Unit 6
No ratings yet
Unit 6
107 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Irreducible Error Bias Variance Underfitting Overfitting Regularization Lasso Ridge
No ratings yet
Irreducible Error Bias Variance Underfitting Overfitting Regularization Lasso Ridge
8 pages
PA Notes 2
No ratings yet
PA Notes 2
23 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
Lecture+Notes+-+Advanced+Regression
No ratings yet
Lecture+Notes+-+Advanced+Regression
12 pages
Session 3
No ratings yet
Session 3
26 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Feature Selection
No ratings yet
Feature Selection
19 pages
Linear Regression, Polynomical, Gradiant Descent
No ratings yet
Linear Regression, Polynomical, Gradiant Descent
42 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Unit 2
No ratings yet
Unit 2
23 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Data Analytics - Ridge and LASSO Regression
No ratings yet
Data Analytics - Ridge and LASSO Regression
15 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Class 9 After
No ratings yet
Class 9 After
38 pages
Unit - 3 - ML - 24
No ratings yet
Unit - 3 - ML - 24
41 pages
Linear Regression
No ratings yet
Linear Regression
60 pages
Unit I
No ratings yet
Unit I
14 pages
SP 24 BADM 576 Final - Exam - Study - Guide
No ratings yet
SP 24 BADM 576 Final - Exam - Study - Guide
13 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
0 Regularization PDF
No ratings yet
0 Regularization PDF
88 pages
Regression
No ratings yet
Regression
45 pages
6.classification & Regression
No ratings yet
6.classification & Regression
45 pages
Regularization Linear Models
No ratings yet
Regularization Linear Models
23 pages
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
No ratings yet
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
43 pages
ML Endsem
No ratings yet
ML Endsem
14 pages
Lecture 05 06
No ratings yet
Lecture 05 06
40 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Lecture 3 Single Factor Experiments - RCBD PDF
No ratings yet
Lecture 3 Single Factor Experiments - RCBD PDF
29 pages
The Influence of The Audit Committee, The Board of Commissioners, and Institutional Ownership On Company Performance
No ratings yet
The Influence of The Audit Committee, The Board of Commissioners, and Institutional Ownership On Company Performance
9 pages
Applied Nonparametric Statistical Methods 3rd Ed Edition Peter Sprent Download PDF
No ratings yet
Applied Nonparametric Statistical Methods 3rd Ed Edition Peter Sprent Download PDF
51 pages
Econometrics Notes
No ratings yet
Econometrics Notes
30 pages
Ma8452 SNM Rejinpaul Iq Am19 PDF
No ratings yet
Ma8452 SNM Rejinpaul Iq Am19 PDF
3 pages
Design and Analysis of Experiments: Experiments With Random Factors
No ratings yet
Design and Analysis of Experiments: Experiments With Random Factors
31 pages
COMSATS University Islamabad, Wah Campus Terminal Examinations Spring 2020
No ratings yet
COMSATS University Islamabad, Wah Campus Terminal Examinations Spring 2020
6 pages
Sample 3
No ratings yet
Sample 3
25 pages
Assessment in Learning 1 PDF Free
No ratings yet
Assessment in Learning 1 PDF Free
16 pages
Using Multivariate Statistics - 7th Edition ISBN 0134790545, 9780134790541 (FULL VERSION DOWNLOAD)
No ratings yet
Using Multivariate Statistics - 7th Edition ISBN 0134790545, 9780134790541 (FULL VERSION DOWNLOAD)
16 pages
Chapter 13 Wiener Processes and PDF
No ratings yet
Chapter 13 Wiener Processes and PDF
29 pages
Final Chapter 3 Methodology
100% (3)
Final Chapter 3 Methodology
31 pages
Probability With The Binomial Distribution and Pascal's Triangle - A Key Idea in Statistics
100% (2)
Probability With The Binomial Distribution and Pascal's Triangle - A Key Idea in Statistics
54 pages
PHStat Readme
No ratings yet
PHStat Readme
4 pages
Statistics For Finance Notes
No ratings yet
Statistics For Finance Notes
111 pages
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 3: Solutions
No ratings yet
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 3: Solutions
3 pages
MA - Standard Costing and Variance Analysis PDF
No ratings yet
MA - Standard Costing and Variance Analysis PDF
3 pages
Improved Estimator For The Estimation of Sensitive Variable Using ORRT Models
No ratings yet
Improved Estimator For The Estimation of Sensitive Variable Using ORRT Models
7 pages
RMSD and MSD
No ratings yet
RMSD and MSD
4 pages
Applied Statistics: From Bivariate Through Multivariate Techniques Seconddownload
100% (2)
Applied Statistics: From Bivariate Through Multivariate Techniques Seconddownload
53 pages
Discrete Probability Distribution
No ratings yet
Discrete Probability Distribution
67 pages
B.Tech (PT) - Mathematics - II YEAR - III SEM - (R) 2012 PDF
No ratings yet
B.Tech (PT) - Mathematics - II YEAR - III SEM - (R) 2012 PDF
15 pages
Van Der Post H. Monte Carlo Simluations For Options... Financial Forecasting 2024
No ratings yet
Van Der Post H. Monte Carlo Simluations For Options... Financial Forecasting 2024
450 pages
Cointegration and Asset Allocation A New Fund Stra
No ratings yet
Cointegration and Asset Allocation A New Fund Stra
27 pages
Risk Management in Construction Projects
No ratings yet
Risk Management in Construction Projects
303 pages
SS2 Mathematics Week 4 Third Term
No ratings yet
SS2 Mathematics Week 4 Third Term
3 pages
STA301-Mid Term Solved MCQs With References PDF
0% (1)
STA301-Mid Term Solved MCQs With References PDF
28 pages
Dl-Unit-2 - 1
No ratings yet
Dl-Unit-2 - 1
47 pages
Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2
No ratings yet
Random Vectors: An Overview: Master INVESTMAT 2018-2019 Unit 2
40 pages
Normal Distribution Notes & Exam Type Qstns
No ratings yet
Normal Distribution Notes & Exam Type Qstns
18 pages

ML Unit 3

Uploaded by

ML Unit 3

Uploaded by

Q1) Overfitting vs Underfitting ML Unit 3 Q5) Lasso vs Ridge Regression ML Unit 3

You might also like