0% found this document useful (0 votes)
11 views3 pages

Chatgpt Unit - 2

Uploaded by

he he
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

Chatgpt Unit - 2

Uploaded by

he he
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Regression Analysis in Machine Learning - Unit 2 Notes

1. Introduc on to Regression
What is Regression?
Regression is a sta s cal method used in Machine Learning to es mate the rela onships between variables. It predicts
a con nuous target variable (dependent variable) based on one or more predictor variables (independent variables).
Terminologies in Regression:
 Dependent Variable (Target): The variable we aim to predict.
 Independent Variable (Features): The variables used to predict the target.
 Regression Coefficients: Parameters that represent the rela onship between predictors and the target.
 Residuals: The differences between observed and predicted values.
Applica ons of Regression:
 Predic ng house prices.
 Es ma ng sales revenue.
 Analyzing the impact of adver sing on sales.

2. Types of Regression
1. Linear Regression:
o Predicts the target variable by fi ng a linear rela onship.
o Equa on: y=β0+β1x+ϵy = \beta_0 + \beta_1x + \epsilon.
2. Logis c Regression:
o Used for classifica on problems.
o Predicts the probability of a binary outcome.
o Uses the sigmoid func on to map outputs.
3. Polynomial Regression:
o Extends linear regression by adding polynomial terms.
o Captures non-linear rela onships.
4. Ridge and Lasso Regression:
o Add regulariza on to prevent overfi ng.
5. Mul ple Linear Regression:
o Extends linear regression to mul ple predictors.
o Equa on: y=β0+β1x1+β2x2+...+ϵy = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \epsilon.

3. Logis c Regression
Overview:
Logis c Regression models the probability of a binary outcome.
Key Concepts:
 Sigmoid Func on: Converts predic ons into probabili es.
o P(y=1∣X)=11+e−zP(y=1|X) = \frac{1}{1 + e^{-z}}, where z=β0+β1xz = \beta_0 + \beta_1x.
 Threshold: Determines class (e.g., 0.5).
Applica ons:
 Spam detec on.
 Disease diagnosis.

4. Simple Linear Regression


Overview:
 Models the rela onship between one independent variable and one dependent variable.
 Equa on: y=β0+β1xy = \beta_0 + \beta_1x.
Assump ons:
1. Linear rela onship between variables.
2. Homoscedas city (constant variance of residuals).
3. Independence of observa ons.
4. Normally distributed residuals.
Model Building:
1. Define the dependent and independent variables.
2. Es mate parameters β0\beta_0 and β1\beta_1.
3. Fit the regression line to the data.
Ordinary Least Squares (OLS):
 Minimizes the sum of squared residuals.
 Es mates coefficients β\beta by minimizing:
o SSE=∑(yi−y^i)2\text{SSE} = \sum (y_i - \hat{y}_i)^2.
Proper es:
 Unbiased es mators.
 Minimum variance among linear es mators.
Interval Es ma on:
 Confidence intervals for predic ons.
 Provides uncertainty bounds.
Residuals:
 Differences between observed and predicted values.
 Residual plot checks model fit.

5. Mul ple Linear Regression


Overview:
 Extends simple linear regression to mul ple predictors.
 Equa on: y=β0+β1x1+β2x2+...+βkxk+ϵy = \beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_kx_k + \epsilon.
Assump ons:
1. Linear rela onship between predictors and target.
2. No mul collinearity (independent predictors).
3. Homoscedas city.
4. Independence of errors.
Model Evalua on:
 R-Squared ( R²): Propor on of variance explained.
 Adjusted R-Squared: Adjusted for the number of predictors.
 Standard Error: Measures predic on accuracy.
 F-Sta s c: Tests overall model significance.
 P-Values: Tests significance of individual predictors.
Interpreta on:
 Coefficients indicate the change in target for a one-unit change in predictors.
Assessing Fit:
1. R2R^2: Higher values indicate be er fit.
2. Residual analysis: Check for pa erns.

6. Feature Selec on and Dimensionality Reduc on


Importance:
 Improves model performance.
 Reduces computa onal complexity.
Techniques:
1. Principal Component Analysis (PCA):
o Converts correlated features into uncorrelated principal components.
o Retains components that explain most variance.
o Applica ons: Data visualiza on, noise reduc on.
2. Linear Discriminant Analysis (LDA):
o Finds a linear combina on of features that best separates classes.
o Commonly used in classifica on.
3. Independent Component Analysis (ICA):
o Decomposes data into independent components.
o Applica ons: Blind signal separa on.

End of Notes
These detailed notes provide comprehensive coverage of regression analysis, including logis c regression, simple and
mul ple linear regression, and advanced techniques for feature selec on and dimensionality reduc on.

You might also like