0% found this document useful (0 votes)

2 views27 pages

Chapter 2

Chapter 2 discusses key assumptions in linear regression, such as linearity, independence of errors, homoscedasticity, normality of errors, and no perfect multicollinearity, which ensure the reliability of regression analysis. It also covers R-squared as a measure of model fit, the Gauss-Markov theorem, and the basics of logistic regression, including the sigmoid function and key terminologies. Additionally, it introduces multiple linear regression, its assumptions, and the concept of Principal Component Analysis (PCA) for dimensionality reduction.

Uploaded by

anupjareda7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views27 pages

Chapter 2

Uploaded by

anupjareda7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Chapter -2

Regression

Assumptions in Linear regression:

In linear regression, several key assumptions are made about the data and the relationship
between the independent and dependent variables. These assumptions ensure that the model is
appropriate and that the results of the regression analysis are reliable. The main assumptions in
linear regression are:
1. Linearity:
 The relationship between the dependent variable (Y) and the independent variables (X) is
assumed to be linear. This means that the change in Y is proportional to a change in X.
Mathematically, this is represented as:

2. Independence of Errors (No Autocorrelation):

 The residuals (errors) are assumed to be independent of each other. In other words, the
error for one observation should not provide any information about the error for another
observation. This assumption is important for ensuring that the model does not
overestimate the goodness of fit due to correlated residuals.
 For example, in stock market data, sales data, or any scenario where previous values can
influence future values, the error terms for successive observations can become
correlated.
3. Homoscedasticity:
 The variance of the residuals (errors) should be constant across all levels of the
independent variable(s). This is known as homoscedasticity. If the variance of the
residuals changes as the value of X changes, it’s called heteroscedasticity, which can lead
to inefficiency in the model.
 Homoscedasticity means that the spread of the residuals is uniform across all predicted
values.

4. Normality of Errors:
 The residuals (errors) of the model should be approximately normally distributed. This
assumption is important for hypothesis testing (e.g., t-tests for the regression coefficients)
and for constructing confidence intervals.
5. No Perfect Multicollinearity:
 The independent variables should not be perfectly correlated with each other. If two or
more predictors are highly correlated, the model may have difficulty estimating their
individual effects, which can lead to unstable coefficient estimates (high variance). This
is known as multicollinearity.

R squared (Coefficient of determination)

R-squared is a statistical measure that represents the goodness of fit of a regression model.
The value of R-square lies between 0 to 1. Where we get R-square equals 1 when the model
perfectly fits the data and there is no difference between the predicted value and actual value.
However, we get R-square equals 0 when the model does not predict any variability in the
model and it does not learn any relationship between the dependent and independent variables.
SSE is the sum of the squared differences between the actual dependent variable values and the
predicted values from the regression model.

SST is the total variation in the dependent variable and is calculated by summing the squared
differences between each actual dependent variable value and the mean of all dependent variable
values.
Gauss Markov Theorem:

The Gauss-Markov theorem states that if your linear regression model satisfies the classical
assumptions, then ordinary least squares (OLS) regression produces unbiased estimates that
have the smallest variance of all possible linear estimators.

The Gauss-Markov theorem famously states that OLS is BLUE.

Best Linear Unbiased Estimator

What Does OLS Estimate?

In Regression analysis, goal is to draw a random sample from a population and use it to estimate
the properties of that population. In regression analysis, the coefficients in the equation are
estimates of the actual population parameters.

The notation for the model of a population is the following:

The betas (β) represent the population parameter for each term in the model.

Epsilon (ε) represents the random error that the model doesn’t explain.

Unfortunately, we’ll never know these population values because it is generally impossible to
measure the entire population. Instead, we’ll obtain estimates of them using our random sample.

The notation for an estimated model from a random sample is the following:

Sampling Distributions of the Parameter Estimates

Imagine that we repeat the same study many times. We collect random samples of the same size,
from the same population, and fit the same OLS regression model repeatedly. Each random
sample produces different estimates for the parameters in the regression equation. After this
process, we can graph the distribution of estimates for each parameter. Statisticians refer to this
type of distribution as a sampling distribution, which is a type of probability distribution.

1. Unbiased Estimates: Sampling Distributions Centered on the True Population

Parameter

In the graph below, beta represents the true population value. The curve on the right centers on a
value that is too high. This model tends to produce estimates that are too high, which is a positive
bias. It is not correct on average. However, the curve on the left centers on the actual value of
beta. That model produces parameter estimates that are correct on average. The expected value is
the actual value of the population parameter

2. Minimum Variance: Sampling Distributions are Tight Around the Population

Parameter

In the graph below, both curves center on beta. However, one curve is wider than
the other because the variances are different. Broader curves indicate that there is a
higher probability that the estimates will be further away from the correct value.

The Best in BLUE refers to the sampling distribution with the minimum variance. That’s the
tightest possible distribution of all unbiased linear estimation methods!

Simple Linear Regression Evaluation

This lesson presents two alternative methods for testing whether a linear association exists
between the predictor x and the response y in a simple linear regression model:
H0: β1 = 0 versus HA: β1 ≠ 0.
One is the t-test for the slope while the other is an analysis of variance (ANOVA) F-test.
1. Inference for the Population Intercept and Slope

Let's visit the example concerning the relationship between skin cancer mortality and state
latitude. The response variable y is the mortality rate (number of deaths per 10 million people) of
white males due to malignant skin melanoma from 1950-1959. The predictor variable x is the
latitude (degrees North) at the center of each of 49 states in the United States. A subset of the
data looks like this:

# State Latitude Mortality

1 Alabama 33.0 219
2 Arizona 34.5 160
3 Arkansas 35.0 170
4 California 37.5 182
5 Colorado 39.0 149
--- --- --- ---
49 Wyoming 43.0 134

and a plot of the data with the estimated regression equation looks like:

Is there a relationship between state latitude and skin cancer mortality? Certainly, since the
estimated slope of the line, b1, is -5.98, not 0, there is a relationship between state latitude and
skin cancer mortality in the sample of 49 data points. But, we want to know if there is a
relationship between the population of all of the latitudes and skin cancer mortality rates. That is,
we want to know if the population slope β1 is unlikely to be 0.

An α-level hypothesis test for the slope parameter β1

We follow standard hypothesis test procedures in conducting a hypothesis test for the
slope β1. First, we specify the null and alternative hypotheses:
Null hypothesis H0 : β1 = 0
Alternative hypothesis HA : β1 ≠ 0
Second, we calculate the value of the test statistic using the following formula:

Third, we use the resulting test statistic to calculate the P-value. The P-value is determined by
referring to a t-distribution with n-2 degrees of freedom.

Finally, we make a decision:

 If the P-value is smaller than the significance level α, we reject the null hypothesis in favor of the
alternative. We conclude "there is sufficient evidence at the α level to conclude that there is a
linear relationship in the population between the predictor x and response y."
 If the P-value is larger than the significance level α, we fail to reject the null hypothesis. We
conclude "there is not enough evidence at the α level to conclude that there is a linear
relationship in the population between the predictor x and response y."
Logistic Function

1. Logistic regression is a supervised machine learning algorithm used for classification

tasks where the goal is to predict the probability that an instance belongs to a given
class or not.
2. Uses a sigmoid function, that takes input as independent variables and produces a
probability value between 0 and 1.

 Logistic regression predicts the output of a categorical dependent variable. Therefore, the
outcome must be a categorical or discrete value.
 It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as
0 and 1, it gives the probabilistic values which lie between 0 and 1.
 In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic
function, which predicts two maximum values (0 or 1).

Logistic Function – Sigmoid Function

 The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
 It maps any real value into another value within a range of 0 and 1. The value of the
logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms
a curve like the “S” form.
 The S-form curve is called the Sigmoid function or the logistic function.
 In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and a value
below the threshold values tends to 0.

Terminologies involved in Logistic Regression

Here are some common terms involved in logistic regression:
 Independent variables: The input characteristics or predictor factors applied to the
dependent variable’s predictions.
 Dependent variable: The target variable in a logistic regression model, which we are
trying to predict.
 Logistic function: The formula used to represent how the independent and dependent
variables relate to one another. The logistic function transforms the input variables into a
probability value between 0 and 1, which represents the likelihood of the dependent
variable being 1 or 0.
 Odds: It is the ratio of something occurring to something not occurring. it is different from
probability as the probability is the ratio of something occurring to everything that could
possibly occur.
 Log-odds: The log-odds, also known as the logit function, is the natural logarithm of the
odds. In logistic regression, the log odds of the dependent variable are modeled as a linear
combination of the independent variables and the intercept.
 Coefficient: The logistic regression model’s estimated parameters, show how the
independent and dependent variables relate to one another.
 Intercept: A constant term in the logistic regression model, which represents the log odds
when all independent variables are equal to zero.
 Maximum likelihood estimation: The method used to estimate the coefficients of the
logistic regression model, which maximizes the likelihood of observing the data given the
model.
Formula of Logit function is:

Equation of best fit line in linear regression is

Let’s say instead of y we are taking probabilities (P). But there is an issue here, the
value of (P) will exceed 1 or go below 0 and we know that range of Probability is (0 -1).
To overcome this issue we take “odds” of P:

We know that odds can always be positive which means the range will always be (0,+∞
). Odds are nothing but the ratio of the probability of success and probability of failure.

It is difficult to model a variable that has a restricted range. To control this we take the log

of odds which has a range from (-∞,+∞).

Now we have our logistic function, also called a sigmoid function. The graph of a
sigmoid function is as shown below. It squeezes a straight line into an S-curve.

Pearson Correlation Coefficient (r):

The Pearson correlation coefficient (r) is the most common way of measuring a linear
correlation. It is a number between –1 and 1 that measures the strength and direction of the
relationship between two variables.
Pearson correlation Correlation Interpretation
coefficient (r) type

Between 0 and 1 Positive When one variable changes, the other variable
correlation changes in the same direction.

0 No correlation There is no relationship between the variables.

Between Negative When one variable changes, the other variable
0 and –1 correlation changes in the opposite direction.

Visualizing the Pearson correlation coefficient

The Pearson correlation coefficient also tells you whether the slope of the line of best fit is
negative or positive. When the slope is negative, r is negative. When the slope is positive, r is
positive.
When r is 1 or –1, all the points fall exactly on the line of best fit:

Calculating the Pearson correlation coefficient

Multi linear regression:

Multiple linear regression is used to estimate the relationship between two or more
independent variables and one dependent variable.

Assumptions of multiple linear regression:

Multiple linear regression makes all of the same assumptions as simple linear regression:
Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t
change significantly across the values of the independent variable.
Independence of observations: the observations in the dataset were collected using statistically
valid sampling methods, and there are no hidden relationships among variables.
In multiple linear regression, it is possible that some of the independent variables are actually
correlated with one another, so it is important to check these before developing the regression
model. If two independent variables are too highly correlated (r2 > ~0.6), then only one of them
should be used in the regression model.
Normality: The data follows a normal distribution.
Linearity: the line of best fit through the data points is a straight line, rather than a curve or
some sort of grouping factor.

How to perform a multiple linear regression

Multiple linear regression formula

The formula for a multiple linear regression is:

 = the predicted value of the dependent variable

 = the y-intercept (value of y when all other parameters are set to 0)
 = the regression coefficient ( ) of the first independent variable ( ) (a.k.a.
the effect that increasing the value of the independent variable has on the predicted y
value)
 … = do the same for however many independent variables you are testing
 = the regression coefficient of the last independent variable
 = model error (a.k.a. how much variation there is in our estimate of )
To find the best-fit line for each independent variable, multiple linear regression calculates three
things:
 The regression coefficients that lead to the smallest overall model error.
 The t statistic of the overall model.
 The associated p value (how likely it is that the t statistic would have occurred by chance
if the null hypothesis of no relationship between the independent and dependent variables
was true).
It then calculates the t statistic and p value for each regression coefficient in the model.

Principal Component analysis:

PCA stands for Principal Component Analysis. It is a dimensionality reduction technique

commonly used in data analysis and machine learning. The primary goal of PCA is to reduce the
dimensionality of a dataset while preserving as much of the variance or information present in
the data as possible.
PCA achieves this by transforming the original variables into a new set of variables, called
principal components. These principal components are linear combinations of the original
variables and are orthogonal to each other, meaning they are uncorrelated. The first principal
component accounts for the largest possible variance in the data, the second principal component
for the second largest variance, and so on.
In essence, PCA helps in simplifying the complexity of high-dimensional data by capturing the
most important patterns or directions of variation in the data, thereby enabling easier
visualization, exploration, and analysis of the dataset. It is widely used in various fields such as
image processing, signal processing, finance, and bioinformatics, among others.
The principal components (PCs) in PCA are derived through linear algebra techniques, primarily
involving eigenvalue decomposition or singular value decomposition (SVD) of the covariance
matrix of the original data. Here's a brief overview of the mathematics behind PCA:
1. Centering the data: First, the mean of each feature (variable) is subtracted from the dataset.
This step ensures that the data is centered around the origin.
2. Covariance matrix: The covariance matrix is calculated for the centered data. This matrix
represents the pairwise covariances between all pairs of features.
3. Eigenvalue decomposition (EVD):
 EVD: In EVD, the covariance matrix is decomposed into its eigenvectors and
eigenvalues. The eigenvectors represent the directions (principal components) of
maximum variance in the data, and the corresponding eigenvalues represent the
magnitude of variance along those directions. The eigenvectors are usually sorted in
descending order based on their corresponding eigenvalues, so the first principal
component (PC1) captures the most variance; the second principal component (PC2)
captures the second most variance, and so on.
4. Selecting principal components: After obtaining the eigenvectors or singular vectors, the
desired number of principal components is selected based on the explained variance or the
application's requirements. Typically, one can select a subset of the principal components that
capture most of the variance in the data.
5. Projection: Finally, the original data is projected onto the selected principal components to
obtain the reduced-dimensional representation of the data. This is achieved by taking the dot
product of the centered data matrix with the matrix of selected principal components.

Numerical:

To compute PCA, following steps:

1. Center the data
Linear Discriminant Analysis (LDA) is a supervised dimensionality reduction and classification
technique commonly used in pattern recognition, machine learning, and statistics. It is
particularly useful when dealing with high-dimensional data and aims to maximize the
separation between multiple classes.
LDA projects high-dimensional data onto a lower-dimensional space while preserving class
separability. It finds a linear combination of features that best separate two or more classes.
Unlike Principal Component Analysis (PCA), which is unsupervised and focuses on variance,
LDA is supervised and optimizes for class separability.
1. Mathematical Foundation of LDA
LDA works by computing discriminant axes that maximize the ratio of between-class variance
to within-class variance.
Step 1: Compute Class Mean and Overall Mean
For a dataset with ccc classes, compute:

Step 2: Compute Scatter Matrices

 Within-Class Scatter Matrix (Sw)
Measures the spread of data points within each class:

 Between-Class Scatter Matrix (Sb):

Measures the spread between different class means:
Key Properties of LDA
 Maximizes Class Separation – Unlike PCA, which maximizes variance, LDA optimizes
class discrimination.
 Handles Multi-Class Problems – Can extend beyond binary classification to multiple
classes.
 Feature Reduction – Projects data onto a lower-dimensional space, reducing
computational complexity.

LDA vs. PCA: Key Differences

Feature LDA PCA

Type Supervised Unsupervised
Goal Maximizes class separability Maximizes variance
Uses Class Labels? Yes No
Dimensionality Yes, while preserving class Yes, but may not preserve class
Reduction info info
Best Use Case Classification problems Feature extraction

Pipe Supports
100% (1)
Pipe Supports
147 pages
TESDA Circular No. 089-2019 - Mandatory SIL or OJT
88% (8)
TESDA Circular No. 089-2019 - Mandatory SIL or OJT
28 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Solutions of Triangle Sheet
100% (2)
Solutions of Triangle Sheet
16 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Eco 5
No ratings yet
Eco 5
30 pages
RiP Final Study
No ratings yet
RiP Final Study
35 pages
Basic Econometrics Notes
No ratings yet
Basic Econometrics Notes
47 pages
Stats Notes
No ratings yet
Stats Notes
48 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
No ratings yet
Jimma University: M.SC in Economics (Industrial Economics) Regular Program Individual Assignment: Econometrics
20 pages
Statics Thinking-Regression
No ratings yet
Statics Thinking-Regression
51 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Regression
No ratings yet
Regression
14 pages
Econometric S
No ratings yet
Econometric S
23 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Review Lecture
No ratings yet
Review Lecture
44 pages
Regression
No ratings yet
Regression
6 pages
Linear Regression For Intermediate
No ratings yet
Linear Regression For Intermediate
6 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
54 pages
Unit III
No ratings yet
Unit III
13 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Ecn 306
No ratings yet
Ecn 306
43 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
36 pages
Week 13
No ratings yet
Week 13
25 pages
Chapter 8 Regression Model - 2023
No ratings yet
Chapter 8 Regression Model - 2023
21 pages
Foreword
No ratings yet
Foreword
1,318 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
Thought Mastery Vocab Text PDF
No ratings yet
Thought Mastery Vocab Text PDF
2 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Econometrics Session
No ratings yet
Econometrics Session
43 pages
1486016038da Mod12 Q1 e Text
No ratings yet
1486016038da Mod12 Q1 e Text
11 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
Home Work 1: Group Member Student Name ID Contribution
No ratings yet
Home Work 1: Group Member Student Name ID Contribution
7 pages
Lesson 11 - Communication Professionals and Practitioners
No ratings yet
Lesson 11 - Communication Professionals and Practitioners
20 pages
Regression Notes
No ratings yet
Regression Notes
7 pages
Annexes 1 - 18
No ratings yet
Annexes 1 - 18
26 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
M2L2 CLRM & Simple Linear Regression Analysis
No ratings yet
M2L2 CLRM & Simple Linear Regression Analysis
13 pages
Econometrics Revision Work
100% (6)
Econometrics Revision Work
6 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Gebru Netsanet Kassaye 150519190409
No ratings yet
Gebru Netsanet Kassaye 150519190409
65 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Econometrics
No ratings yet
Econometrics
13 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Regression
No ratings yet
Regression
24 pages
NICU Discharge Plan
No ratings yet
NICU Discharge Plan
58 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Voltage Drop
No ratings yet
Voltage Drop
39 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
SRS
100% (1)
SRS
11 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
Evaluation of Threat Models
No ratings yet
Evaluation of Threat Models
5 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
Administrative and Business Chapter 4
No ratings yet
Administrative and Business Chapter 4
20 pages
Notes On Linear Regression - 2
No ratings yet
Notes On Linear Regression - 2
4 pages
NxOpen Programming MasterCourse CADVertex
No ratings yet
NxOpen Programming MasterCourse CADVertex
10 pages
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
B.SC - in Civil Engineering Session 2014 2015
No ratings yet
B.SC - in Civil Engineering Session 2014 2015
25 pages
Network Communication Types: by Ahmed El Hefny
100% (1)
Network Communication Types: by Ahmed El Hefny
15 pages
Internship Summary
No ratings yet
Internship Summary
3 pages
Business Ethics Case Study PDF
No ratings yet
Business Ethics Case Study PDF
5 pages
Advanced Global Trading - AGT Arena #1
No ratings yet
Advanced Global Trading - AGT Arena #1
38 pages
English Paper 1 2025
No ratings yet
English Paper 1 2025
143 pages
ARGUMENTATIVE ESSAY ON WATER - English
100% (1)
ARGUMENTATIVE ESSAY ON WATER - English
2 pages
RM Paper Quantum Machine Learning 2
No ratings yet
RM Paper Quantum Machine Learning 2
7 pages
Group 4: Diet For Healthy Teath Bones
No ratings yet
Group 4: Diet For Healthy Teath Bones
26 pages
Hadiths Notes (1-20)
No ratings yet
Hadiths Notes (1-20)
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
19 pages
Physics Class 12
No ratings yet
Physics Class 12
9 pages
Unit - 3 PME Notes
No ratings yet
Unit - 3 PME Notes
37 pages
Internship Documents
No ratings yet
Internship Documents
3 pages
Water
No ratings yet
Water
17 pages
Approaches To Stylistics and The Literar
No ratings yet
Approaches To Stylistics and The Literar
8 pages
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
No ratings yet
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
19 pages
Hamm 2015
No ratings yet
Hamm 2015
8 pages
PROPOSED - Date Sheet For Mid-Term Examination. March 2024
No ratings yet
PROPOSED - Date Sheet For Mid-Term Examination. March 2024
5 pages
Paper 1
No ratings yet
Paper 1
27 pages
Final Year Project Ideas EEE 2025
No ratings yet
Final Year Project Ideas EEE 2025
2 pages
PME Question Bank
No ratings yet
PME Question Bank
5 pages
Nure 231-11
No ratings yet
Nure 231-11
1 page
ML Mid Term Exam2024
No ratings yet
ML Mid Term Exam2024
2 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

Chapter 2

Uploaded by

Chapter 2

Uploaded by

Chapter -2

Assumptions in Linear regression:

2. Independence of Errors (No Autocorrelation):

R squared (Coefficient of determination)

The Gauss-Markov theorem famously states that OLS is BLUE.

Best Linear Unbiased Estimator

What Does OLS Estimate?

The notation for the model of a population is the following:

Sampling Distributions of the Parameter Estimates

1. Unbiased Estimates: Sampling Distributions Centered on the True Population

2. Minimum Variance: Sampling Distributions are Tight Around the Population

Simple Linear Regression Evaluation

# State Latitude Mortality

An α-level hypothesis test for the slope parameter β1

Finally, we make a decision:

1. Logistic regression is a supervised machine learning algorithm used for classification

Logistic Function – Sigmoid Function

Terminologies involved in Logistic Regression

Equation of best fit line in linear regression is

of odds which has a range from (-∞,+∞).

Pearson Correlation Coefficient (r):

0 No correlation There is no relationship between the variables.

Visualizing the Pearson correlation coefficient

Calculating the Pearson correlation coefficient

Multi linear regression:

Assumptions of multiple linear regression:

How to perform a multiple linear regression

Multiple linear regression formula

The formula for a multiple linear regression is:

 = the predicted value of the dependent variable

Principal Component analysis:

PCA stands for Principal Component Analysis. It is a dimensionality reduction technique

To compute PCA, following steps:

Step 2: Compute Scatter Matrices

 Between-Class Scatter Matrix (Sb):

LDA vs. PCA: Key Differences

Feature LDA PCA

You might also like