0% found this document useful (0 votes)

9 views9 pages

Sampling Distribution and SE

Uploaded by

adrish22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views9 pages

Sampling Distribution and SE

Uploaded by

adrish22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Sampling Distribution and SE

The standard error is nothing but the standard deviation of the sampling distribution of the
estimator, and the sampling distribution of an estimator is simply a probability or frequency
distribution
of the estimator, that is, a distribution of the set of values of the estimator obtained from all
possible
samples of the same size from a given population. Sampling distributions are used to draw
inferences
about the values of the population parameters on the basis of the values of the estimators
calculated. Explain this explicitly
from one or more samples.

Let’s break this down step by step to explain sampling distributions and the role of the standard error
explicitly.

1. Population vs. Sample

A population includes all individuals or items of interest in a study. For example, all people in a
city.
A sample is a subset of the population, selected for study. For example, 100 people randomly
chosen from the city.

2. What is an Estimator?

An estimator is a statistic (a function of the sample data) used to infer a population parameter.
Examples include:

ˉ) as an estimator of the population mean (μ).

The sample mean (x
The sample proportion (p) as an estimator of the population proportion.

3. Sampling Distribution of an Estimator

If you repeatedly take many samples of the same size from a population and compute an estimator
(e.g., sample mean or proportion) for each sample, these values will vary. The distribution of these
values is called the sampling distribution of the estimator.
For instance:

Suppose the population mean (μ) is 50, and you take samples of size 10. Each sample will have a
ˉ), but if you calculate x
different sample mean (x ˉ for all possible samples, their distribution forms
the sampling distribution of xˉ.

4. Standard Error

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/9
The standard error is the standard deviation of the sampling distribution of an estimator. It quantifies
how much an estimator (e.g., the sample mean) is expected to vary from sample to sample.
Mathematically:
σ
Standard Error of the Mean (SE) =

Where:

σ : Population standard deviation

n: Sample size

The formula shows that the standard error decreases as the sample size (n) increases, meaning larger
samples give more stable estimators.

5. Explicit Inference from One or More Samples

When we take one or more samples from a population:

One sample: Based on one sample, we compute an estimator (e.g., the sample mean) and
ˉ is close to μ, we
compare it to the expected behavior of the sampling distribution. For example, if x
infer the sample is representative.

More samples: By comparing multiple sample means (x ˉ1 , x

ˉ2 , …), we can assess variability and

make stronger inferences about μ using the sampling distribution.

In both cases, the standard error provides insight into the reliability of our estimators by reflecting how
much we expect them to deviate from the true population parameter.

Standard error of estimate or the standard error of the regression (se).

The Standard Error of Estimate (SE), also known as the Standard Error of the Regression (se ),

measures the average distance that the observed data points fall from the regression line in a
regression analysis. It provides a sense of how well the regression line represents the data.

1. Formula for Standard Error of Estimate

For a simple linear regression, the formula is:

n
∑i=1 (yi − y^i )2
se =

n−2

Where:

yi : Observed values of the dependent variable.

y^i : Predicted values of the dependent variable from the regression equation.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/9
n: Number of observations (data points).
n − 2: Degrees of freedom in simple linear regression (due to the estimation of two parameters:
slope and intercept).

2. Components of the Formula

^i )2 represents the squared residuals, which are the squared differences between
The term (yi − y

^i ).
observed values (yi ) and predicted values (y

2
∑ni=1 (yi − y^i ) is the Residual Sum of Squares (RSS), which quantifies the total deviation of

observed values from the predicted values.

Dividing RSS by n − 2 adjusts for the loss of degrees of freedom due to estimating parameters (the
slope and intercept).

3. Interpretation of se

A smaller se indicates that the data points are closer to the regression line, meaning the model has

better predictive accuracy.

A larger se suggests that the observed values deviate more from the regression line, indicating the

model has poorer predictive accuracy.

For example:

If se = 2, on average, the observed data points deviate by about 2 units from the regression line.

4. Relationship to R2 (Coefficient of Determination)

The R2 value in regression quantifies the proportion of variation in the dependent variable explained by
the independent variable(s). While R2 shows how well the model explains variability, se gives the

absolute measure of prediction error.

If R2 is high, se tends to be low, indicating a good fit. Conversely, a low R2 often correlates with a higher

se .

5. Use of Standard Error of Regression

It helps in constructing prediction intervals, indicating the range within which the dependent
variable is likely to fall for a given independent variable.
It assists in hypothesis testing, such as testing the significance of regression coefficients.

In summary, se is a key metric in regression analysis for assessing how well the regression model fits the

data by measuring the average prediction error.

Earlier we noted that, given Xi , σ2 represents the (conditional) variance of both ui and
Yi . Therefore, the standard error of the estimate can also be called the (conditional)
standard deviation of ui and Yi . Of course, as usual, σ2
Y and σY represent, respectively, the
unconditional variance and unconditional standard deviation of Y.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/9
Conditional and Unconditional Variance in Regression

In the context of regression analysis:

1. σ 2 is the conditional variance of the error term (ui ) and the dependent variable (Yi ) given the

independent variable (Xi ).

2. σY2 is the unconditional variance of the dependent variable (Y ), which is not conditional on any

specific values of X .

Let’s elaborate on these points:

1. Conditional Variance and Standard Deviation (σ 2 and σ )

When we say σ 2 represents the conditional variance of ui and Yi , it refers to the variance of these

variables after considering the effect of the independent variable (Xi ).

In a linear regression model:

Y i = β 0 + β 1 Xi + u i

Here:

ui is the error term (random deviation of actual Y from the regression line).

Yi is the dependent variable.

Conditional variance is:

Var(Yi ∣Xi ) = Var(ui ∣Xi ) = σ 2

This means that, for a fixed Xi , the variability of Yi around the regression line (or equivalently the

variability of ui ) is captured by σ 2 , and its square root σ is the conditional standard deviation.

2. Unconditional Variance and Standard Deviation (σY2 and σY )

The unconditional variance (σY2 ) is the variance of Y across all possible values of X , without

considering any specific X .

Unconditionally:

Var(Y ) = σY2

In regression, the unconditional variance of Y can be decomposed into:

Var(Y ) = β12 ⋅ Var(X) + σ 2

Here:

The term β12 ⋅ Var(X) is the portion of variance in Y explained by X .

The σ 2 represents the portion of variance in Y not explained by X (i.e., the residual variance).

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/9
Thus, σ 2 captures the unexplained variability in Y , while σY2 is the total variability of Y .

3. Standard Error of the Estimate as Conditional Standard Deviation

The standard error of the estimate (se ) provides an estimate of σ , the conditional standard

deviation of ui and Yi .

In practice, se is used because the true σ is typically unknown and must be estimated from sample

data.

4. Relationship Between Conditional and Unconditional Measures

The conditional variance σ 2 is typically smaller than the unconditional variance σY2 , as it reflects the

variability in Y after accounting for the effect of X .

σY2 includes both the variability explained by the regression model and the residual variability (i.e.,

σ 2 ).

In summary:

Conditional variance (σ 2 ) focuses on variability around the regression line (or conditional on X ).
Unconditional variance (σY2 ) captures the overall variability of Y in the population, regardless of

As n increases, the precision with which β2 can be estimated also increases. (Why?)

The precision with which β2 (or any regression coefficient) can be estimated increases as the sample size
(n) increases because of the following reasons:

1. Variance of the Estimator Decreases with n

^2 ) is inversely proportional to the sample size

In regression analysis, the variance of an estimator (e.g., β

n. Specifically, for simple linear regression:

σ2
Var(β^2 ) = n ˉ )2
∑i=1 (Xi − X

Where:

σ 2 : Variance of the error term (ui ).

Xi : Values of the independent variable.

Xˉ : Mean of X .

As n increases:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/9
n
1. The sum ∑i=1 (Xi
ˉ )2 grows larger because more data points provide more information about
−X
the variability of X .
^2 ).
2. The denominator of the variance formula becomes larger, reducing Var(β

^2 is more precise.
Lower variance means the estimator β

2. Law of Large Numbers

The Law of Large Numbers ensures that as n increases, sample statistics (e.g., sample mean and
sample variance) converge to their population counterparts. This reduces the variability in the estimates
of regression coefficients, leading to greater precision.

3. Central Limit Theorem

^2 becomes more
The Central Limit Theorem implies that as n increases, the sampling distribution of β

tightly concentrated around the true value β2 . This makes the estimate more reliable and reduces the

standard error.
^2 is given by:
The standard error of β

SE(β^2 ) =
Var(β^2 )

^2 ) decreases, indicating increased precision.

As n increases, SE(β

4. More Data Reduces Noise

A larger sample size provides more information about the relationship between X and Y , reducing the
influence of random noise or outliers in the data. With more observations, the regression line becomes a
better representation of the underlying true relationship.

5. Confidence Intervals Narrow

^2 ).
As n increases, the confidence intervals for β2 become narrower due to the reduction in SE(β

Narrower confidence intervals indicate more precise estimates of β2 .

Summary

The precision of β2 increases with n because:

1. The variance of the estimator decreases.

2. Sample statistics converge to population parameters.
3. The sampling distribution becomes more concentrated around the true value.
4. Random noise has less influence with more data points.
5. Confidence intervals become narrower, reflecting reduced uncertainty.

In practical terms, larger samples provide more robust and reliable estimates, minimizing the impact of
variability and randomness.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/9
how are the OLS estimators BLUE?

The Ordinary Least Squares (OLS) estimators are considered BLUE, which stands for Best Linear
Unbiased Estimators, under the assumptions of the Classical Linear Regression Model (CLRM). Let's
break this down step by step to understand why OLS estimators have this property.

Key Points in BLUE

1. Best: The OLS estimators have the smallest variance among all linear and unbiased estimators
(they are efficient).
2. Linear: The OLS estimators are linear functions of the dependent variable (Yi ).

3. Unbiased: The expected value of the OLS estimators equals the true population parameters,
meaning they are correct on average.

OLS Assumptions Ensuring BLUE Property

The Gauss-Markov Theorem guarantees that OLS estimators are BLUE if the following assumptions
hold:

1. Linear Model: The relationship between the dependent variable (Y ) and independent variables (
X1 , X2 , … , Xk ) is linear:

Yi = β0 + β1 X1i + β2 X2i + ⋯ + βk Xki + ui

2. Exogeneity (Unbiasedness Assumption): The expected value of the error term (ui ) is zero, and it is

uncorrelated with the independent variables:

E(ui ) = 0,
E(ui Xji ) = 0 ∀j

3. Homoscedasticity: The variance of the error term is constant across all observations:

Var(ui ) = σ 2 ∀i

4. No Autocorrelation: The error terms are uncorrelated across observations:

Cov(ui , uj ) = 0 for i =
j

5. Full Rank (No Perfect Multicollinearity): The independent variables are not perfectly correlated,
ensuring the regression coefficients can be uniquely estimated.

6. Normality (optional for small samples): While not required for the Gauss-Markov Theorem,
normality of ui ensures hypothesis testing and inference are valid.

Why OLS Estimators are BLUE

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/9
1. Linearity

^j ) are linear combinations of the dependent variable Y , derived by

The OLS estimators (β

minimizing the sum of squared residuals:

β^ = (X ′ X)−1 X ′ Y

This is a linear function of Y , ensuring the linearity condition.

2. Unbiasedness

Under the assumption E(ui )

= 0 and X being exogenous:

E(β^) = β

This means the OLS estimators correctly estimate the true population parameters on average.

3. Efficiency (Best)

Among all linear and unbiased estimators, OLS estimators have the smallest variance. The variance
^ is given by:
of β

Var(β^) = σ 2 (X ′ X)−1

If the OLS assumptions hold, no other linear and unbiased estimator can achieve a smaller
variance than this.

Gauss-Markov Theorem

The Gauss-Markov Theorem states:

Under assumptions 1–5 of the CLRM, the OLS estimators are the Best Linear Unbiased
Estimators (BLUE).

This means:

1. The OLS estimators are unbiased (E(β^) = β ).

2. Among all linear unbiased estimators, they have the minimum variance.

Limitations

OLS estimators are BLUE under the Gauss-Markov assumptions. However:

1. If the assumptions are violated (e.g., heteroscedasticity, autocorrelation, or endogeneity), the OLS
estimators may no longer be BLUE.
2. For small samples, normality of errors may be needed to ensure valid statistical inference.

Summary

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/9
OLS estimators are BLUE because they are linear, unbiased, and efficient (minimum variance) under the
Gauss-Markov assumptions. This makes OLS a robust and widely-used method for estimating regression
coefficients in linear models.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/9

An Introduction To Statistical Learning
No ratings yet
An Introduction To Statistical Learning
19 pages
The Conscious Universe The Scientific Truth of Psy
0% (1)
The Conscious Universe The Scientific Truth of Psy
4 pages
Kaizen Principles
100% (3)
Kaizen Principles
42 pages
Upang Cea Common Ece069 p3-1
No ratings yet
Upang Cea Common Ece069 p3-1
49 pages
Chapter 4 Regression (2) - Unlocked
No ratings yet
Chapter 4 Regression (2) - Unlocked
97 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Statistics
No ratings yet
Statistics
53 pages
Lesson 1: The Characteristics, Strengths, Weaknesses, and Kinds of Quantitative Research
100% (2)
Lesson 1: The Characteristics, Strengths, Weaknesses, and Kinds of Quantitative Research
8 pages
Standard Error
No ratings yet
Standard Error
3 pages
Practical Research 2
100% (1)
Practical Research 2
59 pages
13C NMR Spectroscopy Power Point Presentation
100% (1)
13C NMR Spectroscopy Power Point Presentation
34 pages
Lecture No. Probability & Statistics
No ratings yet
Lecture No. Probability & Statistics
60 pages
2015 Dote Annual Report
No ratings yet
2015 Dote Annual Report
472 pages
Lecture 30 - Sample and Population Mean
No ratings yet
Lecture 30 - Sample and Population Mean
49 pages
Ap Psychology Unit 2
No ratings yet
Ap Psychology Unit 2
5 pages
CHM256
No ratings yet
CHM256
78 pages
Lecture 4 Cartographic Process
86% (7)
Lecture 4 Cartographic Process
43 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Alexandre Grothendieck, Are We Going To Continue Scientific Research
No ratings yet
Alexandre Grothendieck, Are We Going To Continue Scientific Research
33 pages
Econometrics Notes
No ratings yet
Econometrics Notes
15 pages
Unit - 1 Sampling Distribution and Estimation Part 1
No ratings yet
Unit - 1 Sampling Distribution and Estimation Part 1
19 pages
ML Unit2 SimpleLinearRegression pdf-60-97
No ratings yet
ML Unit2 SimpleLinearRegression pdf-60-97
38 pages
3.0 ErrorVar and OLSvar-1
No ratings yet
3.0 ErrorVar and OLSvar-1
42 pages
Econometrics Endterm Summary 2 PDF
No ratings yet
Econometrics Endterm Summary 2 PDF
43 pages
Week7 Standard Errors
No ratings yet
Week7 Standard Errors
12 pages
ECM1001 Formula Sheet
No ratings yet
ECM1001 Formula Sheet
15 pages
Confidence Intervals Concept
No ratings yet
Confidence Intervals Concept
10 pages
Reviewer
No ratings yet
Reviewer
8 pages
One Dimensional Statistics
No ratings yet
One Dimensional Statistics
21 pages
Assessing The Out of Sample Forecast Performance of LSTAR Andd GARCH Models
No ratings yet
Assessing The Out of Sample Forecast Performance of LSTAR Andd GARCH Models
11 pages
12 Statistical Inferences Hypothesis Testing Docx 891688442659173
No ratings yet
12 Statistical Inferences Hypothesis Testing Docx 891688442659173
25 pages
IMRaD Format
100% (1)
IMRaD Format
20 pages
What Is A Standard Error, and How Should We Compute It - Jeffrey M. Wooldridge
No ratings yet
What Is A Standard Error, and How Should We Compute It - Jeffrey M. Wooldridge
8 pages
MKT Final - Short-Answer Questions
No ratings yet
MKT Final - Short-Answer Questions
7 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
Notes 2
No ratings yet
Notes 2
16 pages
ETC3550 Applied Forecasting For Business and Economics: Ch12. Some Practical Forecasting Issues
No ratings yet
ETC3550 Applied Forecasting For Business and Economics: Ch12. Some Practical Forecasting Issues
22 pages
Discussion of What Is A Standard Error'' - James L. Powell
No ratings yet
Discussion of What Is A Standard Error'' - James L. Powell
7 pages
Chapter Four: Research Methodology: 1 07/31/2021 Admas University
No ratings yet
Chapter Four: Research Methodology: 1 07/31/2021 Admas University
70 pages
Is The Dependent Variable Related To The Independent Variable?
No ratings yet
Is The Dependent Variable Related To The Independent Variable?
10 pages
Design Experiments
No ratings yet
Design Experiments
72 pages
Eto Talaga Yon
No ratings yet
Eto Talaga Yon
73 pages
Chandigarh University Department of Commerce
No ratings yet
Chandigarh University Department of Commerce
16 pages
Statistics: Week 5 Continuous Probability Distributions
No ratings yet
Statistics: Week 5 Continuous Probability Distributions
31 pages
Bsa Unit 5
No ratings yet
Bsa Unit 5
16 pages
AF ECO 4000 Cheat Sheet
No ratings yet
AF ECO 4000 Cheat Sheet
3 pages
CFICalculatorUsersGuide PDF
No ratings yet
CFICalculatorUsersGuide PDF
61 pages
Science and The Ethics of Curiosity: General Articles
No ratings yet
Science and The Ethics of Curiosity: General Articles
12 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Notes in Research
No ratings yet
Notes in Research
11 pages
Midwest Political Science Association
No ratings yet
Midwest Political Science Association
31 pages
Week 2. Errors in Chemical Analysis (Abstract)
No ratings yet
Week 2. Errors in Chemical Analysis (Abstract)
31 pages
Mediation and Multi-Group Moderation
No ratings yet
Mediation and Multi-Group Moderation
41 pages
Standard Error of Point Estimate
No ratings yet
Standard Error of Point Estimate
8 pages
Statistical Estimation
No ratings yet
Statistical Estimation
31 pages
LAMPIRAN 12. Uji Kruskal-Wallis Selisih Kadar Kolesterol LDL Dan HDL Antara Sebelum Dan Sesudah Perlakuan
No ratings yet
LAMPIRAN 12. Uji Kruskal-Wallis Selisih Kadar Kolesterol LDL Dan HDL Antara Sebelum Dan Sesudah Perlakuan
4 pages
Simple Regression (Continued) : Y Xu Y Xu
No ratings yet
Simple Regression (Continued) : Y Xu Y Xu
9 pages
EC2303 Final Formula Sheet PDF
No ratings yet
EC2303 Final Formula Sheet PDF
8 pages
Introduction To Qualitative Research: SCWK 240 - Week 9 Slides
No ratings yet
Introduction To Qualitative Research: SCWK 240 - Week 9 Slides
16 pages
Lesson 3
No ratings yet
Lesson 3
16 pages
Concept of Standard Error
No ratings yet
Concept of Standard Error
5 pages
Introduction To Inferential Statistics
No ratings yet
Introduction To Inferential Statistics
15 pages
Standard Error of The Estimate
No ratings yet
Standard Error of The Estimate
3 pages
The Shroud of Turin
100% (1)
The Shroud of Turin
18 pages
ECMT1020 Formulas 2021
No ratings yet
ECMT1020 Formulas 2021
9 pages
Regression Equation For SI
No ratings yet
Regression Equation For SI
12 pages
Data Analysis, Standard Error, and Confidence Limits: Mean of A Set of Measurements
No ratings yet
Data Analysis, Standard Error, and Confidence Limits: Mean of A Set of Measurements
5 pages
课本附录 (二) - 公式表 Formula Sheet - final
No ratings yet
课本附录 (二) - 公式表 Formula Sheet - final
2 pages
Regression Equation
No ratings yet
Regression Equation
56 pages
Standard Deviation of Error Term
No ratings yet
Standard Deviation of Error Term
7 pages
Chapter III - Thesis Breast Cancer
No ratings yet
Chapter III - Thesis Breast Cancer
5 pages
Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4
No ratings yet
Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4
13 pages
Notes On Sampling and Hypothesis Testing
No ratings yet
Notes On Sampling and Hypothesis Testing
10 pages
Standard Deviation Versus Standard Error - The Stats Geek
No ratings yet
Standard Deviation Versus Standard Error - The Stats Geek
1 page
06 01 Regression Analysis
No ratings yet
06 01 Regression Analysis
6 pages
Statistics Cheatsheet
No ratings yet
Statistics Cheatsheet
2 pages
Dimensions
No ratings yet
Dimensions
2 pages
ExamFinal Topics
No ratings yet
ExamFinal Topics
9 pages
Interpret Standard Deviation Outlier Rule: Using Normalcdf and Invnorm (Calculator Tips)
No ratings yet
Interpret Standard Deviation Outlier Rule: Using Normalcdf and Invnorm (Calculator Tips)
12 pages
Standard Errors For Regression Equations
No ratings yet
Standard Errors For Regression Equations
4 pages
Descriptive Statistics: Hypothesis Testing - Means
No ratings yet
Descriptive Statistics: Hypothesis Testing - Means
3 pages
Statistic Formulas
No ratings yet
Statistic Formulas
3 pages
x=^μ= x n x x) n−1 s σ s x−μ σ se (´x) = σ n: Sample Mean
No ratings yet
x=^μ= x n x x) n−1 s σ s x−μ σ se (´x) = σ n: Sample Mean
5 pages
Important Formulas Table
No ratings yet
Important Formulas Table
4 pages
Nursing Research
No ratings yet
Nursing Research
109 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

Sampling Distribution and SE

Uploaded by

Sampling Distribution and SE

Uploaded by

Sampling Distribution and SE

1. Population vs. Sample

ˉ) as an estimator of the population mean (μ).

3. Sampling Distribution of an Estimator

σ : Population standard deviation

5. Explicit Inference from One or More Samples

When we take one or more samples from a population:

More samples: By comparing multiple sample means (x ˉ1 , x

make stronger inferences about μ using the sampling distribution.

Standard error of estimate or the standard error of the regression (se).

1. Formula for Standard Error of Estimate

For a simple linear regression, the formula is:

yi : Observed values of the dependent variable.

2. Components of the Formula

observed values from the predicted values.

better predictive accuracy.

model has poorer predictive accuracy.

4. Relationship to R2 (Coefficient of Determination)

absolute measure of prediction error.

5. Use of Standard Error of Regression

data by measuring the average prediction error.

In the context of regression analysis:

independent variable (Xi ). ​

Let’s elaborate on these points:

1. Conditional Variance and Standard Deviation (σ 2 and σ )

variables after considering the effect of the independent variable (Xi ). ​

In a linear regression model:

Yi is the dependent variable.

Conditional variance is:

Var(Yi ∣Xi ) = Var(ui ∣Xi ) = σ 2

2. Unconditional Variance and Standard Deviation (σY2 and σY ) ​ ​

considering any specific X .

In regression, the unconditional variance of Y can be decomposed into:

Var(Y ) = β12 ⋅ Var(X) + σ 2

The term β12 ⋅ Var(X) is the portion of variance in Y explained by X .

3. Standard Error of the Estimate as Conditional Standard Deviation

4. Relationship Between Conditional and Unconditional Measures

variability in Y after accounting for the effect of X .

1. Variance of the Estimator Decreases with n

^2 ) is inversely proportional to the sample size

n. Specifically, for simple linear regression:

σ 2 : Variance of the error term (ui ).

Xi : Values of the independent variable.

2. Law of Large Numbers

3. Central Limit Theorem

^2 ) decreases, indicating increased precision.

4. More Data Reduces Noise

5. Confidence Intervals Narrow

Narrower confidence intervals indicate more precise estimates of β2 . ​

The precision of β2 increases with n because:

1. The variance of the estimator decreases.

Key Points in BLUE

OLS Assumptions Ensuring BLUE Property

Yi = β0 + β1 X1i + β2 X2i + ⋯ + βk Xki + ui

uncorrelated with the independent variables:

4. No Autocorrelation: The error terms are uncorrelated across observations:

Why OLS Estimators are BLUE

^j ) are linear combinations of the dependent variable Y , derived by

minimizing the sum of squared residuals:

This is a linear function of Y , ensuring the linearity condition.

Under the assumption E(ui ) ​

The Gauss-Markov Theorem states:

1. The OLS estimators are unbiased (E(β^) = β ).

OLS estimators are BLUE under the Gauss-Markov assumptions. However:

You might also like

independent variable (Xi ).

variables after considering the effect of the independent variable (Xi ).

2. Unconditional Variance and Standard Deviation (σY2 and σY )

Narrower confidence intervals indicate more precise estimates of β2 .

Under the assumption E(ui )