0% found this document useful (0 votes)
30 views11 pages

Economic

NOte

Uploaded by

tizazubarock1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views11 pages

Economic

NOte

Uploaded by

tizazubarock1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

De par

tment Accounting and Finance


INDIVIDUAL ASSIGNMENT
Name Tizazu Getachew
ID/No 053/12
Section - One
1 Discuss the causes of heteroscedasticity and explain tests ?

Causes of Heteroscedasticity
 Omitted Variables: If an important variable that is
correlated with the independent variable is omitted from the
regression model, it can lead to heteroscedasticity.
 Measurement Errors: Errors in measuring the dependent
variable can result in heteroscedasticity, as the variance of
the error term may depend on the values of the independent
variables.
 Non-linear Relationships: If the true relationship between
the dependent and independent variables is non-linear, the
assumption of constant variance may be violated, leading to
heteroscedasticity.
 Outliers: The presence of outliers in the data can also
contribute to heteroscedasticity, as the variance of the error
term may be larger for observations with extreme values.
 Heterogeneous Populations: If the data is drawn from a
population that is not homogeneous, the variance of the
error term may depend on the values of the independent
variables.
Tests for Heteroscedasticity
 Breusch-Pagan Test
This test examines the relationship between the squared residuals and
the independent variables.
The null hypothesis is that the error variance is constant
(homoscedasticity), and the alternative hypothesis is that the error
variance is not constant (heteroscedasticity).

If the test statistic is significant, it suggests the presence of


heteroscedasticity.
 White Test
This test is similar to the Breusch-Pagan test, but it examines the
relationship between the squared residuals and the independent
variables, their squares, and their cross-products.

The null hypothesis is that the error variance is constant


(homoscedasticity), and the alternative hypothesis is that the error
variance is not constant (heteroscedasticity).

If the test statistic is significant, it suggests the presence of


heteroscedasticity.
 Goldfeld-Quandt Test
This test divides the data into two groups based on the values of an
independent variable and compares the variances of the error terms in
the two groups.

The null hypothesis is that the error variance is constant


(homoscedasticity), and the alternative hypothesis is that the error
variance is not constant (heteroscedasticity).

If the test statistic is significant, it suggests the presence of


heteroscedasticity.

2. State the consequences of autocorrelation and explain tests ?

Consequences of Autocorrelation
 Biased Coefficient Estimates
In the presence of autocorrelation, the standard errors of the
regression coefficients will be biased, leading to incorrect inferences
about the significance of the independent variables.
 Inefficient Parameter Estimates
Autocorrelation violates the assumption of independent error terms,
which is required for the ordinary least squares (OLS) estimator to be
efficient. As a result, the parameter estimates will not be the most
efficient (i.e., have the smallest possible variance).
 Incorrect Confidence Intervals and Hypothesis Tests
The presence of autocorrelation can lead to incorrect confidence
intervals and invalid hypothesis tests, as the standard errors used to
construct these statistics will be biased.
 Spurious Regression
Autocorrelation can lead to the appearance of a significant relationship
between variables when, in fact, there is no true relationship (i.e., a
spurious regression).

Tests for Autocorrelation


 Durbin-Watson Test
This is a commonly used test for detecting first-order autocorrelation
(i.e., correlation between adjacent error terms).

The test statistic ranges from 0 to 4, with a value of 2 indicating no


autocorrelation.

If the test statistic is significantly different from 2, it suggests the


presence of autocorrelation.
 Breusch-Godfrey Test
This test is more general than the Durbin-Watson test, as it can detect
higher-order autocorrelation (i.e., correlation between error terms
separated by more than one time period).

The null hypothesis is that there is no autocorrelation, and the


alternative hypothesis is that there is autocorrelation.

If the test statistic is significant, it suggests the presence of


autocorrelation.
 Ljung-Box Test
This test examines the overall significance of a set of autocorrelation
coefficients, rather than just the first-order autocorrelation

3.Explain 3 remedial measures suggested to overcome multi-


collinearity and explain tests?

1.Dropping one of the highly correlated variables

If two or more independent variables in a regression model are highly


correlated, one solution is to drop one of the variables from the model.

This can be done by examining the correlation matrix of the


independent variables and identifying the variables with high
correlation coefficients.

Tests used to detect multicollinearity


 Variance Inflation Factor (VIF): VIF measures the degree to
which the variance of an estimated regression coefficient is
inflated due to multicollinearity. A VIF value greater than 10
is generally considered an indication of severe
multicollinearity.
 Condition Number: The condition number is the ratio of the
largest to the smallest eigenvalue of the correlation matrix of
the independent variables. A condition number greater than
30 suggests the presence of multicollinearity.
2. Combining the correlated variables into a single variable

Instead of dropping one of the highly correlated variables, another


approach is to combine the correlated variables into a single variable.

This can be done by creating a new variable that is a linear combination


of the original correlated variables, such as the average or the first
principal component.
 Test used to detect multicollinearity
Correlation Matrix: Examining the correlation matrix of the
independent variables can help identify the variables that are highly
correlated and should be combined.
 Principal Component Analysis (PCA)
PCA can be used to identify the principal components of the
independent variables, which can then be used to create new,
uncorrelated variables.

3. Using principal component regression (PCR)

Principal component regression is a technique that involves using the


principal components of the independent variables as the new
independent variables in the regression model.

This approach can be useful when there is a high degree of


multicollinearity among the original independent variables.

Tests used to detect multicollinearity


 Eigenvalues: The eigenvalues of the correlation matrix of the
independent variables can be used to determine the number
of principal components to include in the regression model.
 Proportion of Variance Explained: The proportion of
variance in the dependent variable that is explained by the
principal components can be used to assess the effectiveness
of the PCR approach.
4.What is p-value ?

The p-value is a fundamental concept in statistical hypothesis testing


and inference. It is a measure of the strength of the evidence against
the null hypothesis in a statistical test.

Specifically, the p-value represents the probability of obtaining a test


statistic that is at least as extreme as the one observed, given that the
null hypothesis is true. In other words, the p-value is the probability of
getting the observed (or more extreme) results if the null hypothesis is
actually true.

The p-value is used to determine the statistical significance of the


results of a hypothesis test. The smaller the p-value, the stronger the
evidence against the null hypothesis. Typically, researchers use a pre-
determined significance level, often denoted as α (alpha), to make a
decision about the null hypothesis.

The interpretation of the p-value is as follows:


 If the p-value is less than the significance level (α), the null
hypothesis is rejected, and the result is considered
statistically significant.
 If the p-value is greater than or equal to the significance level
(α), the null hypothesis is not rejected, and the result is
considered not statistically significant.
 The most commonly used significance levels are 0.05 (5%)
and 0.01 (1%), but other levels can be used depending on the
context and the researcher's preferences.
 It's important to note that the p-value does not directly tell
you the probability that the null hypothesis is true or the
magnitude of the effect. It only provides information about
the strength of the evidence against the null hypothesis
based on the observed data.
5.Define the nature of dummy variables and give example?

Dummy variables are a type of categorical variable used in regression


analysis to represent discrete or qualitative characteristics. They are
used to capture the effect of a categorical variable on the dependent
variable in a regression model.

The nature of dummy variables is as follows:


 Binary Representation
Dummy variables are typically coded as either 0 or 1, representing the
presence or absence of a particular characteristic or category.

For example, if you have a variable representing gender, you can create
a dummy variable where 0 represents "Female" and 1 represents
"Male".
 Mutually Exclusive Categories
Dummy variables are used to represent mutually exclusive categories,
meaning that an observation can only belong to one category at a time.
For example, if you have a variable representing marital status with
categories "Single", "Married", and "Divorced", you would need to
create two dummy variables to represent these three categories.
 Reference Category
When using multiple dummy variables, one category is typically
designated as the reference category, which is represented by the value
0 for all dummy variables.

The regression coefficients for the other dummy variables are then
interpreted relative to the reference category.

Example

Let's consider a regression model that aims to predict the salary of


employees based on their education level. The education level variable
has three categories: "High School", "Bachelor's Degree", and "Master's
Degree".

To incorporate this categorical variable into the regression model, you


would create two dummy variables:

1. Bachelor's Degree Dummy

Coded as 1 if the employee has a Bachelor's Degree, 0 otherwise.

2. Master's Degree Dummy

Coded as 1 if the employee has a Master's Degree, 0 otherwise.

The reference category in this case would be "High School", which


would be represented by 0 for both dummy variables.

6.Define Time Series Analysis and what does mean stationarity and
unit roots?
Two important concepts in time series analysis are stationarity and unit
roots:
 Stationarity
Stationarity refers to the statistical properties of a time series, such as
the mean, variance, and autocorrelation structure, remaining constant
over time.

A stationary time series has a constant mean and variance, and the
covariance between any two time points depends only on the time
difference, not on the actual time.

Stationarity is important because many time series models, such as


ARIMA (Autoregressive Integrated Moving Average) models, assume
that the time series is stationary.
 Unit Roots
A unit root is a statistical property of a time series that indicates the
presence of a stochastic trend, which means that the series is non-
stationary.

A time series with a unit root is said to be integrated of order 1,


denoted as I(1), meaning that the series needs to be differenced once
to become stationary.

The presence of a unit root can have important implications for the
analysis and modeling of the time series, as it can lead to spurious
regression results and invalid statistical inferences.
 Mean Stationarity
A time series is said to be mean stationary if its mean does not change
over time.
This means that the expected value of the time series is constant and
does not depend on the time index.
 Unit Roots
A time series has a unit root if it is non-stationary, meaning that its
statistical properties (such as the mean and variance) change over time.
The presence of a unit root indicates that the time series has a
stochastic trend, which means that the series is not mean-reverting and
can wander arbitrarily far from its starting value.

You might also like