0% found this document useful (0 votes)
118 views

Chapter3 - Learning To Use Regression Analysis

1) The document outlines the 6 key steps in performing regression analysis: 1) review literature and develop a theoretical model, 2) specify the model by selecting variables and functional form, 3) hypothesize expected coefficient signs, 4) collect and clean the data, 5) estimate and evaluate the model, and 6) document the results. 2) An example is provided of using regression to determine the best locations for a new restaurant chain. Independent variables like the number of competitors, local population, and average income are selected based on literature review. The model is estimated using existing restaurant data and provides expected results. 3) Regression analysis allows researchers to systematically test theoretical models using data. Close attention to specification, assumptions

Uploaded by

ZiaNaPiramLi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views

Chapter3 - Learning To Use Regression Analysis

1) The document outlines the 6 key steps in performing regression analysis: 1) review literature and develop a theoretical model, 2) specify the model by selecting variables and functional form, 3) hypothesize expected coefficient signs, 4) collect and clean the data, 5) estimate and evaluate the model, and 6) document the results. 2) An example is provided of using regression to determine the best locations for a new restaurant chain. Independent variables like the number of competitors, local population, and average income are selected based on literature review. The model is estimated using existing restaurant data and provides expected results. 3) Regression analysis allows researchers to systematically test theoretical models using data. Close attention to specification, assumptions

Uploaded by

ZiaNaPiramLi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

LEARNING TO USE REGRESSION ANALYSIS

β 3-1
Steps in Applied Regression Analysis
Step 1: Review literature and develop
theoretical model.
Step 2: Specify model: Select independent
variables and functional form.
Step 3: Hypothesize expected signs of coefficients.
Step 4: Collect data. Inspect and clean data.
Step 5: Estimate and evaluate equation.

Step 6: Document results.


β 3-2
Step 1: Review the Literature and
Develop the Theoretical Model

• Best data analysts start with theory!


• It’s smart to review scholarly literature before doing
anything else.
• Many approaches, but searching EconLit is helpful.
• When topic has not been studied, two strategies:
1. Transfer theory from a similar topic to your topic.
2. Consult someone who works in the area.

β 3-3
Step 2: Specify the Model: Select the Independent
Variables and Functional Form

• Most important step in applied regression analysis:


specification of theoretical model.

• Specifying a model involves choosing:


1. Independent variables and how they should
be measured.
2. Functional (mathematical) form of variables.
3. Properties of stochastic error term.

β 3-4
Step 2: Specify the Model (continued)
• Any mistake in these three components leads to
specification error—a disastrous error to validity.
• Choose independent variables based on theory.
• Judgment must often be used and researchers impose
priors.

Example: Estimate demand equation for a good.


Theory suggests including prices of compliments
and substitutes.
Which ones do you choose?

β 3-5
Step 3: Hypothesize the Expected
Signs of Coefficients

• Once variables selected, hypothesize expected


signs of coefficients.

• Often, basic theory is general knowledge and


expected coefficient signs need no explanation.

• If there’s uncertainty, opposing theories should be


documented and your hypothesized sign explained.

β 3-6
Step 3: Hypothesize the Expected
Signs of Coefficients (continued)
Example: Impact of class size on student learning.
dependent variable:
Y= student score on grammar test
independent variables:
X1 = income level of student’s family
X2 = students per teacher

Notation with hypothesized signs above coefficients:


+ -
Y=b 0 + b 1 X1 + b 2 X2 + e i
(3.1)

β 3-7
Step 4: Collect the Data.
Inspect and Clean the Data

• Obtaining and preparing an original data set is difficult.

• General rule: the more observations the better.

• Reason there should be as many observations as


possible concerns the concept of degrees of freedom
(first mentioned in Section 2.4).

• With large number of degrees of freedom, every positive


error is likely balanced by a negative error.

β 3-8
Step 4: Collect the Data.
Inspect and Clean the Data (continued)
• Another question: does unit of measurement of the
variables matter?
• Short answer: No—except in interpreting scale of coef.
Example: Independent variable is measured in dollars or
thousands of dollars.
• Constant term and measures of fit are unchanged.
• Slope coefficient of the variable changes by the exact
amount to compensate for the change in units.
• Variable measured in “thousands of $”: coefficient is 50
• Variable measured in “$”: coefficient is 0.05
β 3-9
Step 4: Collect the Data.
Inspect and Clean the Data (continued)
• Always review data set for errors.
• Approaches:
• Plot the data and look for outliers.
• Look at mean, maximum and minimum of each
variable.
• Typically, data can be “cleaned” by replacing an
incorrect value with correct value.
• In extremely rare circumstances, drop an observation.
• BE CAREFUL! Mere existence of an outlier is not a
justification for dropping that observation.
β 3-10
Step 5: Estimate and Evaluate the Equation
• It can take months to complete steps 1–4!

• Once your equation is estimated, your work is not over.

• Rather, you need to evaluate.

• For example:
• How well did the equation fit the data?
• Were signs and magnitudes of coefficients expected?

• If evaluation indicates a problem, go back to step 1.

β 3-11
Step 6: Document the Results
• A standard format usually used to present results:

Ŷi =103.40 + 6.38Xi


(0.88) (3.2)
t = 7.22
2
N = 20 R = 0.73
• Number in parenthesis is standard error of coefficient.
• t-statistic is one used to test hypothesis that the true
value of the coefficient is different from zero.
• It is also important to explain the model, assumptions
and document data manipulations in written report.
β 3-12
Example: Using Regression
Analysis to Pick Restaurant Locations
• You’re hired to determine best location of the next
Woody’s restaurant (a moderately priced, 24-hour,family
restaurant chain).
• You decide to build a regression model to explain the
gross sales volume of each of the restaurants.

Step 1:Review the literature and develop theoretical


model.
• Read about restaurant industry.
• Talk to experts within the firm.

β 3-13
Example: Using Regression Analysis
to Pick Restaurant Locations (continued)
Step 2: Specify the model: Select independent
variables and the functional form.
• You decide there are three major determinants of sales:

N = Competition: the number of direct market


competitors within a two-mile radius of the
Woody’s location
P = Population: the number of people living within a
three-mile radius of the Woody’s location
I = Income: the average household income of the
population measured in variable P
β 3-14
Example: Using Regression Analysis
to Pick Restaurant Locations (continued)
Step 3: Hypothesize the expected signs of the
coefficients.
• N: More competition in area, the fewer customers the
location will have (negative).
• P: More people in area, the more customers the location
will have (positive).
• I: Unclear—probably positive but could be negative.

- + +?
• Thus: Yi = b 0 +b N Ni + b P Pi + b I Ii +e i
(3.3)

β 3-15
Example: Using Regression Analysis
to Pick Restaurant Locations (continued)
Step 4: Collect the data. Inspect and clean the data.
Table 3.1: Data for Woody’s Restaurant Example

β 3-16
Example: Using Regression Analysis
to Pick Restaurant Locations (continued)
Step 5: Estimate and evaluate the equation.
• With software and the data set, you estimate:

Ŷi =102,192 - 9075Ni + 0.355Pi +1.288I i


(2053) (0.073) (0.543) (3.4)
t = - 4.42 4.88 2.37
N = 33 R2 = 0.579
• Estimated coefficients have expected signs.
• Overall fit seems reasonable.

β 3-17
Example: Using Regression Analysis
to Pick Restaurant Locations (continued)
Step 6: Document the results.
• Equation 3.4 from Step 5 documents results—pulled from
statistical software output (like Table 3.2).

β 3-18
β

CHAPTER 3: the end

You might also like