Econometrics_Hồ Thị Mến_K234080901

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 65

CHAP 14:

Ex6/609/641:The National Football League (NFL) records a variety of performance


data for individuals and teams. To investigate the importance of passing on the
percentage of games won by a team, the following data show the average number of
passing yards per attempt (Yds/Att) and the percentage of games won (winPct) for a
random sample of 10 NFL teams for the 2011 season (NFL website, February 12, 2012).

a. Develop a scatter diagram with the number of passing yards per attempt on the
horizontal axis and the percentage of games won on the vertical axis.
b. what does the scatter diagram developed in part (a) indicate about the relationship
between the two variables?
This indicates a positive relationship between the two variables.

c. Develop the estimated regression equation that could be used to predict the
percentage of games won given the average number of passing yards per attempt.
^y =−70.39095+17.17514 x
WinPct = -70.39095 + 17.17514Yds
d. Provide an interpretation for the slope of the estimated regression equation.
From the Stata result, we can see that the slope of the estimated regression equation is equal
to 17.17514. This means that for each additional unit increase in the independent variable (the
passing yards per attempt), there is an increase by 17.17514 in the number of dependent
variable (the percentage of games won).

e. For the 2011 season, the average number of passing yards per attempt for the Kansas
City Chiefs was 6.2. Use the estimated regression equation developed in part (c) to
predict the percentage of games won by the Kansas City Chiefs. (Note: For the 2011
season the Kansas City Chiefs record was 7 wins and 9 losses.) Compare your
prediction to the actual percentage of games won by the Kansas City Chiefs.
^
y∗¿=−70.39095+ 17.17514∗6.2=36.094918 ≈ 36.1 % ¿
The actual percentage of game wons is:
7
WinPct = ∗100=43.75 %
7+ 9
The predicted score of 36.10 does not provide a direct percentage but indicates a performance
metric. The actual win percentage of 43.75% shows that the Chiefs had a moderate
performance in the 2011 season. The prediction and actual results suggest that while the
model provides some insights, it may not accurately reflect the win percentage based on
passing yards alone.
Ex18/619/651: The following data show the brand, price ($), and the overall score for six
stereo headphones that were tested by Consumer Reports (Consumer Reports website,
March 5, 2012). The overall score is based on sound quality and effectiveness of ambient
noise reduction. Scores range from 0 (lowest) to 100 (highest). The estimated regression
equation for these data is yˆ = 23.194 + .318x, where x = price ($) and y = overall score.

x=100 y=55 ^
yi 80.434
70.894 53.404 45.454 45.454 34.324
a. Compute SST, SSR, and SSE.

SST =∑ ¿ ¿ ¿

+¿ = 1800

SSR=∑ ¿¿

+2∗¿

SSE=∑ ¿¿

+¿

b. Compute the coefficient of determination r2 . Comment on the goodness


of fit.
2 SSR 1511.804
r= = =0.8399
SST 1800
The value of coefficient of determination is 0.8399. This indicates that approximately 83.99%
of the variability in the dependent variable (Overall Score) can be explained by the linear
relationship between the two variables (Overall Score and Price). Just about 16.01% of other
factors influencing the Overall Score are not mentioned in this model. Hence, the estimated
regression equation is a good fit.
c. what is the value of the sample correlation coefficient?
r xy =(sign of b1 ) √❑
Ex26/631/662: In exercise 18 the data on price ($) and the overall score for six stereo
headphones tested by Consumer Reports were as follows (Consumer Reports website,
March 5, 2012).
x=100 y=55

a. Does the t test indicate a significant relationship between price and the overall score?
what is your conclusion? Use a = .05.

4820
∑ ( x i−x)∗( yi − y)=
14950
=0.3224
1
b 1= ❑

b 0= y−b1 x=55−0.32241∗100=22.7592
^y =22.7592+ ¿0.3224 x

SSE=∑ ¿¿

+¿
s
sb =
1
√❑
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
b1 0.32241
Test statistic: t= = =4.649
s b 0.06935
1

Rejection Rule: t > t0.025;4 (4.649 > 2.77645). Hence, we reject H0. There is a significant
relationship between two variables.
b. Test for a significant relationship using the F test. what is your conclusion? Use a
= .05.

SSR=∑ ¿¿

+2∗¿
SSR 1511.804
MSR= = =1511.804
p 1
SSE 287.624
MSE= = =71.906
n− p−1 6−1−1
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
MSR 1511.804
F-test: F= = =21.025
MSE 71.906
Rejection Rule: F > F0.05;1;4 (21.025 >7.7086). Hence, we conclude that there is a significant
relationship between two variables.
c. Show the ANOVA table for these data.

Source of Sum of Degree of Mean of F-test p-value


Variatio Squares Freedoms Squares
n

Model SSR = p(numerator) MSR F(1,4) = Prob>F=0.0


1512.37625 =number of =1512.37625 21.03 101
independent
variables=1

Residual SSE n-p-1= MSE


=287.623746 (denomator)= =71.9059365
6-1-1=4

Total SST=1800 n-1=6-1=5

Ex27/631/662:To identify high-paying jobs for people who do not like stress, the
following data were collected showing the average annual salary ($1000s) and the stress
tolerance for a variety of occupations (Business Insider, November 8, 2013).

The stress tolerance for each job is rated on a scale from 0 to 100, where a lower rating
indicates less stress.
a. Develop a scatter diagram for these data with average annual salary as the
independent variable. what does the scatter diagram indicate about the relationship
between the two variables?
b. Use these data to develop an estimated regression equation that can be used to
predict stress tolerance given the average annual salary.

StressTolerance = 84.25042 - 0.2107439*AverageAnnualSalary


c. At the .05 level of significance, does there appear to be a significant statistical
relationship between the two variables?
Hypothesis: B1 = 0
B1 is not equal to 0
Test statistic: t = b1 / sb1 = -0.2107439/0.0609571 = -3.457
Rejection Rule: t = -3.457 < -t0.025;8 = -2.306 or p-value = 0.009 < alpha =0.05
Hence, we reject H0. There is a significant relationship between two variables
(StressTolerance and AverageAnnualSalary)

d. would you feel comfortable in predicting the stress tolerance for a different
occupation given the average annual salary for the occupation? Explain.
Since R-square = 0.599 is greater than 0.55, the estimated regression equation provided a
good fit. Therefore, we should feel comfortable using the estimated regression equation to
estimate the stress tolerance given the average annual salary if the value of the average
annual salary is within the range of the current data.

e. Does the relationship between average annual salary and stress tolerance for these
data seem reasonable to you? Explain.
The relationship between average annual salary and stress tolerance in this dataset does not
seem reasonable or consistent. While one might expect higher salaries to correlate with lower
stress tolerance (due to higher responsibilities or pressure), the data shows a mixed pattern.
This could suggest that factors influencing stress tolerance are not solely related to salary,
and other variables (such as job nature, work environment, or individual differences) may
play a significant role.

Ex35/637/669: The following data are the monthly salaries y and the grade point
averages x for students who obtained a bachelor’s degree in business administration.
y-hat 3601.36 4066.24 4182.46 3950.02 4124.35 3775.69
The estimated regression equation for these data is yˆ = 2090.5 + 581.1x and MSE =
21,284.

a. Develop a point estimate of the starting salary for a student with a GPA of 3.0.
^y =2090.5+581.1∗3=3833.8
y*-hat = 2090.5 + 581.1*3 = 3833.8
b. Develop a 95% confidence interval for the mean starting salary for all students with a
3.0 GPA.
s= √ ❑
s^y∗¿=s∗√❑¿
^y ¿ ± t α / 2∗s ^
y∗¿=3833.8± 2.77645∗68.54=3833.8 ±190.298¿

s = Root MSE = Root of SSE/(n-2)


= Root of sum(yi – yi-hat)^2 / (n-2) = Root of 85135.1378/4 = 145.889
sy*-hat = s*{Root of [ 1/n + (x* - xmean)^2/sum (xi - xmean)^2 ] }
= 145.889*{Root of [ ⅙ + (3 - 3.2)^2/0.74]} = 68.54
ta/2*sy*-hat = t0.025*68.54 = 2.77645*68.54 = 190.298
y*-hat (+ or - ) ta/2*sy*-hat = 3833.8 (+ or -) 190.298 = 4024.098 to 3643.502

c. Develop a 95% prediction interval for Ryan Dailey, a student with a GPA of 3.0.
s pred =s∗√ ❑
^
y ¿ ± t α / 2∗s pred =3833.8 ± 2.77645∗161.187=3833.8 ± 447.528
sy*-hat = s*{Root of [ 1+1/n + (x* - xmean)^2/sum (xi - xmean)^2 ] }
= 145.889*{Root of [ 1 + ⅙ + (3-3.2)^2/0.74]}=161.187
ta/2*sy*-hat = 2.77645*161.187=447.528
y*-hat (+ or - ) ta/2*sy*-hat = 3833.8 (+ or -) 447.528 = 3386.212 to 4281.328

d. Discuss the differences in your answers to parts (b) and (c).


As expected, the prediction interval is much wider than the confidence interval. This is due
to the fact that it is more difficult to predict the starting salary for one new student with a
GPA of 3.0 than it is to estimate the mean for all students with a GPA of 3.0.

Ex41/641/673: Following is a portion of the computer output for a regression analysis


relating y = maintenance expense (dollars per month) to x = usage (hours per week) of a
particular brand of computer terminal.
a. write the estimated regression equation.

^y =6.1092+ 0.8951 x
Maintenance Expense=6.1092+0.8951∗Usage

b. Use a t test to determine whether monthly maintenance expense is related to usage at


the .05 level of significance.
Hypothesis: H0: B1 = 0
Ha: B1 not equal to 0
Test Statistics:
t = b1/sb1 = 0.8951/0.149 = 6.0074
Rejection Rule: t = 6.0074 > t0.025;8=2.306. Hence, reject Ho. There is a significant
relationship between monthly maintenance expenses and usage.

c. Use the estimated regression equation to predict monthly maintenance expenses for
any terminal that is used 25 hours per week.
y-hat = 6.1092 + 0.8951x
y*-hat = 6.1092 + 0.8951*25=28,4867

Ex45/651/683. Given are data for two variables, x and y.


xi 6 11 15 18 20
yi 6 8 12 20 30
a. Develop an estimated regression equation for these data.
x=(6+11+15+ 18+20)/5=14
y=(6+ 8+12+20+30)/5=15.2

200
∑ ( x i−x)∗( yi − y)=
126
=1.5873
b 1= ❑

b 0= y−b1 x=15.2−1.5873∗14=−7.0222
The estimated regression equation:
^y =b0 +b 1 x=−7.0222+1.5873 x
b. Compute the residuals.
xi 6 11 15 18 20
yi 6 8 12 20 30
^
yi 2.502 10.4381 16.7873 21.5492 24.7238
y i− ^
y i 3.498 -2.4381 -4.7873 -1.5492 5.2762
c. Develop a plot of the residuals against the independent variable x. Do the assumptions
about the error terms seem to be satisfied?

With only five observations, it is difficult to determine whether the assumptions are satisfied;
however, the plot does suggest curvature in the residuals, which would indicate that the error
term assumptions are not satisfied; the scatter diagram for these data also indicates that the
underlying relationship between x and y may be curvilinear.
d. Compute the standardized residuals.
s= √ ❑
1
hi = +¿ ¿ ¿ (predict leverage, hat) s y − ^y =s √ ❑
n i i

1
h1= +¿ ¿
5
s y − ^y =s √ ❑ 4.877 √❑2.636
1 1

1
h2 = +¿ ¿
5
s y − ^y =s √ ❑ 4.877 √ ❑4.1628
2 2

1
h3 = +¿ ¿
5
s y − ^y =s √ ❑ 4.877 √❑4.3404
3 3

1
h 4= + ¿ ¿
5
s y −^y =s √ ❑ 4.877 √❑4.001
4 4

1
h5 = +¿ ¿
5
s y − ^y =s √ ❑ 4.877 √❑ 3.4975
5 5

The standardized residuals:


yi 6 8 12 20 30
y i− ^
yi 3.498 -2.4381 -4.7873 -1.5492 5.2762
s y − ^y
i i
2.636 4.1628 4.3404 4.001 3.4975
y i−^ yi
1.327 -0.586 -1.103 -0.3872 1.5086
s y − ^y
i i

e. Develop a plot of the standardized residuals against yˆ. what conclusions can you
draw from this plot?
The plot of the standardized residuals against yˆ has the same shape as the original residual
plot; as stated in part (c), the curvature observed indicates that the assumptions regarding the
error term may not be satisfied

Ex54/659/691: The following data show the annual revenue ($ millions) and the
estimated team value ($ millions) for the 30 Major League Baseball teams (Forbes
website, January 16, 2014).
a. Develop a scatter diagram with Revenue on the horizontal axis and Value on the
vertical axis. Looking at the scatter diagram, does it appear that there are any outliers
and/ or influential observations in the data?

The scatter diagram does indicate potential outliers and/or influential observations. For
example, the New York Yankees have both the highest revenue (471) and value (2300) , and
appears to be an influential observation. The Los Angeles Dodgers have the second highest
value (1615) and appears to be an outlier.
b. Develop the estimated regression equation that can be used to predict team value
given the annual revenue.
^y =−601.4814+5.927063 x
Value=−601.4814+5.927063 Revenue
c. Use residual analysis to determine whether any outliers and/or influential
observations are present. Briefly summarize your findings and conclusions.

The residual analysis for the value of New York Yankees is:
2
1 (x i−x)
hi = + ❑
n 1
∑ ❑
(x i−x )2= +¿ ¿ ¿ ¿
30
s y − ^y =s √ ❑
i i

Residuals : y i− ^
yi =2300−(−601.4814+ 5.927063∗471)=109,834727
y −^ y 109.834727
Standardized Residuals : i i = =1,085 => not an outlier
s y − ^y
i i
101,2345

The residual analysis for the value of Los Angeles Dodgers is:
1 (x i−x)2
hi = + ❑
n 1
∑ ❑
(x i−x )2= +¿ ¿ ¿ ¿
30
s y − ^y =s √ ❑
i i

Residuals : y i− ^ yi =1615−(−601.4814+ 5.927063∗245)=764,350965


y −^ y 764,350965
Standardized Residuals : i i = =4.700761368
s y − ^y 162,6015245
i i

Since 4.7 is greater than 2, the value of the Los Angeles Dodgers is considered as an outlier.
Ex55/664/696: Does a high value of r2 imply that two variables are causally related?
Explain.
No, a high r2 value does not necessarily imply that two variables are causally related. It
indicates correlation, not causation. Two variables may move together due to coincidence, a
common cause, or an indirect relationship.

Ex56/664/696: In your own words, explain the difference between an interval estimate
of the mean value of y for a given x and an interval estimate for an individual value of y
for a given x

An interval estimate of the mean value of y for a given x provides a range within which we
expect the average of all possible y values (for that specific x) to fall. This type of estimate
reflects the uncertainty about the population mean and is typically narrower than an interval
for individual values because it accounts for the variability of the sample mean rather than the
variability of individual observations.
On the other hand, an interval estimate for an individual value of y for a given
x offers a range within which we expect a single observation of y (for that specific
x) to fall. This interval is wider than the mean interval because it incorporates both the
uncertainty about the mean and the inherent variability of individual data points around that
mean.
Ex57/664/696: what is the purpose of testing whether b1 = 0? If we reject b1 = 0, does it
imply a good fit?
Testing whether β₁ = 0 helps determine if the independent variable has a statistically
significant relationship with the dependent variable. Rejecting β₁ = 0 implies a statistically
significant relationship, but it doesn't guarantee a good fit. A good fit also requires a high R-
squared value and other considerations.
Ex58/664/696: The Dow Jones Industrial Average (DJIA) and the Standard & Poor’s
500 (S&P 500)indexes are used as measures of overall movement in the stock market.
The DJIA is based on the price movements of 30 large companies; the S&P 500 is an
index composed of 500 stocks. Some say the S&P 500 is a better measure of stock
market performance because it is broader based. The closing price for the DJIA and the
S&P 500 for 15 weeks, beginning with January 6, 2012, follow (Barron’s website, April
17, 2012).

a. Develop a scatter diagram with DJIA as the independent variable.


b. Develop the estimated regression equation.

SP=−669.0212+ 0.1573∗DJIA
c. Test for a significant relationship. Use a = .05.
Hypotheses: H 0 : β 1=0
H a : β1≠ 0

Since the p-value 0.000 is less than the significance level α =0.05 . We reject H 0. This
indicates that there is a significant relationship between the two variables (DJIA and SP).
d. Did the estimated regression equation provide a good fit? Explain.
Since R2=0.9486 . This means approximately 94.86% of the variability in the number of the
dependent variable (SP) can be explained by the linear relationship between the two variables
(SP and DJIA ), and just about 5.14 % other factors influencing the SP are not mentioned in
this model. Hence, the estimated regression equation is a good fit.
e. Suppose that the closing price for the DJIA is 13,500. Predict the closing price for the
S&P 500.
SP=−669.0212+ 0.1573∗13500=2792.571
f. Should we be concerned that the DJIA value of 13,500 used to predict the S&P 500
value in part (e) is beyond the range of the data used to develop the estimated regression
equation?
Yes, we should be cautious because 13,500 is beyond the range of the DJIA values in the
dataset (the maximum DJIA value in the provided data is 13,233). Predicting outside the
range of the data (extrapolation) can lead to inaccurate results since the relationship between
DJIA and S&P 500 may not hold beyond the observed data range.
Ex59/665/697: Is the number of square feet of living space a good predictor of a house’s
selling price? The following data show the square footage and selling price for fifteen
houses in winston Salem, North Carolina (Zillow.com, April 5, 2015)

a. Develop a scatter diagram with square feet of living space as the independent variable
and selling price as the dependent variable. what does the scatter diagram indicate
about the relationship between the size of a house and the selling price?

b. Develop the estimated regression equation that could be used to predict the selling
price given the number of square feet of living space.
SellingPrize=−59.01557+115.0915∗¿ ¿
c. At the .05 level, is there a significant relationship between the two variables?
Hypotheses: H 0 : β 1=0
H a : β1≠ 0

Since the p-value 0.000 is less than the significance level α =0.05 . We reject H 0. This
indicates that there is a significant relationship between the two variables (the selling
price given the number of square feet of living space).
d. Use the estimated regression equation to predict the selling price of a 2000 square
foot
house in winston Salem, North Carolina.
SellingPrize=−59.01557+115.0915∗2000=230120.98
e. Do you believe the estimated regression equation developed in part (b) will provide
a good prediction of selling price of a particular house in winston Salem, North
Carolina? Explain.
Since R2=0.8896 . This means approximately 88.96% of the variability in the number of the
dependent variable (SellingPrice) can be explained by the linear relationship between the two
variables (SellingPrice and Size ), and about 11.04 % other factors influencing the
SellingPrice are not mentioned in this model. Hence, the estimated regression equation is a
good fit.
f. would you be comfortable using the estimated regression equation developed in part
(b) to predict the selling price of a particular house in Seattle, washington? why or
why not?
I would not be comfortable using the estimated regression equation developed for Winston-
Salem to predict the selling price of a house in Seattle without further adjustments or
additional data.
Ex60/665/697: One of the biggest changes in higher education in recent years has been
the growth of online universities. The Online Education Database is an independent
organization whose mission is to build a comprehensive list of the top accredited online
colleges. The following table shows the retention rate (%) and the graduation rate (%)
for 29 online colleges.

a. Develop a scatter diagram with retention rate as the independent variable. what does
the scatter diagram indicate about the relationship between the two variables?

b. Develop the estimated regression equation.


^y =25.4229+ 0.284526 x
c. Test for a significant relationship. Use a = .05.

Hypotheses: H 0 : β 1=0
H a : β1≠ 0

Since the p-value 0.000 is less than the significance level α =0.05 . We reject H 0. This
indicates that there is a significant relationship between the two variables (Retention Rate and
Graduate Rate).
d. Did the estimated regression equation provide a good fit?
Since R2=0.4492. The regression relationship is moderate. This means approximately
44.92% of the variability in the percentage of Graduate Rate can be explained by the linear
relationship between the percentage of Retention Rate and the percentage of Graduate Rate,
and 55.08% other factors influencing the Graduate Rate are not mentioned in this model.
Hence, the estimated regression equation is not a good fit.
e. Suppose you were the president of South University. After reviewing the results,
would you have any concerns about the performance of your university as compared to
other online universities?
Since nearly 55.08% of the variability remains unexplained, R-squared value may raise
concerns about the effectiveness of the university’s strategies in improving Graduation Rates.
f. Suppose you were the president of the University of Phoenix. After reviewing the
results, would you have any concerns about the performance of your university as
compared to other online universities?
Since nearly 55.08% of the variability remains unexplained, R-squared value may raise
concerns about the effectiveness of the university’s strategies in improving Graduation Rates.

Ex61/666/698: Jensen Tire & Auto is in the process of deciding whether to purchase a
maintenance contract for its new computer wheel alignment and balancing machine.
Managers feel that maintenance expense should be related to usage, and they collected
the following information on weekly usage (hours) and annual maintenance expense (in
hundreds of dollars).

a. Develop the estimated regression equation that relates annual maintenance expense
to
weekly usage.

AnnualMaintenanceExpense=10.52796+0.9534404∗Usage
b. Test the significance of the relationship in part (a) at a .05 level of significance.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variables (annual
maintenance expense to weekly usage)
c. Jensen expects to use the new machine 30 hours per week. Develop a 95% prediction
interval for the company’s annual maintenance expense.
^y =10.52796 +0.9534404∗30=39.131172
s pred =s∗√ ❑
^y ¿ ± t α / 2∗s pred =39.131172 ± 2.306∗4.5041=28.745 ± 49.518
d. If the maintenance contract costs $3000 per year, would you recommend purchasing
it? why or why not?
The calculated prediction interval for an expected 30 hours of weekly usage is (28.745 ,
49.518 ) dollars. The $3,000 annual maintenance contract cost is below this range. This
suggests that the contract cost is likely lower than the expected maintenance expense.
Yes, I would recommend purchasing the maintenance contract, as it is likely more cost-
effective compared to the predicted range of expenses.
Ex62/667/699: In a manufacturing process the assembly line speed (feet per minute) was
thought to affect the number of defective parts found during the inspection process. To
test this theory, managers devised a situation in which the same batch of parts was
inspected visually at a variety of line speeds. They collected the following data.

a. Develop the estimated regression equation that relates line speed to the number of
defective parts found.

NumberofDefectivePartsfound=22.17391−0.1478261∗LineSpeed
b. At a .05 level of significance, determine whether line speed and number of defective
parts found are related.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.028<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variables ( line
speed and number of defective parts found )
c. Did the estimated regression equation provide a good fit to the data?
Since R2=0.6739 . This means approximately 67.39% of the variability in the number of the
dependent variable (LineSpeed) can be explained by the linear relationship between the two
variables (line speed and number of defective parts found ), and about 32.61% other factors
influencing the number of defective parts found are not mentioned in this model. Hence, the
estimated regression equation is not a good fit.
d. Develop a 95% confidence interval to predict the mean number of defective parts for
a line speed of 50 feet per minute.
^
y ¿ =22.17391−0.1478261∗50=14.782605
s^
y∗¿=s∗√❑¿
^
y ¿ ± t α / 2∗s ^
y∗¿=14.782605 ±2.77645∗0.8963=12.294 ±17.271¿

Ex63/667/699. A sociologist was hired by a large city hospital to investigate the


relationship between the number of unauthorized days that employees are absent per
year and the distance (miles) between home and work for the employees. A sample of 10
employees was chosen, and the following data were collected.

a. Develop a scatter diagram for these data. Does a linear relationship appear
reasonable? Explain.

This indicates the negative relationship between two variables.


b. Develop the least squares estimated regression equation.
^y =8.097826−0.3442029 x
Numberofdaysabsent =8.097826−0.3442029∗Distancetowork
c. Is there a significant relationship between the two variables? Use a = .05.
Hypothesis: H0: B1 = 0
Ha: B1 is not equal to 0
1 b
0.3442029
Test statistics: t= s = 0.0776137 =4.435
b 1

Rejection Rule: t >t 0.025 ;8 (4.435 > 2.306). => Reject Ho


Hence, we conclude that there is a significant relationship between the two
variables.
d. Did the estimated regression equation provide a good fit? Explain.
Since R2=0.7109 . The regression relationship is significant. This means approximately
71.09% of the variability in the number of the dependent variable (Daysofabsent) can be
explained by the linear relationship between the two variables (Daysofabsent and
Distancetowork), and just about 28.91% other factors influencing the Daysofabsent are not
mentioned in this model. Hence, the estimated regression equation is a good fit.

e. Use the estimated regression equation developed in part (b) to develop a 95%
confidence interval for the expected number of days absent for employees living 5 miles
from the company.
s= √ ❑
^
y ¿ =8.097826−0.3442029∗5=6.377
s^
y∗¿=s∗√❑¿
^
y ¿ ± t α / 2∗s ^
y∗¿=6.377 ±2.306∗0.5125=5.195± 7.5588¿
Ex64/668/700: The regional transit authority for a major metropolitan area wants to
determine whether there is any relationship between the age of a bus and the annual
maintenance cost. A sample of 10 buses resulted in the following data.

a. Develop the least squares estimated regression equation.


MaintenanceCost=220+ 131.6667 AgeofBus
^y =220+131.6667 x
b. Test to see whether the two variables are significantly related with a = .05.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
b1 131,6667
Test statistic: t= = =7.399
s b 17.79513
1

Rejection Rule: t >t 0.025 ;8 (7.399 > 2.306). => Reject Ho


Hence, we conclude that there is a significant relationship between the two
variables ( Maintenance Cost and Age of Bus )

c. Did the least squares line provide a good fit to the observed data? Explain.
Since R2=0.8725 . The regression relationship is significant. This means approximately
87.25% of the variability in the number of the dependent variable (Maintenance Cost) can be
explained by the linear relationship between the two variables (Maintenance Cost and Age of
Bus), and just about 12.75% other factors influencing the Maintenance are not mentioned in
this model. Hence, the estimated regression equation is a good fit.

d. Develop a 95% prediction interval for the maintenance cost for a specific bus that is
4 years old.
^y =220+131.6667∗4=746.667
s pred =s∗√ ❑
^y ¿ ± t α / 2∗s pred =746.667 ± 2.306∗81.158=559.517 ±933.817

Ex65/668/700: A marketing professor at Givens College is interested in the relationship


between hours spent studying and total points earned in a course. Data collected on 10
students who took the course last quarter follow.

a. Develop an estimated regression equation showing how total points earned is related
to hours spent studying.

Points=5.847009+ 0.8295394∗Hours
b. Test the significance of the model with a = .05.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variables ( Points
and Hours )
c. Predict the total points earned by Mark Sweeney. he spent 95 hours studying.
Points=0.829539∗95+5.847009=84.6495
Therefore, Mark Sweeney is predicted to earn 84.6495 points.
d. Develop a 95% prediction interval for the total points earned by Mark Sweeney
^y =84.6495
s pred =s∗√ ❑
^y ¿ ± t α / 2∗s pred =84.6495 ± 2.306∗0.1218=84.3686 ± 84.9304

Ex66/668/700: Market betas for individual stocks are determined by simple linear
regression. For each stock, the dependent variable is its quarterly percentage return
(capital appreciation plus dividends) minus the percentage return that could be
obtained from a risk-free investment (the Treasury Bill rate is used as the risk-free
rate). The independent variable is the quarterly percentage return (capital appreciation
plus dividends) for the stock market (S&P 500) minus the percentage return from a
risk-free investment. An estimated regression equation is developed with quarterly
data; the market beta for the stock is the slope of the estimated regression equation (b1).
The value of the market beta is often interpreted as a measure of the risk associated
with the stock. Market betas greater than 1 indicate that the stock is more volatile than
the market average; market betas less than 1 indicate that the stock is less volatile than
the market average. Suppose that the following figures are the differences between the
percentage return and the risk-free return for 10 quarters for the S&P 500 and horizon
Technology.

a. Develop an estimated regression equation that can be used to predict the market beta
for horizon Technology. what is horizon Technology’s market beta?

HorizonTechnology=0.275+0.95∗(S∧P 500)
b. Test for a significant relationship at the .05 level of significance.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.029<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variables
( HorizonTechnology and S&P500 )
c. Did the estimated regression equation provide a good fit? Explain.
Since R2=0.4695 . This means approximately 46.95% of the variability in the number of the
dependent variable (S&P500) can be explained by the linear relationship between the two
variables (HorizonTechnology and S&P500), and about 53.05% other factors influencing the
HorizonTechnologyare not mentioned in this model. Hence, the estimated regression
equation is not a good fit.
d. Use the market betas of xerox and horizon Technology to compare the risk associated
with the two stocks.
Xerox has a higher risk
Ex67/669/701: The Transactional Records Access Clearinghouse at Syracuse University
reported data showing the odds of an Internal Revenue Service audit. The following
table shows the average adjusted gross income reported and the percent of the returns
that were audited for 20 selected IRS districts.

a. Develop the estimated regression equation that could be used to predict the percent
audited given the average adjusted gross income reported.

Audits=−0.4709536+0.0000387∗GrossIncome
b. At the .05 level of significance, determine whether the adjusted gross income and the
percent audited are related.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.038<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two
variables ( GrossIncome and Audits )
c. Did the estimated regression equation provide a good fit? Explain.
Since R2=0.2171. This means approximately 21.71% of the variability in the number of the
dependent variable (GrossIncome) can be explained by the linear relationship between the
two variables (GrossIncome and Audits), and about 78.29% other factors influencing the
Audits are not mentioned in this model. Hence, the estimated regression equation is not good
fit.
d. Use the estimated regression equation developed in part (a) to calculate a 95%
confidence interval for the expected percent audited for districts with an average
adjusted gross income of $35,000.
Audits=−0.4709536+0.0000387∗35000=0.8835464
s^y∗¿=s∗√❑¿
^
y ¿ ± t α / 2∗s ^
y∗¿=0.8835464 ±2.101∗0.185=0.8639± 0.9035¿

Ex68/670/702: The Toyota Camry is one of the best-selling cars in North America. The
cost of a previously owned Camry depends upon many factors, including the model
year, mileage, and condition. To investigate the relationship between the car’s mileage
and the sales price for a 2007 model year Camry, the following data show the mileage
and sale price for 19 sales (Pricehub website, February 24, 2012).
a. Develop a scatter diagram with the car mileage on the horizontal axis and the price
on the vertical axis.

b. what does the scatter diagram developed in part (a) indicate about the relationship
between the two variables?
The scatter diagram developed in part (a) indicates the negative relationship between the two
variables.
c. Develop the estimated regression equation that could be used to predict the price
($1000s) given the miles (1000s).
Price=16.46976−0.0587739∗Miles
^y =16.46976−0.0587739 x

d. Test for a significant relationship at the .05 level of significance.


Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.0000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two
variables ( Miles and Price )

e. Did the estimated regression equation provide a good fit? Explain.


Since R2=0.5387 . The regression relationship is reasonably good fit. This means
approximately 53.87% of the variability in the number of the dependent variable (Price) can
be explained by the linear relationship between the two variables (Miles and Price), and just
about 46.13% other factors influencing the Price are not mentioned in this model. Hence, the
estimated regression equation is a good fit.
f. Provide an interpretation for the slope of the estimated regression equation.
From the Stata result, we can see that the slope of the estimated regression equation is equal
to −0.0587739 . This means that for each additional unit increase in the independent variable
(Miles), there is an decrease by −0.0587739 in the number of dependent variable (Price).
g. Suppose that you are considering purchasing a previously owned 2007 Camry that
has been driven 60,000 miles. Using the estimated regression equation developed in
part (c), predict the price for this car. Is this the price you would offer the seller?
^y =16.46976−0.0587739∗60=12.943326(1000 s)
This result indicates that, based on the negative relationship between mileage and price
shown by the regression, a 2007 Camry with 60,000 miles is expected to sell for around
$12,943. However, whether this is the price you would actually offer the seller may depend
on additional factors not accounted for in the model, such as the car's condition, location, and
market demand at the time of purchase.

CHAP 15:

Ex2/689/721: Consider the following data for a dependent variable y and two
independent variables, x1 and x2.

a. Develop an estimated regression equation relating y to x1. Predict y if x1 = 45.


^y =45.05937+1.943571 x 1
Predict y:
^y =45.05937+1.943571∗45=132.52
b. Develop an estimated regression equation relating y to x2. Predict y if x2 = 15.

^y =85.2171+ 4.321488 x 2
Predict y: ^y =85.2171+ 4.321488∗15=150.03942
c. Develop an estimated regression equation relating y to x1 and x2. Predict y if x1 = 45
and x2 = 15.

^y =−18.36827+2.010185 x 1+ 4.737812 x 2
Predict y: ^y =−18.36827+2.010185∗45+ 4.737812∗15=143.157235
Ex3/689/721: In a regression analysis involving 30 observations, the following estimated
regression equation was obtained.
^y =17.6 +3.8 x1−2.3 x 2 +7.6 x 3+ 2.7 x 4
a. Interpret b1, b2, b3, and b4 in this estimated regression equation.
The coefficient β 1=3.8 . This means that for each additional unit increase in the independent
variable, there is an increase by 3.8 in the number of the dependent variable (all other
independent variables held constant).
The coefficient β 2=−2.3 . This means that for each additional unit increase in the
independent variable, there is a decrease by 2.3 in the number of the dependent variable (all
other independent variables held constant).
The coefficient β 1=7.6 . This means that for each additional unit increase in the independent
variable, there is a increase by 7.6 in the number of the dependent variable (all other
independent variables held constant).
The coefficient β 1=2.7 . This means that for each additional unit increase in the independent
variable, there is a increase by 2.7 in the number of the dependent variable (all other
independent variables held constant).

b. Predict y when x1 = 10, x2 = 5, x3 = 1, and x4 = 2.


^y =17.6 +3.8∗10−2.3∗5+7.6∗1+2.7∗2=57.1

Ex5/690/722: The owner of Showtime Movie Theaters, Inc., would like to predict weekly
gross revenue as a function of advertising expenditures. historical data for a sample of
eight weeks follow.

a. Develop an estimated regression equation with the amount of television advertising


as the independent variable.
WeeklyGrossRevenue=88.63768+1.603865∗TelevisionAdvertising
b. Develop an estimated regression equation with both television advertising and
newspaper advertising as the independent variables.

WeeklyGrossRevenue=83.23009+ 1.300989∗NewspaperAdvertising
+2.290184∗TelevisionAdvertising

c. Is the estimated regression equation coefficient for television advertising expenditures


the same in part (a) and in part (b)? Interpret the coefficient in each case.
The coefficient β 1 ( part a)=1.603865 . This means that for each additional unit increase in
the independent variable, there is an increase by 1.603865 in the number of the dependent
variable.
The coefficient β 1 ( part b)=2.290184 . This means that for each additional unit increase in
the independent variable, there is an increase by 2.290184 in the number of the dependent
variable (all other independent variables held constant).
d. Predict weekly gross revenue for a week when $3500 is spent on television advertising
and $1800 is spent on newspaper advertising.
WeeklyGrossRevenue=83.23009+ 1.300989∗1.8+ 2.290184∗3.5=93.5875142
Ex6/690/772: The National Football League (NFL) records a variety of performance
data for individuals and teams. To investigate the importance of passing on the
percentage of games won by a team, the following data show the conference (Conf),
average number of passing yards per attempt (Yds/Att), the number of interceptions
thrown per attempt (Int/Att), and the percentage of games won (win%) for a random
sample of 16 NFL teams for one full season.

a. Develop the estimated regression equation that could be used to predict the
percentage of games won given the average number of passing yards per attempt.

Win %=−58.77031+16.39063∗Yds/ Att


b. Develop the estimated regression equation that could be used to predict the
percentage of games won given the number of interceptions thrown per attempt.

Win %=97.53826−1600.491∗∫ ¿ Att


c. Develop the estimated regression equation that could be used to predict the
percentage of games won given the average number of passing yards per attempt and
the number of interceptions thrown per attempt.

Win %=−5.763278+12.94936∗Yds / Att−1083.788∗∫ ¿ Att


d. The average number of passing yards per attempt for the Kansas City Chiefs was
6.2 and the number of interceptions thrown per attempt was .036. Use the estimated
regression equation developed in part (c) to predict the percentage of games won by
the Kansas City Chiefs. (Note: For this season the Kansas City Chiefs’ record was 7
wins and 9 losses.) Compare your prediction to the actual percentage of games won
by the Kansas City Chiefs.
Win %=−5.763278+12.94936∗6.2−1083.788∗0.036=35.506386
7
Actual %= ∗100 %=43.75
7+ 9
The predicted win percentage (35.51%) is lower than the actual win percentage (43.75%).

Ex12/696/728: In exercise 2, 10 observations were provided for a dependent variable y


and two independent variables x1 and x2; for these data SST = 15,182.9, and SSR =
14,052.2.
a. Compute R2.
2 SSR 14052.2
R= = =0.9255
SST 15182.9
b. Compute R2 a.
2 2 n−1 10−1
Ra =1−(1−R ) =1−(1−0.925) =0.
n− p−1 10−2−1
c. Does the estimated regression equation explain a large amount of the variability in
the data? Explain.
Yes, exactly! After adjusting for the number of independent variables in the model, the R2a
value tells us how well the regression model fits the data while accounting for the number of
predictors.
Ex15/696/728: In exercise 5, the owner of Showtime Movie Theaters, Inc., used multiple
regression analysis to predict gross revenue ( y) as a function of television advertising
(x1) and newspaper advertising (x2). The estimated regression equation was
^y =83.2+ 2.29 x 1 +1.3 x 2
The computer solution provided SST = 25.5 and SSR = 23.435.
a. Compute and interpret R2 and R2 a.
2 SSR 23.435
R= = =0.919
SST 25.5
Since R2=0.919 . The regression relationship is highly good fit. This means approximately
91.9% of the variability in the number of the dependent variable can be explained by the
linear relationship between the two variables, and just about 8.1% other factors are not
mentioned in this model. Hence, the estimated regression equation is a good fit.
2 2 n−1 8−1
Ra =1−(1−R ) =1−(1−0.919) =0.8866
n− p−1 8−2−1
After adjusting for the number of predictors, the model still explains 88.66% of the variability
in the dependent variable, showing that the model remains quite effective.
b. When television advertising was the only independent variable, R2 = .653 and
R2 a = .595. Do you prefer the multiple regression results? Explain.
Multiple regression analysis is preferred because both R2 ¿ 0.919 and R2a¿ 0.8866 show an
increased percentage of the variability of y explained when both independent variables are
used.
Ex19/704/736: In exercise 1, the following estimated regression equation based on 10
observations was presented.
^y =29.1270+0.5906 x 1+ 0.4980 x 2
here SST = 6724.125, SSR = 6216.375, sb1= .0813, and sb2= .0567.
a. Compute MSR and MSE.
SSR 6216.375
MSR= = =3108.1875
p 2
SSE 6724.125−6216.375
MSE= = =72.5357
n− p−1 10−2−1
b. Compute F and perform the appropriate F test. Use a = .05.
Hypothesis: H 0 : β 1=β 2=0
H a :at least one parameter is not equal ¿ 0
MSR 3108.1875
F-test: F= = =42.85
MSE 72.5357
From the F-table, we can see that F 0.05 ;2; 7=4.7374
Since F > Fa (42.85 > 4.7374) => Reject H 0.
We conclude that the overall model is significant.
c. Perform a t test for the significance of b1. Use a = .05.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
b1 0.5906
T-test: t= = =7.2644
s b 1 0.0813
From t-table, we can see that t 0.025; 7=0.711.
Since t > ta/2. We conclude that β 1 is significant.
d. Perform a t test for the significance of b2. Use a = .05.
Hypothesis: H 0 : β 2=0
H a : β2≠ 0
b 2 0.498
T-test: t= = =8.783
s b 2 0.0567
From t-table, we can see that t 0.025; 7=0.711.
Since t > ta/2. We conclude that β 2 is significant.

Ex50/741/773: The personnel director for Electronics Associates developed the following
estimated regression equation relating an employee’s score on a job satisfaction test to
his or her length of service and wage rate.
^y =14.4−8.69 x 1 +13.5 x2
where
x1 = length of service (years)
x2 = wage rate (dollars)
y = job satisfaction test score (higher scores indicate greater job satisfaction)

a. Interpret the coefficients in this estimated regression equation.


The coefficient β 1=−8.669 . This means that for each additional year increase in the
independent variable (length of service), there is a decrease by 8.669 in the number of
dependent variable (job satisfaction test score).
The coefficient β 2=13.5 . This means that for each additional dollars increase in the
independent variable (wage rate), there is an increase by 13.5 in the number of dependent
variable (job satisfaction test score).

b. Predict the job satisfaction test score for an employee who has four years of service
and makes $13.00 per hour.
^y =14.4−8.69∗4+13.5∗13=155.14
Hence, the prediction for the job satisfaction test score for an employee who has four years of
service and makes $13.00 per hour is 155.14 score.

Ex49/741/773: The admissions officer for Clearwater College developed the following
estimated regression equation relating the final college GPA to the student’s SAT
mathematics score and highschool GPA.
^y =−1.41+ 0.0235 x 1 +0.00486 x 2
where
x1 = high-school grade point average
x2 = SAT mathematics score
y = final college grade point average
a. Interpret the coefficients in this estimated regression equation.
.
The coefficient β 1=0.0235 . This means for each one-unit increase in the high-school GPA,
the final college GPA is expected to increase by 0.0235, holding the SAT mathematics score
constant.
The coefficient β 1=0.00486 . This means for each one-point increase in the SAT mathematics
score, the final college GPA is expected to increase by 0.00486, holding the high-school GPA
constant.

b. Predict the final college GPA for a student who has a high-school average of 84 and a
score of 540 on the SAT mathematics test.
^y =−1.41+ 0.0235∗84+0.00486∗540=3.1884
Hence, the predicted final college GPA for a student with a high-school average of 84 and a
score of 540 on the SAT mathematics test is approximately 3.19.

Ex51/742/774: A partial computer output from a regression analysis follows


a. Compute the missing entries in this output.

Source DF Adj SS Adj MS F-Value P-Value


Regression _2__ 1612 _806__ _71.84_ 0.000
x1 1 146.366 146.366 13.042 0.004
x2 1 289.047 289.047 25.756 0.000
Error 12 134.676_ 11.223
Total 14_ 1746.676

S R-sq R-sq(adj) R-sq(pred)


3.35 92.30% 91.70% 85.12%

Term Coef SE Coef T-Value P-Value VIF


Constant 8.103 2.667 3.04 0.010
x1 7.602 2.105 3.61 0.004 1.62
x2 3.111 0.613 5.08 0.000 1.62
b. Use the F test and a = .05 to see whether a significant relationship is present.
Hypothesis: H 0 : β 1=β 2=0
H a :at least one parameter is not equal ¿ 0
Since p−value=0.0000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the (x1, x2 and y)
c. Use the t test and a = .05 to test H0: b1 = 0 and H0: b2 = 0.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.004< α=0.05
=> Reject Ho
Hence, we conclude that β 1 is significant.

Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that β 2 is significant.
Ex52/742/774: Recall that in exercise 49, the admissions officer for Clearwater College
developed the following estimated regression equation relating final college GPA to the
student’s SAT mathematics score and high-school GPA.
^y =21.41+ 0.0235 x 1 +0.00486 x 2
where
x1 = high-school grade point average
x2 = SAT mathematics score
y = final college grade point average
a. Complete the missing entries in this output.
Source DF Adj SS Adj MS F-Value P-Value
Regression 2 1.76209 0.881045 4.365 0.000
X1 1 0.12389 0.12389 7.35 0.030
X2 1 0.34308 0.34308 20.36 0.003
Error 7 1.41306 0.20186
Total 9 1.88003

S R-sq R-sq(adj) R-sq(pred)


3.35 93.72% 91.93% 85.12%

Term Coef SE Coef T-Value P-Value VIF


Constant -1.41 0.4848 -2.91 0.000
X1 0.0235 0.0087 2.69 0.030 1.54
X2 0.00486 0.0011 4.47 0.003 1.54
b. Use the F test and a .05 level of significance to see whether a significant relationship
is present.
Hypothesis: H 0 : β 1=β 2=0
H a :at least one parameter is not equal ¿ 0
Since p−value=0.0000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the (x1, x2 and y)
c. Use the t test and a = .05 to test H0: b1 = 0 and H0: b2 = 0.
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.03<α =0.05
=> Reject Ho
Hence, we conclude that β 1 is significant.

Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.003<α =0.05
=> Reject Ho
Hence, we conclude that β 2 is significant.
d. Did the estimated regression equation provide a good fit to the data? Explain.
Since R2=0.9372. The regression relationship is highly good fit. This means approximately
93.72% of the variability in the number of the dependent variable can be explained by the
linear relationship, and just about 6.82% other factors are not mentioned in this model.
Hence, the estimated regression equation is a good fit.
Ex53/743/775: Recall that in exercise 50 the personnel director for Electronics
Associates developed the following estimated regression equation relating an employee’s
score on a job satisfaction test to length of service and wage rate.
^y =14.41−8.69 x 1+13.52 x 2
where
x1 = length of service (years)
x2 = wage rate (dollars)
y = job satisfaction test score (higher scores
indicate greater job satisfaction)

a. Complete the missing entries in this output.


Source DF Adj SS Adj MS F-Value P-Value
Regression 2 648.82 324.41 22.788 0.003
x1 1 444.58 444.58 31.23 0.003
x2 1 598.57 598.57 42.05 0.001
Error 5 71.18 14.236
Total 7 720.00

S R-sq R-sq(adj) R-sq(pred)


3.35 90.11% 86.15% 85.12%

Term Coef SE Coef T-Value P-Value VIF


Constant 14.41 8.191 1.76 0.139
x1 -8.69 1.555 -5.588 0.003 1.95
x2 13.52 2.085 6.4844 0.001 1.95

b. Compute F and test using a = .05 to see whether a significant relationship is


present.
Hypothesis: H 0 : β 1=β 2=0
H a :at least one parameter is not equal ¿ 0
MSR 324.41
F-test: F= = =22.788
MSE 14.236
From the F-table, we can see that F 0.05 ;2; 5=5.7861
Since F > Fa (22.788 > 5.7861) => Reject H 0.
Hence, we conclude that there is a significant relationship between the (x1, x2 and y).

c. Did the estimated regression equation provide a good fit to the data? Explain.
Since R2=0.9011. The regression relationship is highly good fit. This means approximately
90.11% of the variability in the number of the dependent variable can be explained by the
linear relationship, and just about 9.89% other factors are not mentioned in this model.
Hence, the estimated regression equation is a good fit.

d. Use the t test and a = .05 to test H0: b1 = 0 and H0: b2 = 0.


Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.003<α =0.05
=> Reject Ho
Hence, we conclude that β 1 is significant.

Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.001< α =0.05
=> Reject Ho
Hence, we conclude that β 2 is significant.
Ex54/746/774: The Tire Rack, America’s leading online distributor of tires and wheels,
conducts extensive testing to provide customers with products that are right for their
vehicle, driving style, and driving conditions. In addition, the Tire Rack maintains an
independent consumer survey to help drivers help each other by sharing their long-
term tire experiences. The following data show survey ratings (1 to 10 scale with 10 the
highest rating) for 18 maximum performance summer tires. The variable Steering rates
the tire’s steering responsiveness, Tread wear rates quickness of wear based on the
driver’s expectations, and buy Again rates the driver’s overall tire satisfaction and
desire to purchase the same tire again.

a. Develop an estimated regression equation that can be used to predict the buy Again
rating given based on the Steering rating. At the .05 level of significance, test for a
significant relationship.
An estimated regression equation that can be used to predict the buy Again
rating given based on the Steering rating is:
^y =−7.521829+ 1.815067 x1
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variable
(BuyAgain and SteeringRating )

b. Did the estimated regression equation developed in part (a) provide a good fit to the
data? Explain.
Since R2=0.843 . The regression relationship is good fit. This means approximately 84,3% of
the variability in the number of the dependent variable (BuyAgain) can be explained by the
linear relationship between the two variable (BuyAgain and Steering), and just about 15.7%
other factors influencing the BuyAgain are not mentioned in this model. Hence, the estimated
regression equation is a good fit.

c. Develop an estimated regression equation that can be used to predict the buy Again
rating given the Steering rating and the Tread wear rating.

BuyAgain=−5.387673+ 0.9113301∗Treadwear+0.6898711∗Steering
^y =−5.387673+0.9113301 x 2+ 0.6898711 x1

d. Is the addition of the Tread wear independent variable significant? Use a = .05.
Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.001< α =0.05
=> Reject Ho
Hence, we conclude that β 1 is significant.
Ex55/745/777: The Department of Energy and the U.S. Environmental Protection
Agency’s 2012 Fuel Economy Guide provides fuel efficiency data for 2012 model year
cars and trucks (Department of Energy website, April 16, 2012). The file named
2012FuelEcon provides a portion of the data for 309 cars. The column labeled
Manufacturer shows the name of the company that manufactured the car; the column
labeled Displacement shows the engine’s displacement in liters; the column labeled Fuel
shows the required or recommended type of fuel (regular or premium gasoline); the
column labeled Drive identifies the type of drive (F for front wheel, R for rear wheel,
and A for all wheel); and the column labeled hwy MPG shows the fuel efficiency rating
for highway driving in terms of miles per gallon.

a. Develop an estimated regression equation that can be used to predict the fuel
efficiency for highway driving given the engine’s displacement. Test for significance
using a = .05.

Hwy MPG=41.0534−3.7232∗Displacement
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that there is a significant relationship between the two variables.

b. Consider the addition of the dummy variable FuelPremium, where the value of
FuelPremium is 1 if the required or recommended type of fuel is premium gasoline
and 0 if the type of fuel is regular gasoline. Develop the estimated regression equation
that can be used to predict the fuel efficiency for highway driving given the engine’s
displacement and the dummy variable FuelPremium.
Hwy MPG=40.5946−3.1944∗Displacement −2.723∗FuelPremium
c. Use a = .05 to determine whether the dummy variable added in part (b) is
significant.
Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho
Hence, we conclude that β 1 is significant.
d. Consider the addition of the dummy variables Frontwheel and Rearwheel. The value
of Frontwheel is 1 if the car has front wheel drive and 0 otherwise; the value of
Rearwheel is 1 if the car has rear wheel drive and 0 otherwise. Thus, for a car that has
all-wheel drive, the value of Frontwheel and the value of Rearwheel is 0. Develop
the estimated regression equation that can be used to predict the fuel efficiency for
highway driving given the engine’s displacement, the dummy variable FuelPremium,
and the dummy variables Frontwheel and Rearwheel.

Hwy MPG=37.76908−3.199242∗Displacement −2.057244∗FuelPremium


+3.192388∗FrontRearWheel
e. For the estimated regression equation developed in part (d), test for overall
significance and individual significance using a = .05.
Hypothesis: H 0 : β 1=β 2=β 3=0
H a :at least one parameter is not equal ¿ 0
Since p−value=0.000<α =0.05
=> Reject H 0.
Hence, we conclude that there is a significant relationship between the (x1, x2, x3 and y).

Ex56/746/778: A portion of a data set containing information for 45 mutual funds that
are part of the Morningstar Funds 500 follows. The complete data set is available in the
file named MutualFunds. The data set includes the following five variables:
Fund Type: The type of fund, labeled DE (Domestic Equity), IE (International Equity),
and FI (Fixed Income). Net Asset Value ($): The closing price per share on December
31, 2007. 5-Year Average Return (%): The average annual return for the fund over the
past five years. Expense Ratio (%): The percentage of assets deducted each fiscal year
for fund expenses. Morningstar Rank: The risk adjusted star rating for each fund;
Morningstar ranks go from a low of 1-Star to a high of 5-Stars.
a.Develop anestimatedregressionequationthatcanbeusedtopredictthe5-yearaverageretun
given the type of fund. At the .05 level of significance, test for a significant relationship.

^y =4.909+10.46581∗Fund DE+ 21.68225∗Fund IE


Hypothesis: H 0 : β 1=β 2=0
H a :at least one parameter is not equal ¿ 0
Since p−value=0.000<α =0.05
=> Reject H 0.
Hence, we conclude that there is a significant relationship between the (x1, x2 and y)

b. Did the estimated regression equation developed in part (a) provide a good fit to the
data? Explain.
Since R2=0.6144 . The regression relationship is moderate. This means approximately
61.44% of the variability in the number of the dependent variable can be explained by the
linear relationship between the two variable, and just about 38.56% other factors influencing
the dependent variable are not mentioned in this model. Hence, the estimated regression
equation is not good enough.

c. Develop the estimated regression equation that can be used to predict the 5-year
average return given the type of fund, the net asset value, and the expense ratio. At the
.05 level of significance, test for a significant relationship. Do you think any variables
should be deleted from the estimated regression equation? Explain.
^y =1.189876 +6.896857∗Fund DE+ 17.67996∗Fund IE +0.026462∗NetAssetValue
+6.456421∗ExpenseRatio
T-test
For Fund DE:
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.017<α =0.05
=> Reject Ho. Hence, we conclude that β 1 is significant.
For Fund IE:
Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho. Hence, we conclude that β 2 is significant.
For Net Asset Value:
Hypothesis: H 0 : β 3=0
H a : β3≠ 0
Since p−value=0.695>α =0.05
=> Reject Ho. Hence, we conclude that β 3 is not significant.
For Expense Ratio:
Hypothesis: H 0 : β 4=0
H a : β4 ≠ 0
Since p−value=0.024< α=0.05
=> Reject Ho. Hence, we conclude that β 4 is significant.
Net Asset value is not significant and can be deleted.
d. Morningstar Rank is a categorical variable. because the data set contains only funds
with four ranks (2-Star through 5-Star), use the following dummy variables: 3StarRank
= 1 for a 3-Star fund, 0 otherwise; 4StarRank = 1 for a 4-Star fund, 0 otherwise;
and 5StarRank = 1 for a 5-Star fund, 0 otherwise. Develop an estimated regression
equation that can be used to predict the 5-year average return given the type of fund,
the expense ratio, and the Morningstar Rank. Using a = .05, remove any independent
variables that are not significant.
^y =−4.0674+ 8.1713∗Fund DE+19.5194∗Fund IE +5.5197∗ExpenseRatio
+5.9237∗Rank 3+ 8.82367∗Rank 4+6.6241∗Rank 5
e. Use the estimated regression equation developed in part (d) to predict the 5-year
average return for a domestic equity fund with an expense ratio of 1.05% and a 3-Star
Morningstar Rank.
The 5-year average return for a domestic equity fund with an expense ratio of 1.05% and a 3-
Star Morningstar Rank
^y =−4.0674+ 8.1713+5.5197∗1.05+5.9237=15.823 %
Ex57/746/778: Fortune magazine publishes an annual list of the 100 best companies to
work for. The data in the file named Fortunebest shows a portion of the data for a
random sample of 30 of the companies that made the top 100 list for 2012 (Fortune,
February 6, 2012). The column labeled Rank shows the rank of the company in the
Fortune 100 list; the column labeled Size indicates whether the company is a small,
midsize, or large company; the column labeled Salaried ($1000s) shows the average
annual salary for salaried employees rounded to the nearest $1000; and the column
labeled hourly ($1000s) shows the average annual salary for hourly employees rounded
to the nearest $1000. Fortune defines large companies as having more than 10,000
employees, midsize companies as having between 2500 and 10,000 employees, and small
companies as having fewer than 2500 employees.

a. Use these data to develop an estimated regression equation that could be used to
predict the average annual salary for salaried employees given the average annual
salary for hourly employees.
Salaries=40.34852+1.194687∗Hourly
b. Use a = .05 to test for overall significance.
Hypothesis: H 0 : β 1=0
H a : β 1 not equal¿ 0
Since p−value=0.0005<α =0.05
=> Reject H 0.
Hence, we conclude that there is a significant relationship between the average annual salary
for salaried employees and the average annual salary for hourly employees.
c. To incorporate the effect of size, a categorical variable with three levels, we used two
dummy variables: Size-Midsize and Size-Small. The value of Size-Midsize = 1 if the
company is a midsize company and 0 otherwise. And, the value of Size-Small = 1 if
the company is a small company and 0 otherwise. Develop an estimated regression
equation that could be used to predict the average annual salary for salaried employees
given the average annual salary for hourly employees and the size of the company.

Salaries=26.9659+1.224045∗Hourly−3.208216∗Midsize+34.40215∗Small
d. For the estimated regression equation developed in part (c), use the t test to
determine the significance of the independent variables. Use a = .05.
For Hourly:
Hypothesis: H 0 : β 1=0
H a : β1≠ 0
Since p−value=0.000<α =0.05
=> Reject Ho. Hence, we conclude that β 1 is significant.
For Midsize:
Hypothesis: H 0 : β 2=0
H a : β2≠ 0
Since p−value=0.802> α =0.05
=> Reject Ho. Hence, we conclude that β 2 is not significant.
For Small:
Hypothesis: H 0 : β 3=0
H a : β3≠ 0
Since p−value=0.003<α =0.05
=> Reject Ho. Hence, we conclude that β 3 is significant.
e. based upon your findings in part (d), develop an estimated regression equation that
can be used to predict the average annual salary for salaried employees given the
average annual salary for hourly employees and the size of the company.
Salary=26.9659+1.2240∗Hourly + 34.4021∗Small

CHAP 16
Ex29/798/830: A sample containing years to maturity and yield (%) for 40 corporate
bonds is contained in the data file named CorporateBonds (Barron’s, April 2, 2012).

a. Develop a scatter diagram of the data using x = years to maturity as the independent
variable. Does a simple linear regression model appear to be appropriate?

There appears to be a curvilinear relationship between years and yield.


b. Develop an estimated regression equation with x = years to maturity and x2 as the
independent variables.
2
Yield=1.016952+0.4606362∗Years−0.0102532∗Year
c. As an alternative to fitting a second-order model, fit a model using the natural
logarithm of price as the independent variable; that is, yˆ 5= b0 1 b1ln(x). Does the
estimated regression using the natural logarithm of x provide a better fit than the
estimated regression developed in part (b)? Explain.

Yield=0.8278764 +1.562609∗ln( years)


We have: R2 (c)> R2 (b) (0.6695 > 0.6678) => the natural logarithm of x in question (c)
provide a better fit than the estimated regression developed in part (b)
Ex30/799/831: Consumer reports tested 19 different brands and models of road, fitness,
and comfort bikes. Road bikes are designed for long road trips; fitness bikes are
designed for regular workouts or daily commutes; and comfort bikes are designed for
leisure rides on typically flat roads. The following data show the type, weight (lb.), and
price ($) for the 19 bicycles tested (Consumer reports website, February 2009).

a. Develop a scatter diagram with weight as the independent variable and price as the
dependent variable. Does a simple linear regression model appear to be appropriate?

There appears to be a curvilinear relationship between weight and price.


b. Develop an estimated multiple regression equation with x = weight and x2 as the two
independent variables.

2
Price=11376−728.3345 Weight +11.97374 Weight
^y =11376−728.3345 x 1 +11.97374 x 2

c. Use the following dummy variables to develop an estimated regression equation that
can be used to predict the price given the type of bike: Type_Fitness = 1 if the bike
is a fitness bike, 0 otherwise; and Type_Comfort = 1 if the bike is a comfort bike; 0
otherwise. Compare the results obtained to the results obtained in part (b).

Price=1283.75−571.75∗TypeFitness−907.0833∗TypeComfort
^y =1283.75−571.75 x 1−907.0833 x 2
Type of bike appears to be a significant factor in predicting price, but the estimated
regression equation developed in part (b) appears to provide a slightly better fit.

d. To account for possible interaction between the type of bike and the weight of the
bike, develop a new estimated regression equation that can be used to predict the price
of the bike given the type, the weight of the bike, and any interaction between weight
and each of the dummy variables defined in part (c). What estimated regression
equation appears to be the best predictor of price? Explain.
Price=5923.544−6343.25∗TypeFitness−7232.063∗TypeComfort
−214.557∗Weight +266.4088∗WeightComfort +261.3217∗WeightFitness
^y =5923.544−6343.25 x 1−7232.063 x 2−214.557 x 3 +266.4088 x 4 + 261.3217 x 5

Ex31/7997831: A study investigated the relationship between audit delay (Delay), the
length of time from a company’s fiscal year-end to the date of the auditor’s report, and
variables that describe the client and the auditor. Some of the independent variables
that were included in this study follow.
Industry A dummy variable coded 1 if the firm was an industrial company or 0 if
the firm was a bank, savings and loan, or insurance company.
Public A dummy variable coded 1 if the company was traded on an organized
exchange or over the counter; otherwise coded 0.
Quality A measure of overall quality of internal controls, as judged by the
auditor, on a five-point scale ranging from “virtually none” (1) to
“excellent” (5).
Finished A measure ranging from 1 to 4, as judged by the auditor, where 1
indicates “all work performed subsequent to year-end” and 4 indicates
“most work performed prior to year-end.”
A sample of 40 companies provided the following data.
a. Develop the estimated regression equation using all of the independent variables.
Delay=80.42857+11.94419∗Industry−4.816257∗Public
−2.623635∗Quality−4.072511∗Finished
b. Did the estimated regression equation developed in part (a) provide a good fit?
Explain.
Since R2=0.3826 . This means approximately 38.26% of the variability in the number of the
dependent variable can be explained by the linear relationship between the two variable, and
about 61.74% other factors influencing the dependent variable are not mentioned in this
model. Hence, the estimated regression equation is not good enough.
c. Develop a scatter diagram showing Delay as a function of Finished. What does this
scatter diagram indicate about the relationship between Delay and Finished?

Delay=80.22588−4.382416∗Finished
d. On the basis of your observations about the relationship between Delay and Finished,
develop an alternative estimated regression equation to the one developed in (a) to
explain as much of the variability in Delay as possible.
Delay=80.42857 −4.072511∗Finished
Ex32/801/833: Refer to the data in exercise 31. Consider a model in which only Industry
is used to predict Delay. At a .01 level of significance, test for any positive
autocorrelation in the data.

Delay=63+ 11.07407∗Industry
^y =63+11.07407 x

**Autocorrelation test

Hypothesis H 0 : p=0 there is no autocorrelation

H a : p>0 to test for positive autocorrelation

p<0 to test for negative autocorrelation

p ≠ 0 to test for negative or positive autocorrelation

Test statistic:

p−value=0.011.The value is positive.

p−value=0.011>α =0.01 => Do not reject H 0

Hence, we conclude that there is no significant positive relationship.

Ex33/801/833: Refer to the data in exercise 31.


a. Develop an estimated regression equation that can be used to predict Delay by using
Industry and Quality.
Delay=70.63361+12.73717∗Industry −2.918732∗Quality
b. Plot the residuals obtained from the estimated regression equation developed in part
(a) as a function of the order in which the data are presented. Does any autocorrelation
appear to be present in the data? Explain.

The provided image is a scatter plot of the residuals from a regression analysis plotted against
the order of the data points. This type of plot is commonly used to check for the presence of
autocorrelation in the residuals.
Autocorrelation refers to the correlation of a time series with its own past and future values.
In the context of a regression analysis, the presence of autocorrelation in the residuals can
indicate that the assumptions of the regression model are violated, such as the independence
of the error terms.
No autocorrelation because the residuals are randomly scattered without any discernible
pattern, it indicates that the model has adequately captured the relationship between the
independent and dependent variables, and that there is no autocorrelation.

c. At the .05 level of significance, test for any positive autocorrelation in the data.
Autocorrelation test

Hypothesis:
Ho: p=0 there is no autocorrelation

Ha: p> 0 ¿ test for positive autocorrelation

p<0 ¿ test for negative autocorrelation


p not=0 ¿ test for negative∨positive autocorrelation

Test statistic: p-value of industry and quality <0.05 -> reject null hypothesis

p-value of industry and quality >0 -> positive autocorrelation

Therefore, there are 2 positive autocorrelation

Ex34/801/833: A study was conducted to investigate browsing activity by shoppers.


Shoppers were classified as non browsers, light browsers, and heavy browsers. For each
shopper in the study, a measure was obtained to determine how comfortable the
shopper was in the store. Higher scores indicated greater comfort. Assume that the
following data are from this study. Use a .05 level of significance to test for differences
in comfort levels among the three types of browsers.

Hypothesis: H 0 : β 1=β 2=0


H a :at least one parameter is not equal ¿ 0
Since p−value=0.0337<α =0.05
=> Reject H 0.
Hence, we conclude that there is a significant relationship between the (x1, x2 and y)
There is a statistically significant relationship between the comfort of the shopper and the
three types of shopper.
Ex35/801/833: The Department of Energy and the U.S. Environmental Protection
Agency’s 2012 Fuel Economy Guide provides fuel efficiency data for 2012 model year
cars and trucks (Department of Energy website, April 16, 2012). The file named
CarMileage provides a portion of the data for 316 cars. The column labeled Size
identifies the size of the car (Compact, Midsize, and Large) and the column labeled Hwy
MPG shows the fuel efficiency rating for highway driving in terms of miles per gallon.
Use a = .05 and test for any significant difference in the mean fuel efficiency rating for
highway driving among the three sizes of cars.

Hypothesis: H 0 : β 1=β 2=0


H a :at least one parameter is not equal ¿ 0
Since p−value=0.0000<α =0.05
=> Reject H 0.
Hence, we conclude that there is a significant relationship between the mean fuel efficiency
rating for highway driving among the three sizes of cars.

CHAP 17
Ex44/861/893: The data contained in the datafile named crudecost shows the U.S.
refiner acquisition cost of crude oil in dollars per barrel (energy information
administration website, February 3, 2014).
a. construct a time series plot. What type of pattern exists in the data?
The data exhibits a trend pattern, showing a gradual upward shift in sales over the time
period.
b. compute the linear trend equation for the time series. Use the linear trend equation to
forecast the crude cost for january 2014.

The linear trend equation: T=81.2945+0.5835747∗t


c. compute the quadratic trend equation for the time series. Use the quadratic trend
equation to forecast the crude cost for january 2014.
From the given data, we see that January 2010 corresponds to t=1, and each subsequent
month increases t by 1. January 2014 is 48 months after January 2010:

t=48+1=49
T =81.2945+ 0.5835747∗49=109.891
The forecasted crude cost for January 2014 is approximately $109.891 per barrel.
d. Using MSe, which approach provides the most accuratre forecasts for the historical
data?
Linear trend equation: MSE = 69.6
Quadratic trend equation: MSE = 41.9
Ex45/862/894: annual retail store revenue for apple from 2007 to 2014 are shown below
(source:ifoapple.com)

a. construct a time series plot. What type of pattern exists in the data?

The time series plot above shows a positive linear trend with some variability in the growth
rate over the years.
b. Using Minitab or excel, develop a linear trend equation for this time series.

y=2.771655 x +0.1184284
c. Use the trend equation developed in part (b) to forecast retail store revenue for
2015.
y=2.771655∗9+ 0.1184284=25.063
Ex43/861/893: United dairies, inc., supplies milk to several independent grocers
throughout dade county, Florida. Managers at United dairies want to develop a forecast
of the number of half-gallons of milk sold per week. Sales data for the past 12 weeks
follow.

a. construct a time series plot. What type of pattern exists in the data?

The data appears to show a non-linear, piecewise linear pattern. The graph starts with a
steady upward trend from 2800 to around 3100 sales, then exhibits a sharp decline from
around 3100 to 3000 sales, followed by another upward trend from 3000 to around 3400
sales. This pattern suggests that the relationship between the x-variable (likely time or some
other independent variable) and the y-variable (sales) is not a simple linear function, but
rather exhibits multiple distinct linear segments or phases.

b. Use exponential smoothing with a = .4 to develop a forecast of demand for week 13


For Week 1: ^y =2750

For Week 2: ^y =0.4∗3100+0.6∗2750=2890

For Week 3: ^y =0.4∗3250+0.6∗2890=3034

For Week 4: ^y =0.4∗2800+0.6∗3034=2940.4

For Week 5: ^y =0.4∗2900+0.6∗2940.4=2924.24

For Week 6: ^y =0.4∗3050+0.6∗2924.24=2974.544

For Week 7: ^y =0.4∗3300+0.6∗2974.544=3104.7264

For Week 8: ^y =0.4∗3100+0.6∗3104.7264=3102.836

For Week 9: ^y =0.4∗2950+0.6∗3102.836=3041.7016

For Week 10: ^y =0.4∗3000+0.6∗3041.7016=3025.02096

For Week 11: ^y =0.4∗3200+0.6∗3025.02096=3095.01576

For Week 12: ^y =0.4∗3150+0.6∗3095.01576=3117.0075456

For Week 13: ^y =0.4∗3150+0.6∗3117.0075456=3130.


The forecasted sales for Week 13 using exponential smoothing with α=0.4\alpha = 0.4α=0.4
is approximately 3130 half-gallons.

Ex46/862/894: the Mayfair department Store in davenport, iowa, is trying to determine


the amount of sales lost while it was shut down during july and august because of
damage caused by the Mississippi river flood. Sales data for january through june
follow.

a. Use exponential smoothing, with a = .4, to develop a forecast for july and august.
(Hint: Use the forecast for july as the actual sales in july in developing the august
forecast.) comment on the use of exponential smoothing for forecasts more than one
period into the future.
FJanuary= AJanuary = 185.72
FFebruary= 0.4×AJanuary + 0.6×FJanuary = 0.4×185.72+0.6×185.72=185.72
FMarch=0.4×167.84+0.6×185.72=178.568
FApril=0.4×205.11+0.6×178.568=189.1848
FMay=0.4×210.36+0.6×189.1848=197.65488
FJune=0.4×255.57+0.6×197.65488=220.823
FJuly=0.4×261.19+0.6×220.823=236.9698
FAugust=0.4×236.9698+0.6×236.9698=236.55
b. Use trend projection to forecast sales for july and august.

Sales=149.7193+18.45114∗Time
Forecast for July is Sales=149.7193+18.45114∗7=278.87728
Forecast for August is Sales=149.7193+18.45114∗8=297.
c. Mayfair’s insurance company proposed a settlement based on lost sales of $240,000
in july and august. is this amount fair? if not, what amount would you recommend
as a counteroffer?
The proposed settlement is not fair since it does not account for the upward trend in sales;
based upon trend projection, the settlement should be based on forecasted lost sales of
$278,880 in July and $297,330 in August
Ex48/863/895: the costello Music company has been in business for five years. during
that time, sales of pianos increased from 12 units in the first year to 76 units in the most
recent year. Fred costello, the firm’s owner, wants to develop a forecast of piano sales
for the coming year. the historical data follow.
Year 1 2 3 4 5
Sales 12 28 34 50 76
a. construct a time series plot. What type of pattern exists in the data?

The time series plot shows a linear trend.


b. develop the linear trend equation for the time series. What is the average increase in
sales that the firm has been realizing per year?

^y =−5+15∗Year
The slope of 15 indicates that the average increase in sales is 15 pianos per year
c. Forecast sales for years 6 and 7.
^y =−5+15∗6=85
^y =−5+15∗7=100

Ex47/862/894: Canton Supplies, inc., is a service firm that employs approximately 100
individuals. Managers of canton Supplies are concerned about meeting monthly cash
obligations and want to develop a forecast of monthly cash requirements. Because of a
recent change in operating policy, only the past seven months of data that follow are
considered to be relevant.
Month 1 2 3 4 5 6 7
Cash Required ($1000s) 205 212 218 224 230 240 246
a. construct a time series plot. What type of pattern exists in the data?
The data in the image appears to show an exponential growth pattern. The line plot depicts a
steadily increasing trend over time, with the values growing at a faster rate as time
progresses, characteristic of an exponential function.
b. Using Minitab or excel, develop a linear trend equation to forecast cash requirements
for each of the next two months.

Cash=197.7143+ 6.821429∗Months
Cash=197.7143+ 6.821429∗7=252,285.70
Cash=197.7143+ 6.821429∗8=$ 259,108.20
Ex52/863/895: hudson Marine has been an authorized dealer for c&d marine radios for
the past seven years. the following table reports the number of radios sold each year.
Year 1 2 3 4 5 6 7
Number Sold 35 50 75 90 105 110 130
a. construct a time series plot. does a linear trend appear to be present?

The time series plot shows a linear trend.


b. Using Minitab or excel, develop a linear trend equation for this time series.

Number Sold=22.85714 +15.53571∗Year

c. Use the linear trend equation developed in part (b) to develop a forecast for annual
sales in year 8.
Number Sold=22.85714 +15.53571∗8=147.18242
Forecast for annual sales in year 8 is 147 units.

Ex51/863/895: Refer to the costello Music company time series in exercise 49.
a. deseasonalize the data and use the deseasonalized time series to identify the trend.
b. Use the results of part (a) to develop a quarterly forecast for next year based on
trend.
c. Use the seasonal indexes developed in exercise 50 to adjust the forecasts developed
in part (b) to account for the effect of season.

Ex42/861/893: the following table reports the percentage of stocks in a portfolio for nine
quarters from 2010 to 2012.

a. construct a time series plot. What type of pattern exists in the data?
The time series plot indicates a horizontal pattern
b. Use exponential smoothing to forecast this time series. consider smoothing constants
of a = .2, .3, and .4. What value of the smoothing constant provides the most accurate
forecasts?
c. What is the forecast of the percentage of stocks in a typical portfolio for the second
quarter of 2009?

Ex38/718

You might also like