0% found this document useful (0 votes)
146 views5 pages

Worksheet 1

The document describes a dataset from 420 California school districts that includes test scores, student-teacher ratios, and other variables. It provides a series of questions and answers about analyzing the data: 1) There is a weak negative relationship between student-teacher ratio and average test scores. Scores decrease by around 0.19 points for each additional student. 2) The regression model finds student-teacher ratio, expenditures per student, computers per student, percent on reduced lunch, district income, and percent English learners are statistically significant predictors of test scores. 3) The model is valid as it contains statistically significant coefficients, but may not reliably predict scores outside the original student-teacher ratio range in the data

Uploaded by

Rupok Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
146 views5 pages

Worksheet 1

The document describes a dataset from 420 California school districts that includes test scores, student-teacher ratios, and other variables. It provides a series of questions and answers about analyzing the data: 1) There is a weak negative relationship between student-teacher ratio and average test scores. Scores decrease by around 0.19 points for each additional student. 2) The regression model finds student-teacher ratio, expenditures per student, computers per student, percent on reduced lunch, district income, and percent English learners are statistically significant predictors of test scores. 3) The model is valid as it contains statistically significant coefficients, but may not reliably predict scores outside the original student-teacher ratio range in the data

Uploaded by

Rupok Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ECO 515 (Summer 2020)

Worksheet -1

Name-Sarah Nuzhat Khan

ID-20175008

THE CALIFORNIA TEST SCORE DATA SET

The data used here are from all 420 K-6 and K-8 districts in California with data available for 1998 and
1999. Test scores are the average of the reading and math scores on the Stanford 9 standardized test
administered to 5th grade students. The student-teacher ratio used here is the number of full-time
equivalent teachers in the district, divided by the number of students. All of these data were obtained
from the California Department of Education (www.cde.ca.gov).

Series in Data Set: DIST_CODE: DISTRICT CODE; READ_SCR: AVG READING SCORE; MATH_SCR: AVG
MATH SCORE; COUNTY : COUNTY; DISTRICT: DISTRICT; GR_SPAN: GRADE SPAN OF DISTRICT;
ENRL_TOT : TOTAL ENROLLMENT; TEACHERS: NUMBER OF TEACHERS; COMPUTER: NUMBER OF
COMPUTERS; TESTSCR: AVG TEST SCORE (= (READ_SCR+MATH_SCR)/2 ); COMP_STU: COMPUTERS PER
STUDENT ( = COMPUTER/ENRL_TOT); EXPN_STU: EXPENTITURES PER STUDENT ($’S); STR: STUDENT
TEACHER RATIO (ENRL_TOT/TEACHERS); EL_PCT: PERCENT OF ENGLISH LEARNERS; MEAL_PCT:
PERCENT QUALIFYING FOR REDUCED-PRICE LUNCH; CALW_PCT: PERCENT QUALIFYING FOR CALWORKS;
AVGINC: DISTRICT AVERAGE INCOME (IN $1000'S);

a)Download the data from the student companion website of Stock and Watson. Calculate summary
statistics of STR and TESTSCR. What can you tell from these statistics?

Ans:-

From calculating the summary statistics of STR and TESTSCR we can say that:-
• When student teacher ratio is minimum at 14, average test score is 605.55

• When student teacher ratio is maximum at 25, average test score is 706.75

• When student teacher ratio std. deviation is 1.89

• When average test score std. deviation is 19.05.

b) Draw a scatter plot of average test scores (testscr) andstudent-teacher ratio (str). What does the
scatter plot indicate regarding relationship between test scores and class size?

Ans:-

Data from 420 California school districts shows that, There is a weak negative relationship between the
student-teacher ratio and test scores.

c) Run a regression of testscr on str, expn_stu, comp_stu, meal_pct, calw_pct, avginc, el_pct and copy
the STATA output below.

Ans:-
d) Write down the estimated regression line from the STATA output.

Ans:- testscore= 659.59 + ( – 0.189) str + ( 0.00152) expn_stu + (11.89) comp_stu + ( – 0.375) meal_pct
+ ( – 0.077) calw_pct + ( 0.62) avginc + ( – 0.198) el_pct

e) Explain what the coefficient of str means.

Ans:- The coefficient of str is ( -.18991) which means for every additional student in class, average test
score is expected to decrease by 0.18991 points, keeping other variables constant.

f) Report the standard error of regression (SER). What are the units of measurement for the SER
(dollars? years? scores? or is it unit free)?

Ans:- MSE = SS (residual) / df (residual) = 29011.1128 / 412 = 70.41

MSE = 70.41 = 8.39 This is Unit free.


g) Report the regression adjusted coefficient of determination. What are the units of measurement
for this coefficient (dollars? years? scores? or is it unit free)? What does it mean? Why should we use
this instead of simple coefficient of determination?

Ans:- Adjusted R-squared = 1- {(ss(residual)/df(residual))/(ss/df)} = 1- (70.41/363.03) = 0.8060 .Adjusted


R-squared is the modified version of R-squared. It estimates how well terms fit a curve or line. It
represents more appropriate measure than R-squared ( R-squared shows biased measure). Adding more
and more useless variable to a model decreases adjusted R-squared. Therefore, we use adjusted R-
squared.

h) Last year a classroom had 19 students and this year it has 23 students. What is the regression’s
prediction for the change in the classroom average test score?

Ans:- str = (-0.189) For new 4 students, the score is estimated to go down by (0.189 * 4) = 0.756 holding
other variables constant.

i) The student-teacher ratio has a minimum value of 14 and a maximum value of 25.8 in the 420
classrooms. Will the regression give reliable predictions for a class with 35 students? Why or why
not?

Ans:- The regression will not give reliable predictions for a class with 35 students as we have predicted Y
(35) but not actual Y in this case.

j) Based on your results, can you argue that a smaller class size will increase student test scores on
average?

Ans:-The regression takes teacher to student ratio in prediction and contain no information on how
districts with extremely small classes perform, so these data alone are not a reliable basis for predicting
the effect of a radical move to such an extremely low student-teacher ratio. Hence, the class size has no
effect on the regression & a smaller class has no effect on test scores.
k) Which explanatory variables are statistically significant? Use the p-value approach and explain their
signs.

Ans:-Variables whose p-value is less than 5% are statistically significant.

In this model they are: Percent of English learners (-ve) Percent qualifying for reduced-price lunch (+ve)
District average income (+ve).

l) Is the model valid? Conduct the test of validity and comment.

Ans:-This model is valid because it contains statistically significant coefficients.

You might also like