Worksheet 1
Worksheet 1
Worksheet -1
ID-20175008
The data used here are from all 420 K-6 and K-8 districts in California with data available for 1998 and
1999. Test scores are the average of the reading and math scores on the Stanford 9 standardized test
administered to 5th grade students. The student-teacher ratio used here is the number of full-time
equivalent teachers in the district, divided by the number of students. All of these data were obtained
from the California Department of Education (www.cde.ca.gov).
Series in Data Set: DIST_CODE: DISTRICT CODE; READ_SCR: AVG READING SCORE; MATH_SCR: AVG
MATH SCORE; COUNTY : COUNTY; DISTRICT: DISTRICT; GR_SPAN: GRADE SPAN OF DISTRICT;
ENRL_TOT : TOTAL ENROLLMENT; TEACHERS: NUMBER OF TEACHERS; COMPUTER: NUMBER OF
COMPUTERS; TESTSCR: AVG TEST SCORE (= (READ_SCR+MATH_SCR)/2 ); COMP_STU: COMPUTERS PER
STUDENT ( = COMPUTER/ENRL_TOT); EXPN_STU: EXPENTITURES PER STUDENT ($’S); STR: STUDENT
TEACHER RATIO (ENRL_TOT/TEACHERS); EL_PCT: PERCENT OF ENGLISH LEARNERS; MEAL_PCT:
PERCENT QUALIFYING FOR REDUCED-PRICE LUNCH; CALW_PCT: PERCENT QUALIFYING FOR CALWORKS;
AVGINC: DISTRICT AVERAGE INCOME (IN $1000'S);
a)Download the data from the student companion website of Stock and Watson. Calculate summary
statistics of STR and TESTSCR. What can you tell from these statistics?
Ans:-
From calculating the summary statistics of STR and TESTSCR we can say that:-
• When student teacher ratio is minimum at 14, average test score is 605.55
• When student teacher ratio is maximum at 25, average test score is 706.75
b) Draw a scatter plot of average test scores (testscr) andstudent-teacher ratio (str). What does the
scatter plot indicate regarding relationship between test scores and class size?
Ans:-
Data from 420 California school districts shows that, There is a weak negative relationship between the
student-teacher ratio and test scores.
c) Run a regression of testscr on str, expn_stu, comp_stu, meal_pct, calw_pct, avginc, el_pct and copy
the STATA output below.
Ans:-
d) Write down the estimated regression line from the STATA output.
Ans:- testscore= 659.59 + ( – 0.189) str + ( 0.00152) expn_stu + (11.89) comp_stu + ( – 0.375) meal_pct
+ ( – 0.077) calw_pct + ( 0.62) avginc + ( – 0.198) el_pct
Ans:- The coefficient of str is ( -.18991) which means for every additional student in class, average test
score is expected to decrease by 0.18991 points, keeping other variables constant.
f) Report the standard error of regression (SER). What are the units of measurement for the SER
(dollars? years? scores? or is it unit free)?
h) Last year a classroom had 19 students and this year it has 23 students. What is the regression’s
prediction for the change in the classroom average test score?
Ans:- str = (-0.189) For new 4 students, the score is estimated to go down by (0.189 * 4) = 0.756 holding
other variables constant.
i) The student-teacher ratio has a minimum value of 14 and a maximum value of 25.8 in the 420
classrooms. Will the regression give reliable predictions for a class with 35 students? Why or why
not?
Ans:- The regression will not give reliable predictions for a class with 35 students as we have predicted Y
(35) but not actual Y in this case.
j) Based on your results, can you argue that a smaller class size will increase student test scores on
average?
Ans:-The regression takes teacher to student ratio in prediction and contain no information on how
districts with extremely small classes perform, so these data alone are not a reliable basis for predicting
the effect of a radical move to such an extremely low student-teacher ratio. Hence, the class size has no
effect on the regression & a smaller class has no effect on test scores.
k) Which explanatory variables are statistically significant? Use the p-value approach and explain their
signs.
In this model they are: Percent of English learners (-ve) Percent qualifying for reduced-price lunch (+ve)
District average income (+ve).