Practice Midterm Questions 1 and 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Practice Midterm

Questions 1 and 2

Consider the histogram and the associated box plot of the dataset called “Crime Data”, below
and answer questions 1 and 2

1. The data was normalized (by subtracting the mean and dividing by standard deviation of
the data). Which of the following graphs best represents the normalized data?

(A)$ (B)

(C) (D)

2. Based on the histogram of the original data, which of the following statements is most
likely?
A. If we use this variable as a predictor (independent) variable, it will definitely be an
influential variable
B. If we use this variable as a predictor (independent) variable, it will definitely be a
leverage point
C. The distribution is left skewed
D. The distribution is right skewed$
Questions 3 and 4
Consider the boxplot below which gives the distribution of work experience (in months) of
employees in the company. The employees are categorized into two groups based
performance. The first category is “Low” Performers (with an annual rating of 3.15 or less
out of 4.00) denoted by “0” and “High” players (with an annual rating of above 3.15 out of
4.00) denoted by “1”.

3. Which of the following statements is true?


A. Employees with shorter work experience perform better$
B. Employees with longer work experience perform better
C. The performance category is independent of the work experience
D. Cannot say unless a combined box plot is drawn

4. We would like to build a prediction model for predicting the category of employees (Low
or High performers). Then, ________
A. Work experience is not a good predictor because there is a large amount of overlap
B. Work experience is a good predictor even though there is a large amount of overlap$
C. Work experience cannot be used as a predictor because there is a large difference
between the two medians
D. Work experience cannot be used because the two distributions are not similar

Questions 5 to 10

Yajvin is trying to estimate the relationship between compensation and other factors. He had
collected data on 150 employees in the company from HR department. He collected the data
on the Compensation (Comp) (Rs. lakhs), Months of work experience (WorkEx), Age in
months (Age), Number of years of post-high school education (EDU) and Number
Promotions (Promo) in the company. The correlation matrix across all the variables is given
below:

  Comp (lakhs) WorkEx Age Edu Promo

Comp (lakhs) 1.0000


WorkEx 0.6968 1.0000
Age 0.4852 0.1063 1.0000
EDU 0.5765 0.8296 0.0504 1.0000
Promo 0.6605 0.7710 0.1339 0.6572 1.0000

5. He has estimated a simple linear regression with EDU as the dependent variable and
Comp as the independent variable. What percentage of the variation of the dependent
variable is explained by the regression equation (i.e., what is the R2?)?
A. 57.65% B. 33.24%$ C. 75.93% D. 82.96%

6. He decided to interchange the dependent and independent variables of the above


regression, i.e., he decided to estimate a simple linear regression with Comp as the
dependent variable and EDU as the independent variable. What percentage of the
variation of the dependent variable is explained by the regression equation?
A. 57.65% B. 33.24%$ C. 75.93% D. 82.96%
  Coeffici Standard t Stat P-value Lower Upper
ents Error 95% 95%
Intercept 55.2050 10.1448 0.0000
WorkEx 0.0070 0.0000 0.0170 0.0448
Age 0.8836 0.0000 5.7886 9.2815
EDU 0.8232 0.3533 0.7244 -3.7823 5.4287
Promo 14.7222 4.3345 0.0009 6.1553

7. What is the value of the t statistic with respect to the regression coefficient of the variable
“WorkEx”?
A. 0.0070 B. 0.0309 C. 0.0000 D. 4.4143$

8. If we reject the null hypothesis that the variable Promo has no impact on the
compensation, what is the probability of Type I error?
A. 0.09%$ B. 99.91% C. 14.72% D. cannot be calculated

9. Based on the regression equation estimated above and appropriate hypothesis tests, which
of the variables need to be dropped from the regression equation?
A. WorkEx
B. EDU$
C. Age
D. Promo

10. What is the degrees of freedom with respect to “residuals”?


A. 150 B. 149 C. 148 D. 145$

You might also like