Statistics Quiz
Statistics Quiz
Sum of
Degree of freedom the
squares
squares
1
Regression 1 1258.55 1258.55
(SST)
0.61
9.76
The F value is the ratio of the mean regression sum of squares (MSR) divided by the mean error sum of squ
0
Correlation coefficient formulas are used to find how strong a relationship is between data. The formulas re
Heteroscedastic
Homoscedastic
The error terms should be Homoscedastic i.e. the error terms should have constant variance across differen
4 Df Sum of sqMean of the squares
Regression 1 1258.55 1258.55
(SSR) ( MSR )
Residual 6 773.451 128.909
(SSE) ( MSE)
Total 7 2032
SST
No. of Independent Variables (I.V) = 1 Based on the above table what would be the value of R Squ
0.74
0.62
0.5
0.82
The R Squared is the ratio of Sum of Squares Regression (SSR) to Total sum of squares (SST) = 1258.54863
5 A Type II error is a
False Positive
False Negative
6 The VIF is a relative measure of the increase in the variance because of collinearity.
VIF is variance inflation factor which quantifies the severity of multicollinearity in an ordinary least squares
Can't Say
None of these
The slope of the regression line will change due to outliers in most of the cases. So Linear Regression is sen
8 As a Quick Rule of Thumb: AIC or p-value; which one to choose for model selection?
AIC
The Akaike information criterion (AIC) is an estimator of the relative quality of statistical models for a given
(X-intercept, Slope)
(Slope, X-Intercept)
(Y-Intercept, Slope)
(slope, Y-Intercept)
10 In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How muc
by 1
no change
by intercept
by its slope
For linear regression we know that Y=a+bx+error. And if we neglect error then the equation is Y=a+bx. T
e mean error sum of squares (MSE) So F Statistic is 1258.548639 / 128.9085602 = 9.76 (rounded to 2 decimal places)
e equation is Y=a+bx. Therefore, if x increases by 1, then Y = a+b(x+1) which implies Y=a+bx+b. So Y increases by its slope.
decimal places)
Y increases by its slope.
1 Calculate the odds ratio (rounded off to 2 decimal points) for the given probability, P =0.45
0.69
0.73
0.77
0.81
explain: q=1-p 0.55 odss rationp/q 0.818
p 0.45
Predicted
2300 180
Positive
2480 0.072581
Predicted
200 820
Negative
1020 0.196078
total 2500 1000
0.92 0.82
0.78
0.82 yellow marked need validation
0.92
0.96
Predicted
2300 180
Positive
2480 0.927419
Predicted
200 820
Negative
1020 0.803922
2500 1000
0.08 0.18
0.59
0.69
0.79
0.89 yellow marked ones need validation
Predicted
200 820
Negative
1020 0.196078
2500 1000
0.92 0.82
0.78
0.82
0.92 yellow marked ones need validation
0.96
There should be a very low correlation between a dependent and an independent variables.
Linear Regression errors values has to be normally distributed but in case of Logistic Regression it is not th
Logistic Regression errors values has to be normally distributed but in case of Linear Regression it is not th
Both Linear Regression and Logistic Regression error values have to be normally distributed
Both Linear Regression and Logistic Regression error values have not to be normally distributed
8 Logistic regression assumes a linear relationship between the log odds ratio of an event and independent
9 Which of the following methods do we use to best fit the data in Logistic Regression?
Maximum Likelihood
Jaccard distance
Both A and B
There can be more than 2 categories in the Dependent Variable for prediction
ent variables.
y distributed
mally distributed
an event and independent predictors.
1 KNN can be used for both Classification and Regression problems.
0
KNN can be used for both classification and regression problems.
2 You have given the following 2 statements, find which of these option is/are true in case of k-NN?
1. In case of very large value of k, we may include points from other classes into the neighborhood.
2. In case of too small value of k the algorithm is very sensitive to noise
Only 1
Only 2
Both 1 & 2
3 What is K in KNN ?
1 - Number of nearest neighbors to the new data point considered to decide the class of the new data po
2 - Number of nearest neighbors to the new data point, majority class out of which is assigned to the new
Only 1
Only 2
Both 1 & 2
KNN algorithm has a high prediction cost for large data sets
KNN is lazy learning algorithm and therefore requires no training prior to making real time predictions
The KNN algorithm does work very well with categorical features.
This algorithm segregates unlabeled data points into well-defined groups
explained: The KNN algorithm doesn't work well with high dimensional data because, with a large number of dimen
explained: Naive Bayes can be used for Binary and Multiclass classification. It provides different types of Naive Bay
Gaussian
Bernoulli
Multinomial
explained: Naive Bayes can be used for Binary and Multiclass classification. It provides different types of Naive Bayes
8 Naive Bayes assumes all the features to be related so it can learn the relationship between features
explained: It assumes all the features to be unrelated so it cannot learn the relationship between features
10 Naive Bayes is called Naive due to the assumption that the features in the dataset are mutually independent.
0
It is called Naïve due to the assumption that the features in the dataset are mutually independent.
ue in case of k-NN?
o the neighborhood.
ip between features
etween features
e mutually independent.
utually independent.
ch dimension