0% found this document useful (0 votes)
50 views4 pages

Questions For Viva

Uploaded by

4xtdbqkdcj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views4 pages

Questions For Viva

Uploaded by

4xtdbqkdcj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Possible Viva Questions about Linear Models

Prepared by: Pietro Colombo

Questions about Linear Models


1. What is a linear model in statistics?
2. What are the key assumptions of a linear regression model?

3. Explain the difference between simple linear regression and multiple linear
regression.
4. How do you interpret the slope coefficient in a linear regression model?
5. What is the role of the intercept term in a linear regression equation?

6. What is the ordinary least squares (OLS) method, and how is it used in
linear regression?
7. What is the purpose of residual analysis in linear regression?
8. How do you check for multicollinearity in multiple linear regression?

9. What are the advantages and disadvantages of using a linear model?


10. Explain the concept of homoscedasticity in the context of linear regression.
11. How do you assess the overall goodness of fit of a linear regression model?

12. What are influential points, and how do they affect linear regression anal-
ysis?
13. Describe the process of model selection in linear regression.
14. What are the assumptions of independence of errors in linear regression?

15. Can categorical variables be included in a linear regression model? If so,


how?
16. What is the difference between correlation and regression analysis?
17. What is the purpose of transforming variables in linear regression?

18. How do you interpret the coefficient of determination (R-squared) in linear


regression?

1
19. Explain the difference between explanatory and response variables in the
context of linear regression.
20. How do you handle outliers in linear regression analysis?

21. Explain the role of the p-values for model selection and hypothesis testing
in the context of regression modelling and also the potential limitations.

Typical Answers
1. A linear model is a statistical approach used to describe the relationship
between a dependent variable and one or more independent variables,
assuming a linear relationship.
2. The key assumptions of a linear regression model include linearity, inde-
pendence of errors, homoscedasticity, and normality of errors.

3. Simple linear regression involves predicting a dependent variable using one


independent variable, while multiple linear regression involves using two
or more independent variables.
4. The slope coefficient represents the change in the dependent variable for
a one-unit change in the independent variable, holding other variables
constant.
5. The intercept term represents the predicted value of the dependent vari-
able when all independent variables are zero.
6. The ordinary least squares (OLS) method is used to estimate the pa-
rameters of a linear regression model by minimizing the sum of squared
differences between observed and predicted values.
7. Residual analysis is used to assess how well a linear regression model fits
the data by examining the differences between observed and predicted
values.

8. Multicollinearity in multiple linear regression is checked by examining the


correlation between independent variables.
9. Advantages of linear models include simplicity and ease of interpretation,
while disadvantages include sensitivity to outliers and the assumption of
linearity.

10. Homoscedasticity refers to the assumption that the variance of errors in a


regression model is constant across all levels of independent variables.
11. The overall goodness of fit of a linear regression model is assessed using
measures such as R-squared and diagnostic plots.

2
12. Influential points are data points that have a significant impact on regres-
sion model parameters.
13. Model selection in linear regression involves choosing the most appropriate
set of independent variables using techniques such as stepwise regression
or information criteria.
14. The assumption of independence of errors in linear regression means that
errors are not correlated with each other.
15. Categorical variables can be included in a linear regression model by con-
verting them into dummy variables.
16. Correlation analysis measures the strength and direction of linear relation-
ship between two continuous variables, while regression analysis predicts
the relationship between variables.
17. Transforming variables in linear regression can help meet model assump-
tions such as linearity and normality.
18. R-squared represents the proportion of variance in the dependent variable
explained by independent variables.
19. Explanatory variables are independent variables that explain variation in
the dependent variable, while response variable is the variable of interest
being predicted.
20. Outliers in linear regression analysis can be handled by removing them,
transforming variables, or using robust regression techniques.
21. p-values
(a) Variable Selection: In stepwise regression, for instance, variables
are added or removed based on their p-values. Variables with p-values
below a certain threshold (e.g., 0.05) are typically included in the
model, while those with higher p-values are excluded. This process
continues iteratively until no further variables meet the inclusion or
exclusion criteria.
(b) Hypothesis Testing: Each coefficient in a linear regression model
comes with an associated p-value, indicating the probability of ob-
serving that coefficient if the true coefficient were zero (i.e., if there
were no relationship between the predictor and the response vari-
able). If a predictor’s coefficient has a p-value below a significance
level (e.g., 0.05), it suggests that the predictor is statistically sig-
nificant and contributes to explaining the variation in the response
variable.
(c) Limitations:Multiplicity Problem: If you perform multiple hypoth-
esis tests (e.g., testing the significance of many coefficients), the like-
lihood of making a Type I error (false positive) increases. This is

3
known as the multiplicity problem, and adjustments (such as Bon-
ferroni correction) may be necessary. Overfitting: Relying solely on
p-values for variable selection can lead to overfitting, where the model
performs well on the training data but poorly on new data. Over-
fitting occurs when the model is too complex relative to the amount
of available data, and it often results from including variables that
are not truly related to the response variable but happen to have
low p-values by chance. Context Dependence: The interpretation
of p-values depends on various factors, including sample size, effect
size, and the quality of the data. A small p-value does not necessar-
ily imply a strong practical or meaningful relationship between the
predictor and the response variable.
Therefore, while p-values can be a useful tool for model selection in linear
regression, it’s essential to consider them alongside other criteria, such as
effect size, theoretical relevance, and model performance metrics like ad-
justed R-squared or AIC (Akaike Information Criterion), to ensure that
the selected model is both statistically significant and practically mean-
ingful.

You might also like