Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
R - Multiple Regression:-
Multiple regression is an extension of linear regression into relationship between more than
two variables. In simple linear relation we have one predictor and one response variable, but in
multiple regression we have more than one predictor variable and one response variable.
2. Check the Model Summary: The summary() function provides detailed information
on the model, including coefficients, R-squared, and p-values.
summary(model)
• Coefficients: The estimated values for the intercept and the slope.
• R-squared: A measure of how well the model explains the variance in the data.
• p-values: To test the significance of the coefficients.
• Residual standard error: A measure of the typical size of the residuals.
3. Plot the Model : A quick visualization of the model fit can be achieved using plot().
plot(dataset$X, dataset$Y)
Using an in-built dataset like mtcars in R is quite simple. Here’s a step-by-step guide on how
to use an in-built dataset in R: (Instead of using pre-loaded dataset we can also use our own
file, such as CSV file, dataframe etc.)
Step 1: Load the Dataset
For most in-built datasets, you don’t need to explicitly load them; they are pre-loaded with the
datasets package, which comes with base R. Simply type the dataset name to view it:
data(mtcars)
View(mtcars)
Step 3: Visualize the Data: Basic plots are useful for understanding the relationships in the
dataset. For example, with mtcars
Call:
lm(formula = mpg ~ wt, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.5432 -2.3647 -0.1252 1.4096 6.8727
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
wt -5.3445 0.5591 -9.559 1.29e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.046 on 30 degrees of
freedom Multiple R-squared: 0.7528, Adjusted R-
squared: 0.7446
F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
Coefficients:
(Intercept) y
61.380 1.415
>print(summary(relation))
Call:
lm(formula = x ~ y)
Residuals:
Min 1Q Median 3Q Max
-6.0529 -2.4833 -0.0912 1.3774 10.0562
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 61.3803 7.2653 8.448 2.94e-05 ***
y 1.4153 0.1089 12.997 1.16e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Note:-
• Standard Error: Measures the precision of the coefficient estimates. Smaller values suggest more
precise estimates.
• t value: A measure of how many standard errors the estimated coefficient is away from 0. Larger
values indicate that the predictor is more significant.
• Pr(>|t|): The p-value for testing the null hypothesis that the coefficient is zero. A small p-value
(typically < 0.05) indicates that the predictor is statistically significant.
• R-squared: A measure of how well the model fits the data. It indicates the proportion of variance in
the dependent variable explained by the independent variable(s).
• Residual Standard Error: The standard deviation of the residuals. Smaller values indicate better fit.
• F-statistic and p-value: Tests the overall significance of the model. A significant F-statistic (p-value <
0.05) indicates that at least one of the predictors is significantly related to the dependent variable.