0% found this document useful (0 votes)
15 views

Lab 2

Uploaded by

thulasi.v
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lab 2

Uploaded by

thulasi.v
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Analyzing the Impact of Caloric Intake on Weight Gain: A

Simple Linear Regression Approach


Thulasi-2348152

2023-11-10

Introduction
This report aims to analyze the relationship between the number of calories consumed and
the corresponding weight gained in grams. The dataset provided contains information on
calories consumed and weight gained over a period. We will employ simple linear
regression to understand and model this relationship.

Analysis
a. Steps Involved in Building a Simple Linear Regression Model
Data Collection: The dataset includes two variables - “Weight_gained_grams” and
“Calories_Consumed.” Data Exploration: Examine the dataset for any anomalies, missing
values, or patterns.
Data Visualization: Create visualizations such as scatter plots to visualize the relationship
between calories consumed and weight gained.
Correlation Analysis: Calculate the correlation coefficient to quantify the strength and
direction of the relationship.
Model Training: Split the dataset into training and testing sets. Train the model on the
training set. Model Evaluation: Evaluate the model’s performance on the testing set using
appropriate metrics.
library(readr)
data <- read_csv("C:/Users/Admin/Downloads/calories_consumed.csv")

## Rows: 14 Columns: 2
## ── Column specification
────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): Weight_gained_grams, Calories_Consumed
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this
message.
# a. Scatter Diagram and Coefficient of Correlation
plot(data$Calories_Consumed, data$Weight_gained_grams, main="Scatter Plot",
xlab="Calories Consumed", ylab="Weight Gained (grams)")

correlation_coefficient <- cor(data$Calories_Consumed,


data$Weight_gained_grams)
cat("Correlation Coefficient:", correlation_coefficient, "\n")

## Correlation Coefficient: 0.946991

We get a correlation of 0.946991 which determines a strong correlation between calories


consumed and weight gained.
# b. Parameter Estimation and Regression Line
model <- lm(Weight_gained_grams ~ Calories_Consumed, data=data)
summary(model)

##
## Call:
## lm(formula = Weight_gained_grams ~ Calories_Consumed, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -158.67 -107.56 36.70 81.68 165.53
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -625.75236 100.82293 -6.206 4.54e-05 ***
## Calories_Consumed 0.42016 0.04115 10.211 2.86e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 111.6 on 12 degrees of freedom
## Multiple R-squared: 0.8968, Adjusted R-squared: 0.8882
## F-statistic: 104.3 on 1 and 12 DF, p-value: 2.856e-07

The intercept is -625.75. This is the estimated value of “Weight_gained_grams” when


“Calories_Consumed” is zero. The coefficient for “Calories_Consumed” is 0.42016. This
implies that, on average, for each additional unit of calories consumed, the weight gained is
expected to increase by 0.42016 grams.
fit=fitted.values(model)
scatter.smooth(data$Weight_gained_grams,data$Calories_Consumed,col="darkgreen
")
abline(lm(data$Weight_gained_grams ~ data$Calories_Consumed))

residuals <- residuals(model)


cat("Residuals:\n", residuals, "\n\n")

## Residuals:
## 103.5174 -140.6079 97.21979 -98.59224 -124.6392 63.50174 165.5331 -
110.5453 49.31377 87.14147 24.09077 -22.54525 -158.6706 65.28245
# R-squared value
r_squared <- summary(model)$r.squared
cat("R-squared Value:", r_squared, "\n\n")

## R-squared Value: 0.896792

Different ways to check the significance if the estimated value via


model.
#check through scatter plot of y and fitted values
scatter.smooth(data$Weight_gained_grams,fit,col="red")#to check how close y
and y estimated is.

R=cor(data$Weight_gained_grams,fit)
R

## [1] 0.946991

R^2

## [1] 0.896792

From above we could say that our predicted value is significant as we have the same
correlation value.The R-squared value is approximately 0.8968, meaning that around
89.68% of the variability in weight gained is explained by the model. This is a relatively
high R-squared value, indicating a strong relationship.
summary(model)

##
## Call:
## lm(formula = Weight_gained_grams ~ Calories_Consumed, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -158.67 -107.56 36.70 81.68 165.53
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -625.75236 100.82293 -6.206 4.54e-05 ***
## Calories_Consumed 0.42016 0.04115 10.211 2.86e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 111.6 on 12 degrees of freedom
## Multiple R-squared: 0.8968, Adjusted R-squared: 0.8882
## F-statistic: 104.3 on 1 and 12 DF, p-value: 2.856e-07

The critical t-value for a two-tailed test with 12 degrees of freedom at a 0.025 significance
level is approximately 2.179.Since the t-value of 10.211 is much larger than 2.179, you
would reject the null hypothesis for the coefficient of “Calories_Consumed” in a two-tailed
test as well. The large t-value indicates that the effect of “Calories_Consumed” is statistically
significant, whether considering a positive or negative relationship.

Conclusion
This report has demonstrated the application of simple linear regression to understand the
relationship between calories consumed and weight gained. The analysis provides insights
into the predictive power of the model and assesses its quality of fit based on the given
dataset.

You might also like