Multiple Linear Regression
Multiple Linear Regression
Coefficients: Coefficients:
Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|)
(Intercept) 50000 10000 5.00 <2e-16 (Intercept) 3.0 0.5 6.00 <2e-16
*** ***
area 300 50 6.00 <2e- salary 0.01 0.002 5.00 <2e-
16 *** 16 ***
bedrooms 20000 5000 4.00 0.0002 benefits 0.5 0.1 5.00 <2e-
*** 16 ***
age -500 100 -5.00 <2e- workload -0.001 0.001 -1.00 0.12
16 *** ---
--- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 '
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' '1
'1
Residual standard error: 0.2 on 96 degrees of
Residual standard error: 20000 on 96 degrees of freedom
freedom Multiple R-squared: 0.90, Adjusted R-squared:
Multiple R-squared: 0.85, Adjusted R-squared: 0.898
0.843 F-statistic: 290 on 3 and 96 DF, p-value: < 2.2e-16
F-statistic: 180 on 3 and 96 DF, p-value: < 2.2e-16 a)
a) Intercept (3.0): This is the estimated satisfaction score when all
Intercept (50,000): This is the estimated price of a house when all the independent variables (salary, benefits, workload) are zero.
the independent variables (area, bedrooms, age) are zero. In This might represent a base level of satisfaction for employees in
practical terms, this might represent a base price for a house in the absence of these factors.
the area being studied. Salary (0.01): For every additional unit of salary, the satisfaction
Area (300): For every additional square unit of area, the price of score is expected to increase by 0.01 units, holding benefits and
the house is expected to increase by $300, holding the number of workload constant.
bedrooms and age constant. This coefficient is statistically Benefits (0.5): For every additional unit of benefits, the
significant (p-value < 0.05), indicating that the area of the house satisfaction score is expected to increase by 0.5 units, holding
is a significant predictor of its price. salary and workload constant.
Bedrooms (20,000): For each additional bedroom, the price of Workload (-0.001): For every additional unit of workload, the
the house is expected to increase by $20,000, holding the area and satisfaction score is expected to decrease by 0.001 units, holding
age constant. This coefficient is also statistically significant (p- salary and benefits constant.
value = 0.05), suggesting that the number of bedrooms is an Salary and Benefits are the significant predictors (p value < 0.05)
important factor in determining the price of a house. of the employee satisfaction level, while worload is not significant
Age (-500): For each additional year of age, the price of the predictor (p value >0.05).
house is expected to decrease by $500, holding the area and
number of bedrooms constant. The negative coefficient indicates
that older houses are generally cheaper than newer ones. This b)
coefficient is statistically significant (p-value < 0.05), implying Multiple R-squared (0.90): 90% of the variability in employee
that the age of the house is a significant factor in its price. satisfaction can be explained by the linear relationship between
satisfaction and the independent variables (salary, benefits,
b) 85% of the variability in house prices can be explained workload) in the model.
by the linear relationship between the price and the
independent variables (area, bedrooms, age) in the
model.
3. Predicting prouct sales 4. Predicting exam score
lm(formula = sales ~ TV + radio + online, data = lm(formula = score ~ study_hours + sleep_hours +
advertising_data) stress_level, data = student_data)
Coefficients: Coefficients:
Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|)
(Intercept) 10000 2000 5.00 <2e-16 *** (Intercept) 50 5 10.00 <2e-
TV 2 0.5 4.00 0.0002 16 ***
*** study_hours 2 0.5 4.00
radio 3 1 3.00 0.003 0.0002 ***
** sleep_hours 1 0.2 5.00 <2e-
online -0.5 0.5 -1.00 0.08 16 ***
--- stress_level -0.1 0.1 -1.00 0.1
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ---
'1 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 '
'1
Residual standard error: 500 on 96 degrees of
freedom Residual standard error: 5 on 96 degrees of
Multiple R-squared: 0.85, Adjusted R-squared: freedom
0.843 Multiple R-squared: 0.85, Adjusted R-squared:
F-statistic: 180 on 3 and 96 DF, p-value: < 2.2e-16 0.843
F-statistic: 180 on 3 and 96 DF, p-value: < 2.2e-16
Intercept (10,000): This is the estimated sales when all the
independent variables (TV, radio, online) are zero. This might
represent a base level of sales in the absence of advertising
expenditures.
TV (2): For every additional $1 of TV advertising expenditure,
sales are expected to increase by 2 units, holding radio and online
advertising constant.
Radio (3): For every additional $1 of radio advertising
expenditure, sales are expected to increase by 3 units, holding TV
and online advertising constant.
Online (-0.5): For every additional $1 of online advertising
expenditure, sales are expected to decrease by 0.5 units, holding
TV and radio advertising constant.
TV and Radio ads are the significant predictors of product sales
(p values <0.05), while online ads are not significant predictors of
product sales (p value >0.05).
Multiple R-squared (0.85): 85% of the variability in product
sales can be explained by the linear relationship between sales
and the independent variables (TV, radio, online ad
expenditures) in the model.
Coefficients: Coefficients:
Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.5 0.5 7.00 <2e- (Intercept) 5000 1000 5.00 <2e-16
16 *** ***
acidity 0.2 0.05 4.00 location 1000 200 5.00 <2e-16
0.0002 *** ***
sugar 0.1 0.02 5.00 <2e- size 500 100 5.00 <2e-
16 *** 16 ***
temperature -0.01 0.01 -1.00 0.14 reviews -5 10 -0.5 0.62
--- ---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 '
'1 '1
Residual standard error: 0.3 on 96 degrees of Residual standard error: 200 on 96 degrees of
freedom freedom
Multiple R-squared: 0.85, Adjusted R-squared: Multiple R-squared: 0.85, Adjusted R-squared:
0.843 0.843
F-statistic: 180 on 3 and 96 DF, p-value: < 2.2e-16 F-statistic: 180 on 3 and 96 DF, p-value: < 2.2e-16