0% found this document useful (0 votes)
20 views4 pages

Analytics Assignment

BUSINESS ANALYTICS

Uploaded by

sameekshamane29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Analytics Assignment

BUSINESS ANALYTICS

Uploaded by

sameekshamane29
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

PREDICTIVE DECISION-MAKING ASSIGNMENT

UNIVERSITY CANADA WEST


BUSI 650: BUSINESS ANALYTICS
PROFESSOR: ROBIN, TEOTIA
NOVEMBER 20,2023

PART - A
LINK:

A. Perform simple linear regression, with the target variable as “Overall” which is the rating of
the hotels by using the independent variable “Amenities”.
Sol: Please refer to the Python collab file link attached.

B. Determine the equation of the simple linear regression, performed in question ‘a’’. Also,
mention the slope and the Intercept.
Sol: The equation for simple linear regression is expressed as,

(y) = β0 + β1x

where (y) is the dependent variable, β0 is the y-intercept, (β1) is the slope of the line, and
(x) is the independent variable.
-In this specific scenario, the dependent variable (y) is the Overall Score, the independent
variable (x) is Amenities, the y-intercept (β0) is 69.249, and the slope (β1) is 0.234.

Therefore, the simplified equation for the simple linear regression is:

‘Overall Score = 69.249 + 0.234 × Amenities’

C. What is the R2 value and what do you understand by it?

Sol: The determined R2 value stands at 49%. This statistical metric indicates the proportion of
variability in the dependent variable (Overall Score) with respect to its mean.

- Specifically, it signifies that only 49% of the variance in the Overall Score around the mean is
accounted for by the model in this instance.

PART - B
D. Perform multiple linear regression, with the target variable as “Overall” which is the rating
of the hotels, predicted by using the given independent variables- “comfort”, “amenities”,
and “nearby dining”.
Sol: Please refer to the Python collab file link attached.

E. Determine the estimated multiple linear regression equation (done in part a) that can be
used to predict the overall score given the scores for comfort, amenities, and nearby dining.

Sol: The equation for Multiple Linear Regression is given by,

(y) = β0 + β1x1 + β2x2 + . . . + βPxP

where (y)is the dependent variable, β0 is the y-intercept, and β1, β2, ..., βP represents the
slopes for the corresponding independent variables x1, x2, . . ., xP.

- In this particular context, the Dependent Variable (y) is Overall Score, and the Independent
Variables (x1, x2, x3) are Comfort, Amenities, and Nearby Dining.
-The values for the Intercept (β0) are 35.697, and the slopes (β1, β2, β3) are 0.109, 0.244, and
0.247, respectively.
Hence, the Multiple Linear Regression equation is expressed as:

Overall Score = 35.697 + 0.109 × Comfort + 0.244 × Amenities + 0.247 × Nearby Dining

F. Use the t-statistics to determine the significance of each independent variable. What is the
conclusion for each test at the 0.05 level of significance?

Sol: Analysis of independent variables yields the following outcomes regarding the null
hypothesis:
Comfort: The p-value is 0.412, exceeding the significance level of 0.05. Consequently, we fail to
reject the null hypothesis. This implies that the Comfort variable is insignificant in the model,
playing a negligible role in influencing the dependent variable.
Amenities: With a p-value of 0.000 (less than 0.05), Amenities emerge as significant in the
model. Therefore, we reject the null hypothesis associated with Amenities.
Nearby Dining: The p-value is 0.001, falling below 0.05, indicating significance in the model.
Consequently, we reject the null hypothesis for Nearby Dining.

PART - C
G. Remove all independent variables that are not significant at the 0.05 level of significance
from the estimated regression equation. What is your recommended estimated regression
equation?

Sol: The equation for Multiple Linear Regression after removing the independent variable which
is insignificant is expressed as,

(y) = β0 + β1 × ’Amenities’+ β2 × ’Nearby Dining’

where (y)is the dependent variable, β0 is the y-intercept, (β1, β2) represents the slopes for the
corresponding independent variables (x1, x2) are Amenities, and Nearby Dining.
Hence, the necessary Multiple Linear Regression equation can be expressed as follows,

Overall score = 45.146 + 0.252 × Amenities + 0.248 × Nearby Dining

H. What is the impact of this change on the coefficient of determination?

Sol:

You might also like