0% found this document useful (0 votes)
51 views9 pages

Omelchenko Oksana & Khankeldiev Sanjar Homework #2: Comments

This document summarizes the results of analyzing sales data from three ice cream shops to determine if a promotion campaign was successful. Descriptive statistics are provided for key variables like daily sales, price, and temperature. Correlation and regression analyses were conducted to identify relationships between variables. The full regression model found sales were significantly impacted by location, price, competitors' prices, economic conditions, temperature, day of week, and promotions. Diagnostic tests confirmed the model had high explanatory power and met assumptions of normality, lack of multicollinearity and homoscedasticity.

Uploaded by

Zafar Okhunov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views9 pages

Omelchenko Oksana & Khankeldiev Sanjar Homework #2: Comments

This document summarizes the results of analyzing sales data from three ice cream shops to determine if a promotion campaign was successful. Descriptive statistics are provided for key variables like daily sales, price, and temperature. Correlation and regression analyses were conducted to identify relationships between variables. The full regression model found sales were significantly impacted by location, price, competitors' prices, economic conditions, temperature, day of week, and promotions. Diagnostic tests confirmed the model had high explanatory power and met assumptions of normality, lack of multicollinearity and homoscedasticity.

Uploaded by

Zafar Okhunov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Omelchenko Oksana & Khankeldiev Sanjar

Homework #2

This work is aimed to identify whether the promotion campaign held in three different ice cream shops was successful or not.

First of all, descriptive statistics was provided for scalable variables.

Comments:
As it can be seen, approximately 31 MLN cups were sold per
day in the observed chain of shops. Standard deviations was
not remarkably different from it, making up a bit less than 14.7
MLN of cups.

As for the price, the average price in the observed chain


constituted almost 1 unit, whereas in Main competitor`s store it
was 1.25 units. Standard deviations are not remarkably
different from means and represent similar values for two
listed variables.

Three observed cities, namely NYC, Chicago, LA, had a bit


different mean temperature in the observed year; it was equal
to about 18, 16, and 25 degrees, correspondingly. Std.
Deviation varied in the frames from 25 to 36 in accordance
with the region.
Interpretation for Correlation Analysis:
Pearson Correlation was estimated for each scalable variable. The required levels of statistical significance in this particular case
was considered as 1%, 5%, and 10%.

 Total Sales of cups of ice cream:


Price of the ice cream: statistically insignificant on the required level of significance, since P-value = 0.126 and it is more than
10%.
Main competitior`s average price: statistically significant. It is negligible correlation since its value equals to 0.60
Overall economic state: statistically significant. It is negligible correlation since its value equals to -0.58
Average temperatures in NYC, Chicago, and LA are all statistically significant (P-value = 0.000 < 1%). As for the effects, while
in NYC it is very high positive correlation (0.98), in Chicago it is low negative one (-0.353) and in LA it is moderate negative
correlation (-0.56)
 Overall economic state represents negligible correlation with average temperatures in NYC and Chicago (statistically
significant on the required level of 1%.
 Average temperature in NYC and Chicago represents Low negative correlation with an effect of 0.43 and P-value = 0.
2. As it was required in the task, the model with all variables was presented at this stage.
From the Model Summary it can be concluded that this model represents high Adjusted R square (0.959). In accordance with
ANOVA, regression is statistically significant at any required level of significance.
• Sales in LA are on average 29.84 Million
units less than in New-York City.
• Sales in Chicago are on average 25.69
Million units less than in New-York City.
• If the average price of ice cream increases by
one unit, sales are expected to fall by 3.17
Million units.
• With one unit increase in average
competitors’ price, sales are expected to
increase by 3.46 Million conventional units.
• With an average increase of one unit in the
overall national economic sentiment, sales
grow by 0.069 Million units.
• With an average increase of one unit, sales
increase by 0.069 units (or just 69 000 units).
• With each new day from the beginning of
the year, sales grow by 6 000 units on average.
• Sales on Saturday and Sunday are on
average 1.542 Million units more than on
working days.
• As the temperature increases in New-York
City, sales increase by 0.044 Million units on
average.
• Sales increased by 2.427 Million units on
average if there was a promotion on that day.
At this stage multicollinearity is being checked. For this purpose, Variance
Inflation Factor (VIF) has been applied. In accordance with the rule of
thumb, those variables that possess VIF > 9 are considered to be related with
multicollinearity. Let us list these variables:

Dummies: LA store, Chacago Store,

Scalable variables: Overall national economic statement, Day_Year, average


temp each day in NYC and LA.
Autocorrelation
If d=2 – no autocorrelation
If d=0 – full positive autocorrelation
If d=4 – full negative autocorrelation

Durbin-Watson in this case equals to 2.061 => no autocorrelation


The histogram of sales distribution, test Kolmogorov-Smirnov,
p-p plot, and q-q plot illustrate normal distribution.
Homoscedasticity analysis:
One of the main methods of investigation for heteroscedasticity is the
analysis of residual`s graph. The purpose of this analysis is to find factors
that affect the change in variance, the measurement number, or the value of
one of the features.
Through visual analysis it is not possible to find any signs of inconsistency
in variance and for variables as well. It is impossible to state the fact that the
variance is not constant and to relate this change with the number of
experiment.

Therefore, the problem of heteroscedasticity has not been identified for


any variable! (most of the graphs are illustrated below)

You might also like