Informartion Technology
Informartion Technology
Feb-03-2024
A. Hypothesis Testing
Background: A fast food restaurant claims that the average wait time for a drive-thru order is 3
minutes. The restaurant manager wants to test this claim with a sample of customers.
Task: You have been hired as a data analyst to help the restaurant manager with hypothesis
testing and confidence intervals.
Null hypothesis: The Null hypothesis ( H 0), is that the fast-food restaurant claims,
H 0 : μ=3
the Null hypothesis states, i.e. the average time for a drive-thru order is not equal
to 3 minutes.
H a : μ ≠3
2. If a random sample of 50 customers produces a sample mean wait time of 2.8 minutes
with a sample standard deviation of 0.4 minutes, calculate the test statistic using a z-
value.
x−μx
To calculate Z – Statistics, Z = σ
√n
Where, σ is the sample standard deviation, 0.4 minutes.
2.8−3
Therefore, Z – Statistics, Z = 0.4
√50
Z – Statistics = -3.536
3. Calculate the p-value for this test and interpret the results at a significance level of 0.05.
Using the Z – Value -3.536, the obtained P – Value is 0.000203 (from online
calculator)
Since the population mean wait time is not equal to 3 minutes, this test should be
performed by two tailed test, Hence doubling the P – Value, we get 0.000406
hence, we reject the Null hypothesis, since the P – Value is less than the
significant value α . So there is evidence that the average wait time is not 3
The critical value for a one-tailed test with a significance level (α ) 0.05 is -1.6449
While comparing the Z - Value of -3.536 to the critical value of -1.6449, the
calculated
Hence, we reject the Null hypothesis on the one-tailed test with a significance
level of 0.05. It states that the average wait time for the drive-thru is less than 3
minutes, and it proves that there is no evidence to support the claim of the fast
food restaurant.
5. How would decreasing the sample size from 50 to 25 affect your conclusion? Explain.
difficult and challenging to obtain a clear conclusion, because less sample sizes
might have more flexibility in their mean values which mostly fails to present the
overall population mean value. For example, if the smaller sample sizes consist
mostly of the minority greater values and fails to include the majority of other
values, would bring higher differences in the mean value, hence having a larger
The following are some of the possible errors which might affect the sample size.
The recorded sample data might not be recorded at the perfect time.
The presented sample data might not cover the entire sample.
The data observed time might affected due to weather conditions or traffic at the
zone.
There might be a shortage of manpower at the serving point during the data
observed.
The possible errors can be mitigated by well analysing the restaurants pattern before
7. Provide two other examples of situations or industries where hypothesis testing and
Sports Sector:
In the field of sports Sector, the coaches, team administrators and sports
analysts use hypothesis testing to assess the performance of their players and to
measure the effectiveness of various training techniques employed, or to compare
strategies used by different teams.
When it comes to performance measures, confidence intervals may be
applied to demonstrate their sportsmen’s reliability and consistency in playing
their games.
Educational Sector:
Hypothesis testing in educational sector is a key component of evaluating
the effectiveness of a teaching strategy or analysing the instructional approach in
the field of education.
By applying the confidence interval, the institute or the educational sector
may update or improve their techniques and strategies in teaching students to get
a higher result.
B. Moving Average
Imagine you are a consultant hired by a retail store looking to enhance customer satisfaction. The
management wants to employ data-driven strategies to understand customer preferences,
improve service, and ultimately boost sales. Your task is to design a comprehensive data
collection and analysis plan.
a) Identify both numerical and non-numerical data relevant to understanding customer
satisfaction. Explain how each data type contributes to the overall analysis.
Numerical Data:
Sales Figures: the numerical data which gives the sales value per day. This data
helps to find the sales pattern and the customer behaviour.
Customer counts: this is numerical data which helps to find the number of
customers reaching the business daily and can be used to conduct a survey of
customer ratings.
Non-Numerical Data:
Product types: this is a data about the products which the customers mostly
prefer to buy or do not like to buy, which helps the business to increase or
improve the specific product or service.
Customer Reviews: this data is the satisfaction level from the customers,
mentioning whether the business is providing services good or bad.
c) Consider the following dataset representing the daily sales figures (in dollars) of the store
over a period of one month. The dataset contains 30 data points, each representing the
sales amount for a specific day.
To smooth out the daily sales fluctuations, the store owner decides to calculate a moving
average. A moving average is a statistical technique that involves selecting a specific time
period, summing up the data points within that period, and dividing by the number of
points to obtain an average. This process is then repeated by shifting the time window
one data point at a time until the entire dataset is covered. The resulting moving average
series helps smooth out short-term fluctuations, providing a clearer representation of
underlying trends or patterns in the data. Calculate the 5-day moving average of the
dataset. Show your calculations and your figure.
337
Mean Absolute Error (MAE) = = 13.48
25
7913
Mean Squared Error (MSE) = = 316.52
25
331.001
Mean Absolute Percentage Error (MAPE) = = 13.24%
25
SALES FIGURES & CUSTOMER COUNTS
160.00
Actual Daily Data 5 Days Moving Data
150.00
140.00
130.00
VALUE
120.00
110.00
100.00
90.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
DAY
The actual 30 days daily sales value of the shop is between $90 and $150.
When comparing to the Actual sales performance, the 5 days Moving Average
Forecast shows that,
At the beginning of 5 days, the 5MA shows a flat sale value as the actual
sale value and from day 6 it starts increasing.
From day 6 to 12, the 5MA forecast shows an upward trend in sales.
Days 13, 14, 15 and 16, shows the highest value in the 5MA forecast.
Days 18 and 25 reflects the lowest value of the 5MA forecast.
And towards the last week of the month the 5MA forecast shows the
consistent sales value.
Even while analysing the trend of sales using 5 days moving average forecast, there are
few fluctuations in increase and decrease, hence by recording different months data (more
sample size) and interpreting with 5MA method may gives more accuracy over time.