502 Week 6 Summary
502 Week 6 Summary
Video Summary(W-6)
Saksham (2111579)
Shirley Tang
Z-test Sample
In this video, we learn about how to use the z-test using a single sample. This speaker took about
50 observations of total child nutrition per-pupil expenditure and population variance (371.102).
He set up another column that we were going to use as a test, then the speaker took in $545 for
specific testing to see if the true mean of a population is equal to the $545 amount. Then we need
to associate this amount in column B for all observations then use z-test because we have a
sample less than 30, and population variance is essential in the analysis. To carry this test
speaker, use data ribbon and data analysis tool Pak and select z-test to sample for means and
clear out the existing value and found z-test result, which was not equal to the PPE, so we are
T-test Sample
In this video, we learn about how to use the T-test hypothesis. This speaker will be testing if the
population mean age is significantly greater than 20; this must be an alternative hypothesis
because it contains no equality. So, the speaker uses the built-in data analysis command to
generate the statistic and uses trick or fool excel. First, he creates a dummy second variable, then
selects data analysis t-test: two-sample assuming unequal variance then hypothesized mean
difference of 20 and select an output range. So, after the result, the mean (23.5) was a right-tailed
test and rejected the null hypothesis; in the end, the conclusion was t=2.577 > 1.740 because the
Ho was rejected.
3
In this video summary, the speaker took two stock price samples of Apple and IBM, and the only
thing changes are actual stock by looking for its dates, and it is a two-tailed test, and there are
In the first example of the spreadsheet, the speaker is using the t-test function for the difference
between these two daily price changes of stocks, so he calculates the average daily price for IBM
and apple by using formula AVERAGE (select first 30 observations) and then see the difference
between there change, and he stated that higher p-value means that there is a higher probability
that the difference we are seeking is from random change. He compares the alpha value(also
known as type 1 error) with the p-value. We want low alpha and trade-off not to be too weighted
toward the null hypothesis (It generally is not higher). We don’t know which one is bigger than
the other, so for p-value= (array1, array2, 2, 1) and after applying it, the p-value was 0.2122,
which means that if we reject it, there would be a 21% chance of us having type 1 error.
Independent sample
In the second example of the spreadsheet, the speaker is using the same t-test function for the
difference between these two daily price change of stocks, so he calculates the average daily
price for IBM and apple by using formula but instead of taking continuous run speaker randomly
pick samples among those dates and then see the difference between there change then he
calculates total observation by COUNT(IBM)(231) then use index function and pic a row
number INDEX(IBM,randbetween(1,231)) in this we don’t need a third argument for this usage
of formula then he applied same for the apple company and after getting results he copied the
4
formula till 30 observations and again follow the same procedure of calculating the difference t-
test and p-value in this case we found that there will be about 82% chance of type 1 error.
If we run this test one more time with different samples by pressing f9 on windows, we will get a
new sample and achieve new results. After trying it, again and again, we will finally find a
sample that we can reject (These types of tests need to be repeated repeatedly).
F-test
In the given video, the speaker started with proposing null (H0) and alternative hypothesis (Ha).
The speaker further provides the sale variations data and ask us few questions, which are
mentioned below:
a) Justify your choice of statistical method: Since, the hypothesis demands to compare the
variation between two data sets, hence the best suited statistical analysis which can be
used in such a circumstance would be F-test.
b) One tail or two-tail test: Since, the hypothesis demands to compare variation whether it is
equal for both the years or not. The possible conditions could be : variation in sales of
2019 could be greater than or less than that of 2020. Hence, two tail test needs to be
applied in this case.
c) Select level of significance (critical value)
As per the video, the resource person has proposed a critical value of 2.168 for the given
condition.
d) Find the test value
Finally, the result of F-Test reveals a value =1.200
e) Decision making
Since, the obtained value of F-test(1.2) is less than the critical value (2.168), hence we
fail to reject the null hypothesis. Therefore it can be interpreted that there is no significant
variation in sales data of 2019 and 2020.
f) Compute and interpret p-value
5
Based on the statistical analysis of data using F-test the following results were obtained:
1. F test value = 1.200
2. p-value = 0.3474
3. When p-value is multiplied by 100, in order to transform it into percentage terms the
following results were obtained:
34.74 × 100= 34.4%
Since, the obtained value (34.4%) is higher than the level of significance (5%), hence
there is weak evidence against null hypothesis. Therefore, it can be interpreted that we
fail to reject null hypothesis. Also, the chances of error in our decision to reject H0 is
34.4%, which is greater than the tolerance limit 5%. Based on this premise, it can be
claimed that we fail to reject H0.
For (d) and (e), we need Microsoft Excel (MS Excel) to calculate this Sale Data for 2019
& 2020. To calculate
Variable 1 Range – The range is a group of cells in MS Excel. The range B2:B21
contains the data pertaining to first group i.e., sales data for the year 2019. Once range is
defined next task is to apply the suitable MS Excel formula for computing F-test. The
formula used for computing F-test is defined below:
Fx(stdev) =stdev (b2: b21) – select the data (sale 2019) when enter data
we find 21102.06(standard deviation is given)
Variable 2 Range – The range of these cells containing the data for the second group
Now for calculate Sale 2020 Data =stdev(c2:c21) – select the data (sale 2020) when enter
data
we find 23118.83
we find that the Sale 2020 data is higher than sale 2019 data so
First Sample Sale Data 2020
Second Sample Sale Data 2019
After that, we calculate the critical value
Step 1) On the Data tab, in the Analysis group, click Data Analysis.
Step 3) Click in the Variable 1 Range box and select the range C1:C22
Step 4) Click in the Variable 2 Range box and select the range B1:B22.
6
Step5) Labels – Select this if you have highlighted the group label
Click on Label
Step 6) Alpha – This is the threshold. Which is, By default, set at 0.05. So, if the p<0.05, we can
reject the null hypothesis and conclude that the test is statistically significant
Alpha- 0.05(write)
Step 7) Output Range – This option will allow you to select an area in the current sheet where
you want to result to be placed
i.e. ($P$6)
Step 8) Click ok