Lesson - 4.1 - Hypothesis Testing - Analyze - Phase
Lesson - 4.1 - Hypothesis Testing - Analyze - Phase
Certification Course
Hypothesis Testing
Learning Objectives
Java Coffee House implemented an improved process at facility B, which would help reduce defects. The
desire is to compare the impact of the improved process at facility B to the standard process being
performed at facility A.
How could you confidently say that the new process is significantly better than the standard process?
Yield A Yield B
80 85
80 81
80 95
93 95
93 89
87 87
82 92
81 81
93 82
Basics
Hypothesis Testing
The average cycle time for processing similar products is the same
between two different facilities
Hypothesis Testing
Minimize subjectivity
⯑ Question assumptions
● Represented as H0 ● Represented as Ha
🖝 H0 : µa = µb
Ha : µa ≠ µb
Null is stating there is no difference in means
Alterative is stating there is a difference in means
Hypothesis Testing
Court’s Decision
The
Truth
‘Α’ is set at 0.05, which means the risk of committing a type I error will be 1
out of 20 experiments
It is important to decide what type of error should be less and set ‘α’ and ‘β’
accordingly
Type I and Type II Errors
Power of a Test
• The probability of correctly rejecting the null hypothesis when it is false
Confidence Level
• The probability of correctly failing to reject the null hypothesis when it is true
• The complement to Type I error and represented as 1-α
• The probability of not committing a Type I error
Sample Size
To calculate the standard sample size for continuous data, the value of α is taken
as 5%.
Q The population standard deviation for the time, to resolve customer problems, is 30
hours. What should be the size of a sample that can estimate the average problem
resolution time within ± 5 hours tolerance with 99% confidence?
To calculate the standard sample size for discrete data, the average
population proportion of non-defective is ‘p’ and value of α is taken as 5%.
DISCRETE DATA
Where = Tolerance allowed on either side of the population proportion average in percentage.
Standard Sample Size Formula
The P-value is the probability that any differences observed are due to random
chance or common cause variation.
If lower than alpha risk, the null is rejected in favor of the alternative.
If it is not likely for the test statistic to come from Directional, Right-tail Test
population, then the null is rejected.
If the alternate hypothesis tests more than one If the alternate hypothesis tests one direction,
direction, either less or more, use a 2-tailed use a 1-tailed probability value from the test.
probability value from the test.
Example: Example: If Mean of A is greater than Mean of B,
If Mean of A is not equal to Mean of B, then it is then it is 1-tailed probability.
2-tailed probability.
Hypothesis Test Conclusions
Hypothesis
Test
Continuous, Discrete
Normal Data Data
1-Way
Mean Variation Variation Mean 1 Group 2 Groups >2 Groups
ANOVA
n1,n2<30 &/or
1 Sample Z 1 Sample t n1, n2 ≥30 ;
Paired t Test σ1,σ2
Test Test σ1,σ2 known
unknown
2 Sample Z 2 Sample t
test Test
Hypothesis Testing Formulas
1 Sample Z test
1 Sample t test
1 Proportion
test
CALCULATE TEST STATISTICS
2 Sample Z test
2 Sample t test
Paired t Test
2 Proportion
test
1 Sample Test
Z-Test t-test
H0: Proportion of wins in Australia or abroad is independent of the country played against
Ha: Proportion of wins in Australia or abroad is dependent on the country played against
χ2 Critical = 6.251
χ2 Calculated = 1.36
Result:
Since calculated value is less than the critical value, the proportion of wins of Australia hockey team
is independent of the country played or place.
1 Sample Test
Since, z.975 = 1.96, the null hypothesis is not rejected at 5% level of significance.
Result: It can be concluded based on the sample that the proportion of smokers in R is 0.10.
Means of Two Groups
Reject H0 at level of significance α if I Computed t I > tDF,a/2; DF = (n1 – 1) + (n2 – 1) = 124 + 109 = 233
Since t233,0.025 = 1.97[“=T.INV.2T(.05,233)”], the null hypothesis is rejected at 5% level of significance
2 Sample Test
A
H0 : σA2= σB2 ; the variance of Company A’s earnings is equal to the
variance of Company B’s earnings.
Ha : σA2 ≠ σB2 ; the variance of Company A’s earnings is different.
σA2= variance of Company A’s earnings.
σB2= variance of Company B’s earnings.
F-Test Example
σA > σB. In calculating the F-test statistic, always put the greater
variance in the numerator.
F-Test Example
4.2 4
4.5 4.5
7.2 5
6.1 5.2
8.9 5.3
5.2 6.1
Conducting F-Test In MS Excel
F-Test Example
Null Hypothesis
• There is no significant statistical difference between the variances of the two groups
Alternate Hypothesis
• There is a significant statistical difference between the variances of the two groups
F 6.177076626
• There could be Assignable Causes of Variation or
P(F<=f) one-tail 0.033652302
Special Causes of Variation.
Select 2-Sample
Open MS Excel,
Independent t-
click Data, and
test assuming
click Data
unequal
Analysis.
variances.
In Variable 1
range, select the
data set for
Group A.
Click OK.
t-Test
4.2 4
4.5 4.5
7.2 5
6.1 5.2
8.9 5.3
5.2 6.1
t-Test
Null Hypothesis
• There is no significant statistical difference between the means of the two groups
Alternate Hypothesis
• There is a significant statistical difference between the means of the two groups
Jan Feb
360 365
The paired t-test is conducted before and after the
324 325
process to measure:
377 359
336 352
Customer satisfaction before and after improvements 383 397
361 351
369 367
Employee performance before and after training 349 397
301 335
354 338
344 349
329 393
337 370
387 400
378 411
ANOVA
Q
Outlet 1 Outlet 2 Outlet 3
The table shows the takeaway food delivery time of
48 50 49
three different outlets.
Is there any evidence that the averages for the three 49 48 48
outlets are not the same?
48 36 39
53 50 49
A
58 50 34
The null hypothesis will assume that the three
means are equal. If the null hypothesis is rejected, it
50 62 33
would mean that there are at least two outlets that 46 45 57
are different in their average delivery time.
50 47 48
49 51 47
47 44 39
ANOVA
Select the
Enter the data in ANOVA – single
Select all the
an Excel factor test from
cells for analysis
spreadsheet the Data Analysis
“Toolpack”
Chi-Square Distribution
There is a different Chi-square distribution for each of the different numbers of degrees of freedom.
For Chi-square distribution, degrees of freedom are calculated according to the number of rows and
columns in the contingency table.
Hypothesis Testing with Non-normal Data
Non-Parametric Test
Non-parametric tests do not make any assumptions about a distribution model where the
data could fit.
Non-parametric tests compare groups of medians using the relative ranks of the data
within the groups.
Non-Parametric Test
Corresponding Parametric
Non-parametric Test Main Characteristics
Tests
1 sample Sign test Test on the median, for non symmetric distribution 1 sample t or Z test
Mann and Whitney test Test on ranks to compare center of 2 groups 2 samples t or Z test
Freidman Test Test on ranks, based on the Chi Squared distribution Two-way randomized ANOVA
Mann-Whitney test is a non-parametric test used to compare the center between two
unpaired groups
MANN-WHITNEY TEST—DEFINITION
Non-Parametric Test
The calculated value is not equal to or less than 2. Therefore, there is no statistical difference between
the means of the two groups.
Non-Parametric Test
The Kruskal-Wallis test is used for testing the source of origin of the samples.
Testing the ratings of a product from three different groups to see if the ratings are the
same or different
Non-Parametric Test
The Mood’s Median is a non-parametric test used to test the equality of medians from two
or more different populations. The test works if:
• The output (Y) variable is continuous, discrete-ordinal, or discrete count.
• The input (X) variable is discrete with two or more attributes.
Compare Chi-square
Find Chi-square value value to critical Chi-
Square value
Example:
To determine whether temperature changes in the ocean water near a nuclear power plant will have a
significant effect on the animal life in the region, an environmental group places groups of fish in four bowls that
are identical in except for water temperature. Six months later, they measure the weights of the fish.
Non-Parametric Test
The Friedman Test is a form of non-parametric test that does not make any
assumptions on the origin of the sample.
A marketing company wants to compare the relative effectiveness of three different modes of advertising:
The company conducts a randomized block design experiment. For 14 customers, the marketing company
used all three modes during a 1-year period and recorded the percentage response to each type of
advertising.
Non-Parametric Test
HR of a large company analyzes its payroll to determine whether the company's median
salary differs from the industry average.
Non-Parametric Test
It is equivalent to 1 Sample t-Test and is more powerful than 1 Sample sign test.
It is used to estimate the population median and compare it to a target or reference value.
The median customer satisfaction score of an organization has always been 3.7.
Management wants to see if it has changed. They conducted a survey and got the results
grouped by the customer type
Conclusion:
⮚ If median = 3.7 then Fail to Reject Null
⮚ If median ≠ 3.7 then Reject Null
Key Takeaways
In deciding to reject or not reject the null hypothesis, we can make two
possible decision errors—Type I and Type II errors.
To calculate the standard sample size for discrete data, the average
population proportion non-defective is ‘p’ and value of α is taken as 5%.
The P-value is the probability that any differences observed are due
to random chance or common cause variation.
Alpha value is 10%. Since confidence level is 90% and p is smaller than the alpha value, the null hypothesis is rejected
Knowledge
Check An Assembly team desired to see if there was any performance improvement after
2 completing a Six Sigma project. What hypothesis could be used?
A. F-test
D. Paired t-test
Knowledge
Check An Assembly team desired to see if there was any performance improvement after
2 completing a Six Sigma project. What hypothesis could be used?
A. F-test
D. Paired t-test
To see if a process has improved, a paired t-test should be used to compare the before and after improvement state.
Knowledge
Check
Which non-parametric test is similar to a single factor ANOVA?
3
A. Sample Sign
C. Mood’s Median
D. Freidman’s Test
Knowledge
Check
Which non-parametric test is similar to a single factor ANOVA?
3
A. Sample Sign
C. Mood’s Median
D. Freidman’s Test
Mood’s median is similar to a single factor ANOVA . It is able to test for the difference in medians for more than 2
groups.
Knowledge
Check A team wants to test if a new drug reduced pain in the patients. What would be the
4 Type II error?
B. The new drug does not work and team concludes it works
C. The new drug really works and team concludes it does not work
D. The new drug does not work and the team concludes it does not work
Knowledge
Check A team wants to test if a new drug reduced pain in the patients. What would be the
4 Type II error?
B. The new drug does not work and team concludes it works
C. The new drug really works and team concludes it does not work
D. The new drug does not work and the team concludes it does not work
Type II error fails to reject the null hypothesis when it is false. Therefore, if the null hypothesis is the drug, it does not
cause a difference in pain levels.
Knowledge
Check Which hypothesis test is used to compare the variance for two groups with normal
5 data?
A. Z Test
B. F Test
C. t-Test
D. χ2 Test
Knowledge
Check Which hypothesis test is used to compare the variance for two groups with normal
5 data?
A. Z Test
B. F Test
C. t-Test
D. χ2 Test
The F test is used to compare variance for two or more groups with normal data.
Knowledge
Check The population standard deviation for the time, to resolve customer problems, is 20 hours.
What should be the size of a sample that can estimate the average problem resolution time
6 within ± 2 hours tolerance with 95% confidence?
A. 385
B. 384
C. 386
D. 400
Knowledge
Check The population standard deviation for the time, to resolve customer problems, is 20 hours.
What should be the size of a sample that can estimate the average problem resolution time
6 within ± 2 hours tolerance with 95% confidence?
A. 385
B. 384
C. 386
D. 400
The p-value indicates failure to reject the null and the test statistic indicates rejection of the null. Therefore, there is a
discrepancy because both methods should always lead to the same conclusion.
Knowledge
Check
Which non-parametric test is similar to a 1 sample t test?
8
A. Freidman
B. Kruskal-Wallis
D. Mood’s Median
Knowledge
Check
Which non-parametric test is similar to a 1 sample t test?
8
A. Freidman
B. Kruskal-Wallis
D. Mood’s Median
Wilcoxon Signed Rank test is is similar to a 1 sample t test. It is also known as the 1 Sample Wilcoxon Test.