0% found this document useful (0 votes)
11 views

Module 5 Quiz Rev

Uploaded by

Justin Pimentel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Module 5 Quiz Rev

Uploaded by

Justin Pimentel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

Learning Objectives:

At the end of the discussion, the student should be able to:


 Define the Chi-square distribution.
 Describe how the shape of the Chi-square distribution changes as its degrees
of freedom increase.
 Determine the appropriate use of the chi-square statistic.
 Understand the different uses of the chi-square test.
 Test the distribution for goodness-of-fit test, using the chi-square.
Video clip on Chi-square:

https://www.youtube.com/watch?v=7_cs1YlZoug
The Chi-square Distribution
The Chi-square distribution is a continuous probability distribution. It is the distribution of a sum
of the squares of k independent standard normal random variables.
In the Chi-square distribution, as the degrees of freedom increases, the Chi-square
distribution approaches a normal distribution.
Characteristics of a Chi-square Distribution:

1. The chi-square distribution is a family of curves


based on the degrees of freedom.
2. The chi-square distributions are positively skewed.
3. All chi-square values are greater than or equal to
zero.
4. The total area under each chi-square distribution
is equal to one.
General Assumptions of a Chi-square
Distribution:
1. The sample was chosen using a random sampling
method.
2. The variables being analyzed are categorical
(nominal or ordinal).
3. All chi-square values are greater than or equal to
zero.
4. The total area under each chi-square distribution is
equal to one.
The Chi-square Distribution can be used to
 find a confidence interval for a variance or standard
deviation;
 test a hypothesis about a single variance or standard
deviation;
 test concerning frequency distributions;
 test the goodness-of-fit test;
 test for independence of two categorical variables;
 test the homogeneity of proportions;
 test the normality of the variable.
Chi-square Test:
A chi-square test (or chi-squared test), denoted by χ2 is statistical hypothesis test
▪ used to investigate whether distributions of categorical variables (at the nominal or ordinal
levels of measurement) significantly differ from one another.
▪ commonly used to compare observed data (actual value) with data we would expect (expected
value) to obtain according to a specific hypothesis.
▪ used to test information about the proportion or percentage of people or things who fit into a
category.
Types of Chi-square Tests:
Chi-square Goodness-of-Fit Test
A chi-square test,
▪ also known as the chi-square goodness-of-fit test is used if we would like to see whether the
distribution of data follows a specific pattern.
For example:
 You would like to see whether the values obtained from an actual observation on the monthly
dividend in stocks differ considerably from the expected value.
 You may want to investigate whether the fluctuation on the interest rates during Sundays is
higher than the rest of the days in a week.
Chi-square Test of Independence
A chi-square test,
▪ can be used to test the independence of two variables.
▪ Is used when we would like to see;

o whether or not two random variables take their values


independently.
o whether the value of one relates with another.
o whether one variable is associated with another.
▪ this test of hypothesis use the chi-square distribution and the contingency table.
For example,
 based on the distribution of data, you want to see whether the success of an individual in his
chosen career is independent or relates with his academic performance in college. Here, the
two variables involved are the success of an individual in his chosen career and his academic
performance in college.
 you may want to see whether the life in years of laptops is independent of brand. Here, the two
variables involved are the life in years of laptop and the brand of laptops.
 A study which involves on determining if job satisfaction can be associated with income. The two
variables are job satisfaction and income.
Chi-square Test for Homogeneity of Proportions

A chi-square test,
▪ can also be used to test the homogeneity of proportions.
▪ this is used to determine whether the proportions for a variable are equal when several samples
are selected from different populations.
▪ this also use the chi-square distribution and the contingency table.
For example,
 You would like to see if the proportions of each group of students who play online gaming are
equal based on their program of affiliation, say proportions of accountancy students,
engineering students, and architecture students who play online gaming.
 You may want to see if the proportions of employees who are in to stock market are equal based
on the nature of their profession (IT, Medicine, Accounting, Engineering).
Two main types of Chi-square Tests to be
discussed here are:
 Goodness-of-fit tests which focus
on one categorical variable.
 Tests of independence which focus on the
relationship between two categorical variables.
Thus, the contingency table (or cross tabulation
table will be used to present the data values).
To illustrate the use of chi-square test:
If, according to Mendel's laws, you expect 10 of 20 offspring to
be male and the actual observed number was 8 males, then you
might want to know about the "goodness-of-fit" between the
observed and expected data.
Were the deviations (differences between observed and expected
value) the result of chance, or were they due to other factors?
How much deviation can occur before we conclude that
something other than chance is at work, causing the observed to
differ from the expected value.
The chi-square test is always testing what scientists call the
null hypothesis, which states that there is no significant
difference between the expected and observed result.
Test for Goodness-of-Fit
Definition:
The chi-square goodness-of-fit test is used to test the
claim that an observed frequency distribution fits some
given expected frequency distribution.

Assumptions of Chi-square Goodness-of-Fit Test:


1. The data are obtained from a random sample.
2. The expected frequency for each category must be 5
or more.
Test Of Goodness-of-Fit
• If the observed frequencies are close to the
corresponding expected frequencies, the 2-value will be
small, indicating a good fit.
• If the observed frequencies differ considerably from the
expected frequencies, the 2-value will be large and the
fit is poor.
• A good fit leads to the non rejection of H0, whereas a
poor fit leads to its rejection.

Test Statistic,
2
To calculate the expected frequencies,
there are two rules to follow:
To test the null hypothesis, the following
formula will be used: 1. If all the expected frequencies are
equal, the expected frequency E can
(O − E ) 2 be calculated by using E =n/k, where n
 =
2

E is the total number of observations and


Where: O = is the observed frequency
k is the number of categories.
E = is the expected frequency 2. If all the expected frequencies are not
df = k – 1, degrees of freedom, k
is the number of
equal, then the expected frequency E
E=
n categories can be calculated by E = n●p, where n
k is the total number of observations and
n = total number of observations p is the probability for that category(or
p is the hypothesized proportion from
the null hypothesis).
Consider for example, a quality control officer of a laptop manufacturing company
would like to see if there was a difference in the life span of laptop batteries among
three categories. A sample of 45 student laptop owners is selected. The table below
shows the distribution of the life span of laptop batteries in years. If there were no
difference, you would expect 45/3 = 15 years life span of batteries for each category.
More than 4 years and
Category 4 years and below Above 10 years
below 10 years
Observed frequency 12 19 14

The observed frequencies will almost always differ from the expected frequencies due to
sampling error; that is, the values differ from sample to sample. But the question is: Are these
differences significant? (Which means, there is a difference in the life span of the batteries for
each category) or will it be due to chance only? Thus, the two opposing statements are necessary
before computing the test value, the null and alternative hypotheses. Here, the null hypothesis
indicates that there is no difference or change among the categories.
Ho: There is no difference in the life span of laptop batteries among three categories.
H1: There is difference in the life span of laptop batteries among three categories.
Summary Procedures in conducting Chi-Squared Goodness-of-Fit Test:

Step 1: State the hypothesis and identify the claim.


Step 2: Find the critical value for the chi-square table. The test is always right-
tailed.
Step 3: Compute the test value using the formula

2 = 
(O − E )2
E
Step 4: Make the decision.
Reject the null hypothesis if the test value is greater than the critical
value.
Do not reject the null hypothesis if the test value is less than the critical
value.
Step 5: Summarize the results.
Example 1:
A quality control officer of a laptop manufacturing company would like to see
if the life span of laptop batteries are equally distributed among three categories.
A sample of 45 student laptop owners is selected. The table below shows the
distribution of the life span of laptop batteries in years. At α = 0.05 can it be
considered that the lifespan of laptop batteries are equally distributed among the
three categories?

More than 4 years


Category 4 years and below Above 10 years
and below 10 years

Observed
frequency
12 19 14

Note that this problem involves only one categorical variable, the life span of laptop batteries classified into
three (4 years and below, more than 4 years and below 10 years, above 10 years), so we use the
goodness-of-fit-test.
Solution:
Step 1: State the hypotheses and identify the claim.
Ho: The ages of laptop batteries are equally distributed over the three
categories. (claim)
(Which is the same as saying that, “There is no difference in
the lifespan of laptop batteries in the three categories.”)
H1: The ages of laptop batteries are NOT equally distributed.
(Which is the same as saying that, “There is difference in the
lifespan of laptop batteries in the three categories.”)
Step 2: Find the critical value. At α = 0.05 and df = 3-1 = 2, locate the
critical value from the chi-square table. Thus, the critical value
is 5.991.
Step 3: Compute the test value
To compute the test value, we solve first for the expected value E.
n 45
E= = = 15
k 3
More than 4 years and
Category 4 years and below Above 10 years
below 10 years
Observed frequency 12 19 14
Expected frequency 15 15 15

Then the test value 2 is

(O − E ) 2 (12 − 15) 2 (19 − 15) 2 (14 − 15) 2


2 = = + + = 1.73
E 15 15 15
2 = 1.73 (test value/computed value or test statistic)

Step 4: Make the decision. Do not reject the null hypothesis, since the test value
1.73 is less than the critical value 5.991 (1.73 < 5.991)

Step 5: Summarize the results. There is no difference in the ages of laptop


batteries over the three categories. The life span of laptop batteries is
equally distributed.
To illustrate the goodness-of-fit test, let us analyze the charts showing the
graphs of the observed values and the expected values of different data sets.
From the charts below, you could see whether the observed values and the
expected values are close together or far apart.
Laptop Batteries Chart Title Chart Title
20 20 25
Frequency

15 20
10
15
0 (A) 10
(B) 10 (C)
1 2 3 5 5
Categories 0 0
1 2 3 1 2 3
Observed frequency Expected frequency
Observed frequency Expected frequency Observed frequency Expected frequency

From (A), the observed From (B), the observed From (C), the observed
values and the expected values and the expected values and the expected
values are close together, values are far apart, the chi- values are far apart, the chi-
indicating that the chi- square test will be large. square test will be large.
square test will be small. Then “the null hypothesis Then “the null hypothesis
The decision will be “do not will be rejected”, hence, will be rejected”, hence,
reject the null hypothesis”, there is “not a good fit”. there is “not a good fit”.
hence, there is “a good fit”.
Example 2:
A financial analyst wants to determine whether investors have any preference on the type of
investment. A sample of 93 investors were interviewed and provided the information shown on the
table below. At 0.10 level of significance, is there a difference in investment preferences among
the investors?

Types of Investment Frequency


Stocks 35
Mutual Funds 18
Bonds 30
Index Funds 10

Note that this problem involves only one categorical variable, the types of investment classified into four
(stocks, mutual funds, bonds, index funds), so we use the goodness-of-fit-test.
Solution:
Step 1: State the hypotheses and identify the claim.
Ho: Investors show no preferences.
(Which is the same as saying that, “There is no difference in the
preferences on the type of investment among investors.”)
H1: Investors show preferences. (claim)
(Which is the same as saying that, “There is difference in the
preferences on the type of investment among investors.”)
Step 2: Find the critical value. At α = 0.10 and df = 4-1 = 3, locate the
critical value from the chi-square table. Thus, the critical value
is 6.251.
Step 3: Compute the test value
Types of Investment Observed Frequency Expected Frequency
Stocks 35 24
Mutual Funds 18 24
Bonds 30 24
Index Funds 10 24

To compute the test value, we solve first for the expected value E.
n 93
E= = = 23.25  24
k 4
Then the test value 2 is
(O − E ) 2 (35 − 24) 2 (18 − 24) 2 (30 − 24) 2 (10 − 24) 2
 =
2
= + + + = 16.21
E 24 24 24 24
2 = 16.21 (test value/computed value or test statistic)

Step 4: Make the decision. Reject the null hypothesis, since the test value 16.21
is greater than the critical value 6.251 (16.21 > 6.251).

Step 5: Summarize the results. There is difference in the preferences on the type
of investment among investors. The investors in fact show preferences.
Example 3:
An article shows statistics of orders made online on a
particular product with different online stores within city. The
data is based on the last six months of the previous year as Number of Orders
Months
follows, July 17%, August 11%, September 8%, October 14%, made with CECT store

November 27%, and December 23%. The CECT online store July 27
manager wants to compare the orders made with his store with August 17
that of the data revealed by the article. The manager listed September 22
the number of orders in his store on the same product stated in October 45
the article. The table on the right shows the data collected by
November 30
the manager for the last six months in the previous year.
December 59
At 0.01 level of significance, can we support the claim that
the proportions of orders with CECT online store is the same as
the rest of the online stores within city?

Note that this problem involves only one categorical variable, months covered in a year, so we use the
goodness-of-fit-test.
Solution:
Step 1: State the hypotheses and identify the claim.
Ho: The orders made on a particular product in different online
stores within the city for the last six months of the year is
distributed as follows: July 17%, August 11%, September 8%,
October 14%, November 27%, and December 23%.
(or “There is no difference between the orders made with the
CECT online stores with the rest of the online stores within
the city”.(claim)
H1: The distribution is not the same as stated in the null hypothesis.
(or “There is difference between the orders made with the
CECT online stores with the rest of the online stores within
the city”.)
Step 2: Find the critical value. At α = 0.01 and df = 6-1 = 5, locate
the critical value from the chi-square table. Thus, the critical
Step 3: Compute the test value
Number of Orders made with CECT
Months store (O)
P E = np

July 27 17% (200)(0.17) = 34


August 17 11% (200)(0.11) = 22
September 22 8% (200)(0.08) = 16
October 45 14% (200)(0.14) = 28
November 30 27% (200)(0.27) = 54
December 59 23% (200)(0.23) = 46
Then the test value 2 is
(O − E ) 2 ( 27 − 34) 2 (17 − 22) 2 ( 22 − 16) 2 ( 45 − 28) 2
 =
2
= + + + +
E 34 22 16 28
(30 − 54) 2 (59 − 46) 2
+ = 29.49
54 46
2 = 29.49 (test value/computed value or test statistic)

Step 4: Make the decision. Reject the null hypothesis, since the test value 29.49 is
greater than the critical value 15.086 (29.49 > 15.086).

α = 0.01

15.086
Step 5: Summarize the results. There is significant difference between the orders
made with the CECT online stores with the rest of the online stores
within the city.
Exercise 1:

A chef of a fine dining restaurant wants to determine whether


customers have any preference among five flavors of ice cream
as toppings in their special dessert. A sample of 100 people
provided the following data. At 0.10 level of significance, is there
a difference in the flavor preferences among the customers?
Exercise 2:

An operations manager would like to see whether the production


of the different parts (A, B, C, D) of a certain electronic
equipment in different machines: laser designing machine for
part A, laser engraving machine of part B, solid filling machine of
part C, and pressing machine of part D is in the ratio 2:2:5:1 per
day. A randomly selected day is inspected to see if the
production of these parts is in the ratio 2:2:5:1. The manager has
recorded that a total of 900 pieces of these parts was found to have 200
pieces of part A, 165 pieces of part B, 468 pieces of part C, and 67 pieces
of part D. At the 0.01 level of significance, test the hypothesis that the
machines has produced these parts in the ratio 2:2:5:1.
References:
 Statistical Analysis with Software Applications, Philippines. Mc Graw Hill Education (2019)
 Bluman, G. (2010). Elementary Statistics : A Step by Step Approach, A Brief Version, 5th
Edition. New York: McGraw- Hill Companies
 https://tophat.com/marketplace/science-&-math/statistics/full-course/statistics-for-social-
science-stephen-hayward/211/34398/
 http://onlinestatbook.com/2/chi_square/distribution.html
 https://www.youtube.com/watch?v=7_cs1YlZoug
 http://www.z-table.com/chi-square-table.html
Learning Objective:

At the end of the discussion, the student should be able to:


 Test the variables for independence using chi-square.
Test for Independence
Definition:
The chi-square independence test is used to test whether two variables are independent of each
other.

Assumptions of Chi-square Independence test:


1. The data are obtained from a random sample.
2. The expected value in each cell must be 5 or more. If the
expected values are not 5 or more, combine categories.
Test for Independence (CATEGORICAL DATA)

The chi-squared test procedure can also be used to test


the hypothesis of independence of two variables of
classification.
A contingency table with r rows and c columns is referred
to as an r  c table.
(O − E ) 2 (column sum)  (row sum)
 2
= E=
grand total
E

The expected frequency E is computed by multiplying the subtotals of the


intersecting categories, then dividing the product by the grand total
Summary Procedures in conducting Independence Test:
Step 1: State the hypothesis and identify the claim.
Step 2: Find the critical value for the right tail using the chi-square table. Determine the
degrees of freedom using the formula df = (r – 1) (c – 1).
Step 3: Compute the test value. To compute the test value, first find the expected values. For
each cell of the contingency table, use the formula
(column sum)  (row sum)
E=
grand total

to get the expected value. To find the test value, use the formula

 =
2 (O − E )2
E
Step 4: Make the decision.
Reject the null hypothesis if the test value is greater than the critical value.
Do not reject the null hypothesis if the test value is less than the critical value.
Step 5: Summarize the results.
When there is only one degree of freedom (this means that a 22
contingency table is given), Yate’s correction for continuity is applied by
reducing the absolute value of each difference by 0.5 before squaring.
Hence, the formula to use is

( O − E − 0.5) 2
2 = 
E
Example 1:
An education analyst wishes to see whether the academic achievement a
person has completed is related to his or her socio economic status. A sample
of 88 people is randomly selected. At α = 0.05, can it be conclude that a person’s
academic achievement is dependent on the person’s socio economic status?
SES
Academic Mass Middle Class Elite TOTAL
Achievement
Up to High School 15 12 8 35
College Degree 8 15 9 32
Advanced Degree 6 8 7 21
TOTAL 29 35 24 88

Note that this problem involves two categorical variables, the academic achievement a person has
completed and his or her socio economic status, so we use the independence test.
Solution:
Step 1: State the hypotheses and identify the claim.
Ho: The academic achievement a person has completed is
independent to his or her socio economic status.
(Which is the same as saying that, “A person’s academic achievement is not related to
his or her socio economic status.”)
H1: The academic achievement a person has completed is dependent
to his or her socio economic status. (claim)
(Which is the same as saying that, “A person’s academic achievement is related to his or
her socio economic status.”)
Step 2: Find the critical value. At α = 0.05 and df = (3-1)(3-1) = 4,
locate the critical value from the chi-square table. Thus, the
critical value is 9.488.
.
Step 3: Compute the test value. To compute the test value, first find the
expected values. For each cell of the contingency table, use the
formula
(column sum)  (row sum)
E=
grand total

The results are shown in the following table:


SES Mass Middle Class Elite
TOTA
Academic O E O E O E L
Achievement
High School
15 11.53 12 13.92 8 9.55 35
College
8 10.55 15 12.73 9 8.73 32
Degree
Advanced
6 6.92 8 8.35 7 5.73 21
Degree
TOTAL 29 35 24 88
(O − E ) 2
To find the test value, use the formula 2 =
E
(15 − 11.53) 2 (12 − 13.92) 2 (8 − 9.55) 2 (8 − 10.55) 2 (15 − 12.73) 2
 =
2
+ + + +
11.53 13.92 9.55 10.55 12.73
(9 − 8.73) 2 (6 − 6.92) 2 (8 − 8.35) 2 (7 − 5.73) 2
+ + + + = 3.009
8.73 6.92 8.35 5.73

Step 4: Make the decision. Do not reject the null hypothesis, since the test
value 3.009 is less than the critical value 9.488 (3.009 < 9.488).
Step 5: Summarize the results. There is enough evidence to reject the
claim that the academic achievement a person has completed is
dependent to his or her socio economic status.
The academic achievement a person has completed is
independent of his or her socio economic status.
Example 2:
A study was conducted to see if there was a relationship between the
memory recall and the length of gadget usage per day of children. A sample of
338 grade level pupils is randomly selected and the results are shown on the
table below. At α = 0.01 level of significance, can it be assumed that memory
recall and the length of gadget usage per day of children are dependent?
Type of Memory Recall 2 hours Above 2
TOTAL
Gadget Usage per day and below hours
Sensory Memory 43 62 105
Short-term Memory 65 55 120
Long-Term Memory 88 25 113
TOTAL 196 142 338
Note that this problem involves two categorical variables, the memory recall and the length of gadget usage
per day of children, so we use the independence test.
Solution:
Step 1: State the hypotheses and identify the claim.
Ho: The memory recall is independent of the length of gadget usage
per day of the children.
H1: The memory recall is dependent of the length of gadget usage
per day of the children. (claim)

Step 2: Find the critical value. At α = 0.01 and df = (3-1)(2-1) = 2,


locate the critical value from the chi-square table. Thus, the
critical value is 9.210.
.
Step 3: Compute the test value. To compute the test value, first find the
expected values. For each cell of the contingency table, use the
formula
(column sum)  (row sum)
E=
grand total

The results are shown in the following table:


Type of Memory Recall 2 hours and below Above 2 hours
TOTAL
Gadget Usage per day O E O E
Sensory Memory 43 60.89 62 44.11 105
Short-term Memory 65 69.59 55 50.41 120
Long-Term Memory 88 65.53 25 47.47 113
TOTAL 196 142 338
(O − E ) 2
To find the test value, use the formula 2 =
E

( 43 − 60.89) 2
( 65 − 69.59) 2
(88 − 65.53) 2
( 62 − 44.11) 2
(55 − 50.41) 2
( 25 − 47.47 ) 2
2 = + + + + + = 31.58
60.89 69.59 65.53 44.11 50.41 47.47

Step 4: Make the decision. Reject the null hypothesis, since the test value
31.58 is greater than the critical value 9.210 (31.58 > 9.210). The
test value lies within the critical region.
Step 5: Summarize the results. There is not enough evidence to reject the null
hypothesis. The memory recall is dependent of the length of gadget
usage per day of the children.
http://www.z-table.com/chi-square-table.html
References:
 Statistical Analysis with Software Applications, Philippines. Mc Graw Hill Education (2019)
 Bluman, G. (2010). Elementary Statistics : A Step by Step Approach, A Brief Version, 5th
Edition. New York: McGraw- Hill Companies
 https://tophat.com/marketplace/science-&-math/statistics/full-course/statistics-for-social-
science-stephen-hayward/211/34398/
 http://onlinestatbook.com/2/chi_square/distribution.html
 https://www.youtube.com/watch?v=7_cs1YlZoug
 http://www.z-table.com/chi-square-table.html
NON
PARAMETRIC
TESTS
Learning Objectives:
At the end of the discussion, the student should be able to:
• Define nonparametric tests and explain when they may be desirable.
• Use the One-Sample Runs Test.
• Use the Wilcoxon Signed –Rank Test for dependent samples.
• Use the Wilcoxon Rank Sum Test for two independent samples.
Classifications of Test of Hypothesis
PARAMETRIC AND NONPARAMETRIC
Parametric Tests
• Parametric hypothesis tests require the estimation of one or more
unknown parameters (e.g., population mean or variance).
• Often, unrealistic assumptions are made about the normality of
the underlying population.
• Large sample sizes are often required to invoke the Central Limit
Theorem.
Nonparametric Tests
• In contrast, nonparametric or distribution-free tests
–usually focus on the sign or rank of the data rather than the exact
numerical value.
–do not specify the shape of the parent population.
–can often be used in smaller samples.
–can be used for ordinal data.
– nonparametric methods are procedures that work their magic
without reference to specific parameters (or measure of the
population).
• If the information about the • If there is no knowledge about the
population is completely population parameters (i.e., µ or
known by means of its σ), but still it is required to test the
parameters, (i.e., µ and σ), hypothesis of the population, then
then we use the parametric we use the nonparametric tests
tests (e.g., t-test, z-test, f-test, (e.g., Single-sign test, Wilcoxon
ANOVA, Pearson correlation signed rank test, Wilcoxon rank
coefficient) sum test, chi-square test, Kruskal-
Wallis test, Spearman rank)

https://www.slideshare.net/saiprakash6/distinguish-between-parametric-vs-nonparametric-test1
http://www.mayo.edu/mayo-edu-docs/center-for-translational-science-activities-documents/berd-5-6.pdf
Test statistics is based on the distribution test statistics is arbitrary

Information about the population is No information about the population is


completely known (given σ or µ) available
Specific assumptions are made regarding No assumptions are made regarding the
the population population
Null hypothesis is made on parameters of The null hypothesis is free from
the population distribution parameters
Focus on the actual numerical value Focus on the sign or rank of the data

https://www.slideshare.net/saiprakash6/distinguish-between-parametric-vs-nonparametric-test1;
Statistical analysis with software applications by McGraw Hill Education; Probability and statistics by Walpole
Parametric tests require the estimation of Nonparametric test assumes no
one or more unknown parameters knowledge of the distributions of the
(population parameters) population, except that it is continuous
Parametric Tests are applicable only for Nonparametric tests are applicable to
variable both variable and attributes
No parametric test exist for nominal scale Nonparametric test do exists for nominal
data and ordinal scale data
Parametric test is more powerful when Nonparametric test is more efficient
data assumes normality when data seriously departs from
normality
https://www.slideshare.net/saiprakash6/distinguish-between-parametric-vs-nonparametric-test1
http://www.mayo.edu/mayo-edu-docs/center-for-translational-science-activities-documents/berd-5-6.pdf
Parametric Tests and Analogous Nonparametric
Tests based on the purpose:
Table 1 on the next slide contains the names of several familiar
statistical procedures and categorizes each one as parametric or
nonparametric. All of the parametric procedures listed in the table rely
on an assumption of approximate normality.

https://www.mayo.edu/research/documents/parametric-and-nonparametric-demystifying-the-terms/doc-20408960
Parametric Procedure Nonparametric Procedure
Purpose of Test Example
(Normal Theory Based (Corresponding Nonparametric
(Analysis Type)
Tests) Test)
Test whether the median income
of the residents in barangay A
T test for independent Mann Whitney U Test Compares two
is greater than the median
samples Wilcoxon Rank Sum Test distinct/independent samples
income of the residents in
barangay B
Determine whether there is a
Examines a set of differences significant change in the
Paired Sample Sign Test (i.e., Compare two quantitative standard deduction of single
Paired t test
Wilcoxon Signed-Rank Test measurements taken from the taxpayers after the new tax law
same individual) has been implemented from
2017 to 2019.
Assesses the linear association A market analysts would like to
between two variables(i.e., see the relationship between the
Pearson Correlation Spearman Rank Correlation quality of a commodity and its
Estimate the degree of
Coefficient Coefficient market price.
association between two
quantitative variables)
Test the hypothesis that
One Way Analysis of Kruskal-Wallis Analysis of Variance Compares three or more defective items produced by the
Variance by Ranks distinct/independent groups three brands of machines A,B, C,
are equal.
Source: Applied Statistics in Business and Economics by Doane and Seward p. 694
Advantages and Disadvantages of
Nonparametric Tests
• If parametric and nonparametric tests are both applicable to the
same data set, we should carry out the more efficient parametric
technique over its nonparametric counterpart.
• Since we do not always have quantitative measurements and the
assumptions of normality is not at all times justified, the
nonparametric tests or distribution-free method will compliment to
their customary parametric tests.
• With nonparametric tests data analyst has more ammunition to
accommodate a wider variety of experimental situations.
Statistical analysis with software applications by McGraw Hill Education; Probability and statistics by Walpole
Advantages of Nonparametric Tests
1. They can be used to test population parameters when the variable is not
normally distributed.
2. They can be used when the data are nominal or ordinal.
3. They can be used to test hypotheses that do not involve population parameters.
4. Can be used in small samples, thus, assumptions of normality are not required.
5. In most cases, the computations are easier than those for the parametric
methods.
6. Their interpretation is often more direct than the interpretation of parametric
tests.
7. Nonparametric tests are simple and easy to understand.
8. It will not involve complicated sampling theory.
9. No assumption is made regarding the population.
Disadvantages of Nonparametric Tests
1. They are less sensitive than their parametric counterparts when the
assumptions of the parametric methods are met.
2. Larger differences are needed before the null hypothesis can be rejected.
3. They do not utilize all the information provided by the sample.
4. Require special tables for small samples.
5. They are less efficient than their parametric counterparts when the assumptions
of the parametric methods are met (normality).
6. Larger sample sizes are needed to overcome the loss of information.
7. It can be applied only for nominal or ordinal scale.
Ranking of Data
There are many applications in business where data are reported not as
values on a continuum but rather on an ordinal scale, thus, assigning ranks to
the values is necessary to draw an analysis of the data. The distribution-free
methods therefore allows the data analyst to make an analysis of ranks
rather than the actual data values which makes nonparametric tests very
appealing and intuitive.
For example, assuming that the nonparametric test is applicable, and an
HR personnel would like to determine the degree of relationship between
the performance rank obtained by the ten trainees during the first and
second evaluation period. A nonparametric test could then be used to
determine if there is an agreement between the two rank evaluations.

Statistical analysis with software applications by McGraw Hill Education; Probability and statistics by Walpole
Thus, since nonparametric tests can be applied to ordinal scale of data
measurement, it is important for the analyst to be efficient in ranking data
sets.
Example 1:
The following set of values are ranked from highest to lowest:

Student A B C D E

Grade 87 65 78 93 85

Rank 2 5 4 1 3
The following set of values are ranked from lowest to highest using the
Example 2:
average ranking for repeated values :

Contestant A B C D E

Score 3 1 3 4 6

Rank 2.5 1 2.5 4 5


Ranked scores of 9 students in mathematics from highest to lowest
Example 3: using the average ranking :

Student A B C D E F G H I
Math score 6 3 8 6 9 9 11 2 9
rank 6.5 8 5 6.5 3 3 1 9 3

3
Example 4: It may also be necessary to get the sum of ranks according to sign, R+ and R-.

Call Center Rank During 1st Rank During 2nd Rank 1 - Rank 2 Rank of the Difference Between Ranks
/D/
Trainee Evaluation Evaluation (D) Rank of /D/ Average rank Signed Average rank

A 8 7 1 1 1 1 1
B 2 5 -3 3 6 8 -8
C 7 10 -3 3 7 8 -8
D 1 4 -3 3 8 8 -8
E 4 2 2 2 2 3.5 3.5
F 9 6 3 3 9 8 8
G 3 1 2 2 3 3.5 3.5
H 6 9 -3 3 10 8 -8
I 10 8 2 2 4 3.5 3.5
J 5 3 2 2 5 3.5 3.5
R+ = 23 sum of positive ranks
R- = 32 sum of negative ranks
In this module three
nonparametric tests will be
presented:
• The One-Sample Runs Test
• Wilcoxon Signed-Rank Test
• Wilcoxon Rank Sum Test
One-Sample Runs Test
The one-sample runs test is also called the Wald-Wolfowitz test
after its inventors, Abraham Wald (1902-1950), and his student Jacob
Wolfowitz.
• One-sample runs test purpose is to detect nonrandomness.
• A nonrandom pattern suggests that the observations are not
independent.
• Here, we investigate whether each observation in a sequence is
independent of its predecessor (or the appearance of one is not
dependent on the appearance of another).
Random Independent Nonrandom Nonindependent
Statistical analysis with software applications by McGraw Hill Education
Runs Test
This test is to determine whether a sequence of binary events (two
outcomes involved) follows a random pattern. A nonrandom sequence
suggests nonindependent observations.
The hypotheses are
Ho: Events follow a random pattern.
H1: Events do not follow a random pattern.
To test the hypothesis of randomness, we first count the number of outcomes of
each type:
n1 : number of outcomes in the first type
n2 : number of outcomes in the second type
n = total sample size = n1 + n2
Wald-Wolfowitz Runs Test
• When n1 ≥10 and n2 ≥10 (large sample situation), the number of
runs R may be assumed to be normally distributed with mean µR and
standard deviation σR.
Summary of Procedures in Conducting a One-Sample Runs Test
STEP 1 State the hypotheses and identify the claim.

STEP 2 Count the runs by grouping sequences of similar outcomes.

STEP 3 Compute the test value/statistic zcalc .

STEP 4 Find the critical value using the table for the areas under the normal curve (z - table).
For a given level of significance α, find the critical value zα/2 for a two-tailed test.
Because either too many runs or too few runs would be nonrandom, we choose a two-tailed
test.

STEP 5 Make the decision. Reject the hypothesis of a random pattern if zcalc < −zα/2 or if zcalc > +zα/2.

STEP 6 Summarize the results.


Example 1: Quality Inspection
Inspection of 44 computer chips reveals the following sequence of
defective (D) or acceptable (A) chips. Do defective chips appear at
random? Test the hypothesis using α = 0.01.
DAAAAAAADDDDAAAAAAAADDAAAAAAAADDDDAAAAAAAAAA
Step 1: The hypotheses are
H0: Defects follow a random sequence. (independent)
H1: Defects follow a nonrandom sequence. (not independent)
A run is a series of
Step 2: Count the runs. consecutive outcomes of
Group sequences of similar outcomes and count the runs. the same type, surrounded
by a sequence of outcomes
of the other type.
• A run can be a single outcome if it is preceded and followed by outcomes of
the other type. Thus, there are 8 runs in this example (R = 8).
Step 3. Compute the test statistic zcalc. To compute zcalc , solve first for µR and σR.
The number of outcomes of each type is
n1 = number of defective chips (D) = 11
n2 = number of acceptable chips (A) = 33
n = total sample size = n1 + n2 = 11 + 33 = 44

expected number of runs µR if Ho is true

Because the actual number of runs (R = 8) is far less than the expected (µR = 17.5),
the sample suggests that the null hypothesis, “defects follow a random sequence”,
may be false. To verify this observation, proceed with solving the test statistic zcalc.
standard deviation σR

test statistic zcalc

Step 4. Find the critical value. Because either too many runs or too few runs
would be nonrandom, we choose a two-tailed test.
Thus, the critical value z0.005 for a two-tailed test at α = 0.01 is ± 2.576
(refer to the z-table for the critical value or the table indicated below).
Step 5. Make the decision. Reject the null hypothesis since the test
statistic zcalc = -3.90 is less than the critical value - 2.576.

Step 6. Summarize the Result. There is sufficient evidence to reject the


hypothesis of randomness. Defects follow a nonrandom
sequence (not independent). The difference between the
observed number of runs and the expected number of runs is
too great to be due to chance.
Example 2:
Perform a runs test for randomness on the sample data shown
below using α = 0.05. Let A be the event that follow a random pattern.
AABBAABBABAABBBAABBBBAABABB

Step 1: The hypotheses are


H0: Outcomes A follow a random sequence. (independent)
H1: Outcomes A follow a nonrandom sequence. (not independent)

Step 2: Count the runs.


Group sequences of similar outcomes and count the runs. A run is a series of
consecutive outcomes of
the same type, surrounded
by a sequence of outcomes
of the other type.
• A run can be a single outcome if it is preceded and followed by outcomes of
the other type. Thus, there are 14 runs in this example (R = 14).
Step 3. Compute the test statistic zcalc. To compute zcalc , solve first for µR and σR.
The number of outcomes of each type is
n1 = number of outcomes A = 12
n2 = number of outcomes B = 15
n = total sample size = n1 + n2 = 12 + 15 = 27

expected number of runs µR if Ho is true

Because the actual number of runs (R = 14) is considerably near the expected run
(µR = 14.33), the sample suggests that the null hypothesis, “Outcomes A follow a
random sequence”, is true. To verify this observation, proceed with solving the test
statistic zcalc.
standard deviation σR

test statistic zcalc

Step 4. Find the critical value. Because either too many runs or too few runs
would be nonrandom, we choose a two-tailed test.
Thus, the critical value z0.025 for a two-tailed test at α = 0.05 is ± 1.96.
Step 5. Make the decision. Do not reject the null hypothesis since the
test statistic zcalc = -0.13 is greater than the critical value – 1.96.

Step 6. Summarize the Result. There no sufficient evidence to reject the


hypothesis of randomness. Outcomes A follow a random
sequence (independent).
WILCOXON SIGNED-RANK TEST
• The Wilcoxon signed-rank test was named after Frank Wilcoxon (1892-1965).
• Wilcoxon proposed this test utilizing both direction (+/- sign) and magnitude
(difference).
• This test applies in the case of a symmetric continuous distribution.
• The Wilcoxon signed-rank test is a nonparametric test to compare a sample
median M with a benchmark median Mo (the point of reference or standard
/acceptable value), say the median income in the United States is $17, 593
(or Ho: M = 17, 593).
• It is also used to test the median difference in paired samples (dependent
samples), M1 – M2.
• It does not require normality but does assume symmetric populations.
• It corresponds to the parametric t test for one mean.
WILCOXON SIGNED-RANK TEST
The Wilcoxon signed-rank test is a nonparametric test to compare a
sample median with a benchmark or to test the median difference in
paired samples. It does not require normality but does assume
symmetric populations. It corresponds to the parametric t test for one
mean.
• Advantages are
– freedom from the assumption normality.
– robustness to outliers, which means that the test is tough with (or is not effected
by) outliers.
– applicability to ordinal data.
– applicable to roughly symmetric population since the median is the basis of
hypothesis.
Applications of the Wilcoxon signed-rank test
The Wilcoxon signed-rank test is a nonparametric test used to;
• compare a sample median with a benchmark
You want to test whether the median income of the residents in a
certain barangay is P2,500 per week. A sample of 5 residents were
selected showing their income per week.
Resident A B C D E
Income 2700 900 1200 3000 500

Benchmark median: 2500 (standard, acceptable or point of reference)


Ho: The median income of the residents is equal to 2500.
H1: The median income of the residents is not equal to 2500.
Applications of the Wilcoxon signed-rank test
The Wilcoxon signed-rank test is a nonparametric test used to;
• ;test the median difference in paired samples (dependent samples)
You want to test whether the median income of the residents in a
barangay five years ago is less than the median income at present. A
sample of 5 residents from the barangay were selected:
Five years ago 700 4500 400 1800 2900
Present 900 3300 1200 1800 3200
Ho: The median income of the residents in a barangay five years ago and at
present are equal.
H1: The median income of the residents in a barangay five years ago and at
present are not equal.
• To compare the sample median (M) with a benchmark median (M0), the
hypotheses are:

• When evaluating the difference between paired observations, use the median
difference Md (paired sample 1 – paired sample 2) and zero as the benchmark:
• Calculate the difference between the paired observations.
• Rank the differences from smallest to largest by absolute value.
• Add the ranks of the positive differences to obtain the rank sum W.

sum of all positive ranks W

expected value of the W statistic

standard deviation of the W statistic

Wilcoxon test statistic zcalc for large n


For large samples (n ≥ 20), the test statistic is approximately normal.
Example 1: Wilcoxon signed-rank test of the Sample Median Vs. Benchmark
The machines in a company used for the operating time per operating time per
production of paraffin wax candles operates Machine Machine
production X production X
with a median of 5.6 hours before it reach its
1 5.88 11 4.99
downtime (due to limited specifications). The 2 5.91 12 4.84
operations manager would like to prove that 3 6.11 13 7.21
the same set of machines used for the mass 4 5.55 14 4.75
5 5.55 15 4.30
production of gel wax candles operates with 6 6.53 16 7.77
the same median operating time as that the 7 5.31 17 12.28
18 3.74
paraffin wax candles produced by the 8 5.04
19 8.58
9 4.99
machines. Twenty randomly selected 10 5.34 20 11.99
machines are inspected to see whether the
manager’s claim is valid. Use 0.05 level of
significance to test the claim.
Step 1. The hypotheses are:
(refer to slide 37 to see how the hypotheses are established)
H0: M = 5.6 (claim)
H1: M ≠ 5.6

Step 2. Find the critical value. At two-tailed test and α = 0.05, the
critical value ±1.96. (using the z –table or the table indicated
below)
-
Machine X X - 5.6 /X - 5.6/ rank R+ R
Step 3. Compute the test statistic following
1 5.88 0.28 0.28 4 4
the procedure shown below. 2 5.91 0.31 0.31 6 6
3 6.11 0.51 0.51 7 7
4 5.55 -0.05 0.05 1.5 1.5
• Calculate the difference between the 5 5.55 -0.05 0.05 1.5 1.5
given data value X and the benchmark 6 6.53 0.93 0.93 13 13
median Mo (X - Mo). 7 5.31 -0.29 0.29 5 5
• Get the absolute value of the 8 5.04 -0.56 0.56 8 8
9 4.99 -0.61 0.61 9.5 9.5
difference /X – Mo /. 10 5.34 -0.26 0.26 3 3
• Rank (or average rank) /X – Mo /. 11 4.99 -0.61 0.61 9.5 9.5
• Separate the positive ranks R+ and the 12 4.84 -0.76 0.76 11 11
negative ranks R- according to the sign 13 7.21 1.61 1.61 15 15
14 4.75 -0.85 0.85 12 12
of the difference X - Mo. 15 4.30 -1.30 1.3 14 14
• Get the sum of R+ , . 16 7.77 2.17 2.17 17 17
• Note that negative ranks are shown but 17 12.28 6.68 6.68 20 20
18 3.74 -1.86 1.86 16 16
not used in the analysis. 19 8.58 2.98 2.98 18 18
20 11.99 6.39 6.39 19 19
sum 126.65 210 119 91
Calculation for test statistic:
sum of all positive ranks W

expected value of the W statistic

standard deviation of the W statistic

Wilcoxon test statistic zcalc for large n


Step 4. Make the decision. Do not reject the null hypothesis since the
test statistic zcalc = 0.0373 is less than the critical value +1.96.

Step 5. Summarize the results. There is not enough evidence to reject


the manager’s claim that the same set of machines used for the
mass production of gel wax candles operates with the same
median operating time as that the paraffin wax candles produced
by the machines. The medians are the same.
Example 2: Wilcoxon Signed-rank test of Paired Samples (Dependent Samples)

The table on the right shows the Week


Old New
Management Management
Old New
Week Management Management
number of baggage being shipped by a (A) (B) (A) (B)
shipping company over a 22 week period 1 16 14 12 18 12
before and after change of management. 2 28 12 13 18 13
Data set A shows the number of baggage 3 22 10 14 18 12
shipped in the previous management, while 4 14 10 15 16 17
data set B is the number of baggage 5 26 13 16 14 19
shipped under new management.
6 17 12 17 20 10
Determine whether the old and new
7 16 16 18 15 14
management, on average, shipped the
8 19 10 19 20 13
same number of baggage against the
alternative that the old management ships 9 17 11 20 28 20
more baggage than the new management. 10 15 18 21 26 25
Use a 0.10 level of significance. 11 20 16 22 29 27
Step 1. The hypotheses are:
(refer to slide 37 to see how the hypotheses are established)
H0: Md ≤ 0 (The median difference is zero (Old – New = 0))
(claim)
H1: Md > 0 (The median difference is positive (Old – New > 0))

Step 2. Find the critical value. At one-tailed test and α = 0.10, the critical
value ±1.28. (using the z –table or the table indicated below)
Step 3. Compute the test statistic following the Week Old New d /d/ Rank R+ R-
procedure shown below. 1 16 14 2 2 5.5 5.5
2 28 12 16 16 22 22
3 22 10 12 12 20 20
• Calculate the difference d between the 4
5
14
26
10
13
4
13
4
13
8.5
21
8.5
21
paired observations. 6 17 12 5 5 11 11
7 16 16 0 0 1 1
• Get the absolute value of the difference d. 8 19 10 9 9 18 18
• Rank (or average rank) the absolute 9
10
17
15
11
18
6
-3
6
3
14
7
14
-7
difference. 11 20 16 4 4 8.5 8.5
12 18 12 6 6 14 14
• Separate the positive ranks R+ and the 13 18 13 5 5 11 11
negative ranks R- according to the 14 18 12 6 6 14 14
15 16 17 -1 1 3 -3
corresponding sign of the difference. 16 14 19 -5 5 11 -11

• Get the sum of R+ , . 17


18
20
15
10
14
10
1
10
1
19
3
19
3
• Note that negative ranks are shown but 19 20 13 7 7 16 16
20 28 20 8 8 17 17
not used in the analysis. 21 26 25 1 1 3 3
22 29 27 2 2 5.5 5.5
sum 253 232 21
Computation for test statistic:
sum of all positive ranks W

expected value of the W statistic

standard deviation of the W statistic

Wilcoxon test statistic zcalc for large n


Step 4. Make the decision. Reject the null hypothesis since the test
statistic zcalc = 3.43 is greater than the critical value +1.28.

α = 0.10

Step 5. Summarize the results. There is enough evidence to reject the


claim that the old and new management, on average, shipped
the same number of baggage. The median number of baggage
shipped under the old management is greater than the median
number of baggage shipped under the new management.
Wilcoxon Rank Sum Test (Mann-Whitney Test)
The Wilcoxon rank sum test is a nonparametric test to compare two populations,
utilizing only the ranks of the data from two independent samples. If the
populations differ only in location (center), it is a test for equality of medians,
corresponding to the parametric two sample t-test.

• Wilcoxon rank sum test is named after statisticians Frank Wilcoxon (1892 - 1965),
Henry B. Mann (1905-2000), and D. Ransom Whitney (1915 - 2007).
• Compares two populations whose distributions are assumed to be the same except for
a shift in location.
• It is a test of differences between the medians of two different populations that are
obviously nonnormal, and samples are independent (no pairing of observations).
• It is analogous to the t – test for two independent sample means, thus, it requires
independent samples from populations with equal variances.
• It does not assume normality.
Wilcoxon Rank Sum Test for Two Independent Samples
The test of the hypothesis can be either one-tailed or two-tailed test.
• If testing for the difference of two population medians, then the test is two-
tailed.
• If one median is greater or less than the other, the test in one-tailed.

Two-Tailed Test One-Tailed Test One-Tailed Test


Ho: M1 = M2 Ho : M 1 ≥ M 2 Ho: M1 ≤ M2
Ha: M1 ≠ M2 Ha : M 1 < M 2 Ha: M1 > M2

Where : M1 is the median of population 1, and


M2 is the median of population 2
Assuming that the only difference in the populations is in location, the
hypotheses for a two-tailed test of the population medians can also be expressed
as

H0: M1 − M2 = 0 (no difference in medians).


H1: M1 − M2 ≠ 0 (medians differ for the two groups).

Note:
• This module will illustrate only a large-sample version (n1 ≥ 10, n2 ≥ 10) of
this test, thus we can use the z-test.
• The test statistics is the difference in mean ranks, divided by its standard
error.
For large samples (n1 ≥ 10, n2 ≥ 10), use a z-test. The test statistic is
computed using any of the two methods :
❖ Performing the Test using Method A

❖ Performing the Test using Method B


Summary of Procedures in Conducting a Wilcoxon Rank Sum Test
STEP 1 State the hypotheses and identify the claim.

STEP 2 Find the critical. For large samples (n1 ≥ 10, n2 ≥ 10), use a z-test.

STEP 3 Compute the statistic zcalc by following the procedures below.


• Sort the combined samples from lowest to highest.
• Assign rank to each value, use the average of the ranks when there are tied.
• Separate the data into two groups according to classification or grouping (i.e., sample 1,
sample 2).
• Sum the ranks for each group (e.g., T1, T2).
• The sum of the ranks T1 + T2 must be equal to n(n + 1)/2, where n = n1 + n2.
• Calculate the mean rank sums—mean of T1 and mean of T2.
• The test statistic is computed using Method A or Method B indicated on the previous slide.

STEP 4 Make the decision. For a given α, reject the null hypothesis if zcalc < −zα/2 or if zcalc > +zα/2.

STEP 5 Summarize the results.


Example 1: Wilcoxon Rank Sum Test for Two Independent Samples

The production of certain products, A and B is intermittent depending


on the availability of resources. The following table shows 28 randomly
selected production of the products on a day. Test the hypothesis that the
average production of product A and product B are equal using 0.05 level
of significance.
Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Product name A A A B B A B A A B B B A B
Quantity
18 20 17 17 19 16 20 19 18 14 17 17 15 22
Produced

Day 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Product name A B A A B B B A A A B B A A
Quantity
17 18 20 16 20 18 16 21 15 19 15 18 17 20
Produced
Step 1. The hypotheses are:

H0: M1 − M2 = 0 (no difference in medians).


H1: M1 − M2 ≠ 0 (medians differ for the two groups).

Step 2. Find the critical value. At two-tailed test and α = 0.05, the critical
value ±1.96. (using the z –table or the table indicated below)
Product Name Quantity Produced Rank
Step 3. Compute the test statistic following the procedure shown B
A
14
15
1
3
below. A
B
15
15
3
3
A 16 6
A 16 6
• Sort the combined samples from lowest to highest. B 16 6
A 17 10.5
• Assign a rank to each value, use the average of the ranks B 17 10.5
B 17 10.5
when there are tied as shown on the table. B 17 10.5
A 17 10.5
• Separate the data into two groups according to A 17 10.5
A 18 16
classification, in this case (Product A, Product B) as A 18 16
B 18 16
shown on the table on the next slide. B 18 16

• Sum the ranks for each group (e.g., T1, T2). B


B
18
19
16
20
• The sum of the ranks T1 + T2 must be equal to n(n + 1)/2, A
A
19
19
20
20
where n = n1 + n2. A
B
20
20
24
24
• Calculate the mean rank sums—mean of T1 and mean of A
B
20
20
24
24
T2. A
A
20
21
24
27
A 22 28
Computation for test statistic using Method A:
Computation for test statistic using Method B:

Step 4. Make the decision. Do not reject the null hypothesis since the test statistic
zcalc = 0.1382 is less than the critical value +1.96.

Step 5. Summarize the results. There is not enough evidence to reject the claim that the
average production of product A and product B are equal.
References:
• Statistical analysis with software applications by
McGraw Hill Education
• Source: Applied Statistics in Business and Economics by
Doane and Seward p. 694
• Probability and statistics by Walpole, R., Myers, R., and
Myers, S.
• http://www.mayo.edu/mayo-edu-docs/center-for-
translational-science-activities-documents/berd-5-6.pdf
• https://www.slideshare.net/saiprakash6/distinguish-
between-parametric-vs-nonparametric-test1

You might also like