Problem: Hypothesis Testing: Chi-Square Test:: Usage Gender
Problem: Hypothesis Testing: Chi-Square Test:: Usage Gender
Gender Row
Usage Total
Male Female
Light Users 14 5 19
Medium Users 5 5 10
Heavy Users 5 11 16
Colum Total 24 21
1. CMO of the company is interested to know if there is any relationship between gender
and usage. Which statistical analysis would you suggest here? Why?
2. Conduct your suggested analysis.
Here we have two variables – gender (non-metric) and usage (non-metric). Since the CMO is
interested to know if there is any relationship between two non-metric variables, the
appropriate statistical analysis is the chi-square test.
The following steps can be taken for the purpose of conducting the chi-square test:
Step-1: Formulate the H0 and H1: The following null and alternative hypothesis can be
developed here:
Step-2: Select An Appropriate Statistical Technique: Since we are interested to see if there
is any relationship between two non-metric variables (gender and usage), chi-square will be the
appropriate statistical technique here.
Step-3: Choose the Level of Significance, α: Assuming 95% Level of Confidence, the level of
significance will be given by α = .05.
1|Page
Step-4: Collect Data and Calculate Test Statistic: For the purpose of calculating χ2 (chi-
square) value, the following table can be constructed from the given data:
Gender Row
Usage
Male Female Total
Light Users 14 5 19
10.13 8.87
Medium Users 5 5 10
5.33 4.67
Heavy Users 5 11 16
8.53 7.47
Colum Total 24 21 45
Where f0= observed frequency and fe is the expected frequency. Here all f0 are given in all six
cells and expected frequency will be calculated by using the following equation:
where nr is the row total and nc is the column total and n is the total sample size.
How for the first cell, . Similarly for the second cell, . All
expected frequencies were calculated similarly.
= + + + + +
= 6.34
Step-5: Determine the Critical Value: Now the critical value for can be found from the
relevant table (at the end of your text book).
df = (r – 1) (c – 1) = (3 – 1) (2 – 1) = 2 1 = 2
2|Page
The required table is shown here,
Critical = 5.991
Step-6: Determine the Calculated Value: Here the = 6.34 (previously calculated in
calculated
step- 4) which will be compared with critical value of for the purpose of rejecting or not
rejecting the null hypothesis.
Step-7: Reject or Do Not Reject the H0: The null hypothesis can be rejected only when the
calculated value is greater than the critical value.
So the null hypothesis will be rejected here. Now it can be concluded that there is a relationship
between gender and usage.
Now the strength of association will be calculated by using Cramer’s V by because the
dimension of the matrix is 3 2.
⁄
√ [Here r = number of row and c = number of column]
⁄
√ [Here r = 3 and c = 2]
⁄
√
⁄
√ [Here minimum of 2 and 1 is 1; therefore only the minimum value is retained]
3|Page
V = .3754 or 37.54 % Here we will follow the following rule of thumb:
Less than 30% Weak relationship
So, the relationship between gender and 30% to 50% Moderate relationship
usage is moderate. More than 50% Strong
4|Page