Eda Reviewer
Eda Reviewer
Eda Reviewer
( 40 ) ( 30 )−(10)(20)
¿
√(40+ 10)(20+30)(40+20)(10+ 30)
1200−200
¿
√(50)(50)(60)(40)
1000
¿
√2449.489743
ϕ=0.4082
The Phi coefficient analysis indicates a weak positive correlation between gender
and tea or coffee preference. To assess the significance of this correlation, the chi-
square statistic needs to be calculated.
2 2
χ =n φ
Where:
2
χ is the chi-square statistic
n is the total number of observations (the grand total in your contingency table)
is the phi-coefficient
4. Calculate the Chi-Square Statistic
2 2
χ =n φ
2
¿ 100( 0.4082)
2
χ =16.6627
5. Find the Critical Value or P-value:
= 0.05
For a 2x2 contingency table:
df = (number of rows -1)(number of columns-1)
= (2-1)(2-1)
df = 1
C.V. = 3.841
6. Decision & Conclusion
Reject Null if χ 2 > C.V.
Since 16.66 > 3.841 then, we reject the null hypothesis.
At 0.05 LOS, there is a significant weak association between gender and preference
on either tea or coffee.
(rpb )
- Point-Biserial correlation is also called the point-biserial correlation coefficient
3. SPEARMAN RHO
Introduction
- Spearman’s rank correlation assesses the strength and direction of association
between two ranked variables, measuring the monotonicity of their relationship.
The calculations for the rank correlation coefficient are simpler than those for
the Pearson coefficient, involving ranking the data and computing differences in
ranks. The resulting coefficient, ( rpb), ranges from +1 (perfect positive rank
correlation) to -1 (perfect negative rank correlation), with values near 0
indicating no relationship. It is appropriate to use Spearman’s rank correlation
when analyzing the covariation of two ranked variables.
Formula:
Where:
= Spearman’s rank correlation coefficient
d = difference in ranks
n = number of observations
The Spearman Rank correlation can take a value from +1 to -1 where,
- A value of +1 means a perfect positive association of rank
- A value of 0 means that there is no association between ranks
- A value of -1 means a perfect negative association of rank
Assumptions
There are two main assumptions for Spearman’s rank correlation coefficient:
1. The data must be measured on an ordinal or continuous scale.
2. The two variables should have a monotonic relationship, meaning that as one
variable increases
or decreases), the other variable also increases (or decreases), allowing for the
formation of ranks.
Hypothesis Testing
Application Using Excel
Excel does not have a procedure to compute the Spearman rank correlation
coefficient. However, you may compute this statistic by using the MegaStat Add-in
available on your CD. If you have installed this add-in, do so, following the
instructions.
1. Enter the rating scores from the raw data you have into columns A and B of a
new worksheet.
2. From the toolbar, select Add-ins, MegaStat>Nonparametric
Tests>Spearman Coefficient of Rank Correlation. Note: You a need to open
MegaStat from the MegaStat.xls file on your computer’s hard drive.
3. Select the cells of your data for Input Range.
4. Check the Correct for ties option.
5. Click [OK] and the result will appear on a different sheet named “Output”
4. KENDALL TAU
Introduction
- Kendall’s Tau is a non-parametric measure of relationships between columns of
ranked data. The Tau correlation coefficient returns a value of 0 to 1, where:
Tau Correlation Interpretation
0 No relationship
1 Perfect relationship
- Kendall's Tau can produce values from -1 to 0; however, a negative value in
ranked data may simply indicate a column switch, so the negative sign can be
disregarded in interpretations. There are different versions of Tau, including Tau-
A, Tau-B, and Tau-C, with Tau-B being the most widely available in statistical
software. Kendall's Tau can be calculated manually using the formula: Kendall’s
Tau = (C – D) / (C + D), where C is the number of concordant pairs and D is the
number of discordant pairs. Non-parametric methods like Kendall's Tau and
Spearman's rank-order correlation are recommended for non-normal data, while
Pearson's product moment correlation is suited for normally distributed data.
When To Use?
Kendall's Tau is a non-parametric test used to assess the strength and direction of
association between two ranked variables. It is recommended to use Kendall's Tau
under the following conditions:
Where:
C = the number of concordant pairs
D = the number of discordant pairs
T = total number of possible pairs
t = number of tied ranks
Tau-C provides a more accurate measure of the association between two rankings
that Tau-A or Tau-B when there are tied ranks. However, it is also computationally
more intensive.
1. Load Data into SPSS: Start SPSS and import your dataset, which should include
the relevant variables, by navigating to File > Open > Data.
2. Access the Analyze Menu: Click on “Analyze” in the top menu, then select
“Correlate” and choose “Bivariate.”
3. Choose Variables: In the “Bivariate Correlations” dialogue box, select the
variables to analyze and move them to the “Variables” box, ensuring to check
Kendall for the correlation coefficient.
4. Generate SPSS Output: Click “OK” to perform the analysis, and SPSS will produce
an output that includes the frequency table and chart for your dataset.
5. CHI-SQUARE
Introduction
Chi-Square test is a statistical method used to determine whether there is a
significant association between two categorical variables. It evaluates how closely
the observed frequencies of a dataset align with the expected frequencies under
the null hypothesis.
When to Use
The Chi-square test for correlation is appropriate under specific circumstances,
particularly when you aim to evaluate the relationship between two categorical
variables. Here are some scenarios when you should consider using the Chi-square
test:
1. Examine Relationships Between Categorical Variables: Use the Chi-square
test to determine if there is a significant association between two categorical
variables (e.g., gender and voting preference, or smoking status and lung disease).
2. Contingency Tables: It is particularly applicable when the data can be
summarized in a contingency table (also known as a cross-tabulation), which
displays the frequency distribution of the variables.
3. Nominal or Ordinal Data: The test is most commonly applied to nominal data
(categories without intrinsic order). However, it can also be used for ordinal data
(categories with a specific order) as long as the assumptions about independence
and expected frequency are respected.
4. Large Samples: It is best used for larger sample sizes where the expected
frequency in each cell of the contingency table is adequate (typically at least 5). If
you have small sample sizes, consider using Fisher's Exact Test instead.
5. Testing Hypotheses About Independence: The Chi-square test can be used
when you want to test hypotheses about whether two categorical variables are
independent (i.e., no association) or dependent (i.e., some association).
6. Exploratory Analysis: The Chi-square test is often used in exploratory data
analysis to identify potential associations or patterns that may warrant further
investigation.
Assumptions
The Chi-square test for correlation, commonly used in the context of contingency
tables to assess the relationship between two categorical variables, relies on
several assumptions:
1. Categorical Variables: The data must be in the form of categorical variables.
The test is not suitable for continuous data unless it is categorized.
2. Independence: The observations must be independent of each other. This
means that the occurrence of one observation should not influence another.
3. Expected Frequency: For the Chi-square test to be valid, the expected
frequency in each cell of the contingency table should be sufficiently large. A
common rule of thumb is that the expected frequency in each cell should be at
least 5. If this assumption is violated (for example, if the table has many cells with
low expected counts), the Chi-square test may not provide reliable results.
4. Random Sampling: The samples should be randomly selected; this ensures
that the data is representative of the population being studied.
5. Data Type: The data should be nominal (categories without a specific order) or
ordinal (categories with a specific order), but the Chi-square test is primarily used
for nominal data.
By ensuring these assumptions are met, the validity of the conclusions drawn from
the Chi-square test for correlation can be enhanced. If these assumptions are
violated, alternative statistical methods or corrections might be necessary.
Assumptions for the Chi-Square Independence
1. The data are obtained from a random sample.
2. The expected value in each cell must be 5 or more.
Hypothesis Testing
The null hypothesis (H0) states that there is no association between the two
categorical variables. In other words, the variables are independent of each other.
The alternative hypothesis (H1) asserts that there is an association between the
two categorical variables, meaning they are not independent.
H1: The two categorical variables are not independent (i.e., they are
associated).
Assumptions
The assumptions for using the Pearson Product-Moment Correlation Coefficient
include:
1. Both variables must be measured on a continuous scale.
2. Each case must have paired values for the two variables.
3. Observations for each case should be independent of each other.
4. A linear relationship must exist between the two continuous variables.
5. Ideally, both variables should follow a bivariate normal distribution, though
univariate normality is often deemed sufficient in practice.
6. Homoscedasticity should be present, meaning variances along the line of best
fit should remain consistent; varying variances indicate heteroscedasticity.
7. There should be no univariate or multivariate outliers present in the data.
Hypothesis Testing
1. State the null and alternative hypothesis
2. Find the critical values
3. Calculate the value r
4. Calculate the Test Statistic
5. Make the Decision
6. Summarize the result
Application Using SPSS
1. Measuring and Setting Variables
2. Input Data
3. Analyze Data
4. Set Options
5. Run the Analysis