Correlation Coefficient
Correlation Coefficient
COEFFICIENT
PRESENTED BY:
ASNA ZIA (1994)
SHEEZA TARIQ (1995)
MAHNOOR IKLAQ (1996)
IQRA MASOOD CHOHAN (1999)
AMMARA ABID (2002)
QURAT-UL-AIN (2003)
Correlation
coffecient :
The correlation coefficient is a statistical measure that quantifies the strength and
direction of the relationship between two variables. It is denoted by r (in the case of
Pearson correlation) and ranges from +ve 1 to -ve 1
• A value of +1 indicates a perfect positive correlation, meaning that as one variable increases, the
other variable also increases.
• Conversely, a value of -1 indicates a perfect negative correlation, where one variable increases as the other
decreases.
• A value of 0 indicates no correlation.
• Larger samples tend to provide more accurate estimates of the correlation, while smaller samples can
lead to unstable coefficients.
• Therefore, researchers should aim for an adequate sample size to ensure the validity of their findings.
4. Contextual
• TheConsideration:
satisfactory size of the correlation coefficient can depend on the context of the research.
• In some fields, such as psychology, a correlation of 0.30 may be considered meaningful, while in
other disciplines, a higher threshold might be required.
5. Statistical
• It isSignificance:
important to consider not just the size of the correlation but also its statistical significance.
• A correlation can be large but not statistically significant if the sample size is small.
• This involves evaluating whether the strength of the correlation has real-world implications.
• A small correlation might be statistically significant but may not have practical relevance in a real-
world context.
7. Correlation vs
• Causation:
Correlation does not imply causation, means that it is not necessary that correlation provides reason
and cause.
• A high correlation coefficient does not mean that one variable causes changes in another.
• If causation is involved, then it includes independent and dependent variables.
• Not every correlation gives direction, but in causation we know that which variable is cause
(independent variable) and which is effect (dependent variable).
8. Coefficient of
• Determination:
The square of the correlation coefficient (r²) is known as the coefficient of determination.
• The coefficient of determination (r²), measures how well the independent variable explain the
variability of the dependent variable.
• It provides insight into the proportion of variance in one variable that can be explained by the
other variable.
• For example, if r = 0.6, then r² = 0.36, indicating that 36% of the variance in one variable can
be explained by the other. This can help in understanding the practical significance of the
correlation.
9. Types of Correlation
• There are different types of correlation coefficient, such as Pearson's r, Spearman's rank
Coefficient:
correlation, and Kendall's tau.
• Pearson's r is used for linear relationships with interval or ratio data, while Spearman's and Kendall's
are used for ordinal data or non-linear relationships.
Pearson Product Moment
Correlation: Σ(X-X) (Y-Y)
r=
Σ(X-X)² Σ(Y-Y)²
Spearmen Rank Order
Correlation:
1- 6Σd²
r=
n(n²-1)
endall’s Tau Correlation:
0. Confidence Interval:
• A confidence interval for the correlation coefficient is a range of values that is likely to contain the true
correlation coefficient of a population based on sample data.
• It provides an estimate of the uncertainty around the correlation measurement.
• Typically, a confidence interval is expressed at a certain confidence level, such as 95%, meaning that if
we were to take many samples and compute the confidence interval for each, approximately 95% of
those intervals would contain the true correlation coefficient.
. Reporting Correlation:
• When reporting correlation results, it is essential to include the correlation coefficient, the sample
size, the p-value, and the confidence interval.
• This comprehensive reporting allows for better interpretation and understanding of the findings.
• Solution: Develop a test blueprint that ensures balanced representation of all content areas.
• Use alternate-form reliability techniques, where different but equivalent versions of the test are created
and compared to estimate and reduce such errors.
• Time Sampling Error: This error arises from fluctuations in test-takers’ performance due to
timing factors, such as mood, fatigue, or external circumstances.
• Solution: Administer the test multiple times (test-retest reliability) to check stability over time.
• Use appropriate intervals between testing sessions: Shorter intervals
minimize practice effects (remembering items from the first test).
• Use multiple raters and calculate inter-rater reliability. A correlation of 0.90 or higher between scores
from different raters indicates strong agreement.
• Employ double scoring for high-stakes assessments to cross-check accuracy.
3. Improving Test
Design:
• Internal consistency: Evaluates whether all test items measure the same underlying
construct.
• Solution: Use statistical measures such as:
• Split-half reliability: Divide the test into two halves and assess how well scores from each
half correlate.
• Cronbach’s alpha: A measure of how well items are interrelated. Higher alpha values indicate
greater reliability.
• Revise or eliminate poorly performing items (items with low item-total correlations).
• Test Lenght:Rationale: Longer tests generally produce more reliable scores because they reduce the
influence of random errors in individual items.
• Solution: Add more items that assess the same construct, ensuring they are high-quality and well-
designed.
• Develop parallel test forms with equivalent difficulty levels to increase reliability for repeated
testing.
4. Using Advanced Theoretical Models:
• Item Response Theory ( IRT): A modern psychometric approach that focuses on the
relationship between item difficulty and test-taker ability.
• Advantages: Allows precise calibration of test items for difficulty and discrimination.
• Enables adaptive testing, where the test adapts to the ability level of the test-taker, improving efficiency
and reliability.
• Application: Use IRT to identify items that function differently across subgroups (differential item
functioning).
• Generalizability Theory (GT): GT goes beyond classical test theory to evaluate multiple
sources of error simultaneously, such as item, time, and scorer variability.
• Application: Use GT to design tests that minimize variance caused by multiple factors, offering a
comprehensive view of reliability.
5. Evaluating Reliability Data in
Context:
• Purpose-Specific Reliability: High reliability is more critical for high-stakes tests (e.g.,
licensing exams) than for exploratory or low-stakes measures.
• Reliability coefficients above 0.80 are generally acceptable; however, thresholds of 0.90 or higher may
be needed for critical decisions.
• Sample Considerations: Ensure the reliability analysis is conducted on a representative sample
of the target population to avoid biased estimates.
• Interpretation: Use confidence intervals to interpret reliability coefficients, considering the
specific purpose and constraints of the test.
tandardization of Administration:
• Challenge: Variability in test administration (e.g., differences in instructions, environment, or time
limits) introduces error.
• Solution: Use standardized instructions and administration procedures for all test-takers.
• Control environmental factors such as noise, lighting, and seating arrangements.
TOOK THE SAME TESTAN INFINITE NUMBER OF TIMES. ,THIS CONCEPT IS IMPORTANT BECAUSE IT HELPS
times.
• Formula: SEM = SD × √(1 - r)
• where;
• SD = the standard deviation of the test
• r = the reliability coefficient
• If SD is greater then SEM is also greater.
Why is SEM
important?
• Reminds us that test scores are not precise.
• Provides a range of possible true scores for an individual.
• Allows us to quantify the extent to which a test provides accurate scores.
• Low level SEM represents High score accuracy.
• High level SEM represents Low score accuracy.
Example: Applying
the SEM
• Case study:
• Maria's Vocabulary subtest score (WAIS-III)
• Calculating the SEM and confidence interval
• Interpreting the results
• SD= 3, M=10, error variance or r= .79
confidence level.
Confidence
Interval:
• A statistical tool that provides a range of values within which a population parameter is
likely to lie.
• To calculate confidence interval we apply the percentages of estimated confidence level to
the estimated true scores.
• Importance:
1. Reminds us that test scores are not precise.
2. Prevents overvaluing insignificant score differences.
Confidence
Interval:
• Getting estimated true score using formula based on Dudek (1979):
• T ′ = r (Xo – M ) + M
• where:
• T ′ = the individual’s estimated true score
• r= estimated reliability of test scores
• Xo = the individual’s obtained score
• M = the mean of the test score distribution
Confidence
Interval:
• Xo = 15, r= .79, and M = 10
T ′ = (.79) (15 – 10) + 10
= 13.95
= 14
• If Xo is above the M, then T is lower than Xo
• If Xo is below the M, then T is greater than Xo
• If Xo=M then T is M
Confidence
Interval:
• Now, calculating Confidence Interval:
• 68%CI= T±SEM
SEdiff Formula 1:
SEdiff = SD× √( 2 –r11 –r22 )
where
SD = the standard deviation of Test 1 and Test 2
r11 = the reliability estimate for scores on Test 1
r22 = the reliability estimate for scores on Test 2
Standard Error
Difference:
SEdiff Formula 2:
SEdiff = √(SEM1^2 + SEM2^2)
Where:
SEM1 = Standard Error of Measurement for Score 1
SEM2 = Standard Error of Measurement for Score 2
SEdiff Formula 1 is used if the two test scores being compared are expressed in
the same scale, and SEdiff Formula 2 is used when they are not.
Example: Applying
SEdiff:
• The Standard Error of the Difference (SEdiff) is applied to determine the statistical significance of
differences between Maria's Vocabulary and Information subtest scores on the WAIS-III.
• SEdiff = √(2 × SEM^2)
5/1.80 = 2.78
• Determining Statistical Significance:
p-value = 2 × .0027
= .0054
Example: Applying
SEdiff:
• Interpretation:
The probability that the 5-point difference between Maria's Vocabulary and Information subtest scores is due
to chance is 5.4 in 1,000.
• Conclusion:
Maria's knowledge of vocabulary most likely exceeds her knowledge of general information.
Profile
Analysis:
• Using SEM to analyze profiles of subtest scores
• Example: Maria's WAIS-III subtest scores
90%
Subtest Obtained Scores SEM Confidence
level
stability.
Relationship Between Reliability
and Validity:
• Reliability is a necessary but not sufficient condition for validity.
• Score reliability can be seen as minimal evidence of validity.
• However, some tests may produce valid results that are not reliable in terms of consistency or
stability.
THE END
THANK