0% found this document useful (0 votes)
105 views11 pages

Validity and Reliability

The document discusses validity and reliability in psychological assessment. Reliability refers to the consistency of a measure and is assessed through test-retest reliability, alternate forms reliability, split-half reliability, and inter-rater reliability. Validity indicates how well a measure assesses the intended construct and is determined through content validity, construct validity which involves correlating with other measures and factor analysis, and criterion validity which tests predictive accuracy. Both reliability and validity are important but reliability does not guarantee validity. Measures should have reliability coefficients of at least 0.80 for individuals and 0.65 for groups.

Uploaded by

busisiwe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views11 pages

Validity and Reliability

The document discusses validity and reliability in psychological assessment. Reliability refers to the consistency of a measure and is assessed through test-retest reliability, alternate forms reliability, split-half reliability, and inter-rater reliability. Validity indicates how well a measure assesses the intended construct and is determined through content validity, construct validity which involves correlating with other measures and factor analysis, and criterion validity which tests predictive accuracy. Both reliability and validity are important but reliability does not guarantee validity. Measures should have reliability coefficients of at least 0.80 for individuals and 0.65 for groups.

Uploaded by

busisiwe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Validity and Reliability

Class4

Foxcroft, C., & Roodt, G. (2018). Introduction to


psychological assessment in the South African context
(5th ed., ch. 4 & 5). Oxford University Press.
RELIABILITY
Reliability is the consistency with which it measures whatever it measures.For. example, a
speedometer ~ speed of car.
It is important to define:
¨What do you want to measure (weight/height)
¨Nature of the measurement (instrument) that you want to use
¨How exactly are you going to measure

A person's performance in one administration of a measure does not reflect with complete
accuracy or the 'true' amount of the trait that the individual possesses.
There may be other systematic or chance factors that may affect his/her score on the measure.
Eg. the person's emotional state of mind, fatigue, the noise outside the test room, etc.
The observed test score is thus comprised of the true score and the error score.
TYPES OF RELIABILITY

Test re-test reliability – Administer twice to same individuals, correlate coefficient. Personality
and exercise example- similar results for both cardio exercise and personality results on two
different occasions.
Alternate Form reliability- two equivalent questionnaires and forms of exercises are
administered on two different occasions then correlated
Split-Half reliability – Divide the questionnaire and exercises into two halves, administer the
first half of questionnaire/ exercises, administer 2nd half of questionnaire/ exercises and
correlate.
Inter-item Consistency- formula used to see how consistent our questionnaire questions and
exercises we created.
Inter-scorer (rater) reliability – how consistently are we as scientist scoring or rating our
participants on their questionnaires and exercises.
Intra-scorer (rater) reliability – Are we as individual scientist rating or scoring consistently and
not being affected by other personal factors of the participant.
FACTORS AFFECTING RELIABILITY
Respondent (testee) error (person doing the test)
Non-response errors / Self-selection bias
Measure is speeded
Variability in individual scores (compare scores to the population for which the measure was intended)
Ability levels (compute reliability separately for homogeneous subgroups, such as age, gender, occupation)
Response bias:
- ¨Extremity bias (very + or -)
¨Centrality or neutrality bias
¨Stringency or leniency bias (raters are lenient or strict)
¨Acquiescence bias (respondent agrees with all questions - no preferences are noted)
¨Halo effect (raters rate more favourably if they like individual)
¨Social desirability bias (respond in a way you think is socially desirable)
¨Purposive falsification
¨Unconscious misrepresentation
Purposive falsification
Unconscious misrepresentation
FACTORS AFFECTING RELIABILITY
Administrative error (person administerinng the test)
Variations in:
Instructions
Assessment conditions
Interpretation of instructions
Scoring or ratings

Countered by:
-manuals with standardised instructions
-Following these instructions
INTERPRETING RELIABILITY
Standardised measures should have reliabilities ranging between .80 and .90. Some scholars
state that reliability coefficients should be .85 or higher if measures are used to make
decisions about people, while it may be .65 or higher for decisions about groups.
Magnitude of reliability coefficient:
Ø Standardized measures 0.8 to 0.9
Ø Individuals 0.85 or higher
Ø Groups 0.65 or higher
Ø Personality and Interest measures 0.8 to 0.85
Ø Aptitude 0.9 or higher

Standard Error of measurement- Another way of interpreting reliability coefficients.


Interprets individual test scores in terms of the reasonable limits within which they are
likely to vary as a function of measurement error. It is independent of the variability of the
group on which it was computed.
TYPES OF VALIDITY
Validity refers to how accurately a measure measures what it is supposed it (accuracy).

Content-description Procedures:
Face validity – Does it look like a personality measure or exercise watch
Content Validity- involves determining whether the content of the measure (the
questions in our questionnaire or the exercises we want our watch to measure) covers
a representative sample of the behaviour domain/ aspect to be measured (e.g. the
competency to be measured). A frequently used procedure to ensure high content
validity is the use of a panel of subject experts to evaluate the items during the test
construction phase.
TYPES OF VALIDITY
Construct-Identification Procedures:
Construct Validity: involves a quantitative, statistical analysis procedure. The construct validity of a measure
is the extent to which it measures the theoretical construct or trait it is supposed to measure. Such as;
Extroversion, neuroticism, resting heart rate, metabolic age etc.
Correlation with other tests- We would administer our personality measure and compare their results to an
already established personality measure, and compare our readings of a single participant on their readings
on our exercise watch to a Polar and Fit Bit readings.
Factorial validity- Factor analysis is a statistical technique for analysing the interrelationships of variables.
The aim is to determine the underlying structure or dimensions of a set of variables because, by identifying
the common variance between them, it is possible to reduce a large number of variables to a relatively small
number of factors or dimensions. We can use this technique when we consult our panel of experts to do a
theme analysis on what personality characteristics is important to going to space. We can do a meta analysis
on previous studies done on what physical exercises stand in your favour to going to space and correlating
those constructs with what our exercise watch actually measures.
TYPES OF VALIDITY
Convergent and discriminant validity- A measure demonstrates construct validity when it correlates highly
with other variables with which it should theoretically correlate (convergent validity)- extroversion and
warmth towards others, cardio-vascular fitness with metabolic age. And when it correlates minimally with
variables from which it should differ (discriminant validity)- extroversion and structure, heart rate and shoe
size for example.
Incremental and differential validity- A measure displays incremental validity when it explains numerically
additional variance compared to a set of other measures when predicting a dependent variable. For instance,
a measure of emotional intelligence would posses incremental validity if it explains additional variance
compared to a set of the ‘big five’ in personality measures when predicting job performance. So would our
personality questionnaire give any more info than an already existing measure.
Criterion- Prediction Procedures
Concurrent validity- How accurately can our personality questionnaire or exercise watch measure our
participants’ current personality functioning and current fitness levels.
Predictive Validity – How accurately can our personality measure and fitness watch predict future behaviour
or future fitness levels.
INTERPRETATION OF VALIDITY
Predictive validity coefficient = correlation coefficient
between the predictor variable(s) and the criterion
variable
The magnitude of the validity coefficient:
Statistically significant at 0.05 and 0.01 levels
For selection purposes à values of 0.30 and 0.20
are acceptable

Validity is directly proportional to its reliability


¨No point trying to validate an unreliable measure
¨Reliability does NOT imply validity
¨Reliability is a necessary but not sufficient precondition for validity
THANK YOU

You might also like