Psychometrics Lecture 2
Psychometrics Lecture 2
Psychometrics Lecture 2
What was Sterns definition of an IQ? In your answer, explain the concept of mental age. What was the major drawback with the measure that Stern proposed?
Main points
The three essential ingredients of your answer are as follows:
1. 2. Explanation of MENTAL AGE. STERNS DEFINITION. Define the IQ verbally, as well as giving the formula. The PROBLEM (psychometric mental age doesnt increase beyond 15, which is problematic for the measurement of adult intelligence).
3.
Anything else you may be able to add is a luxury. Remember, this question hasnt asked for a solution to the problem, so you dont have to give one. You need only show the examiner that you understand WHY theres a problem. 3
Mental age
A persons MENTAL AGE is the chronological age at which most children can perform at the same level. So if a 25-year-old man performs at the levels of typical 9-year-olds, his CHRONOLOGICAL AGE is 25, but his MENTAL AGE is 9.
Lecture 2 RELIABILITY
Validity
A test is a MEASURING INSTRUMENT. A test is said to be VALID if it measures what it is supposed to measure. Applicants for a post in senior management may be given a psychometric test of leadership capability. But do a candidates responses really indicate his or her suitability for the post? This is a question about the VALIDITY of a test.
10
Validity
When we say the test is VALID, we mean that a persons responses to the questions in the test really do tell us something about how that person would perform in a real situation requiring managerial capability.
11
13
Validity
According to Binet, the child who can draw the cone from memory has a greater scholastic aptitude than a child who can not. But is that true? Is a childs mental age, as measured by Binets test, really a measure of scholastic aptitude? Is the child who can draw the cone really better at school subjects (such as French, geometry or chemistry) than a child who can not? These are questions about the VALIDITY of a psychological test. Does it measure the hypothetical quality that it is supposed to measure?
14
Reliability
If a test is to be VALID, it must, in the first place, be RELIABLE. A RELIABLE test is one that gives CONSISTENT RESULTS if taken by the same participants on different occasions or when they are tested by different examiners.
15
Reliability
A reliable test produces CONSISTENT results. If John scored at the 70th percentile on the first occasion of testing, he would, if tested on other occasions, score at similar percentile levels. His scores when tested on subsequent occasions are indicated by the dashed lines at the 69th, the 68th, the 73rd and the 72nd percentiles. Hes always somewhere near the 70th percentile. This is a RELIABLE test.
16
Unreliability
An unreliable test gives INCONSISTENT results. John scores at the 70th percentile on the first occasion. On subsequent occasions, however, he scores at the 20th, the 40th, the 45th, the 75th and the 60th percentiles (not necessarily in that order). A test showing this sort of inconsistency is an UNRELIABLE test.
17
20
Definition of reliability
1. The scores must have a distribution that DIFFERENTIATES among those tested. 2. A test is said to be reliable if, given that the scores display the necessary variability and DISTRIBUTION, individuals retain their relative standing in the distribution from occasion to occasion of testing, and when tested by different administrators. A child should score at similar PERCENTILES from occasion to occasion. 3. A reliable test thus gives CONSISTENT RESULTS.
21
Personality tests
Many tests of personality have several subsections, each of which measures a distinct aspect of personality. So an overall aggregate score may be a sum of several scores which are themselves aggregates of scores on the items in the various subsections of the test. Cattells personality test produces scores on 16 subscales, each supposedly measuring one of 16 personality factors.
25
Visual patterns
You can build up increasing complex patterns by increasing the size of the grid. The lower pattern is, of course, much more difficult to reproduce from memory than would be one in a smaller grid.
31
32
Age norms
Norms are available for both the Corsi Blocks and Visual Patterns tests. Both Corsi Blocks and Visual Patterns spans decrease noticeably with age. To assess whether someone in their seventies has sustained cognitive impairment, that persons score must be related to the distribution of scores of people in that age group.
33
34
1. Test-retest reliability
Give the test to a large number of people. Give the test again to the same people. You will have a bivariate data set comprising the scores of each person on the two tests. Calculate the Pearson correlation r between Score on the FIRST occasion and Score on the SECOND occasion. The value of r should be at least .75.
35
2. Parallel forms.
Construct two equivalent forms of the same test, Form A and Form B. Ensure that people score at similar levels on the two forms. This has been done with the Visual Span test. Test each of a large sample of people with both Form A and Form B. Let Variable A contain their scores on Form A of the test; Variable B contains their scores on Form B of the test. This is a bivariate data set. Calculate the Pearson correlation between A and B. The correlation should be at least .75.
36
The method
Each person can now be given two totals:
1. A total on the odd-numbered items. 2. A total on the even-numbered items.
You will now have a bivariate data set comprising the Odd and Even totals achieved by all the people tested. Calculate the Pearson correlation to determine the split-half reliability. The value of r should be at least .75.
38
An example
Suppose that a test comprises ten items, each item being marked 0 or 1, for a wrong and a right answer, respectively. The score a person finally gets on the test is an aggregate of the ones and zeros over all ten items. A persons total score, therefore, can vary from 0 to 10.
39
40
The scatterplot
The scatterplot is indicative of the assumed linear relationship between scores on the odd and even items in the test. The split-half reliability is 0.86.
42
Test-retest: Disadvantages
On a test of attitudes or prejudice, memory for previous answers would make the test seem more reliable than it really is. The shorter the interval between the first and second testing, the stronger the memory effect is likely to be. There is therefore uncertainty about how long to make the interval between the two sessions. The test-retest method may OVERSTATE a tests reliability.
43
46
A persons vocabulary score consists of a TRUE component (true relative size of voculary) and a RANDOM component. The random (or ERROR) component is contributed to by the element of luck in the selection of words for the test.
49
A scores components
50
Longer tests
Tests with more items produce scores with relatively greater true components and relatively less error. Increase the true component of the total score by HAVING MORE ITEMS IN YOUR TEST.
52
Summary
Reliability, in the technical sense of the term, was defined. Three methods of determining reliability were described: (1) the TEST-RETEST method; (2) the PARALLEL FORMS method; (3) the SPLIT-HALF method. Each method has its own advantages, disadvantages and applicability.
55
Summary
Reliability is a necessary, but not a sufficient, condition for validity. The reliability of intelligence tests, field dependence tests (Rod-and-frame, Embedded Figures) and personality tests (IntroversionExtraversion, Neuroticism-Stability) is very high, often .9 or greater. This fact in itself, however, does not demonstrate the VALIDITY of these tests. Next week, I shall turn to the ways in which psychometricians attempt to validate their tests.
56
Short question
What, in the context of mental testing, is meant by the RELIABILITY and VALIDITY of a test? Can a test be valid without being reliable? Can a test be reliable without being valid? Describe two approaches to the measurement of reliability, explaining the advantages and disadvantages of each.
57
Practice question
What is a DEVIATION IQ? In your answer, explain how a deviation IQ differs from IQ as defined by Stern. What is the advantage of a deviation IQ?
58