0% found this document useful (0 votes)
14 views37 pages

Assessment Notes

The document discusses the concept of intelligence, defining it as a multifaceted capacity that includes abilities such as reasoning, problem-solving, and adapting to new situations. It outlines the history and development of intelligence testing, including key figures like Alfred Binet and David Wechsler, and various theories of intelligence such as Spearman's two-factor theory and Gardner's multiple intelligences. Additionally, it covers the evolution of IQ tests, including the Stanford-Binet and Wechsler scales, and their applications in educational and psychological contexts.

Uploaded by

Farzana Afzal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views37 pages

Assessment Notes

The document discusses the concept of intelligence, defining it as a multifaceted capacity that includes abilities such as reasoning, problem-solving, and adapting to new situations. It outlines the history and development of intelligence testing, including key figures like Alfred Binet and David Wechsler, and various theories of intelligence such as Spearman's two-factor theory and Gardner's multiple intelligences. Additionally, it covers the evolution of IQ tests, including the Stanford-Binet and Wechsler scales, and their applications in educational and psychological contexts.

Uploaded by

Farzana Afzal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

INTELLIGENCE TEST (History and Theories)

What Is Intelligence?
 We may define intelligence as a multifaceted capacity that manifests itself in different
ways across the life span. In general, intelligence includes the abilities to:
■ acquire and apply knowledge ■ reason logically
■ plan effectively ■ make sound judgments and solve
problems
■ grasp and visualize concepts ■ pay attention
■ be intuitive ■ find the right words and thoughts
with facility
■ cope with, adjust to, and make the most of new situations
Intelligence Defined: Views of the Lay Public
Research 1
For the nonpsychologists, the behaviors most commonly associated with intelligence
were “reasons logically and well,” “reads widely,” “displays common sense,” “keeps an open
mind,” and “reads with high comprehension.”
Research 2 and 3
 In another study (Siegler & Richards, 1980), students enrolled in college developmental
psychology classes were asked to list behaviors associated with intelligence in infancy,
childhood, and adulthood
 A study conducted with first-, third-, and sixth-graders
Intelligence Defined: Views of Scholars and Test Professionals
 Galton (1883) believed that the most intelligent persons were those equipped with the
best sensory abilities.
 For Binet, these components included reasoning, judgment, memory, and abstraction
 David Wechsler: Intelligence, operationally defined, is the aggregate or global capacity
of the individual to act purposefully, to think rationally and to deal effectively with his
environment.
History of IQ test
 Paul Broca (1824-1880) and Sir Francis Galton (1822-1911)
 could determine intelligence by measuring the size of the human skull.
 Wilhelm Wundt (1932-1920) used introspection - the human ability to reflect on their
own thoughts
 Now these ideas are outdated

The first 'real' IQ test (1904)


 The first modern intelligence test in IQ history was developed in 1904, by Alfred Binet
(1857-1911) and Theodore Simon (1873-1961).
 French Ministry of Education
 IQ test consists of several components such as logical reasoning, finding rhyming words
and naming objects.
 Stanford University psychologist Lewis Terman took Binet's original test and
standardized it using a sample of American participants.
 The Stanford-Binet intelligence test used a single number, known as the intelligence
quotient (or IQ), to represent an individual's score on the test. This score was calculated
by dividing the test taker's mental age by their chronological age, and then multiplying
this number by 100. For example, a child with a mental age of 12 and a chronological age
of 10 would have an IQ of 120 (12 /10 x 100).

Army Alpha, Army Beta (1917)


 At the outset of World War I, U.S. Army officials were faced with the task of screening
an enormous number of recruits.
 In 1917, as chair of the Committee on the Psychological Examination of Recruits,
psychologist Robert Yerkes developed two tests known as the Army Alpha and Beta
tests.
 At the end of WWI, the tests remained in use in a wide variety of situations outside of the
military with individuals of all ages, backgrounds and nationalities.
The Wechsler Intelligence Scales
Wechsler also developed two different tests specifically for use with children: the
Wechsler Intelligence Scale for Children (WISC) and the Wechsler Preschool and Primary Scale
of Intelligence (WPPSI). The adult version of the test has been revised since its original
publication and is now known as the WAIS-IV.

History Wechsler Intelligence Scales


 This theory differed greatly from the Binet scale which, in Wechsler's day, was generally
considered the supreme authority with regard to intelligence testing. A drastically revised
new version of the Binet scale, released in 1937, received a great deal of criticism
from David Wechsler.
 He felt that the 1937 Binet scale did not do a good job of incorporating these factors into
the scale.
 Did not agree with the idea of a single score that the Binet test gave
 Argued that the Binet scale items were not valid
 Criticized the existing Binet scale because "it did not consider that intellectual
performance could deteriorate as a person grew older
Charles E. Spearman (1904)
Two factor theory of intelligence
 Developed a statistical procedure called factor analysis.
 Related variables are tested for correlation to each other
 Distinguishing pitch, perceiving weight and colors, directions, and mathematics.
 When analyzing the data he collected, Spearman noted
 Concluded that there must be one central factor that influences our cognitive abilities.
 Spearman termed this general intelligence g
 Intelligence consists only two factor General factor (g) and Specific factor (s)
 ‘g’ factor is general mental ability towards different tasks. general mental energy. This
factor is involved in deductive reasoning and is linked to the "skill, speed, intensity, and
extent of intellectual output."
 ‘s’ factor is specific capacity that helps the person to deal with specific problems
 ‘g’ factor remains constant while ‘s’ factor varies with the intellectual activity. For
example, One’s performance in physics= g+s1; Maths =g+s2; English =g+s3
 Different activity requires different combination of ‘g’ and ‘s’.
Primary Mental Abilities
 In 1938, Louis L. Thurstone proposed the Primary Mental Abilities theory. This theory
suggests that the human intelligence is constituted by seven independent primary mental
abilities. These are the following:
 o Verbal Comprehension
 o Verbal Fluency
 o Number or Arithmetic Ability
 o Memory
 o Perceptual Speed
 o Inductive Reasoning
 o Spatial Visualization
Fluid and crystallized intelligence
 In 1966, Raymond B. Cattell and John Horn developed the Fluid and
Crystallized Intelligence theory. That is, Intelligence consists of two parts, the fluid
intelligence and the crystallized intelligence.
Cattell-Horn_Carroll theory of cognitive abilities (1996)
➢ Fluid Intelligence (Gf):
 Crystallized Intelligence (Gc):
➢ Visual Processing (Gv)
➢ Auditory Processing (Ga)
➢ Short-Term Memory (Gsm)
➢ Long-Term Storage and Retrieval (Glr)
➢ Processing Speed (Gs)
➢ Decision Speed (Gt)
➢ Quantitative Knowledge (Gq)
Theory of Multiple Intelligences (1983) by Howard Gardner
 Linguistic Intelligence- writers, poets, lawyers
 Logical-Mathematical Intelligence-
 Spatial Intelligence- architects, geographers, pilots, sailors
 Musical Intelligence- talent in vocals n instruments
 Bodily-Kinesthetic Intelligence-athletes, actors, dancers
 Interpersonal Intelligence- counselor, social worker, religious leaders
 Intrapersonal Intelligence-artists, writers
Triarchic Theory of Human Intelligence
 Sternberg's (1985) Triarchic Theory of Human Intelligence includes three facets:
 Analytical Intelligence (componential)- ability to solve academic problems such as
analogies/puzzles
 Creative Intelligence (experiential)- ability to use prior knowledge in new/innovative
ways
 Practical Intelligence (contextual)- ability to understand n deal effectively with
everyday tasks.
Theory of Successful Intelligence
 Sternberg (1999) holds that individuals who excel in all areas of the triarchic intelligence
test may be considered to have successful intelligence, which he defines as the ability to
achieve success in accordance with one's personal standards and within one's socio-
cultural context.
The three-stratum theory of cognitive abilities
 Proposed by the American psychologist John Carroll in 1993. It is based on a factor-
analytic study of the correlation of individual-difference variables from data such as
psychological tests, school marks and competence ratings.
 The three-stratum theory is derived primarily from Spearman's (1927) model of general
intelligence and Horn & Cattell's (1966) theory of fluid and crystallized intelligence.
 There are a fairly large number of distinct individual differences in cognitive ability, and
that the relationships among them can be derived by classifying them into 3 different
strata: stratum I, "narrow" abilities; stratum II, "broad" abilities; and stratum III,
consisting of a single "general" ability.
Guilford’s Structure of Intellect model
 Evaluation of semantic units (EMU) is measured by the ideational fluency test in which
individuals are asked to make judgements about concepts. For example: "Which of the
following objects best satisfies the criteria, hard and round: an iron, a button, a tennis ball
or a lightbulb? On the other hand, divergent production of semantic units (DMU) would
require the person to list all items they can think of that are round and hard in a given
time period.
Emotional Intelligence
 Conceptualized EQ as it related to intellectual and emotional growth. This processing
model contained four branches:
Perceiving Emotions Facilitating Thought
Understanding Emotions Managing Emotions
The Information-Processing View
 Russian neuropsychologist Aleksandr Luria
 How information is processed, rather than what is processed. Two basic types
 In simultaneous (or parallel) processing, information is integrated all at one time. In
successive (or sequential) processing, each bit of information is individually processed in
sequence
 Examples, Sherlock Holmes, Telephone number, admiring a painting
Binet and Wechsler intelligence scales
DEFINITION OF INTELLIGENCE:
“Intelligence is the aggregate or global capacity of the individual to act purposefully, to think
rationally and to deal effectively with his environment” (Wechsler, 1944)
THEODORE SIMON ALFRED BINET
10 July 1872 – 4 September 1961 July 8, 1857 – October 18, 1911
French psychologist French Psychologist
Influenced by Alfred Binet who was also his Influenced by the work of John Stuart Mill
supervisor

• In 1904 a French professional group for child psychology, was called upon by the French
government to appoint a commission on the education of retarded children. Binet, being
an active member of this group, found the impetus for the development of his mental
scale.
• In 1905 Binet and Simon developed the first Intelligence Test known as the Binet-Simon
Scale. The test was developed in order to identify children with learning disabilities so
that they might be placed in a special class.
• 1908, H.H. Goddard
• 1916 Lewis Terman
• The revised Stanford-Binet scale measured five weighted factors and consists of both
verbal and nonverbal subtests.
The original Simon-Binet Intelligence Scale was composed of 30 tasks, including items on
memory, vocabulary, verbal ability, and reasoning as:
• Le Regard (coordination in the movement of the head and the eyes which is associated
with the act of vision)
• Prehension Provoked by a Tactile Stimulus (coordination exists b/w a tactile stimulus
of the hand, and the movement of seizing and carrying to the mouth)
• Prehension Provoked by a Visual Perception (coordination exists b/w the sight of an
object and its prehension, when the object is not placed in contact with the hand of the
subject).
• Recognition of Food (subject can make the distinction by sight b/w familiar food and
what cannot be eaten)
• Quest of Food Complicated by a Slight Mechanical Difficulty (to bring into play a
rudiment of memory, an effort of will, and a coordination of movements)
1-Verbal Knowledge of Objects 2-Verbal Knowledge of Pictures
3- Naming of Designated Objects 4- Comparison of Two Weights
5- Immediate Comparison of Two Lines of Unequal Lengths
6-Repetition of Three Figures 7- Suggestibility
8-Verbal Definition of Known Objects 9-Definitions of Abstract Terms
10- Repetition of Sentences of Fifteen Words
11- Comparison of Known Objects from Memory 12- Exercise of Memory on Pictures
13- Drawing a Design from Memory 14- Immediate Repetition of Figures
15- Resemblances of Several Known Objects Given from Memory
16- Comparison of Lengths 17- Five Weights to be Placed in Order
18- Gap in Weights 19- Exercise upon Rhymes
20- Verbal Gaps to be Filled
21- Synthesis of Three Words in One Sentence 22- Reply to an Abstract Question
23- Reversal of the Hands of a Clock 24- Paper Cutting
REVISIONS OF BINET INTELLIGENCE SCALE
• April 1905: Development of Binet-Simon Test announced at a conference in Rome
• June 1905: Binet-Simon Intelligence Test introduced
• 1908 and 1911: New Versions of Binet-Simon Intelligence Test
• 1916: Stanford–Binet First Edition by Terman
• 1937: Second Edition by Terman and Merrill
• 1973: Third Edition by Merrill
• 1986: Fourth Edition by Thorndike, Hagen, and Sattler
• 2003: Fifth Edition by Roid
STANFORD-BINET 5th ADDITION
• Fifth Edition is based in the schooling process to assess intelligence.
• It is also capable of measuring multiple dimensions of abilities.
• It can be administered to individuals as early as two years of age.
• There are ten subsets included in this revision including both verbal and nonverbal
domains.
• Five factors are also incorporated as;
Fluid Knowledge Quantitative Visual-Spatial Working
Reasoning Reasoning Processing Memory
Early reasoning Vocabulary Non-verbal Form board and Delayed
quantitative form patterns response (non-
reasoning(non- (non-verbal) verbal)
verbal)
Verbal Procedural Verbal Position and Block span (non-
absurdities knowledge (non- quantitative direction verbal)
verbal) reasoning
Verbal analogies Picture Memory for
absurdities (non- sentences
verbal)
Object series Last word
matrices(non-
verbal)

• Depending on age and ability, administration can range from 15 minutes to 1 hour 15
minutes.
• It incorporated a new scoring system, which can provide a wide range of information
such as four intelligence score composites, five factor indices, and ten subtest scores.
• Additional scoring information includes percentile ranks, age equivalents. Extended IQ
scores and gifted composite scores are available.
• In order to reduce errors and increase diagnostic precision, scores are obtained
electronically through the use of computers now.
IQ Range IQ Classification
145-160 Very gifted or highly advanced
130-144 Gifted or very advanced
120-129 Superior
110-119 High average
90-109 Average
80-89 Low average
70-79 Borderlined impaired or delayed
55-69 Midly impaired or delayed
40-54 Moderately impaired or delayed
RELIABILITY
Several reliability tests performed on it are split-half reliability, standard error of measurement
and test-retest stability. On average, the IQ scores for this scale have been found to be quite
stable across time (Janzen, Obrzut, & Marusiak, 2003).
Internal consistency was tested by split-half reliability and was reported to be substantial and
comparable to other cognitive batteries (Bain & Allin, 2005).
The test has also been found to have great precision at advanced levels of performance meaning
that the test is especially useful in testing children for giftedness (Bain & Allin, 2005).
Re-administration can occur in a six-month interval rather than one year due to the small mean
differences in reliability (Bain & Allin, 2005).
VALIDITY
Content validity has been found based on the professional judgments Roid received concerning
fairness of items and item content as well as items concerning the assessment of giftedness (Bain
& Allin, 2005).
CRITICISM:
• The test is not being able to compare people of different age categories, since each
category gets a different set of tests.
• Furthermore, very young children tend to do poorly on the test due as they lack the ability
to concentrate long enough to finish it.
PRESENT USE:
• Clinical and neuropsychological assessment
• Educational placement,
• Career assessment,
• Adult neuropsychological treatment,
• Forensics and research on aptitude.
WECHSLER SCALES
• David Wechsler (1896 - 1981)
• American psychologist
• Developed prominent intelligence scales:
Wechsler Adult Intelligence Scale (WAIS)
Wechsler Intelligence Scale for Children (WISC)
Wechsler’s Criticism on Binet Scale
Wechsler was an influential advocate for concept of non-intellective factors
(non-intellective factors are variables that contribute to the overall score in intelligence, but are
not made up of intelligence-related items. These include lack of confidence, fear of failure,
attitudes etc).
• He felt that the 1937 Binet scale did not do a good job of incorporating these factors into
the scale
• Argued that the Binet scale items focused on use with children rather than adults.
• The "Binet scale's emphasis on speed, with timed tasks scattered throughout the scale,
tended to unduly handicap older adults."
• He believed that "mental age norms clearly did not apply to adults."
• He criticized Binet scale because "it did not consider that intellectual performance could
deteriorate as a person grew older. “
These criticisms of the 1937 Binet test helped produce the Wechsler–Bellevue scale, released in
1939.
Many of the original concepts Wechsler argued for, have become standards in psychological
testing, including the point-scale concept and the performance-scale concept.
AGE SCALE (BINET)
• In the Binet scales (prior to the 1986 version) items were grouped according to age level.
• Each of these age levels was composed of a group of tasks that could be passed by two-
thirds to three-quarters of the individuals in that level. This meant that items were not
arranged according to content.
• Additionally, an individual taking a Binet test would only receive credit if a certain
number of the tasks were completed. This meant that falling short just one task required
for the credit, resulted in no credit at all
For example, if passing three out of four tasks was required to receive credit, then passing two
yielded no credit)
POINT SCALE (WECHSLER)
• The point scale concept assigned credits or points to each item of the test.
• This had two large effects. First, this allowed items to be grouped according to content.
• Second, participants were able to receive a set number of points or credits for each item
passed.
• The result was a test that could be made up of different content areas (or subtests) with
both an overall score and a score for each content area. In turn, this allowed for an
analysis to be made of an individual's ability in a variety of content areas.
NON-VERBAL PERFORMANCE SCALE
• Essentially, this scale required a subject to do something (such as copying symbols or
point to a missing detail) rather than just answer questions. This was an important
development as it attempted to overcome biases that were caused by "language, culture,
and education."
• Further, this scale also provided an opportunity to observe a different type of behavior
because something physical was required. Clinicians were able to observe how a
participant reacted to the "longer interval of sustained effort, concentration, and attention"
that the performance tasks required.
• IQ test designed to measure intelligence and cognitive ability in adults and older
adolescents.
• The WAIS was initially created as a revision of the Wechsler–Bellevue Intelligence Scale
(WBIS), which was a battery of tests published by Wechsler in 1939.
• First released in February 1955 by David Wechsler
• The WBIS was composed of subtests that consisted of six verbal and five performance
subtests. The verbal subtests were: Information, Comprehension, Arithmetic, Digit
Span, Similarities, and Vocabulary. The Performance subtests were: Picture
Arrangement, Picture Completion, Block Design, Object Assembly, and Digit Symbol.
• A verbal IQ, performance IQ and full scale IQ were obtained.
WAIS-R
• A revised form was released in 1981 and this edition did not provide new validity data,
but used the data from the original WAIS.
WAIS-III
• It was released in 1997. In this version four secondary indices were introduced (Verbal
Comprehension, Working Memory, Perceptual Organization, and Processing Speed)
WAIS-IV
• Was released in 2008
• It is appropriate for use with individuals aged 16–90 years.
• Composed of 10 core subtests and 5 supplemental subtests, with the 10 core subtests
comprising the Full Scale IQ.
• The verbal/performance subscales were replaced by the index scores, General Ability
Index (GAI), The GAI is clinically useful because it can be used as a measure of
cognitive abilities that are less vulnerable to impairments of processing and working
memory.
Index scores and scales
• Verbal Comprehension Index (VCI)
• Perceptual Reasoning Index (PRI)
• Working Memory Index (WMI)
• Processing Speed Index (PSI)
Two broad scores can be used to summarize general intellectual abilities, can also be derived:
• Full Scale IQ (FSIQ), based on the total combined performance of the VCI, PRI, WMI,
and PSI
• General Ability Index (GAI), based only on the six subtests that the VCI and PRI

RELIABILITY AND VALIDITY


• WAIS is a well-established scale and it has fairly high consistency.
• Over a two-to-twelve-week time period, the test-retest reliabilities ranged from 0.70 (7
subscales) to 0.90 (2 subscales).
• Inter-scorer coefficients were very high, all being above 0.90.
• According to the test manual, the instrument targets three areas as; psychoeducational
disability, neuropsychiatric and organic dysfunction, and giftedness.
WECHSLER INTELLIGENCE SCALE FOR CHILDREN (WISC)
• It is an individually administered intelligence test for children between the ages of 6 and
16.
• It was originally developed in 1949
• WISC-R was published in 1974
• WISC-III was published in 1991 and brought in new subtest as a measure of processing
speed. Four new index scores were introduced:
Verbal Comprehension Index (VCI) Perceptual Organization Index (POI)
Freedom from Distractibility Index (FDI) Processing Speed Index (PSI)
• WISC-IV was produced in 2003
• WISC-V introduced in 2014 is the most current version.
• This test takes 45–65 minutes to administer.
• This test also generates a Full Scale IQ (IQ score) that represents a child's general
intellectual ability.
WAIS-V provides five primary index scores:
• Verbal Comprehension Index (VCI)
• Visual Spatial Index (VSI)
• Fluid Reasoning Index (FRI)
• Working Memory Index (WMI)
• Processing Speed Index (PSI)
These indices represent a child's abilities in discrete cognitive domains. Two subtests must be
administered to obtain each of the primary index scores; thus there are 10 primary subtests. The
Full Scale IQ is derived from 7 of the 10 primary subtests
WISC-V also contains five ancillary index scores as;
• Quantitative Reasoning Index (QRI)
• Auditory Working Memory Index (AWMI)
• Nonverbal Index (NVI)
• General Ability Index (GAI)
• Cognitive Proficiency Index (CPI)
Three of these ancillary index scores (NVI, GAI, and CPI) can be derived from the 10 primary
subtests.
Two ancillary index scores termed the expanded index scores were released the year after the
2014 publication, so are not included in the published manuals.
Verbal (Expanded Crystallized) Index (VECI) and the Expanded Fluid Index (EFI)

RELAIBILITY AND VALIDITY


WISC–V is also linked with measures of achievement, adaptive behavior, executive function,
and behavior and emotion. A number of concurrent studies were conducted to examine the
scale's reliability and validity.
Evidence of the convergent and discriminant validity of the WISC–V is provided by
correlational studies with the following instruments: WISC–IV, WAIS–IV, KABC–II, Vineland–
II, and BASC–II.
Evidence of construct validity was provided through a series of factor-analytic studies and mean
comparisons using matched samples of special group and nonclinical children.
SCORING AND INTERPRETATION OF WAIS/WISC
Calculating the Examinee’s Chronological Age
• The examinee’s exact age is needed to locate the correct norms tables.
• Enter the date of testing and examinee’s birth of date then Subtract the D.O.B from the
date of testing.
Recording Responses and Scores
• The following standard abbreviations may be useful in recording responses.
• P (pass) – the examinee responded or performed correctly
• F (fail) – the examinee responded or performed incorrectly.
Q (Question/ Query) – a query or question was asked to clarify the examinee’s response
SCORING AND INTERPRETATION OF WAIS/WISC
Calculating the Examinee’s Chronological Age
• The examinee’s exact age is needed to locate the correct norms tables.
• Enter the date of testing and examinee’s birth of date then Subtract the D.O.B from the
date of testing.
Recording Responses and Scores
• The following standard abbreviations may be useful in recording responses.
• P (pass) – the examinee responded or performed correctly
• F (fail) – the examinee responded or performed incorrectly.
• Q (Question/ Query) – a query or question was asked to clarify the examinee’s response.
• DK (don’t know) – the examinee indicated by shaking his or her head or saying, “I don’t
know”.
• NR (no response) – the examinee made no response to an item, either verbally or
gestural.
• INC (incomplete) – the examinee did not complete an item within the time limit.
• PC (Points correctly) – the examinee pointed to the missing part or correct choice on the
Picture Completion or Matrix reasoning subtest.
• PX (points incorrectly) – the examinee pointed to an incorrect missing part or choice on
the Picture Completion or Matrix reasoning subtest.
Converting Raw Scores to Scaled Scores
• Scaled scores are based on the examinee’s age (as calculated on the Demographic Page of
the Record Form.) the examinee’s age in years determines which page of the table should
be used.
• After you have obtained the raw scores for each subset by summing the item scores,
transfer the raw scores to the raw score conversion table on the Score Conversion Page of
the Record Form. The subtests are listed according to administration order.
• The subtests are ordered according to Verbal/Performance scale and administration order.
For each subtest, find the raw score under the subtest name. Then, reading across from
this raw score to the extreme left/right column, find the equivalent scaled score for the
subtest.
• Enter the subtest scaled score in the raw score conversion table in the box to the right of
the previously recorded raw score. Enter the subtests scaled score value in the Verbal or
Performance column and in the VC, PO, WM, or PS column. For example the Picture
Completion scaled score is entered in the Performance column and in the PO column.
Obtaining Sum of Scaled Scores
• The examinee’s verbal/performance score is the sum of scaled scores on the subtests.
• The full-scale score is the sum of the verbal score and the performance score; therefore, it
is the sum of 11 subtest scaled scores.
Determining the IQ and Percentile
IQ LEVELS
IQ Level Descriptive Classification Percentile
130+ Very Superior 98 - 99.9
120 to 129 Superior 91 - 97
110 to 119 High Average 75 - 90
90 to 109 Average 25 - 73
80 to 89 Low Average 9 - 23
70 to 79 Borderline 2-8
69 & below Intellectual Disability .01 – 2

Personality Assessment: an overview


Personality and Personality Assessment
• Personality: An individual’s unique constellation of psychological traits that is relatively
stable over time.
• Personality assessment: The measurement and evaluation of psychological traits, states,
values, interests, attitudes, worldview, acculturation, sense of humor, cognitive and
behavioral styles, and/or related individual characteristics.
Personality trait: “Any distinguishable, relatively enduring way in which one individual varies
from another”
John Holland argued that most people can be categorized as one of six personality types: Artistic,
Enterprising, Investigative, Social, Realistic, or Conventional.
Developed the Self-Directed Search test (SDS; Holland et al., 1994), a self-administered and
self-scored aid to offer vocational assistance.
Personality types.
• Cardiologists Meyer Friedman and Ray Rosenman developed a two-category personality
typology.
• Type A personality: A personality type characterized by competitiveness, haste,
restlessness, impatience, feelings of being time-pressured, and strong needs for
achievement and dominance.
• Type B personality: Characterized as being mellow or laid-back.
• A personality type that is completely opposite of Type A personality.
• Results from the Minnesota Multiphasic Personality Inventory (MMPI) is frequently
discussed in terms of the patterns of scores that emerge and these patterns are referred to
as a profile.
• Personality profile: A narrative description of the extent to which a person has
demonstrated certain personality traits, states, or types.
• Personality state: The transitory exhibition of some personality trait, a relatively
temporary predisposition.
• Measuring personality states amounts to a search for and an assessment of the strength of
traits that are relatively transitory or situation-specific.
Personality Assessment: Some Basic Questions
Why assess personality?
Aspects of personality could be explored in:
1- Identifying determinants of knowledge about health.
2- Categorizing different types of commitment in intimate relationships.
3- Determining peer response to a team’s weakest link.
4- Identifying those prone to terrorism in the service of national defense.
5- Tracking trait development over time.
6- Studying some uniquely human characteristic such as moral judgment

Who is being assessed and who is assessing?


Some methods of personality assessment rely on the assessee’s own self-report.
Assessees may respond to interview questions and answer questionnaires in writing or on a
computer. Some forms of personality assessment rely on informants such as parents, teachers, or
peers. Self-report methods are very common when exploring an assessee’s self-concept.
Self-concept: One’s attitudes, beliefs, opinions, and related thoughts about oneself. Some
self-concept measures are based on the notion that states and traits related to self-concept are to
a large degree context-dependent. Self-concept differentiation: The degree to which a person
has different self-concepts in different roles. In some situations, the best available method for
assessment of personality and/or behavior involves a third party (e.g., a parent, teacher, or
spouse). It is necessary to proceed with caution when using a third-party referent for personality
assessment Knowledge of the context of the evaluation and the dynamics of the relationship
between the rater and the assessee is important.Raters may vary in the extent to which they are
neutral.
What is assessed when a personality assessment is conducted?
Some tests are designed to measure particular traits (example: introversion) or states
(example: test anxiety). Other tests focus on descriptions of behavior, usually in particular
contexts.

• Response style: A tendency to respond to a test item or interview question in


some characteristic manner regardless of the content of the item or question.
• Impression management: The attempt to manipulate others’ impressions through
“the selective exposure of some information…coupled with suppression of [other]
information” (Braginsky et al., 1969, p. 51)
• Response styles can affect the validity of the outcome and can be countered
through the use of a validity scale.
• Validity scale: A subscale of a test designed to assist in judgments regarding how
honestly the test taker responded and whether responses were products of
response style, carelessness, deception, or misunderstanding.
Where are personality assessments conducted?

Traditional sites include schools, clinics, hospitals, academic research laboratories,


employment counseling, vocational selection centers, and the offices of psychologists and
counselors. Personality assessors can also be found observing behavior and making assessments
in natural settings.

How are personality assessments structured and conducted?


The scope of an evaluation may be very wide, seeking to take a general inventory of an
individual’s personality. The California Psychological Inventory (CPI 434) is an example of such
an evaluation; it yields information on many personality-related variables such as responsibility
and dominance. Some instruments purport to measure a much narrower scope. Instruments used
in personality assessment vary in the extent to which they are based on a theory of personality.
An example of a theory-based instrument is the
• Blacky Pictures Test (Blum,1950)
• Other tests are atheoretical, such as the MMPI
Personality may be assessed by many different methods, such as face-to-face interviews,
computer-administered tests, behavioral observation, paper-and-pencil tests, evaluation of case
history data, evaluation of portfolio data, and recording of physiological responses. Measures of
personality vary in terms of their structure, with some measures being very structured and others
being relatively unstructured.
• Frame of reference: Aspects of the focus of exploration such as the time frame
(the past, present, or future) as well as other contextual issues that involve people,
places, and events.
• Q-sort technique: An assessment technique in which the task is to sort a group of
statements, usually in perceived rank order ranging from most to least descriptive.
Carl Rogers utilized this technique to identify the discrepancy between the perceived actual self
and the ideal self.
Personality measures differ with respect to the way conclusions are drawn from the data they
provide.
• Nomothetic approach: Characterized by efforts to learn how a limited number of
personality traits can be applied to all people.
• Idiographic approach: Characterized by efforts to learn about each individual’s
unique constellation of personality traits.
• Normative approach: A test taker’s responses and the presumed strength of a
measured trait are interpreted relative to the strength of that trait in a sample of a
larger population.
• Ipsative approach: A test taker’s responses and the presumed strength of
measured traits are interpreted relative to the strength of measured traits for that
same individual.
Issues in personality test development and use
Personality assessment that relies exclusively on self-report is vulnerable to false
outcomes. Building validity scales into self-report tests provides some protection against false
results. Assessors can affirm the accuracy of self-reported information by consulting external
sources such as peer raters.
Developing Instruments to Assess Personality
Logic and reason may dictate what content is covered by the items on a personality test.
The use of logic and reason in the development of test items is sometimes referred to as the
content or content-oriented approach to test development. A review of the literature on the aspect
of personality that test items are designed to tap will frequently be very helpful to test
developers.
Personality measures differ in the extent to which they rely on a particular theory of
personality in their development and interpretation. Data reduction methods are another class of
widely used tool in contemporary test development. Such methods are used to aid in the
identification of the minimum number of variables or factors that account for the
intercorrelations in observed phenomena.
• In the 1940s Cattell reduced a list of more than 18,000 personality trait names
(produced by Allport and Odbert in 1936) to only 16 “primary” factors of
personality.
• Some have argued that the 16 primary factors may be measuring fewer than 16
factors, because several of the factors are substantially intercorrelated.
• The Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992) is a
measure of five major dimensions of personality and 30 facets that define each
dimension (extraversion, neuroticism, openness, agreeableness, and
conscientiousness).
Criterion groups.
• Criterion: A standard on which a judgment or decision can be made.
• Criterion group: A reference group of test takers who share specific characteristics and
whose responses to test items serve as a standard according to which items will be
included or discarded from the final version of a scale.
• Empirical criterion keying: The process of using criterion groups to develop test items.
Development of a test by means of empirical criterion keying involves the following:
• Creation of a large preliminary pool of test items from which the final form of the test
will be selected.
• Administration of the preliminary pool to at least two groups of people: (1) a criterion
group of people known to possess the trait being measured and (2) a random sample.
• Conduct an item analysis to select items indicative of membership in the criterion group.
• Obtain data on test performance from a standardization sample of test takers who are
representative of the population from which future test takers will come.
MMPI

• Has three validity scales: the L scale (the Lie scale), the F scale (the Frequency scale),
and the K scale (Correction scale).
• The L scale will call into question the examinee’s honesty.
• The F scale contains items that are infrequently endorsed by members of nonpsychiatric
populations and do not fall into any known pattern of deviance, which can help determine
if the examinee responded to questions randomly and if he/she is a malingerer.
• The K score is associated with defensiveness and social desirability.
• The MMPI has a fourth scale, the Cannot Say scale (denoted with?), which functions as a
frequency count of the number of items to which the examinee responded cannot say or
failed to mark any response.
• The validity of an answer sheet with a cannot say count of 30 or higher is called into
question.
• Harris-Lingoes subscales are groupings of items into subscales (with labels such as
Brooding) that were designed for internal consistency.

Following publication it was found that the MMPI could not be scored into neat
diagnostic categories. Hathaway and McKinley (1943) suggested a configural interpretation
of scores, that is, interpretation based on a pattern or profile. Paul Meehl (1951) proposed a
2-point code derived from the numbers of clinical scales on which the test taker achieved the
highest scores. Welsh codes were another popular approach to scoring and interpretation.
The MMPI-2 is quite similar to its predecessor, though some important differences exist,

• The MMPI-2 was normed on a more representative standardization sample.


• Some content was rewritten to correct grammatical errors and make the language
more contemporary and less discriminatory.

• Items that addressed topics such as drug abuse, suicidality, marital adjustment,
attitudes toward work, and Type A behavior patterns were added.

• Three additional validity scales were added: Back-Page Infrequency (Fb), True
Response Inconsistency (TRIN), and Variable Response Inconsistency (VRIN).
The MMPI-2 RF was devised in response to two basic problems with the MMPI-2.

• Overlapping item: There was an average of more than six overlapping items per
pair of clinical scales in MMPI-2.

• A pervasive factor (referred to as anxiety, despair, malaise, and maladjustment)


that was common to most forms of psychopathology but unique to none.
One goal of restructuring the MMPI-2 into the MMPI-2 RF was to make the clinical scales more
distinctive and meaningful.
The MMPI-A-RF was developed in response to skepticism about the applicability of the MMPI
to adolescents.

• It contains 10 clinical scales and seven validity scales.


• In addition to basic clinical and validity scales, the MMPI-A contains many
supplementary scales.
• Evaluates aspects of internalizing, externalizing, and somatic symptoms of distress.

Personality Assessment and Culture


Before any tool of personality assessment can be employed, and before data is imbued
with meaning, the assessor must consider important issues with regard the assessment of the
assessee.
Acculturation: An ongoing process by which an individual’s thoughts, behaviors,
values, worldview, and identity develop in relation to the thinking, behavior, customs, and values
of a particular cultural group. cculturation begins at birth and proceeds throughout development.
Important to a discussion of acculturation is a understanding of values.
• Instrumental values: Guiding principles to help one attain some objective
(example: honesty and ambition).

• Terminal values: Guiding principles and a mode of behavior that is an endpoint


objective (example: a comfortable life and a sense of accomplishment).

• Kluckhohn (1954, 1960) conceived of values as answers to key questions with


which civilizations must grapple. For example, in one culture, collectivism is the
ideal, but in another culture, individualism and personal striving are emphasized.
Also important to a discussion of acculturation is the concept of personal identity, or one’s sense
of self.

• Levine and Padilla (1980) defined identification as a process by which an


individual assumes a pattern of behavior characteristic of other people.
An assessee’s worldview must be considered when examining personality.

• Worldview: Assessees’ unique way of interpreting their perceptions as a result of


their experiences, cultural background, and related variables.

Personality Assessment Methods


Objective Methods of Personality Assessment
Typically administered by paper-and-pencil or by computer and contain short-answer items for
which the assessee's task is to select one response from those provided. The term "objective" in
relation to personality measures must be considered cautiously.

• Objective personality tests do not contain one correct answer.


• A distinct lack of objectivity is associated with self-report.
Projective Methods
Projective hypothesis: The idea that an individual supplies structure to unstructured
stimuli in a manner consistent with the individual' s own unique pattern of conscious and
unconscious needs, fears, desires, impulses, conflicts, and ways of perceiving and responding.
Projective techniques are indirect methods of personality assessment,
Inkblots as Projective Stimuli
Rorschach inkblots.
Hermann Rorschach developed a "form interpretation test" using inkblots as the
forms to be interpreted. There is debate on how to precisely classify Rorschach inkblots. Consist
of 10 bilaterally symmetrical inkblots printed on separate cards, half of which are achromatic.
Inkblot cards are initially presented to the testtaker in order from 1 to 10; the testtakers are asked
to interpret the inkblot and are provided a great deal of freedom.
After the entire set of inkblots has been administered, an inquiry is conducted and
the assessor attempts to determine what features of the inkblot played a role in formulating the
testtaker's percept.
A third component, testing the limits, may also be included to enable the
examiner to restructure the situation by asking specific questions concerning personality
functioning.
Hypotheses concerning personality functioning are formed on the basis of
variables such as content and location of the response and the time taken to respond. Rorschach
protocols are scored according to several categories, including location, determinants, content,
popularity, and form. Patterns of response, recurring themes, and interrelationships among the
different categories are all considered in arriving at a final description of the individual from a
Rorschach protocol.
John E. Exner Junior developed a comprehensive system for the administration,
scoring, and interpretation of Rorschach tests. Exner's system brought uniformity to Rorschach
use, but despite such improvements the psychometric properties of the tool are still debated.
Traditional test-retest reliability procedures may be inappropriate for use with the
Rorschach. It is because of the effect of familiarity in response to the cards and because
responses may reflect transient states as opposed to enduring traits.
Thematic Apperception Test (T A T)
Designed by Christiana Morgan and Henry Murray, 1935. 30 picture cards contain
a variety of scenes that present the testtaker with "certain classical human situations." The
administering clinician selects the cards that are believed to elicit responses pertinent to the
objective of testing.
The material used in deriving conclusions includes:
• The stories as they were told by the examinee.
• The clinician's notes about the way or the manner in which the examinee
responded to the cards.
• The clinician’s notes about extra-test behavior and verbalizations.
Interpretive systems incorporate or are based on Henry Murray's concepts of:
• Need: Determinants of behavior arising from within the individual.
• Press: Determinants of behavior arising from within the environment.
• Thema: Unit of interaction between needs and press.
Criticisms for T AT
• Lack of standardization in administration, scoring, and interpretation procedures.
• Testtaker's responses may be affected by situational factors and transient internal need
states.
• Different T A T cards have different stimulus pulls.
Comparison of T A T-Derived Data and Self-Report Derived Data
McClelland et al. (1989).
• Argued that self-report measures yielded self-attributed motives, whereas the T A
T yielded implicit motives.
• Implicit motives: Nonconscious influence on behavior typically acquired on the
basis of experience.
Other Tests Using Pictures as Projective Stimuli
Hand test
• Consists of nine cards with pictures of hands on them and a tenth blank card.
• Testtaker is asked what the hands on each card might be doing.

Rosenzweig Picture-Frustration Study.


• Employs cartoons depicting frustrating situations.
• Testtaker is asked to fill in the response of the cartoon figure being frustrated.
Responses are scored in terms of the type of reaction elicited and the direction of the aggression
expressed.
• Intropunitive: Aggression turned inward.
• Extra punitive: Outwardly expressed.
• Inpunitive: Aggression is evaded so as to avoid or gloss over the situation.
Testtaker's reactions are grouped into the following categories:
• Obstacle dominance: Response concentrates on the frustrating barrier.
• Ego defense: When attention is focused on protecting the frustrated
person.
• Need persistence: When attention is focused on solving the frustrating
problem.
Apperceptive Personality Test (A P T).
• Consists of eight stimulus cards that depict recognizable people in everyday
settings.
• Test takers need to respond to a series of multiple-choice questions after telling a
story, either orally or in writing, about each of the A P T pictures.
Words as Projective Stimuli
Word association tests: Semistructured, individually administered, projective
technique of personality assessment that involves the presentation of a list of stimulus words.
Assessee is expected to respond with whatever comes to mind first upon exposure to the stimulus
word. Responses are analyzed on the basis of content and other variables.
Sentence completion test: Semistructured projective technique of personality assessment that
involves the presentation of a list of words that begin a sentence. Assessee's task is to respond by
finishing each sentence with whatever words come to mind.
Sentence completion stems: May be developed for use in specific settings or for
specific purposes. May be relatively atheoretical or linked closely to some theory.
Projective Test
Developed by B. F. Skinner.
Device was similar to auditory inkblots. Created a series of recorded sounds much
like muffled, spoken vowels, to which people would be instructed to associate. The device was
called verbal summator. There was little compelling evidence to show that the instrument could
differentiate between members of clinical and nonclinical groups.
Production of Figure Drawings
Figure drawing test: Assessee produces a drawing that is analyzed on the basis of its content
and related variables.
• Various characteristics of the drawing and that of the individual drawn are
formally evaluated in Draw A Person (D A P) test.
• House-Tree-Person test: Testtaker's task is to draw a picture of a house, a tree, and
a person; it is considered symbolically significant.
• Kinetic Family Drawing (KFD): Helps learn about the examinee in relation to
his/her family in the form of examinee verbalizations while the drawing is being
executed.
Psychometric Considerations of Projective Techniques
Criticism of projective techniques. Uncontrolled variations in:
Protocol length. Inappropriate subject samples.
Inadequate control groups. Poor external criteria.
Behavioral Assessment Methods
Emphasis is on what a person does in situations rather than on inferences about what
attributes he/she has more globally. Differences between traditional and behavioral approaches to
psychological assessment have to do with key variables: 1- Nature of personality. 2- Causes of
behavior.
Methods for Recording Frequency and Intensity of Target Behavior
Timeline followback (TLFB) methodology: Originally designed for use in the context
of a clinical interview for the purpose of assessing alcohol abuse. Has been used to evaluate
problem behaviors, such as gambling, maternal smoking, and HIV risk behaviors.
Ecological momentary assessment: Used to analyze the immediate antecedents of
cigarette smoking.
Behavioral observation. Behavior rating scale.
Self-monitoring. Reactivity: Possible changes in an assessee's behavior, thinking, or
performance that may arise in response to being observed, assessed, or evaluated.
Analogue study: Research investigation in which one or more variables are similar or
analogous to the real variable that the investigator wishes to examine.
Varieties of Behavioral Assessment
Analogue behavioral observation: Observation of a person in an environment designed
to increase the chance that the assessor can observe targeted behaviors and interactions
Situational performance measure.
• Procedure that allows for observation and evaluation of an individual under a
standard set of circumstances.
• Leaderless group technique: Several people are organized into a group for the
purpose of carrying out a task as an observer records their information related to
individual group members' initiative, cooperation, leadership, and related
variables.
Varieties of Behavioral Assessment
Role play.
Psychophysiological methods.
•Biofeedback: Designed to gauge, display, and record a continuous monitoring of
selected biological processes.
• Plethysmograph: Biofeedback instrument that records changes in the volume of
a part of the body arising from variations in blood supply.
• Penile plethysmograph: Instrument designed to measure changes in blood flow,
but more specifically blood flow to the penis.
• Polygraph: Lie detector test.
• Unobtrusive measures: A telling physical trace or record.
Issues in Behavioral Assessment
Contrast effect. Behavioral rating may be excessively positive or negative because a
prior rating was excessively negative or positive. Solution: Composite judgment can be used.
Composite judgment: Averaging of multiple judgments.
Reactivity: Changes in an assessee's behavior, thinking, or performance. May arise in
response to being observed, assessed, or evaluated. Solutions: Hidden observers or clandestine
recording techniques can be used. Equipment costs and cost of training behavioral assessors.
A Perspective
Mental health professionals are called upon by society to make diagnostic judgments and
intervention on the basis of very little information. The clinical approach is more favorable than
the actuarial approach in those situations where data are insufficient to formulate rules for
decision making and prediction. Mostly, contemporary practitioners adopt the actuarial
approach.

Aptitude vs Achievement Tests


Aptitude Tests Achievement Tests

• What a person can do (hasn’t done • What a person has already done
yet) • What a person has generally learned
• How well a person might perform in prior to being tested
school or employment situation • Measures recent learning in specific
• Examine broader range of knowledge subjects
and experience • Measures acquired skill or knowledge
• Measures natural talents abilities • Do not measure higher order thinking
• Higher order skills skills, problem solving and teamwork
• Measure certain special abilities that a • Assesses specific subject or area of
person may possess knowledge
• Assess persons overall performance
over a broad area of mental
capabilities
• Provides general profile of persons
strengths and weaknesses

But they look similar…. How?


• Because knowledge that a person has already acquired (achievement) is usually a reliable
predictor of their success at more advanced levels (aptitude), Particulalry when you are
assessing Aptitude in Academia…
• e.g. SAT (Aptitude) vs ACT (Achievement)
• The ACT is an achievement test, measuring what a student has learned in school (more
specific with the content that schildren have learnt at school).
• The ACT has up to 5 components: English, Mathematics, Reading, Science, and
an optional Writing Test.
• The SAT is more of an aptitude test, testing reasoning and verbal abilities (more general,
predicting College success, what a person can learn)

Theories Of Measurement
• Lead to descriptions of examinees' abilities, independent of particular choice of test items
and sample.
• They provide framework for considering issues and addressing technical problems (e.g.
issue of measurement error)
• Theories: General framework of linking observable variables to unobservable variables
such as true scores or ability scores
• Models specify a detailed relationship of the variables in light of theoretical concepts
• Models provide incomplete representation of the test data to which they fit
• So, all models are wrong, but some are useful.

Classical Test Theory


• Classical Test Theory (CTT) - analyses are the easiest and most widely used form of
analyses. The statistics can be computed by readily available statistical packages (or
even by hand)
• Classical Analyses are performed on the test as a whole rather than on the item and
although item statistics can be generated, they apply only to that group of students on that
collection of items
• CTT is based on the true score model
• In CTT we assume that
• True scores and error cores are uncorrelated
• Average error score in population is equal to zero
• Error scores on parallel forms are uncorrelated
• Factor structure is common across parallel forms
Statistics
• Difficulty (item level statistic)
• Discrimination (item level statistic)
• Reliability (test level statistic)
Classical Test Theory vs. Latent Trait Models
• Classical analysis has the test (not the item) as its basis. Although the statistics generated
are often generalised to similar students taking a similar test; they only really apply to
those students taking that test
• Latent trait models aim to look beyond that at the underlying traits which are producing
the test performance. They are measured at item level and provide sample-free
measurement
Latent Trait Models
• Latent trait models have been around since the 1940s, but were not widely used until the
1960s. Although theoretically possible, it is practically unfeasible to use these without
specialized software.
• They aim to measure the underlying ability (or trait) which is producing the test
performance rather than measuring performance per se.
• This leads to them being sample-free. As the statistics are not dependant on the test
situation which generated them, they can be used more flexibly
Item Response Theory
• Item Response Theory (IRT) – refers to a family of latent trait models used to establish
psychometric properties of items and scales
• Sometimes referred to as modern psychometrics because in large-scale education
assessment, testing programs and professional testing firms IRT has almost completely
replaced CTT as method of choice
• IRT has many advantages over CTT that have brought IRT into more frequent use
Three Basics Components of IRT
• Item Response Function (IRF) – Mathematical function that relates the latent trait to the
probability of endorsing an item
• Item Information Function – an indication of item quality; an item’s ability to
differentiate among respondents
• Invariance – position on the latent trait can be estimated by any items with know IRFs
and item characteristics are population independent within a linear transformation
IRT - Item Response Function
• Item Response Function (IRF) – That is response to any item is a function of individuals'
trait/ability and items characteristics (that is difficulty and discrimination of an item).
• The IRF models the relationship between examinee trait level, item properties and the
probability of endorsing the item.
• Examinee trait level is signified by the greek letter theta () and typically has mean = 0
and a standard deviation = 1
IRT - Item Characteristic Curves
• IRFs can then be converted into Item Characteristic Curves (ICC) which are graphical
functions that represents the respondent’s ability as a function of the probability of
endorsing the item
IRF – Item Parameters Location (b)
• An item’s location is defined similar as item difficulty in CTT. It is the amount of the
latent trait needed to have item done correctly.
• The higher the “b” parameter the higher on the trait level a respondent needs to be in
order to endorse the item
• Analogous to difficulty in CTT
• Like Z scores, the values of b typically range from -3 to +3
IRF – Item Parameters Discrimination (a)
• Indicates the steepness of the IRF at the items location
• An items discrimination indicates how strongly related the item is to the latent trait like
loadings in a factor analysis
• Items with high discriminations are better at differentiating respondents around the
location point; small changes in the latent trait lead to large changes in probability
• Vice versa for items with low discriminations
Main difference between classical and item response theories and model
IRT - Item Response Function
The 4-parameter logistic model, Where
•  represents examinee trait level
• b is the item difficulty that determines the location of the IRF
• a is the item’s discrimination that determines the steepness of the IRF
• c is a lower asymptote parameter for the IRF
• d is an upper asymptote parameter for the IRF
The 3-parameter logistic model
• If the upper asymptote parameter is set to 1.0, then the model is termed a 3PL.
• In this model, individuals at low trait levels have a non-zero probability of endorsing the
item.
The 2-parameter logistic model
• If in addition the lower asymptote parameter is constrained to zero, then the model is
termed a 2PL.
• In the 2PLM, IRFs vary both in their discrimination and difficulty (i.e., location)
parameters.
The 1-parameter logistic model
• If the item discrimination is set to 1.0 (or any constant) the result is a 1PL
• A 1PL assumes that all scale items relate to the latent trait equally and items vary only in
difficulty (equivalent to having equal factor loadings across items).
Quick Detour: Rasch Models vs. Item Response Theory Models
• Mathematically, Rasch models are identical to the most basic IRT model (1PL), however
there are some (important) differences
• In Rasch the model is superior. Data which does not fit the model is discarded
• Rasch does not permit abilities to be estimated for extreme items and persons
• And other differences
IRT - Test Response Curve
• Test Response Curves (TRC) - Item response functions are additive so that items can be
combined to create a TRC.
• A TRC is the latent trait relative to the number of items

You might also like