0% found this document useful (0 votes)
150 views48 pages

Week 3 Readings - Chapter 7 and 8

This document discusses assessments for special populations, including infants, preschoolers, and those with disabilities or developmental delays. It describes several measures: 1) The Neonatal Behavioral Assessment Scale assesses newborn behaviors and is used to examine development, predict outcomes, and compare across cultures. 2) The Bayley Scales of Infant Development evaluates cognitive, language, motor, social-emotional, and adaptive behaviors in infants and toddlers. It yields composite and scaled scores to identify strengths and needs. 3) The Devereux Early Childhood Assessment screens preschoolers' social-emotional skills and focuses on protective factors like family and self-regulation that can buffer challenges.

Uploaded by

Bethany Long
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
150 views48 pages

Week 3 Readings - Chapter 7 and 8

This document discusses assessments for special populations, including infants, preschoolers, and those with disabilities or developmental delays. It describes several measures: 1) The Neonatal Behavioral Assessment Scale assesses newborn behaviors and is used to examine development, predict outcomes, and compare across cultures. 2) The Bayley Scales of Infant Development evaluates cognitive, language, motor, social-emotional, and adaptive behaviors in infants and toddlers. It yields composite and scaled scores to identify strengths and needs. 3) The Devereux Early Childhood Assessment screens preschoolers' social-emotional skills and focuses on protective factors like family and self-regulation that can buffer challenges.

Uploaded by

Bethany Long
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Chapter 7 Assessing Special Populations

A. Infant and Preschool Assessment


 not every examinee falls within the ordinary spectrum of physical and mental abilities.
 A large proportion of the population falls outside the reach of traditional tests and procedures.
 By reason of immature age, physical disability, language weakness, or diminished intellect
 Example: Infants and very young children certainly require exceptional approaches to assessment
because of their limited capacities for communication

a. Assessment of Infant Capacities


 Infant and preschool assessment tools
 can help answer questions about the intellectual and emotional development of children,
 whether they are developmentally delayed, intellectually gifted, at-risk for emotional
disorder, or within the normal spectrum
 Includes those 0 months to 6 years of age
 Use of standardized measures in the assessment of preschool children ages 2 ½ or older
 tests measure somewhat different components of intellectual ability – load heavily on
cognitive skills such as verbal comprehension and spatial thinking.
 Neonatal Behavioral Assessment Scale (NBAS)
 How was the measure developed?
 developed by Dr. T. Berry Brazelton and his colleagues
 today is regarded as the most comprehensive examination of newborn behavior available.
 Unique theoretical basis – emphasizes …
 the need to document the contributions of the newborn to the parent–infant system
 the assumption that the newborn infant is both competent and complexly organized
 What is the measure used for?
 Short-term
 looks at a wide range of behaviors
 used to examine the effects of prematurity, low birthweight, undernutrition, and a
range of pre-and perinatal risk factors, the effects of prenatal substance exposure,
environmental toxins, temperament, neonatal behavior in different cultures,
prediction studies, and studies of primate behavior.
 is suitable for examining newborns and infants up to 2 months old.
 Longitudinal
 Expanded our understanding of the range of variability in newborn behavior patterns
and the diversity of child rearing practices and belief systems across settings
 How is the measure scored?
 Do not provide an integrative scoring system; that is, there are no summary scores for the
entire battery or its subcomponents.
 Instead, the “scoring” of the NBAS consists of a summary sheet with ratings on each
specific item. In clinical work, the instrument is used to provide feedback to parents.
 assesses the newborn’s behavioral repertoire with 28 behavioral items, each scored on a
nine-point scale.
 Examples of the behavior items include the following:
 Response decrement to light
 Orientation to inanimate visual stimulus
 Cuddliness
 Consolability
 It also includes an assessment of the infant’s neurological status on 20 items, each scored
on a four-point scale.
 Examples include the following:
 Plantar grasp
 Babinski reflex
 Rooting reflex
 Sucking reflex
 Finally, seven supplementary items can be used to summarize the qualities of
responsiveness of frail, high-risk infants.
 Example include:
 Quality of alertness
 General irritability
 Examiner’s emotional response to infant
 Psychometric Properties
 Shown to have low interrater reliability (Majnemer and Mazer, 1998)
 One likely explanation is that in newborn infants, individual traits may fluctuate
rapidly over short periods of time, which would produce an underestimate of true
reliability when the NBAS is given twice over a period of days or weeks
 How is the measure interpreted? (i.e., early school performance or prognosticate adult
functioning)
 By the end of the assessment, the examiner has a behavioral “portrait” of the infant,
describing the baby’s strengths, adaptive responses, and possible vulnerabilities.
 Long-term
 used to examine the effects of postpartum depression on newborn behavior and the
effects of newborn behavioral differences on the parent-child relationship.
 used in longitudinal studies to provide a baseline or starting point against which to
measure future change or continuity, as a predictor of developmental outcome. 
 used extensively in cross-cultural studies to measure a wide range of variability in
newborn behavioral differences across cultures.
 Bayley-III
 How was the measure developed?
 Now in its third edition
 Known formally as the Bayley Scales of Infant and Toddler Development-III
 What is the measure used for?
 The 5 domains and representative capacities
 Cognitive Scale
 91 items involving sensory acuity, perceptual skill, attention, object permanence,
exploration and manipulation, puzzle solving, color matching, and counting.
 does not contain separate subtests.
 Language Scale
 48 items involving receptive and expressive communication.
 Items involve recognition of sounds, nonverbal expression, following simple
directions, identifying action pictures, naming objects, and answering
questions.
 yields
 Separate scores for Expressive Communication and Receptive
Communication
 Composite Language Scale score
 Motor Scale
 138 items pertaining to gross motor and fine motor skills.
 Items involve object manipulation, functional hand skills, postural control,
dynamic movement, and motor planning.
 yields
 Separate scores for Gross Motor and Fine Motor
 Composite Motor Scale score.
 Social-emotional
 35 items involving interactive and purposeful use of emotions, ability to convey
feelings, and connection of ideas and emotions.
 does not contain separate subtests
 Adaptive Behaviour Scale
 Caregivers complete items on a 4-point scale
 0 (is not able), 1 (never when needed), 2 (sometimes when needed), or 3
(always when needed)
 Items pertain to Communication, Community Use, Health and Safety, Leisure,
Self-Care, Self-Direction, Functional Pre-Academics, Home Living, Social,
and Motor
 yields
 Separate scaled scores for each of the ten areas listed
 General Adaptive Composite (GAC).
 Psychometric Properties
 Standardization
 The technical quality and excellent standardization of the Bayley-III mark this test
as the psychometric pinnacle of its field
 The 5 major clusters listed above each yield a composite score reported as a
standard score (M = 100, SD = 15).
 does not yield an overall score akin to an IQ score on a traditional test
 all scores on the instrument (including the many subscales listed above) can be
reported as scaled scores (mean = 10, SD = 3) for purposes of intra-individual
comparison.
 yields a useful chart that helps pinpoint areas of needed intervention

 Reliability
 High internal consistency - Average reliability coefficients as high as .93
(Language) and .91 (Cognitive)
 Average to high test-retest reliability – Coefficients ranging from .67 (Fine
Motor) to .80 (Expressive Communication).
 Average stability coefficient - t across all ages for the major composites was .80,
 Validity
 Validity evidence for the Bayley-III is scant at this time, but wholly supportive.
 Concurrent validity coefficients with other instruments are strong as well.
 E.g., The WPPSI-III Full Scale IQ scores correlated .72 to .79 with Bayley-III
Cognitive composites
 Devereux Early Childhood Assessment-Clinical Form (DECA-C)
 What is the measure used for?
 designed for the assessment of preschoolers aged 2:0 through 5:11 with social and
emotional troubles or significant behavioral concerns
 theoretical focus on protective factors that can buffer the impact of social, emotional, or
behavior difficulties.
 based, in part, on resilience theory, as proposed by Werner (1990) and described by
others (e.g., Masten, Best, & Garmezy, 1990).
 a strengths-based approach that concentrates on protective factors at three levels:
 environmental (high-quality childcare and schools)
 family (nurturing parents and extended family)
 within-child (adaptive personality traits).

 How is the measure Structured and scored?
 62 items require that the parent or teacher rate the frequency of various behaviors on a 5-
point scale (never, rarely, occasionally, frequently, very frequently).
 3 protective factor scales:
 Initiative: Assesses the child’s ability to use independent thought and behavior to
meet his or her needs.
 Items resemble “Retrieves things by himself or herself.”
 Self-control: Measures the child’s capacity to experience and express a range of
emotions in a socially acceptable manner.
 Items resemble “Controls his or her temper.”
 Attachment: Assesses the child’s formation of strong and long-lasting relationships
with parents, teachers, and family members.
 Items resemble: “Accepts adult comforting when upset.”
 4 problem scales:
 Attention Problems: Assesses the child’s ability to focus on a task and ignore
distracting environmental stimuli.
 Items resemble: “Loses focus on the task at hand.”
 Aggression: Measures aggressive or destructive acts directed at other persons or
things.
 Items resemble: “Destroys personal property of others.”
 Withdrawal/Depression: Assesses self-absorption and emotional/social withdrawal.
 Items resemble: “Appears wrapped up in his/her own world.”
 Emotional Control Problems: Measures difficulties in controlling negative emotions
that interfere with goal directed behavior.
 Items resemble: “Loses temper when things don’t go his/her way.”
 Psychometric Properties
 Standardization
 Exemplary standardization
 The sample approximated national data for preschoolers with respect to race,
ethnicity, geographic region, and family income
 Reliability
 Internal consistency reliability – good
 For the parents, coefficient alphas for the subscales were typically in the high .
70s (median .78), whereas the values for teachers were higher, typically in the
high .80s (median .88)
 Jaberg, Dixon, and Weis (2009): adequate internal consistency for the protective
factors scales in a sample of 780 kindergarten children
 Validity
 Discriminant analysis showed a good criterion validity
 How is the measure interpreted? (i.e., early school performance or prognosticate adult
functioning)
 The purpose of appraising protective factors is so that interventions can build upon the
child’s strengths
 Additional Measures of Infant Capacity
 A brief review of alternative instruments would be chapter-length
 See 400-page review provided by Berry, Bridges, and Zaslow (2004),
b. Assessment of Preschool Intelligence
 Danger in preschool assessment - the examiner may infer that a low score indicates low
cognitive functioning when, in truth, the child is merely unable to sit still, attend, cooperate, etc
 needs to be approached with unusual caution to avoid negative consequences of labeling and
overdiagnosis of disabling conditions
 Differential Ability Scales-II
 Latest edition of a highly respected test initially published in 1990 by Elliot.
 Structuring
 Consists of three batteries:
 The Early Years Battery (lower-level) for ages 2-6 to 3-5 (focus of section)
 The Early Years Battery (upper-level) for ages 3-6 to 6-11
 The School-Age Battery for ages 7-0 to 17-11.
 includes 10 core subtests and 10 diagnostic subtests, but rarely do you administer all 20
 core subtests are the primary measures of cognitive abilities, whereas the diagnostic
subtests provide supplementary information about school readiness and information
processing.
 The Early Years Battery (lower-level) for ages 2-6 to 3-5
 would include 6 core subtests and 7
diagnostic subtests
 Core Subsets
 are used to derive
 three core cluster scores
(Verbal, Nonverbal
Reasoning, and Spatial)
 overall composite score
known as General Conceptual
Ability (GCA)
 Diagnostic Subsets
 measure early numer concepts,
phonological processing, short-
term memory, and processing
speed.
 used for clinical analysis only
 less dependent on the g factor
and therefore do not figure in the
GCA or any core composites.
 Psychometric Properties
 Note: there is almost no published
research using the test, but that
available shows its value
 Standardization
 Norm Referenced Test – The DAS-II is normed to standard scores (M = 100, SD
= 15) for the GCA and cluster scores, whereas the individual subtests are based on
T scores (M = 50, SD = 10).
 Normed with careful stratification age, gender, race/ethnicity, parental education,
and geographic region.
 Reliability
 Commendable reliability for preschool
aged measurement Typically, preschool children are
 GCA internal consistency reliability easily distracted and plainly
is reported to be .95. influenced by situational factors,
 Cluster scores also show excellent which tends to lower the reliability
reliability with values ranging from . of test scores.
89 to .95
 Internal consistency reliability of the subtests is predictably lower, although
still laudable, ranging from .81 to .91
 Validity
 Appears very promising
 Very strong concurrent validity with other well established measure of
intelligence
 Wechsler Preschool and Primary Scale of  Intelligence-IV (WPPSI-IV)
 suitable for children ages 21 /2 to 7 years and 7 months
 Use of child-friendly and developmentally appropriate stimulus materials
 Use of the WPPSI-IV
 The full battery includes up to 13 subtests, but only 6 are needed to obtain a Full Scale IQ
(FSIQ),
 This is rarely the solitary goal of assessment, as it indispensable to compare and
contrast the various subcomponents of general intelligence, not just to get a FSIQ.
 Verbal Comprehension – Information, Similarities
 Visual Spatial – Block Design
 Fluid Reasoning – Matrix Reasoning
 Working Memory – Picture Memory
 Processing Speed – Bug Search
 An additional 4 subtests are needed, for a total of 10 subtests, which is the most
common WPPSI-IV battery.
 Visual Spatial – Object Assembly
 Fluid Reasoning – Picture Concepts
 Working Memory – Zoo Locations
 Processing Speed – Cancellation
 The final 3 subtests (for a total of 13 subtests) are needed
 Any of the 4 Ancillary Index Scales
 Structuring
 5 Primary Index Scales
 The 5 Primary Index Scales, each based on 2 subtests, are needed to capture the
complexity of cognitive abilities in older children
 Verbal Comprehension – Information, Similarities,
 Visual Spatial – Block Design, Object Assembly
 Fluid Reasoning – Matrix Reasoning, Picture Concepts
 Working Memory – Picture Memory, Zoo Locations
 Processing Speed – Bug Search, Cancellation
 4 Ancillary Index Scales
 Vocabulary Acquisition
 2 subtests: Receptive Vocabulary and Picture Naming.
 Nonverbal
 9 subtests with minimal verbal demand: Block Design and Matrix Reasoning.
 General Ability
 8 subtests, mainly untimed: Information, Similarities, and Matrix Reasoning.
 Cognitive Proficiency
 5 subtests: Picture Memory, Cancellation, and Animal Coding.
 Stanford-Binet Intelligence Scales for Early Childhood
 combine the subtests from the Stanford-Binet Intelligence Scales, Fifth Edition (SB5) with a
new Test Observation Checklist and a software-generated Parent Report.
 developed for children ages 2 years to 7 years and 3 months.
 Focus of Section = Test Observation Checklist (TOC)
 Purpose: to provide a qualitative but highly structured format for describing a wide range
of behaviors, including noncompliance, known to affect test performance.
 Focus = behaviors that negatively impact test performance.
 Uses:
 helps the examiner identify problematic behaviors that may affect the validity of
the test results
 may prove helpful in the early detection of developmental difficulties such as
learning disabilities, behavior problems, attentional difficulties, borderline
cognitive function, and neuropsychological deficits
 Test-taking behaviours are divided into two groups:
 (1) Characteristics
 general traits most likely found in many situations
 Motor Skills—includes gross motor skills such as clumsiness and fine motor
skills such as pencil dexterity.
 Activity Level—includes both excessive restlessness as well as underactivity
in relation to child’s age.
 Attention/Distractibility—refers to age-inappropriate inattention, a need for
redirection.
 Impulsivity—indicates the examiner saw fit to intervene, slow the child down.
 Language—includes articulation, receptive language, and expressive
language.
 (2) Specific Behaviors
 Specific behaviors actually observed during the testing session.
 Consistency in Performance—may indicate a haphazard approach to the test.
 Mood—includes specific behavioral indicators such as negative mood,
tantrums, or crying
 Frustration Tolerance—includes aggressiveness, refusal to participate.
 Change in Mental Set—includes noted tendencies toward rigidity of approach
or perseveration.
 Motivation—includes disinterest or boredom and related behaviors.
 Fear of Failure—is qualitatively judged through inference and can be
corroborated through parental report.
 Degree of Cooperativeness/Refusals—a crucial category because numerous
refusals can lead to underestimating cognitive ability.
 Anxiety—includes excessive fearfulness, shyness, or need for parental
presence.
 Need for Redirection—is noted when the child cannot stay on task and
constantly needs reminders.
 Parental Behaviors—includes items such as parental reassurance, tacit
approval for misbehavior, or giving verbal cues.
 Representativeness of Test Behaviors—is based on brief interview with
parent(s), if present during testing.
c. Practical Utility of Infant and Preschool Assessment
 Research shows test scores earned in the first year or two of life show minimal predictive
validity.
 Subsequently important to understand practicality of infant assessment – infant tests do have
an important but limited role to play.
 Predictive Validity of Infant and Preschool Tests
 Infant Tests
 Test score correlate positively but unimpressively with childhood test scores.
 McCall’s (1976) – “Generally speaking, there is essentially no correlation between
performance during the first six months of life with IQ score after age 5; the correlations
are predominantly in the 0.20s for assessments made between 7 and 18 months of life
when one is predicting IQ at 5–18 years; and it is not until 19–30 months that the infant
test predicts later IQ in the range of 0.40–0.55.”
 Preschool Tests
 The correlation between preschool test results and later IQ is typically strong, significant,
and meaningful.
 preschool tests are moderately predictive of later intelligenc
 Practical Utility of Infant Scales
 most important and sound use = screening for I/DD
 vital because it provides for early intervention and, consequently, allows for improved
outcomes later in life.
 Bayley-III (Bayley, 2005) likely possesses good predictive validity for low scores as well
 Score of 2+ SD below mean on original and second edition, particularly on the
Mental Scale, reveal a high probability of meeting the criteria for mental retardation
later in childhood.
 Fagan Test of Infant Intelligence (FTII)
 Perhaps new approaches are needed with infants.
 Fagan (1984) developed a new approach to infant assessment known as the Fagan Test of
Infant Intelligence (FTII).
 Purpose: assesses visual recognition memory using a 10-trial habituation format
 Use: may perform better as a screening test than as a general predictor of childhood
intelligence
 Structuring and Scoring
 In each trial, a photograph of a face is shown to the infant, followed by paired
presentation of the original face with either
 (1) a photograph of a similar but new face
 (2) a photograph of the original face in a different orientation.
 Scoring/Interpretation:
 The amount of time spent looking at the new photograph is presumed to indicate
the degree to which the infant has noticed that it is different from the original
picture.
 The examiner observes the infant’s corneal reflections to determine a percent
Novelty Preference, averaged across the 10 trials.
 A score of less than 53 percent for novelty preference identifies children who are
at risk for later mental retardation.
 Psychometric Properties
 Reliability
 very high interrater agreement
 Validity
 Mixed validity outcomes
 DiLalla, Thompson, Plomin, and others (1990) – correlated only .32 with
Stanford-Binet IQ at age 3
 Andersson (1996) – FTII scores obtained at 7 to 9 months of age and WPPSI-
R IQ at age 5 were very low, about .2,
d. Screening for School Readiness 
 Concerns for Long-Term Impacts
 Results from screening tests might be used to delay entry into the school system, or to hold a
child back a year
 children might be permanently labeled as slow learners or cognitively delayed
 Five Approaches to Screening for School Readiness
 Maturationist Model:
 Seen as biological issue
 a question of cognitive, psychomotor, and emotional maturation that stem directly from
unfolding biological maturation.
 Some use this viewpoint as a basis for defining school entry by age and not using
readiness assessments.
 Environmental Model:
 Seen as acquisitional issue
 Readiness = based on children’s acquisition of skills learned from early socialization
experiences, especially with parents and family members vary from child to child.
 Inclusion approach (parental involvement) in school readiness assessments.
 Constructivist Model:
 Seen as an issue of observational learning
 readiness = the extent to which children can learn tasks by interacting not just with
teachers, but also with more knowledgeable peers and adults.
 inclusive approach (parents, teachers, other adults) in the assessment process.
 Cumulative-Skills Model:
 Readiness as a matter = extent to which children possess important prerequisite skills
necessary for learning foundational subjects such as reading and math.
 Example: Policies that require assessment of pre-academic skills upon KG entrance
 Ecological Model:
 a holistic methodology
 readiness = an interaction between developmental status and children’s environments.
 readiness does not reside within the child alone, but stems as well from an interaction
with the readiness of families, communities, services, and preschool settings.
 assessment for readiness is a complex, qualitative evaluation that involves the wider
community.
 Ideal Screening Instrument
 a short test that can be administered by teachers, school nurses, and other individuals who
have received limited training in assessment.
 provides a cutoff score that is accurate in classifying children as normal or at risk.

 Errors in Screening Tests:
 Two kinds of errors can occur.
 Normal children who fail the test would be referred to as false-positive cases
 they are falsely classified as positive for potential disability
 At-risk children who pass the test would be referred to as false-negative cases
 they are falsely classified as negative for potential disability
 Consequences of Errors
 False-positive misclassification
 rarely leads to undesirable consequences
 purpose of screening is merely to identify children in need of additional evaluation,
which means that false-positive cases will receive further evaluation
 false-negative cases
 typically do not receive further evaluation
 misclassification is potentially more serious—because a needy child is deemed to be
normal.
 Five Common Pitfalls of Developmental and Behavioural Screening (Glascoe and Shapiro,
2005)
 Waiting until the problem is observable.
 Some clinicians use a screening test only after the problem is manifest—a waste of time
and effort.
 Ignoring screening results.
 Practitioners may adopt a “wait and see” outlook—early intervention is then pointlessly
postponed
 Relying on informal methods.
 Clinicians often employ their own informal methods— consequently, children in need of
services go undetected.
 Using inappropriate tests.
 Some clinicians sparingly use long batteries instead of screening tests—as a result,
children with disabilities are overlooked.
 Assuming services are limited or nonexistent.
 Practitioners often incorrectly assume that services are not available—consequently, they
are reluctant to administer screening tests.
 Qualities of a Good Preschool Screening Instrument
 Basic Rule: Must address a few perquisite domains for school readiness
 Involves a number of broad areas, including motor, language, cognitive, social, and
emotional functioning.
 Requires that children function at or near age-appropriate levels in all these areas.
 What are the qualities of a good preschool screening instrument?
 The primary purpose is screening rather than assessment, diagnosis, or prediction of
academic success.
 Screening is provided in most or all of these areas: motor, language, cognitive, social, and
emotional functioning.
 Overall test–retest reliability coefficient is a minimum of .70, preferably higher.
 Concurrent validity against a comprehensive assessment is a minimum of .70, preferably
higher.
 Sensitivity and specificity of “at risk” and “not at risk” classifications, respectively, are
both at least .70.
 Practicality and ease of administration are built in, with testing time of 30 minutes or less.
 Cultural, ethnic, and linguistic sensitivity is evident, that is, the test accurately screens
children from diverse cultures.
 Minimum expertise is required for administration, that is, the test is suitable for
paraprofessionals to administer.
 Instruments for Preschool Screening
 A Sample of School Readiness Screening Tools
 Ages and Stages Questionnaire
 Description: Birth to 60 months; parent report of language, cognition, personal-social,
and motor skills; available in English, Spanish, French, and Korean; takes 10 to 20
minutes; clerical or paraprofessional tester.
 Brigance Screens
 Description: Birth to 60 months; observation of socialemotional skills, speech-
language, motor, readiness, and general knowledge; available in English, Spanish,
Laotian, Vietnamese, Cambodian, and Tagalog; takes 15 to 20 minutes; consult
online training module before scoring.
 Early Screening Inventory-Revised
 Description: 36 to 60 months; observation of visual motor/adaptive, language and
cognition, and gross motor skills; available in English and Spanish; takes 15 to 20
minutes; screeners and scorers can be trained with a manual and video.
 FirstSTEP Preschool Screening Tool
 Description: 33 to 62 months; observation of cognitive, communication, and motor
domains and classifications of: within acceptable limits, caution, or at-risk; available
in English only; takes 15 to 20 minutes; screeners and scorers can be trained with a
manual and video.
 Minneapolis Preschool Screening Instrument-Revised
 Description: 36 to 60 months; 64 dichotomous items pertaining to cognitive,
language, literacy, motor, and perceptual development; available in English, Spanish,
Somali, Hmong; takes 12 to 15 minutes to administer, 2 to 5 minutes to score; easy to
learn, suitable for paraprofessionals.
 Parents’ Evaluation of Developmental Status
 Description: Birth to 96 months; parental response in 10 areas such as cognitive,
expressive language, fine motor, social-emotional; available in English, Spanish, and
Vietnamese; takes 5 minutes to administer, 2 minutes to score; suitable for
paraprofessionals and clinic office staff.
 Dial 4
 use conventional approaches for the identification of developmental delay
 Structure
 The 5 Domains with Example Items/Description
 Motor
 Fine-motor items include block building, cutting, copying shapes and letters,
name writing, and finger touching; gross-motor items include catching,
jumping, hopping, and skipping.
 Concepts
 Pointing to named body parts, naming or identifying colors, rote counting,
counting blocks, positioning blocks, identifying concepts, and sorting shapes.
 Language
 Giving personal information (name, age, sex), naming objects and actions,
proper articulation, and phonemic awareness (e.g., rhyming).
 Self-Help
 Parent and teacher fill out separate questionnaires with items relevant to the
child’s personal care skills, such as eating, grooming, and dressing.
 Social-Emotional
 Parent and Teacher fill out separate questionnaires with items relevant to the
child’s social skills with other children and parents, such as sharing, empathy,
self-control, and rule compliance
 Administration
 Items in these domains are administered directly to the child by the examiner.
 Two additional domains (self-help and social-emotional) are appraised by means of
questionnaires filled out by a parent (or both parents jointly) and a teacher
 Scoring
 Derives 8 Standard Scores
 Respondent = Child (4 Scores)
 Performance Areas: Motor, Concepts, Language, Total Score
 Respondent = Questionnaire Results (1 score)
 Performance Area = Self Help
 Respondent = Parent (2 scores)
 Performance Area = Social-Emotional, Self Help
 Respondent = Teacher (1 score)
 Performance Area = Social-Emotional
 Total score of direct academic relevance
 obtained by summing the first three area scores (motor, concepts, language)
 yields a total of eight scaled scores (mean of 100, SD of 15)
 Interpretation
 Scoring is discrete and objective for some items, while others are more subjectively
interpretation
 Detracts from the reliability of the instrument
 Provides wealth of additional information in addition to standardized scores, such as
raw scores, cut off scores, and percentile ranks
 Cut-Off Scores: key feature of the test is that for each of the eight areas shown,
the manual provides cutoff scores for assigning the child to one of two outcome
groups labeled “potential delay” and “okay.”
 A finding of “potential delay” in one or more areas is a starting point for
further discussion, not a mandate for any high-stakes decision-making
 Psychometric Properties
 Standardization
 Norms provided in 2 months intervals due to quick developmental changes
 available in both English and Spanish, although standardization is now based on
the combined normative sample, that is, separate norms are not provided
 Reliability
 Total Scale Reliability = .87
 Test-Retest Reliability = .90 for tests used to make individual decisions
 Validity
 Content validity is judged to be high
 Criterion-related validity is strong
 Construct validity proven favorable through confirmatory factor analysis
 Practical Value
 The value of a screening test is best judged by the extent to which it accurately
identifies children in need of further developmental assessment, and accurately
identifies children who are normal as normal
 Specificity and Sensitivity
 Specificity is reported to be in the range of .82 to .86 and sensitivity is reported to
be in the range of .73 to .82.
 Increasing sensitivity inevitably will reduce specificity (percentage of normal
children correctly identified as normal).
 Would cause many over-referrals (children identified as “potential delay” who
actually are normal).
 Denver II
 use conventional approaches for the identification of developmental delay
 an updated version of the highly popular Denver Developmental Screening Test-
Revised
 most widely known and researched pediatric screening tool in the United States, but
instrument is popular worldwide
 Brief Description
 Screening test, assessing developmental progress for children 0-6 years of age.
Defines the ages at which children accomplish a broad variety of heterogeneous tasks.
 Structure
 Items and Domain
 125 items in four areas:
 personal-social
 fine motor-adaptive
 language
 gross motor
 Items arranged chronologically on the test by age of the child and marked
pass/fail
 Administration: mix of parent report, direct elicitation, and observation; 20 minutes
or less
 Scoring and Interpretation
 Scores Produced
 does not produce a developmental quotient or score.
 30 age-appropriate items provide a score that can be interpreted as normal,
questionable, or abnormal in reference to age-based norms
 A category of “untestable” also is included.
 Psychometric Properties
 Standardization
 Consisted of 2,096 children, all from the state of Colorado, stratified by age, race,
and socioeconomic status
 Reliability
 Interrater reliability among trained raters averaged an outstanding .99.
 Test–retest reliability for total score over a 7- to 10-day interval averaged .90
 Validity
 Proof of Validity
 excellent content validity insofar as the behaviors tested are recognized by
authorities in child development as important markers of development
 Concerns for Validity
 Glascoe and Bryan (1993) – While the Denver II functioned well in correctly
identifying 15 of the 18 at-risk children, the instrument performed poorly with
the normal children
 a blue-ribbon review panel, Minnesota Interagency Developmental Screening
Task Force flatly – Denver-II is not suitable for developmental and social-
emotional screening of preschool children
 Home Observation for Measurement of the Environment (HOME)
 Background and Description
 embodies a radical departure from traditional procedures for the identification of
developmental delay
 developed to provide a direct process measure of children’s environments.
 argue that direct assessment of children’s experiences provides a more accurate
index of variations in the home environment
 probably the most widely used index of children’s environment
 Based on in-home observation and an interview with the primary caretaker, the
instrument provides a measure of children’s physical and social environments.
 Comes in Three Forms
 Infant and Toddler – 45 items in 6 subscales
 Early Childhood – 55 items in 8 subscales
 Middle Childhood – 59 items in 8 subscales.
 Structure of The Infant and Toddler
 The Infant and Toddler form consists of 45 items organized into the following six
subscales:
 Emotional and Verbal Responsivity of Parent
 Acceptance of the Child’s Behavior
 Organization of the Environment
 Provision of Appropriate Play Materials
 Parent Involvement with Child
 Variety of Stimulation
 Technical Features
 Reliability
 Bradley & Cladwell (1984) – Methods used for the assessment of reliability
included interobserver agreement, internal consistency, and long-range test–retest
stability coefficients for 91 families from the standardization sample
 Reliability coefficiants – The correlation between total score for testings at 12
and 24 months of age was a highly respectable .77.
 Interobserver Agreement – reports of 90% or higher
 Internal consistency – ranged from .67 to .89 for all subscales except Variety
of Stimulation, which yielded a coefficient of only .44.
 Validity
 Validity coefficients show modest correlations with SES indices.
 scores should be significantly but not highly related to SES indices as the
inventory was proposed as a more meaningful, sensitive index of environment
than social class
 subscale correlations with SES are mainly in the .30s and .40s, while the total
score–SES correlation is .45 (Bradley, Rock, Caldwell, & Brisby, 1989)
 Predictive Validity
 strong relationship with poverty status in Caucasian and minority samples
 predicted that children would exhibit fewer behavior problems and better
preschool ability
 Criterion Validity
 strong, theory-confirming relationships with appropriate external criteria,
including language and cognitive development, school failure, therapeutic
intervention, and mental retardation
 Factor-analytic studies of the HOME also support the construct validity of this
instrument
 Practical Application
 shows promise not only in research but also as a practical adjunct to intervention.

B. Testing Persons with Disabilities 


a. Origins of tests for Special Populations
 1950s – renewed commitment to the needs and rights of physically and mentally disabled
persons arose in the United States
 more supportive stance that favored new programs and initiatives on behalf of the
disabled
 1970’s – renewed concern for the needs of disabled persons was translated into federal
legislation
 Public Law 93-112 of 1973, serving as a “Bill of Rights” for individuals with disabilities
 outlawed discrimination on the basis of disability
 Education for All Handicapped Children Act (Public Law 94-142) of 1975
 mandated that disabled schoolchildren receive appropriate assessment and
educational opportunities
 directed to assess children in all areas of possible disability— mental, behavioral, and
physical—and to use instruments validated for those express purposes.
b. Nonlanguage Tests
 equire little or no written or spoken language from examiner or examinee
 Diagnostic/Classification Uses
 suited for assessment of those who
 are non-english-speaking persons
 experience speech impairments
 Experience weak language skills or are non-verbal
 Clarification use:
 Can also be used as supplementary tests for examinees who have no disabilities.
 Leiter International Performance Scale-Revised
 Summary: 
 The Leiter-R is an individually administered test designed to assess cognitive functions in
children and adolescents ages 2-20.
 The battery measures nonverbal intelligence in fluid reasoning and visualization, as well
as appraisals of visuospatial memory and attention.
 There is a complete elimination of verbal instructions
 Structure
 The test is divided into a (1) visualization and reasoning battery and an (2) attention
and memory battery.
 (1) visualization and reasoning battery
 10 subtests – Not all subtests are administered to every child
 6 Visualization and 4 Reasoning: Figure Ground, Design Analogies, Form
Completion, Matching, Sequential Order, Repeated Patterns, Picture Context,
Classification, Paper Folding, Figure Rotation
 (2) attention and memory battery.
 10 subtests - Not all subtests are administered to every child
 8 Memory and 2 Attention: memory span, spatial memory, associative
memory, delayed recognition memory, an underlining test, a measure of
divided attention
 Uses
 children with any of these features:
 non-English-speaking
 autism
 traumatic brain injury
 speech impairment
 hearing problems
 impoverished environment
 useful in the assessment of attentional problems,
 Psychometric Properties
 Norm-Referenced
 yields a composite IQ with the familiar mean of 100 and standard deviation of 15.
 produces subtest scaled scores with a mean of 10 and standard deviation of 3, as well
as a variety of composite scores useful in clinical diagnosis.
 Standardization
 normed on over 2,000 children and adolescents, from 2 to 21 years of age
 carefully stratified according to race, age, gender, social class, and geographic region
 Reliability
 Internal consistency reliability for subtests, domain scores, and IQ scores = excellent
– coefficient alphas are in the high .80s for subtests and the low .90s for domain
scores and IQ scores.
 Validity
 Validity-confirming correlation of r = .80 with another nonverbal measure of
intelligence.
 Human Figure Drawing Tests
 Note: they may not be valid for use even as screening measures
 (1) Florence Goodenough (1926) & the Draw-A-Man test – Revised by Harris (1963) and
renamed the Goodenough-Harris Drawing Test
 Description: a brief, nonverbal test of intelligence that can be administered individually
or in a group.
 the instructions are brief and basic, so for all practical purposes, it is a nonlanguage
test
 Purpose: to measure intellectual maturity, not artistic skill.
 Scoring: emphasizes accuracy of observation and the development of conceptual
thinking
 73 scorable items, transformed to a scaled score with the familiar mean of 100 and
standard deviation of 15.
 receives credit for including body parts and details, as well as for providing
perspective, realistic proportion, and implied freedom of movement
 Psychometrics
 Concurrent validity
 Correlations between Goodenough-Harris Drawing Test scores and WPPSI Full
Scale IQ in the range of .72 to .80.
 Correlations with individual IQ tests are more variable, but the majority are over .
50.
 (2) Naglieri (1988) & The Draw A Person: A Quantitative Scoring System (DAP)
 Scoring – Provides a clear scoring system
 Psychometric Properties
 Standardization
 normed on a sample of 2,622 individuals ages 5 through 17 years who were
representative of the 1980 U.S. Census data on age, sex, race, geographic region,
ethnic group, social class, and community size.
 standard scores with the familiar mean of 100 and standard deviation of 15
 Reliability and Validity
 Reliability is strong but the is concerns for validity of the test
 low to moderate predictive validity
 Hiskey-Nebraska Test of Learning Aptitude
 Description: a nonlanguage performance scale for use with children aged 3 to 17 years
 Uses: useful with children who are deaf, have speech or language impairments or mental
retardation, or those who are bilingua
 Administration: Entirely through pantomime and requires no verbal response from the
examinee; verbal instructions can be used with children with normal and mild hearing
impairment.
 Structure: consists of 12 subtests
Bead Patterns Picture Association
Block Patterns Puzzle Blocks
Memory for Color Paper Folding
Completion of Drawings Picture Analogies
Picture Identification Visual Attention Span
Memory for Digits Spatial Reasoning
 Scoring: Raw scores on the subtests are converted into a Deviation Learning Quotient (LQ)
with mean of 100 and standard deviation of 16.
 Psychometric Properties:
 Standardization
 chief weakness of the instrument is the inadequacy of the norms used to standardize
scores
 contemporary and more detailed restandardization of the test would be quite helpful.
 Reliability
 Watson 1983 & test retest reliability
 reported to be .79, .85, and .62 after intervals of about 1 year, 3 years, and 5 years,
respectively, which is similar to data for normal children
 ≥ 1/3 of sample showed a 15-point or greater change in scores over the 5-year
time span → importance of basing important decisions on more than a single
measure.
 Test of Nonverbal Intelligence-4 (TONI-4)
 Description: a language-free measure of cognitive ability designed for disabled and
language-impaired populations
 a pragmatic, brief, and simple measure
 administered in 15 to 20 minutes
 response format can include any simple gesture such as nodding or pointing,
 Uses:
 well suited for persons who are deaf, language impaired, or physically limited
 Suitable for persons aged 6:0 through 89:11
 Structure:
 two equivalent forms (A and B), each consisting of 60 abstract or figural items that do
not include pictures or cultural symbols.
 require the examinee to solve problems by identifying relationships among the abstract
figures
 Scoring: The test yields three kinds of scores: age equivalents (for younger examinees),
percentile ranks, and TONI-4 quotients (mean of 100 and standard deviation of 15.
 Psychometric Properties
 Standardization
 By adding new items, the fourth edition realized a higher ceiling and a lower floor
than the previous version
 more carefully standardized than most
 Reliability
 possesses excellent reliability
 internal consistency coefficients – greater or equal to .90
 alternate-forms reliability – range of .80 to .95.
c. Nonreading and Motor-Reduced Tests   
 Description: designed for illiterate examinees who can, nonetheless, understand spoken English
well enough to follow oral instructions.
 Uses: well suited to young children, illiterate examinees, and persons with speech or expressive-
language impairments.
 The Challenge of Assessment in Cerebral Palsy or other orthopedically impairing conditions
 The motor deficits, increased tendency to fatigue, and inexactness of purposive movements
common to persons with cerebral palsy will
 Cause the person to score very poorly on nonreading tests that require manipulatory
responses
 Negatively affect their performance on cognitive assessment tools
 Needs: tests that permit a simple pointing response are well suited to the assessment of
children and adults with cerebral palsy or other motor-impairing conditions
 need tests that are both nonreading and motor reduced.
 Peabody Picture Vocabulary Test-IV (PPVT-4)
 Description
 Measurement Areas
 designed to measure the receptive (hearing) vocabulary of English-speaking adults
and children.
 no specific content areas are described in the manual → designed to cover a broad
range of English-language content.
 Use:
 measure of listening vocabulary with persons who are deaf or who have neurological
or speech impairments (any examinee who cannot verbalize well)
 examinees who also manifest motor-impairing conditions such as cerebral palsy or
stroke.
 Purpose:
 The PPVT-4 is a norm-referenced language assessment tool that can be used to:
 evaluate English-language competence;
 measure language learning in second-language speakers;
 determine the appropriate level and content for educational instruction;
 identify language deficits due to injury or disease.
 Note: the instrument is not a substitute for a general intelligence test and PPVT-4
scores may underestimate intellectual functioning in some groups (e.g., minority
children, high-functioning adults)
 Structure:
 Comes in two parallel versions, each consisting of 4 practice plates and 228 testing
plates.
 single-scale test that is intended to measure English “vocabulary knowledge” in a
person
 Administration, Scoring, and Interpretation
 While the PPVT-4 manual contains no specific suggestions for examiner, scorer, or
interpreter qualifications, it can be assumed that a “Level 2” qualification is reasonable
for interpreters.
 paraprofessionals can administer the test with training and qualified supervision.
 moderately easy to administer, easy to score, and moderately difficult to interpret based
on the interpretation guide in the manual.
 Raw scores are tabulated and converted into standard scores, percentiles, normal curve
equivalents, stanines, approximate age and grade equivalents, and “growth scale values”
(GSVs; mean of 100, standard deviation of 15).
 Psychometric Properties
 Reliability
 Test-retest Reliability: yielded correlations between .92 and .96 (very high).
 Internal Consistency:
 split-half reliability for each form of the test (A and B), yielding .94 and .95 on
each form.
 reliable across all the age groups that were measured.
 Alternate-Form Reliability: The reliability coefficients were calculated between .87
and .93, which are considered “very reliable.”
 Validity
 Demonstrates reasonable content validity for American populations.
 Concurrent and predictive validity data for the Peabody are somewhat limited but
promising
 appropriate discriminant validity
 modest relationships (r’s from .30 to .60) are common
d. Testing Persons with Visual Impairments
 Legally blind—a term used in determining eligibility for government benefits
 individuals with central visual acuity of 20/200 or less in the better eye (with correction)
 those with significant reduction in their visual field to a diameter of 20 degrees or less
 Tests used to assess intellectual functioning of the visually impaired
 Adaptations of the Stanford-Binet
 Hayes-Binet revision
 was based on the 1916 Stanford-Binet; this instrument has since undergone several
revisions.
 Perkins-Binet revision
 retains most of the verbal items from the Stanford-Binet but also adapts other items to
a tactual mode
 possesses acceptable split-half reliability and shows high correlations with verbal
scales of the WISC-R
 Acknowledges that visual problems exist on a continuum by developing separate
norms for children with usable vision (Form U) and no usable vision (Form N).
 The Haptic Intelligence Scale for the Adult Blind (HISAB)
 Modifications of the Wechsler Performance scales
 consists of 6 subtests
 4 = the Digit Symbol, Block Design, Object Assembly, and Picture Completion
tests (resemble the WAIS Performance scale)
 Two of which consist of Bead Arithmetic and a Pattern Board
 Psychometrics
 Standardization = Provide normative data on a sample of adults with visual
impairment.
 Reliability = excellent
 Limit = has never been investigated empirically
 Blind Learning Aptitude Test (BLAT)
 Description
 a tactile test for children from 6 to 16 years of age who are blind
 Use of braille like items of 6 different types:
 recognition of differences
 recognition of similarities
 identification of progressions
 identification of the missing element in a 2 × 2 matrix
 completion of a figure
 identification of the missing element in a 3 × 3 matrix
 Psychometrics
 Standardization = Norm referenced sample is said to be socioeconomically and
racially representative of the U.S. population.
 Reliability = reveals excellent reliability, with internal consistency (Kuder-
Richardson) of .93, and test– retest reliability over a 7-month period of .87 and .92
(two studies).
 Validity = Moderate content validity, shown by correlations the Hayes-Binet (r = .74)
and the WISC Verbal scale (r = .71).
 Limit = would profit substantially from minor revisions, updated norms, and a more
thorough test manual
 Intelligence Test for Visually Impaired Children (ITVIC)
 Description
 takes about three hours to administer
 Designed for children 6 to 15 years of age, the test has separate norms for partially
sighted and totally blind examinees.
 The instrument includes five verbal subtests adapted from existing instruments such
as the Wechsler scales and seven new nonverbal subtests that rely on tactile
perception:

e. Testing Individuals Who are Deaf or Hard of Hearing


 American Sign Language (ASL) is often a primary means of communication
 Challenges and Solution
 Challenges
 Profound Challenges to a Proper and Valid Assessment
 the typical limited mastery of the English language of persons who are deaf
 the typical psychologist’s limited (or nonexistent) skill in ASL
 the use of an interpreter may inadvertently alter the content of the test, therefore
affecting the validity of the findings.
 Challenges to Test Construction
 Sign language “can now be characterized on a multidimensional continuum
encompassing numerous styles, lexical variants, syntactic structures, dialects, and
approximations to or departures from English word ordering”
 Choosing the correct test
 Gold Standard for Test Choice = the Wechsler Performance subtests
 impact of English language facility is minimized on these subtests, so it is thought
that they provide a more accurate measure of cognitive skill than the Verbal
subtests.
 Others include
 Raven’s Progressive Matrices (Raven, Court, & Raven, 1992)
 the Hiskey-Nebraska Test of Learning Aptitude
 Solutions
 the proper and valid assessment of persons who are deaf requires
 that interested psychologists immerse themselves in the Deaf culture and also seek
relevant educational and training experiences
 That the consulting psychologist refers persons who are deaf to a person or agency
with the requisite talents and expertise
 The examiner to be fluent in sign language, so that any necessary translations stay
within the bounds of standardized procedure if a translator is being used.
f. Assessment of Adaptive Behavior in Intellectual Disability
 The Term Intellectual Disability
 Intellectual disability represents a continuum from very mild to substantially disabling based
on deficits in intellect and adaptive functioning must have arisen during the developmental
period—defined as between birth and the eighteenth birthday
 Skills Related to Adaptive Functioning
 Conceptual skills—language and literacy; money, time, and number concepts; and self-
direction.
 Social skills—interpersonal skills, social responsibility, self-esteem, gullibility, naïveté (i.e.,
wariness), social problem solving, and the ability to follow rules/obey laws and to avoid
being victimized.
 Practical skills—activities of daily living (personal care), occupational skills, health care,
travel/transportation, schedules/routines, safety, use of money, use of the telephone
 The Four Levels of Intellectual Disability and Levels of Support Required
 Mild Intellectual Disability: IQ of 50–55 to 70–75+, Intermittent Support required.
 Reasonable social and communication skills; with special education, attain sixth grade
level by late teens; achieve social and vocational adequacy with special training and
supervision; partial independence in living arrangements.
 Moderate Intellectual Disability: IQ of 35–40 to 50–55, Limited Support required.
 Fair social and communication skills but little self-awareness; with extended special
education, attain fourth grade level; function in a sheltered workshop but need
supervision in living arrangements.
 Severe Intellectual Disability: IQ of 20–25 to 35–40, Extensive Support required.
 Little or no communication skills; sensory and motor impairments; do not profit from
academic training; trainable in basic health habits.
 Profound Intellectual Disability: IQ below 20–25, Pervasive Support required.
 Minimal functioning; incapable of self-maintenance; need constant nursing care and
supervision.
 Assessment of Adaptive Functioning
 First – Vineland Social Maturity Scale (Doll, 1935).
 undergone several revisions and is now known as the Vineland Adaptive Behavior
Scales, Second Edition
 Since – over 100 scales of adaptive behavior have been published
 vary greatly in structure, intended purpose, and targeted population
 can distinguish two types of instruments designed for two different purposes.
 Group 1: mainly norm-referenced scales is used largely to assist in diagnosis and
classification.
 Group 2: mainly criterion-referenced scales is used largely to assist in training and
rehabilitation
 Scales of Independent Behavior-Revised (SIB-R)
 Summary:
 comprehensive, norm-referenced assessment of adaptive and maladaptive behavior
 assesses functional independence and adaptive functioning across settings: school,
home, workplace, and community.
 designed for individual evaluation, individualized program planning, selection, and
placement, and to assess service needs
 Administration and Format
 Administered using the structured interview or a checklist procedure
 14 areas of adaptive behavior, 8 areas of problem behavior.
 Three forms: Early Development, Short Form, Full Scale.
 The Subscales and Clusters of the Scales of Independent Behavior-Revised
 Motor Skills
 Gross Motor—19 large muscle skills such as sitting without support or taking
part in strenuous physical activities.
 Fine Motor—19 small muscle skills such as picking up small objects or
assembling small objects
 Social and Communication Skills
 Social Interaction—18 skills requiring interaction with other people such as
handing toys to others or making plans with friends to attend social activities.
 Language Comprehension—18 skills involving the understanding of spoken
and written language such as looking toward a speaker or reading.
 Language Expression—20 tasks involving talking such as making sounds to
get attention or explaining a written contract.
 Personal Living Skills
 Eating and Meal Preparation—19 skills related to eating and meal preparation,
ranging from drinking from a glass to planning a meal.
 Toileting—17 skills necessary to bathroom and toilet use.
 Dressing—18 skills related to dressing, ranging from holding out arms and
legs while being dressed to arranging for clothing alterations.
 Personal Self-Care—16 tasks involved in basic grooming and health
maintenance, for example, washing hands and making a medical appointment.
 Domestic Skills—18 tasks needed to maintain a home, ranging from putting
empty dishes in the sink to selecting appropriate housing
 Community Living Skills
 Time and Punctuality—19 tasks involving time concepts and time
management such as keeping appointments.
 Money and Value—20 skills related to money concepts, such as saving money
and using credit.
 Work Skills—20 skills related to prevocational and work habits, for example,
indicating that an assigned task is completed.
 Home–Community Orientation—18 skills involved in getting around the
home and neighborhood and traveling in the community, for example,
locating a dentist.
 Scoring
 Yields standard scores, percentile ranks, age equivalents, and developmental range.
 yields two scale scores plus extra, and different rating systems are used for the two
scales.
 the Adaptive Behavior Full Scale score
 items are rated based on the extent to which the individual performs a task
completely and independently (with no help or supervision).
 the Problem Behavior Scale score
 rated based on the frequency and severity of each behavior.
 The Support Scale score
 based on the information obtained from the other two scales
 indicates an approximate level of support that an individual may need in order
to be independent in different areas.
 Psychometric Properties
 Standardization
 standardization of the SIB-R was well conceived and executed
 2,182 persons sampled; cover persons from age 3 months to adults over age 80.
 Anchored to the norms for the Woodcock-Johnson Psycho-Educational Battery-
Revised.
 SIB-R is one component of this larger test battery, but can be used on its own.
 Reliability
 generally respectable, but somewhat variable from subscale to subscale and from
one age group to another
 The individual subscales show split-half reliabilities in the vicinity of .80
 the four clusters have median composite reliabilities around .90;
 the Broad Independence Scale has a very robust reliability in the high .90s
 Validity
 Validity data for the SIB-R are very promising
 Correlate very strongly with intelligence scores (in the .80s), whereas with
nondisabled examinees, the relationship is minimal (Bruininks et al., 1996).
 Possesses excellent convergent validity— .83
 Discriminant validity – helpful in the evaluation of elderly clients with mild
cognitive impairment.
 Inventory for Client and Agency Planning (ICAP)
 Description
 one of the most widely used tests in the field of developmental disabilities
 Focus = Determining the need for special services such as personal care, remedial
education, vocational training, or sheltered work environment.
 suitable for children and adults with mental retardation, individuals who become
disabled as adults through illness or accident, and elderly persons who have slowly
lost their independence and, therefore, need special assistance
 Format
 Scoring
 provides an overall Service Score based on both adaptive and maladaptive
behavior.
 ranges from 0 to 100, indicates the likely level of attention, supervision, and
training needed by the client.
 The lower the score, the greater the need for oversight
 Scales and Subscales of the Inventory for Client and Agency Planning

Psychometrics
 enhanced reliability (r = .80) in comparison to similar subscales from other
instruments that reveal low reliability (r = .60)
 Additional Measures of Adaptive Behavior
 The Vineland Adaptive Behavior Scales-II (VABS-II, Sparrow, Cicchetti, & Balla,
2005)
 the outcome of a major revision and restandardization
 provides an evaluation in the following domains and subdomains:
 Communication (receptive, expressive, written)
 Daily Living Skills (personal, domestic, community)
 Socialization (interpersonal relationships, play and leisure time, coping skills)
 Motor Skills (gross, fine).
 Psychometrics
 respected instrument with good concurrent validity = correlations in the range of .
50 to .80 with the Wechsler scales and Stanford-Binet.
 The AAMR Adaptive Behavior Scales: Second Edition (Nihira, Leland, & Lambert,
1993)
 assessing the appropriate behavioral domains
 careful attention to maladaptive behaviors, which are evaluated in eight domains
 Violent and antisocial behavior
 Rebellious behavior
 Eccentric and self-abusive behavior
 Untrustworthy behavior
 Withdrawal
 Stereotyped and hyperactive behavior
 Inappropriate body exposure
 Disturbed behavior
 Psychometrics
 Large and broad normative sample
 extensively validated and clearly distinguishes persons independently classified at
different adaptive behavior levels.
g. Assessment of Autism Spectrum Disorders
 Assessment of ASD
 A complex endeavor that includes screening tests, behavioral observations, and diagnostic
evaluation by specialists in pediatrics, neurology, and psychology.
 Early diagnosis and intervention are vital because of the improved prognosis
 Important Note: Excessive reliance on checklists or tests is unwise.
 Assessment Tools
 Baby & Infant Screen for Children with Autism Traits-Part 1 (BISCUIT-Part 1; Matson
et al., 2005)
 Format: appealing 23-item checklist
 Type: Diagnostic screening test
 use only in conjunction with further diagnostic evaluation, in the event of a “failing”
score.
 Populations: used with toddlers between 16 and 30 months of age
 Purpose: to identify children at risk for ASDs
 Scoring: fail three or more items (or two or more critical items) should be referred for
further evaluation by specialists.
 Psychometrics: strong content validity, but yields a high false-positive rate
 an acceptable price to pay for identifying at-risk children who might otherwise go
undetected for additional months or years.
 Modified Checklist for Autism in Toddlers (M-CHAT; Robins, Fein, & Barton, 1999)
 Description
 consists of 71 items
 assess the core symptoms of autism in toddlers 17 to 37 months of age.
 Administration and scoring
 completed by a parent or caretaker on a 3-point scale ( 0 = not different, no
impairment; 2 = very different, severe impairment)
 Psychometrics
 Factor analysis supported the construct validity of the scale (Matson, Boisjoli, Hess,
& Wilkins, 2010).
Chapter 8 Foundations of Personality Testing
A. Theories of Personality and Projective Techniques 
 Personality tests:
 seek to measure one or more of the following:
 personality traits
 dynamic motivation
 symptoms of distress
 personal strengths
 attitudinal characteristics.
 Measures of spirituality, creativity, and emotional intelligence also fall within this realm.
a. Personality: An Overview 
 Personality: characteristic pattern of thinking, feeling, and behaving that is unique to each
individual, and remains relatively consistent over time and situations.
 invoke the concept of personality to make sense out of the behavior and expressed feelings of
others.
 used to explain behavioral differences between persons and to understand the behavioral
consistency within each individual
 Mayer (2007-2008): reminder that some vital issues can be approached through the
empiricism of psychological research and testing, whereas other crucial matters remain
elusive and are amenable mainly to philosophical and phenomenological inquiry.
b. Psychoanalytic Theories of Personality
 Created by Sigmund Freud - The substantial foundations that can be traced to this singular
genius of the Victorian and early-twentieth-century era
 Origins of Psychoanalytic Theory
 Freud developed a general theory of psychological functioning with the concept of the
unconscious as its foundation.
 He believed that the unconscious was the reservoir of instinctual drives and a storehouse
of thoughts and wishes that would be unacceptable to our conscious self
 entire family of projective techniques emerged, including inkblot tests, word association
approaches, sentence completion techniques, and storytelling (apperception) techniques
 Rorschach’s View
 Rorschach (1921) likened his inkblot test to an X ray of the unconscious mind
 Overstated their power
 Believed it was evident that the psychoanalytic conception of the unconscious had a
strong influence on testing practices
 The Structure of The Mind
 Freud divided the mind into three structures: the id, the ego, and the superego
 Id and Pleasure Principle
 Id = entirely unconscious
 is also incapable of logic and possesses no concept of time
 mental processes of the id are, therefore, unaltered by the passage of time
 Freud concluded that the id is the seat of all instinctual needs such as for food, water,
sexual gratification, and avoidance of pain
 Follows the Pleasure Principle – the impulsion toward immediate satisfaction
without regard for values, good or evil, or morality
 Ego and Reality Principle
 Part of the Id develops into the ego with development
 Ego = Conscious self
 Purpose – to mediate between the id and reality.
 obeys the reality principle – it seeks realistic and safe ways of discharging the
instinctual tensions that are constantly pushing forth from the id.
 Superego
 Purpose
 the ethical component of personality that starts to emerge in the first five years of life.
 roughly synonymous with conscience and comprises the societal standards of right
and wrong that are conveyed to us by our parents
 Function
 partly conscious, but a large part of it is unconscious, that is, we are not always aware
of its existence operation
 restrict the attempts of the id and ego to obtain gratification. Its main weapon is guilt,
which it uses to punish the wrongdoings of the ego and id
 The Role of Defense Mechanisms
 the ego has a set of tools at its disposal to help carry out its work, namely, mental strategies
collectively labeled defense mechanisms
 come in many varieties, but they all share three characteristics in common:
 (1) Help ego avoid crippling levels of anxiety.
 Anxiety created by the conflicting demands of id, superego, and external reality
 (2) Operate unconsciously
 Controlled by ego but not aware of operation
 (3)Distort inner or outer reality
 may create more problems than it solves
 Assessment of Defense Mechanisms and Ego Functions
 Categories of Defense Mechanisms
 Vaillant (1971)
 developed a hierarchy of ego defense mechanisms based on the assumption that some
mechanisms are healthier or more adaptive than others.
 4 Broad Categories with specific defense mechanisms
 Psychotic
 Immature
 Neurotic
 Mature
 Perry and Henry (2004)
 proposed a similar hierarchy based on the assumption that some mechanisms are
healthier or more adaptive than others.
 developed a sophisticated rating scale of value in clinical practice
 developed the Defense Mechanism Rating Scales (DMRS) as a basis for assessing
the level, type, and severity of defense mechanisms encountered in psychotherapy
patients
 simple quantitative scoring approach in which defense mechanisms were isolated
and identified in short, meaningful segments of the taped interview.
 Likert scale of 1 (highly immature and maladaptive) to 7 (highly mature and
adaptive)
 the most useful score is the Overall Defensive Functioning (ODF) score, which is
the simple average of the ratings of the observed defense mechanisms
 The theoretical range of scores is 1.0 to 7.0
 scores of 3.0 and below are rare
 Scores below 5.0 indicate significant personality disorder or severe depression
 Scores of 6.0 and higher indicate normal or healthy functioning
 Psychometrics
 Interrater reliabilities from six studies were mostly in the mid- to high- .80s
for the ODF scores.
 The stability coefficient for a small sample of patients over a one-month
interval was a respectable .7
 Hierarchy (least to most) of Defense mechanisms
 Psychotic Defense Mechanisms
 Gross denial of external reality such as frank delusions
 Includes denial and distortion
 Acting Out Defense Mechanisms
 Maladaptive behaviors such as impulsive actions
 Includes passive-aggressiveness
 Borderline Defense Mechanisms
 Splitting the image of others into good and bad
 Includes splitting and schizoid fantasy
 Neurotic Defense Mechanisms
 Mechanisms that involve minor reality distortion
 Includes repression and displacement
 Obsessive Defense Mechanisms
 Somewhat adaptive mechanisms
 Includes isolation of affect and intellectualization
 Mature Defense Mechanisms
 Mature forms of defense with minor reality distortion
 Includes humor and sublimation
c. Type Theories of Personality
 Type A Coronary-Prone Behavior Pattern
 Friedman and Rosenman (1974)
 investigated the psychological variables that put individuals at higher risk of coronary
heart disease
 Identified a Type A coronary-prone behavior pattern: behavior pattern consisting of
insecurity of status, hyper-aggressiveness, free-floating hostility, and a sense of time
urgency (hurry sickness).
 First Definition: “an action–emotion complex that can be observed in any person who
is aggressively involved in a chronic, incessant struggle to achieve more and more in
less and less time, and if required to do so, against the opposing efforts of other things
or persons”
 Characteristics
 display a deep insecurity, regardless of their achievement
 desire to dominate others, and typically are indifferent to the feelings of
competitors.
 exhibit a free-floating hostility, and easily find things that irritate them
 suffer from a sense of urgency about getting things done, so they engage in
multitasking
d. Phenomenological Theories of Personality 
 Commonality of Theories
 Emphasize the importance of immediate, personal, subjective experience as a determinant of
behavior.
 approaches share a common focus on the person’s subjective experience, personal world
view, and self-concept as the major wellsprings of behavior.
 Origins of the Phenomenological Approach
 Early Viewpoints
 Vestiges of these early viewpoints are evident in virtually every contemporary
phenomenological personality theory
 German philosopher Edmund Husserl (1859–1938) & Phenomenology
 invented a complex philosophy of phenomenology
 concerned with the description of pure mental phenomena.
 Danish writer Søren Kierkegaard (1813–1855) & existentialism.
 well known for his contributions to existentialism.
 Existentialism is the literary and philosophical movement concerned with the
meaning of life and an individual’s freedom to choose personal goals.
 Carl Rogers, Self-Theory, and the Q-Technique
 Carl Rogers → most influential phenomenological theorist
 Developed the personality theory known as self-theory
 Self Theory –
 Popularized Q-Technique
 a generalized procedure that is especially useful for studying changes in self-concept
 a procedure for studying changes in the self-concept, a key element in Rogers’s
self-theory.
 developed by Stephenson (1953) but a series of studies by Rogers and his
colleagues served to popularize this measurement approach
 Structure and Administration
 consists of a large number of cards, each containing a printed statement
 Admin Technique 1 (Stephenson)
 examinee is asked to sort a hundred or so statements into nine piles, putting a
prescribed number of cards into each, thus forcing a near-normal distribution.
 examinee put the cards most descriptive of him or her at one end, those least
descriptive at the opposite end, and those about which he or she is indifferent or
undecided around the middle of the distribution
 Admin Technique 2 (Rogers)
 compare an examinee’s self-sort with his or her ideal sort.
 used the discrepancy between these two sortings as an index of adjustment.
 required to sort the items twice, according to the following instructions:
 1. Self-sort. Sort these cards to describe yourself as you see yourself today,
from those that are least like you to those that are most like you.
 2. Ideal sort. Now sort these cards to describe your ideal person—the person
you would most like within yourself to be (Rogers & Dymond, 1954).
 Using the item pile numbers, Rogers then correlated the two sorts for each subject
separately
e. Behavioral and Social Learning Theories
 Internal-External (I-E) Scale
 Based on his social learning views and developed by Rotter (1966)
 Construct Measured: measure of internal versus external locus of control.
 Locus of control refers to the perceptions that individuals have about the source of things
that happen to them
 seeks to assess the examinee’s generalized expectancies for internal versus external
control of reinforcement
 Purpose: to determine the extent to which the examinee believes that reinforcement is
contingent upon his or her behavior (internal locus of control) as opposed to the outside
world (external locus of control).
 Type: a forced-choice self-report inventory – examinee chooses the single statement (from a
pair) with which he or she more strongly concurs.
 balance of internal to external responses determines the overall score on the scale
 Psychometrics: reliable and valid instrument
 stimulated a huge body of research on the nature and meaning of locus of control and
related variables.
 Bandura and Self-Efficacy
 he has proposed that perceived self-efficacy is a central mechanism in human action
 Self-efficacy is a personal judgment of “how well one can execute courses of action required
to deal with prospective situations”
 useful in explaining why correct knowledge does not necessarily predict efficient action.
 Details Guidelines on the Creation of a Self-Efficacy Measure
 he warns against the idea that there can be one all-purpose measure of perceived self-
efficacy
 Scales of self-efficacy need to be adapted to the particular domain of functioning of
interest to the practitioner or researcher
 gives advice on how to construct the best self-efficacy scales, starting with issues of
content validity, response bias, item analysis, and ending with strategies for validation of
scale
 must also be practical
f. Trait Conceptions of Personality
 Trait – any “relatively enduring way in which one individual differs from another”
 Trait conceptions of personality have been enormously popular throughout the history of
psychological testing
 Will review two prominent and influential positions from the dozens of trait theories that
have been proposed
 Differ primarily in terms of whether traits are split off into finely discriminable variants
or grouped together into a small number of broad dimensions:
 1. Cattell’s factor-analytic viewpoint identifies 16 to 20 bipolar trait dimensions.
 2. Eysenck’s trait-dimensional approach coalesces dozens of traits into two overriding
dimensions.
 3. Goldberg and others have sought a modern synthesis of all trait approaches by
proposing a five-factor model of personality.

 Cattell’s Factor-Analytic Trait Theory


 Cattell (1950, 1973) refined existing methods of factor analysis to help reveal the basic traits
of personality
 Surface Traits
 the more obvious aspects of personality
 merge in the first stages of factor analysis when individual test items were correlated
with each other
 tended to come in clusters, as revealed by Cattell’s more sophisticated application
of factor analysis – Evidence of course traits
 Source Traits
 stable and constant sources of behavior
 are less visible but are more important in accounting for behavior
 16 source traits
 independently confirmed by factor analysis of thousand respondents
 have been incorporated into the Sixteen Personality Factor Questionnaire (16PF),
a trait-based paper-and-pencil test of personality that is discussed in the next
chapter
 The Five-Factor Model of Personality

 Comment on the Trait Concept
 Concerns about the Trait Approach
 whether traits cause behavior or merely describe behavior
 an empty form of circular reasoning
 model merely describes psychopathology but does not explain it.
 low predictive validity
 Mischel (1968) and Personality Coefficient –
 Personality coefficient – a term used to refer to the finding that the predictive validity of
personality scales rarely exceeds .30.
 correlations of r = .30 are of minimal value
 Responses to Personality Coefficent
 Attempted to
 refine and limit the trait concept
 distinguish the kinds of situations in which behavior is largely determined by
traits
 Modest success, raising the validity of some trait questionnaires substantially beyond
the ominous r = .30 barrier posited by Mischel (1968). But gone forever are the days
of simplistic, generalized assertions such as “trait X predicts behavior Y.”
g. The Projective Hypothesis 
 projective hypothesis = the assumption that personal interpretations of ambiguous stimuli must
necessarily reflect the unconscious needs, motives, and conflicts of the examinee.
 Challenge = to decipher underlying personality processes (needs, motives, and conflicts)
based on the individualized, unique, subjective responses of each examinee
 A Classification of Projective Techniques
 Lindzey (1959) – Five Categories of Projectives:
 Association to inkblots or words
 Rorschach inkblot test
 Holtzman Inkblot Technique – psychometrically superior
 word association tests
 Construction of stories or sequences
 Thematic Apperception Test
 The many variations upon this early instrument
 Completions of sentences or stories
 sentence completion test (e.g., Rotter Incomplete Sentences Blank)
 Arrangement/selection of pictures or verbal choices
 Szondi test (discussed in the first chapter)
 currently seldom used
 Expression with drawings or play
 Draw-A-Person test
 House-Tree-Person test
 Very popular despite lack of validity
i. Association Techniques  
 The Rorschach
 Bakground and History
 developed in 1921 by Hermann Rorschach
 inspired by the observation that schizophrenia patients often interpret the things
they see in unusual ways.
 Purpose
 to measure thought disorder for the purpose of identifying mental illness.
 Administration
 In the test, the participant is shown a series of ten ink blot cards and directed to
respond to each with what they see in the inkblot.
 Because completing the Rorschach Test is time intensive and requires and
psychologist trained in its usage, there have been many attempts to convert the
Rorschach into an objective test for ease of use
 Information derived
 information about determinants (the
aspects of the inkblots that triggered the
response, such as form and color) and
location (which details of the inkblots
triggered the response) is often
considered more important than conten
 "Popularity" and "originality" of
responses can also be considered as
basic dimensions in the analysis
 Interpretation is based on use of different
scoring systems
 Historically interpreted based on
clinician judgement, which is valid or
reliable
 Comprehensive System (CS)
supplanted all previous methods and
became the preferred scoring system
because it was more clearly grounded
in empirical research
 The Rorschach Performance Assessment System (R-PAS)
 represents an extension and improvement of the CS
 Standardization = availability of an international reference sample for
standardization of scoring variables
 Reliability = Interrater reliability of R-PAS scores is excellent
 Validity = Good for some purposes and not for others
 the Rorschach Prognostic Rating Scale (RPRS)
 Thought Disorder Index (TDI)
ii. Completion Techniques
 Sentence Completion Tests
 respondent is presented with a series of stems consisting of the first few words of a
sentence, and the task is to provide an ending.
 assumes that the completed sentences reflect the underlying motivations, attitudes,
conflicts, and fears of the respondent
 can be interpreted in two different ways:
 subjective-intuitive analysis of the underlying motivations projected in the
subject’s responses
 objective analysis by means of scores assigned to each completed sentence.
 Brief Outline of Representative Sentence Completion Tests
 Sentence Completion Series
 50 sentence stems designed to aid the clinician in identifying underlying
concerns and specific areas of client distress
 eight different forms, parallel in content, which allow for repeated testing.
 Forer Structured Sentence Completion Test
 separate forms for men, women, adolescent boys, and adolescent girls.
 Each form contains 100 sentence stems designed to cover attitude–value
systems, evasiveness, and defense mechanisms.
 Geriatric Sentence Completion Form
 a 30-item form specifically developed for use with older adult clients.
 elicits personal responses to four content domains: physical, psychological,
social, and temporal orientation.
 Washington University Sentence Completion Test
 separate forms for men, women, and younger male and female subjects.
 highly theory-bound; responses are classified according to seven stages of ego
development: presocial and symbiotic, impulsive, self-protective, conformist,
conscientious, autonomous, integrated.
 Rotter Incomplete Sentences Blank
 Three Types of Forms, each containing 40 sentence stems written
 High school, college, adult
 Scoring
 he objective scoring system each completed sentence receives an adjustment score
from 0 (good adjustment) to 6 (very poor adjustment).
 Responses categorized as follows
 Omission—no response or response too short to be meaningful
 Conflict response—indicative of hostility or unhappiness
 Positive response—indicative of positive or hopeful attitude
 Neutral response—declarative statement with neither positive nor negative
affect
 overall adjustment score
 obtained by adding the weighted ratings in the conflict and positive categories.
 can vary from 0 to 240, with higher scores indicating greater maladjustment
 Psychometric Properties
 Reliability of the adjustment score is exceptionally good, even when derived by
assistants with minimal psychological expertise.
 interscorer reliabilities are in the .90s
 split-half coefficients are in the .80s
 Validity
 Proven construct validity of the adjustment index
 Low predictive valisity
 Critiques
 appears that the norms for the adjustment index are outdated.
 single score cannot possibly capture any nuances of personality functioning
 subject to the same types of bias as other self-report measures, namely, the
information will reflect mainly what the respondent wants the examiner to
know
iii. Construction Techniques 
 The Thematic Apperception Test (TAT)
 Description
 developed by Henry Murray and his colleagues at the Harvard Psychological
Clinic (Morgan & Murray, 1935; Murray, 1938)
 consists of 30 pictures that portray a variety of subject matters and themes in
black-andwhite drawings and photographs; one card is blank.
 Most of the cards depict one or more persons engaged in ambiguous activities.
 Some cards are used for adult males (M), adult females (F), boys (B), or girls
(G), or some combination (e.g., BM).
 exactly 20 cards are appropriate for every examinee.
 Psychometric Properties
 difficult to evaluate because of the abundance of scoring and interpretation
methods.
 Note: only a tiny fraction of clinical practitioners rely on a standardized
scoring system, as they are likely to over-diagnose psychological disturbance.
 very low test–retest reliability, with a reported median value of r = .28
 The Picture Projective Test
 uses a set of pictures taken from the Family of Man photo essay published by the
Museum of Modern
 Compared to the TAT productions, the PPT stories were
 of comparable length but were much more positive in thematic content and
emotional tone
 much more active, meaning that the central character had an active, self-
determined effect on the situation in the story
 placed greater emphasis on interpersonal rather than intrapersonal themes
(placed more emphasis on “healthy,” adaptive aspects of personality
adjustment than did the TAT production)
 Diagnostic Validity Compared to TAT
 Although the TAT and PPT were essentially equal in their capacity to
discriminate normal from depressed subjects, the PPT was superior in
differentiating psychotics from normals and depressives.
 Promising instrument, but further research is needed on its psychometric qualities
 Children’s Apperception Test
 Consists of 10 pictures and is suitable for children 3 to 10 years of age
 CAT-A – for younder children
 CAT-H – for older children
 No formal scoring system exists for the CAT and no statistical information is
provided on reliability or validity
 The lack of attention to psychometric issues of scoring, reliability, and validity of
the CAT is troublesome to most testing specialists
 Other Variations on the TAT
 Thematic Apperception Tests for Specific Populations
 Family Apperception Test
 For children ages 6 and older, the Family Apperception Test consists of 21
cards depicting a family in various situations. For example, one card shows a
family sitting around a table with parents talking while the children eat. As
with the TAT, the examinee is asked to describe what led up to the scene,
what is happening now, what will happen next, and what the main characters
are feeling. The test is based on family systems theory. The manual provides a
scoring guide for categories such as limit-setting, conflict resolution,
boundaries, quality of relationships, and emotional tone (Sotile, Julian, Henry,
& Sotile, 1988).
 Blacky Pictures
 For children ages 5 and older, the Blacky Pictures test was also based on the
premise that children identify more readily with animals than humans. The 11
cartoon stimuli depict the adventures of the dog Blacky and his family
(Mama, Papa, and sibling Tippy). In addition to requesting a story for each
card, the examiner also presents multiple-choice questions based on stages of
psychosexual development derived from psychoanalytic theory (Blum, 1950).
Although the test was originally developed with adults, children enjoy taking
the Blacky and are quite responsive to the pictures. Problems with this test
include the absence of norms, especially for children, and poor stability of
scores (LaVoie, 1987).
 Michigan Picture Test-Revised
 For older children ages 8 to 14 years, the MPT-R consists of 15 pictures and a
blank card. Responses are scored for Tension Index (e.g., portrayal of
personal adequacy), Direction of Force (whether the central figure acts or is
acted upon), and Verb Tense (e.g., past, present, future). These three scores
can be combined to yield a Maladjustment Index. Reliability and norms are
adequate, although evidence of validity is unsatisfactory. A major problem
with this test is that the cards portray interpersonal relationships so vividly
that little is left to the child’s imagination (Aiken, 1989).
 Senior Apperception Test (SAT)
 Although the 16 situations depicted on the SAT cards include some positive
circumstances, the majority of pictures were designed to reflect themes of
helplessness, abandonment, disability, family problems, loneliness,
dependence, and low self-esteem (Bellak, 1992). Critics complain that the
SAT stereotypes the elderly and therefore discourages active responding
(Schaie, 1978).
iv. Expression Techniques 
 The Draw-A-Person Test
 The interpretive premises are colorful, interesting, and plausible. However, they are
based entirely on psychodynamic theory and anecdotal observations.
 judged by contemporary standards of evidence, some reviewers have concluded
that the DAP is an unworthy test that should no longer be used
 Naglieri, McNeish, and Bardos (1991) developed the Draw A Person: Screening
Procedure for Emotional Disturbance (DAP:SPED)
 The House-Tree-Person Test (H-T-P)
 uses freehand drawings of a house, tree, and person
 has much the same familial lineage as the Draw-A-Person Test
 originally conceived as a measure of intelligence, complete with a quantitative
scoring system to appraise an approximate level of ability
 Psychometrics
 ver provided any evidence to support the reliability or validity of this instrument.
 Asserteds that validational research is not possible with the H-T-P
 Attempts to validate the H-T-P as a personality measure have failed miserably
v. Projective tests use
 These are used as ancillary measures to the clinical interview
 These practitioners use projective techniques as clinical tools to derive tentative
hypotheses about the examinee.
 Most of these hypotheses will turn out to be false when examined more closely.
However, the few that are confirmed may have important implications for the
clinical management of the examinee.

B. Self-Report and Behavioral Assessment of Psychopathology


a. Theory-Guided Inventories
 construction of several self-report inventories was guided closely by formal or informal theories
of personality
 Differences
 Stand in contrast to factor-analytic approaches that often produce a retrospective theory
based upon initial test findings
 also differ from the stark atheoretical empiricism found in criterion-key instruments such as
the MMPI and MMPI
 Personality Research Form
 Format
 a true–false inventory based loosely on Murray’s (1938) theory of manifest needs
 The forms differ in the number of scales and number of items per scale.
 parallel short tests (forms A and B)
 profiles for the 22 PRF Personality Scales as well as for 8 Vocational Preference
Scales.
 352 true-false items 
 parallel long forms (forms AA and BB).
 used primarily with college students
 consist of 440 true–false items.
 20 personality-scale scores and two validity scores, Infrequency and Desirability (
 Purpose/Theory Backgorund
 focuses on areas of normal functioning rather than psychopathology.
  an extensively researched and validated measure of normal personality.
 Scores
 designed to yield scores for personality traits relevant to the functioning of individuals in
a wide variety of situations.
 Best For
 Counselors seeking comprehensive, detailed personality trait information on clients
 Psychologists looking for detailed trait information to include in a selection battery
 State-Trait Anxiety Inventory( STAI)
 Description
 Self-report measure of anxiety, used in research and clinical settings
 current version is called Form Y
 similar scale for children also is available
 Purpose: to differentiate between the temporary condition of state anxiety and the more
long-standing quality of trait anxiety.
 State Anxiety – transitory emotional state or condition characterized by subjective
feelings of tension and apprehension, and by activation of the autonomic nervous
system
 Trait Anxiety – relatively stable individual differences in anxiety proneness
 Format
 Scales
 The state scale (A-State scale) consists of 20 items that evaluate how the respondent
feels “right now, at this moment.”
 4-point scale (Not At All, Somewhat, Moderately So, and Very Much So).
 The trait scale (A-Trait scale) consists of 20 items that assess how the respondent
feels “generally.”
 4-point scale (Almost Never, Sometimes, Often, and Almost Always)
 Scoring
 scoring is reversed for positively stated items
 range of scores for each scale is 20 to 80, with higher scores indicating greater
anxiety
 Extensive normative data are available, stratified by age and subdivided by setting
 Reliability
 Test-Retest Reliability
 we can expect that test– retest reliability will be lower for state anxiety than for trait
anxiety
 short-range reliability in the .40s and .50s for the A-State scale and in the high .
80s for the A-Trait scale
 Internal Consistency
 Cronbach’s alpha of .86 for the total score in a sample of medical patients
 Individual alpha values for A-State and A-Trait are robust as well, with results of .95
and .93,
 Validity
 The validity of the STAI is well established from dozens of studies demonstrating content
validity, convergent/discriminant validity, and construct validity (Spielberger, 1989).
b. Factor-Analytically Derived Inventories
 Eysenck Personality Questionnaire
 Description
 designed to measure the major dimensions of normal and abnormal personality
 , Eysenck isolated three major dimensions of personality: Psychoticism (P), Extraversion
(E), and Neuroticism (N)
 Format
 Scale
 consists of 3 scales to measure P, E, & N dimensions
 also incorporates a Lie (L) scale to assess the validity of an examinee’s responses.
 The EPQ contains 90 statements answered “yes” or “no” and is designed for persons
aged 16 and older.
 A Junior EPQ containing 81 statements is suitable for children ages 7 to 15
 Scoring
 P-Scale
 High Score – aggressive and hostile traits, impulsivity, a preference for liking odd
or unusual things, and empathy defects.
 E.g., results of antisocial and schizoid patients
 Low Score – more desirable characteristics such as empathy and interpersonal
sensitivity
 E-Scale
 High scores – a loud, gregarious, outgoing, fun-loving person.
 Low scores – introverted traits such as a preference for solitude and quiet
activities.
 N-Scale
 reflects a dimension of emotionality
 nervous, maladjusted, and over–emotional (high scores)
 stable and confident (low scores).
 Reliability = Excellent
 1 month test–retest correlations were .78 (P), .89 (E), .86 (N), and .84 (L).
 Internal consistencies were in the .70s for P and the .80s for the other three scales
 Validity
 Construct validity of the EPQ is also well established using behavioral, emotional,
learning, attentional, and therapeutic criteria
 Comrey Personality Scales
 Description
 a short self-report inventory suitable for college students and other adults
 pursued a factor-analytic strategy
 Format
 180-item test (180 statements)
 8 CPS personality scales (20 Items each)
 (T) Trust versus Defensiveness.
 High scores = belief in the basic honesty, trustworthiness, and good intentions of
other people.
 (O) Orderliness versus Lack of Compulsion.
 High scores = careful, meticulous, orderly, and highly organized individuals.
 (C) Social Conformity versus Rebelliousness.
 High scores = accept society as it is, resent nonconformity in others, seek the
approval of society, and respect the law.
 (A) Activity versus Lack of Energy.
 High-scores = a great deal of energy and endurance, work hard, and strive to
excel.
 (S) Emotional Stability versus Neuroticism.
 High-scores = free from depression, optimistic, relaxed, stable in mood, and
confident.
 (E) Extraversion versus Introversion.
 High-scores = meet people easily, seek new friends, feel comfortable with
strangers, and do not suffer from stage fright.
 (M) Mental Toughness versus Sensitivity.
 High Scores = rather tough-minded people who are not bothered by blood,
crawling creatures, vulgarity, and who do not cry easily or show much interest in
love stories.
 (P) Empathy versus Egocentrism.
 High scores = helpful, generous, sympathetic people who are interested in
devoting their lives to the service of others.
 Another 2 scales and 20 items each are devoted to a validity check and the assessment of
social desirability response bias.
 (V) Validity Check.
 A score of 8 is the expected raw score.
 Any score on the V scale that gives a T-score equivalent below 70 is still
within the normal range
 Higher scores are suggestive of an invalid record.
 (R) Response Bias.
 High scores indicate a tendency to answer questions in a socially desirable way,
making the respondent look like a “nice” person.
 Reliability
 possess exceptional internal consistencies, which range from .91 to .96.
 indicate that the CPS is most likely a reliable test, but traditional test–retest data are
scant.
 Validity
 Cross-cultural studies with the CPS are highly supportive of its validity.
 Practical Uses
 test is a reasonable predictor of clinical performance and personal suitability
 Critique
 needs updated standardization and additional documentation on its technical qualities
c. Criterion-Keyed Inventories 
 In a criterion-keyed approach, test items are assigned to a particular scale if, and only if, they
discriminate between a well-defined criterion group and a relevant control group
 test developer does not consult any theory to determine which items belong on the respective
scales.
 essence of the criterion-keyed procedure is, so to speak, to let the items fall where they may.
 not well suited to the development of independent measures
 Minnesota Multiphasic Personality Inventory-2 (MMPI-2)
 Description
 Provided updated normative sample - a vast improvement over the MMPI normative
sample
 Format
 Items
 567 true or false items carefully designed to assess a wide range of concerns
 dozens of items were rewritten, but most of these revisions are cosmetic and do
not affect the psychometric characteristics of the test
 encompass a wide variety of mainly pathological themes
 Scales
 scored for four validity scales, 10 standard clinical scales, and dozens of
supplementary scales.
 The 4 Validity Scales
 Cannot Say (or ?)
 simply the total number of items omitted or double-marked in completion of
the answer sheet
 very high score - may indicate a reading problem, opposition to authority,
defensiveness, or indecisiveness caused by depression
 Lie Scale (L)
 False asserts that he or she possesses a degree of personal virtue that is rarely
observed in our culture
 designed to identify a general, deliberate, evasive test-taking attitude
 high score on the L Scale indicates defensiveness and nativity.
 frequency or infrequency scale (F)
 reflect a broad spectrum of serious maladjustment, including peculiar
thoughts, apathy, and social alienation.
 Low Scores: Significant psychiatric disturbance
 High Scores: insufficient reading ability, random or uncooperative
responding, a motivated attempt to “fake bad” on the test, or an exaggerated
“cry for help” in a distressed client
 Correction Scale (K)
 designed to help detect a subtle form of defensiveness.
 elevated score indicate a defensive test-taking attitude
 normal range scores suggest good ego strength
 10 Standard Clinical Scales
 clinical scales were constructed in the usual criterion-keyed manner by
contrasting responses of clinical subjects and normal controls, with exception of
Social Introversion
 Dozens of Supplementary Scales.
 Examples of the supplementary scales include Anxiety, Repression, Ego Strength,
and the MacAndrew Alcoholism Scale-Revised.
 MMPI-2 Interpretation
 Interpreting the MMPI requires specialized training and knowledge
 several computerized interpretation systems are available for the MMPI and the
MMPI-2
 can proceed along two different paths: scale by scale or configural.
 Scale by Scale
 The examiner determines the validity of the test, as discussed previously, by
inspecting the four validity scales.
 If the test appears reasonably valid by these criteria, the examiner consults a
relevant resource book and proceeds scale by scale to produce a series of
hypotheses
 Configural
 Somewhat more complicated and consists of classifying the profile as belonging
to one or another loosely defined code type that has been studied extensively
 Code Types
 by a combination of elevation (two or more clinical scales elevated beyond a
certain criterion) and definition (two or more clinical scales clearly standing
out from the others).
 The Minnesota Report
 Configural report produced by computerized interpretation system
 Generates a very cautious and methodical 16-page report that includes discussion of
profile validity, symptomatic patterns, interpersonal relations, diagnostic
considerations, and treatment considerations.
 Provides a variety of figures and tables to illustrate test results.
 Technical Properties of the MMPI-2
 mean of 50 and 10 SD
 MMPI-2 likely will maintain its status as the premiere instrument for assessment of
psychopathology in adulthood for many years to come.
 Reliability
 generally positive, with median internal consistency coefficients (alpha) typically in
the .70s and .80s, but as low as the .30s for some scales in some samples
 test–retest coefficients range from the high .50s to the low .90s, with a median in the .
80s
 Validity
 Difficult to summarize, owing to the sheer volume of research on this instrument and
its predecessor
 Slight racial differences do exist in average profiles
 the average validity coefficient for MMPI studies conducted between 1970 and 1981
was a healthy .46 (MMPI-II highly overlaps with content of MMPI)
 Millon Clinical Multiaxial Inventory-III (MCMI-III)
 Description
 The MCMI is a self-report instrument designed to assess DSM-IV related personality
disorders and clinical syndromes coordinated with Millon’s theory of personality. A
significant revision of the MCMI-II, this instrument incorporates new items, a new
weighting systems, and new scales to provide insight into 14 personality disorders and 10
clinical syndromes. It was developed for use with adults who are seeking mental health
treatment and who have at least eighth-grade reading skills.
 Benefits
 shortness, its theoretical anchoring, multiaxial format, tripartite construction and
validation schema, use of base rate scores, and interpretive depth
 Format
 composed of 175 true-false questions and usually takes the average person less than 30
minutes to complete. After the test is scored, it produces 29 scales — 24 personality and
clinical scales, and 5 scales used to verify how the person approached and took the test.
 The test is broken down into the following scales:
 Moderate Personality Disorder Scales
 1. Schizoid
 2A. Avoidant
 2B. Depressive
 3. Dependent
 4. Histrionic
 5. Narcissistic
 6A. Antisocial
 6B. Aggressive (Sadistic)
 7. Compulsive
 8A. Passive-Aggressive (Negativistic)
 8B. Self-Defeating
 Severe Personality Pathology Scales
 S. Schizotypal
 C. Borderline
 P. Paranoid
 Moderate Clinical Syndrome Scales
 A. Anxiety
 H. Somatoform
 N. Bipolar: Manic
 D. Dysthymia
 B. Alcohol Dependence
 T. Drug Dependence
 R. Post-Traumatic Stress Disorder
 Severe Syndrome Scales
 SS. Thought Disorder
 CC. Major Depression
 PP. Delusional Disorder
 Validity (Modifying) Indices
 modify the person’s Base Rate scores based upon the following areas:
 Disclosure (X), Desirability (Y), Debasement (Z),
 two random response indicators — Validity (V) and Inconsistency (W).
 Psychometric Properties
 The psychometrics of the MCMI-III are good and it is considered a reliable and valid
psychological test
 Reliability
 The reliability of the individual scales is good: Internal consistency coefficients
average .82 to .90, and test–retest coefficients for one week range from .81 to .87.
 Validity
 Support for the validity of the MCMI-III is mixed
 Personality Inventory for Children-2 (PIC-2)
 Description
 designed for younger children and can be used as young as 5 years old.
 assess a range of emotional behavioral and cognitive issues
 Purpose and Nature
 Construct(s) Measured: Hyperactivity, Conduct problems, Social skills, Several others
 Population for which designed: Age Range: 5 through 19 years old
 Grade Level: Kindergarten to High School Senior
 Method of Administration: Individual
 Source of Information: Parent
 Subtests and Scores: Cognitive Impairment, Impulsivity and Distractibility, Delinquency,
Family Dysfunction, Reality Distortion, Somatic Concern, Psychological Discomfort,
Social Withdrawal, Social Skills Deficits, Response Validity Scales
 Number of Items: 275
 Type of Scale: Forced choice
 Norms
 Sample Size: 2,306
 Population: Two samples:
 2,306 parents of boys and girls in Kindergarten through 12th grade, collected from 23
urban, rural and suburban schools in 12 states, all socioeconomic levels and ethnic
groups represented;
 1,551 parents whose children had been referred for educational or clinical
intervention.
 Culture/ethnicity: African-American, Asian-American, Caucasian, Hispanic/Latino,
Native-American, Other
 SES Level: Low to High
 Reliability
 Psychometric information: Provided for Subscales.
 The range of Test-Retest Value: 0.82 to 0.92
 The range of Inter-rater reliability: Not assessed
 The range of Internal consistency: 0.81 to 0.92
 Validity
 Criterion validity was assessed and found to be acceptable.
d. Behavioral Assessment
 concentrates on behavior itself
 Skinner - operant conditioning
 Operant conditioning can be described as a process that attempts to modify behavior through
the use of positive and negative reinforcement. Through operant conditioning, an individual
makes an association between a particular behavior and a consequence.
 Wolpe - systematic desensitization
 type of behavioral therapy based on the principle of classical conditioning.
 aims to remove the fear response of a phobia, and substitute a relaxation response to the
conditional stimulus gradually using counter conditioning.
 Bandura - cognitive based learning
 Credited for” observation, cognition therapy, people learn from one another, via observation,
imitation, and modeling.
 theory has often been called a bridge between behaviorist and cognitive learning theories
because it encompasses attention, memory, and motivation
i. Behavior Therapy and Behavioral Assessment
 Five Techniques in 4 Categories: exposure-based methods, cognitive behavior therapies, self-
control procedures, and social skills training.
 Exposure-Based Methods
 Behavioural Avoidance Test
 A behavioral procedure in which the therapist measures how long the client can
tolerate an anxiety--inducing stimulus.
 Practical Use Example: anxiety score from the BAT technique was strongly
related to self-reports of catastrophic thoughts (e.g., choking to death, having
a heart attack, acting foolish, becoming helpless; Hoffart, Friis, Strand, &
Olsen, 1994)
 Based on the reasonable assumption that the client’s fear is the main determinant
of behavior in the testing situation.
 Caution: results of BAT assessments may not generalize, and the therapist must
be wary of foreclosing treatment too soon
 Fear Survey Schedule
 A behavioral assessment device which requires respondents to indicate the
presence and intensity of their fears in relation to various stimuli, typically on a 5
or 7 point Likert scale
 used in research projects to screen large samples of persons in search of
subjects who share a common fear
 used to monitor changes in fears, including those that have been targeted for
clinical intervention
 Cautions about reliability of data
 Reliability data for fear surveys are almost nonexistent
 The essential downfall seems to be that fear survey schedules possess such
“obvious” validity that few researchers
have bothered to evaluate the traditional
psychometric characteristics of reliability
and validity
 Cognitive Behavior Therapies
 need not use formal assessment tools in their
clinical practice.
 monitor the belief structure of their clients on
an informal session-to-session basis.
 Those available are Questionnaire Measures of
Cognitive Distortion (table to right)
 mainly research questionnaires suitable to the testing of group differences, but not
sufficiently validated for individual assessment
 Critiques: premature release of their instruments – absence of research on the
concurrent and discriminant validity, designed to validate constructs in research
and consequently do not work well in clinical practice.
 Beck Depression Inventory, 2nd edition (BDI-II)
 Description
 21‐item self‐report instrument 
 designed to assess the severity of depression in adults and adolescents aged 13
years and older.
 designed to act as an indicator of depressive symptoms based on diagnostic
criteria in the DSMIV (21 items in the BDI‐II are representative of criteria)
 Format
 The tool consists of 21 items that are self‐rated on a 4 point scale (0-3)
 Total raw scores can range from 0 to 63, and are then converted into descripti
ve classifications based on cut  scores. 
 Total score of 0‐13 is considered minimal range, 14‐19 is mild, 20‐28 is mode
rate, and 29‐ 63 is severe. 
 Reliability
 Internal Consistency: an analysis yielded a Cronbach’s alpha od .92 for
outpatients and .93 for students. Other academic studies have demonstrated
similar internal consistency coefficients in  the .89 to .93 range.
 Testretest Reliability: A sub‐
sample (26 patients) of the clinical sample was retested with the  BDI‐
II one week after the first administration. The test‐retest reliabilities were calc
ulated, and  yielded an average correlation of .93.
 Validity
 Validity: Construct validity was high when compared to the BDI (.93)
 Sensitivity to change
 Designed to assess mood within the most recent 2 week period, so comparison
across assessments should reflect change over time
 Critique
 Lack of transparency – can easily hid despair or exaggerate their depression
can do so easily
 Self-Monitoring Procedures
 A therapeutic approach in which the client chooses the goals and actively participates
in which the supervising, charting, and recording progress toward the end point(s) of
therapy.
 Pleasant Events Schedule (PES; MacPhillamy & Lewinsohn, 1982)
 Used for self-monitoring the frequency and pleasantness of up to 320 largely
ordinary, everyday behaviors. 
 purpose of the PES is two-fold.
 in the baseline assessment phase, the PES is used to self-monitor the
frequency (F) and pleasantness (P) of 320 largely ordinary, everyday events
 During the treatment phase, the PES is used to self-monitor therapeutic
progress.
 Technical Properties
 The instrument has fair to good test–retest reliability (one-month correlations
in the range of .69 to .86), excellent concurrent validity with trained observers,
and promising construct validity.
 In general, the subscales behave as one would predict on the basis of the
constructs they purport to measure
ii. Structured Interview Schedules
 Outside of diagnostic reasonings, there are many purposes of using a Structured
Interview Schedule
 Reducing the complexity of clinical phenomena
 Facilitating communication between clinicians
 Predicting the outcome of the disorder
 Deciding on an appropriate treatment
 Assisting in the search for etiology
 Determining the prevalence of diseases worldwide
 Making decisions about insurance coverage
 Concerns about using a Structured Interview Schedule
 The sheer amount of time it can take to determine a multiaxial diagnosis.
 although the DSM-IV textbook describes the diagnostic categories and alternatives
with great precision, it does not specify a coherent method for arriving at the
diagnosis.
 Psychiatric diagnosis is mixed in its reliability.
 2 structured interview schedules that have been developed to reduce the time needed for
psychiatric diagnosis and to improve reliability.
 Schedule for Affective Disorders and Schizophrenia (SADS)
 highly respected diagnostic interview for evaluating Axis I mood and psychotic
disorders
 semistructured inquiry that includes standard questions asked of all patients and
optional probes used to clarify patient responses
 olicits sufficient information to assess the severity of disturbance and also to
elucidate the diagnosis
 Technical Properties
 Consensus from over 21 studies is that the interrater reliability for specific
diagnoses is typically strong, with median kappa co-efficients of greater than .
85 (Kappa is the index of interrater agreement, corrected for chance)
 Structured Clinical Interview for Diagnosis (SCID)
 Semi-structure interview that comes in numerous editions and variations
 SCID-I for Axis I diagnoses
 SCID-II for Axis II diagnoses,
 SCID-P for determining the differential diagnosis of psychotic symptoms
 SCID-NP for nonpatient settings in which a current psychiatric disorder is
unlikely.
 Each follow the same format in which the interviewer reads the SCID questions to
the client in sequence
 the objective = to elicit sufficient information to determine whether individual
DSM-IV criteria are met.
 Average SCID Interrater Agreement for Psychiatric Diagnosis (Based on DSM-
IV)
Good (equal or greater Fair (.50 to .69) Poor (below .50)
than .7)
MDD Dysthymic Disorder Somatoform Disorder
BD Social Phobia
Schizophrenia OCD
Alcohol use disorder GAD
SUD Avoidant PD
Panic Disorder Dependent PD
PTSD Obsessive Compulsive
ED PD
Schizotypal PD Passive-Aggressive PD
Schizoid PD Self-Defeating PD
Narcissistic PD Depressive PD
APD Paranoid PD

iii. Assessment by Systematic Direct Observation 


 Populations: systematic and direct observation is widely used in the evaluation of
children in a school-based environment and adults with I/DD
 highly structured
 Set Apart from Naturalistic Observation by Five Characteristics
 1. Goal is to measure specific behaviors
 2. Operationally defined target behavior
 3. Standardized observation procedures
 4. Specified times and places
 5. Standardized scoring (does not vary from one observer to another)
 Based on different types of data collection
 Freq./Rate, Duration, Interval, Time sampling, etc
 Example of Systematic Direct Observation: Behavior Assessment System for Children
 straightforward form that consists of six categories of classroom behavior—five
designed for students and one for the teacher
 classifies behaviors as active engagement, passive engagement, off-task motor,
off-task verbal, and off-task passive
 Direct instruction by the teacher also is recorded.
 also allows for the collection of behavioral norms for classmates to determine
normative patterns in each category
 rated in 15-second intervals for a 15-minute interval.
 Threats to Reliability and Validity
 Sattler (2002): catalogued the sources of unreliability; Include personal qualities of
the observer, poor design of instruments, and problems in obtaining a representative
sample of behavior.
 Observer Drift – Personal Quality of Observer
 when an observer becomes fatigued and less vigilant over time, thus failing to
notice target behaviors when they occur.
 primary antidote to observer inaccuracy is careful training and cross-checking
of one observer against another to demonstrate a high level of interrater
agreement
 Coding Complexity – Poor Design of Instruments
 here are too many categories or ill-defined categories.
 Attention to design of rating scales and pretesting of instruments will avert
this problem.
 Tester qualities such as attentional difficulties – problems in obtaining a
representative sample of behavior
 Ratings should be collected throughout the day or, if this is not possible,
during the most salient time periods.
iv. Analogue Behavioral Assessment 
 The observation of clients in a contrived but plausible setting in which they are instructed
to engage in relevant tasks designed to elicit behaviors of interest.
 Closely related to methods of systematic, direct observation
 Done in a contrived but plausible setting, analogous to real life situations
 Uses
 evaluate children referred for assessment of behavior or school problems
 evaluate parent–child interactions.
 Assessment of adults couples, including husbands and wives seeking marital therapy
 Example: Rapid Couples Interaction Scoring System (RCISS)
 consists of 22 codes that address speaker and listener behaviors, both verbal and
nonverbal, in such categories as criticism, disagreement, compromise, positive
solution, questioning, humor, and smiling
 provide information that is helpful in characterizing communication patterns
 Critiques
 Little technical information available
 does not deal adequately with issues of subtext or “reading between the lines” in
couples’ communication
v. Ecological Momentary Assessment 
 defined as the “real-time measurement of patient experience in the real world, at the point
of experience”
 Uses wireless technology to assess attitudes, pain, thoughts on a randomized schedule
 would consist of patients reporting their instantaneous experiences on a handheld
device, with responses immediately transmitted (via the same wireless technology
used by cell phones) to a central computer for ultimate analysis with sophisticated
software.
 responses of clients are immediate and based on a schedule determined by the researcher,
so several biases of human recall are avoided
 Little affect by recency bias
 data are collected in naturalistic settings in real time, and, therefore, not prone to
biases in recall
 The recency bias refers to the fact that people are more likely to recall recent
events than remote events
 Benefit: provides a more accurate and reliable approach to the assessment of patient
experience
 compliance cannot be faked
 tech is becoming streamlined and more affordable
 highly user-friendly

You might also like