Psych Ass Notes
Psych Ass Notes
Psych Ass Notes
PSYCHOLOGICAL TESTING AND ASSESSMENT 2. Pre-assessment: meet with assessee or others before the
formal assessment to clarify aspects of the reason for
-Alfred Binet published a test designed to help place Paris referral
schoolchildren in appropriate classes 3. Prepares for the assessment by preparing the tools to be
used
-World War I: military needed a way to screen large numbers of 4. Assessor writes a report of the finding that is designed to
recruits quickly for intellectual and emotional problems; military answer the referral question
needed a way to screen large numbers of recruits quickly for TOOLS OF ASSESSMENT
intellectual
-World War II: military would depend even more on psychological -test: measuring device or procedure
tests to screen recruits for service
-psychological test: device or procedure designed to measure
-tests to measure not only intelligence but also personality, brain variables related to psychology; almost always involves an analysis
functioning, performance at work, and many other aspects of of behavior
psychological and social functioning
Interview
-by World War II: distinction between testing and a more inclusive Portfolio
term, “assessment” Case history data
Behavioral observation
-psychological assessment: gathering and integration of psychology- Role-play tests
related data for the purpose of making a psychological evaluation that CAPA (computer-assisted psychological assessment): scoring
is accomplished through the use of tools such as tests, interviews, is immediate
case studies, behavioral observation, and specially designed
apparatuses and measurement procedures CAT (computer adaptive testing): ability to tailor the test to
the testtaker’s ability or test-taking pattern
-psychological testing: process of measuring psychology-related
variables by means of devices or procedures designed to obtain a -advantages over paper-and-pencil tests:
sample of behavior Test administrators have greater access to potential users
because of the global reach of the internet
Content
Format
Administration procedures
Scoring and interpretation
o score: code or summary statement, usually but not
-therapeutic psychological assessment: assessment that has a
necessarily numerical in nature; reflects an evaluation of
therapeutic component to it
performance on a test, task, interview, or some other
sample of behavior
-educational assessment: evaluate abilities and skills relevant to
success or failure in a school or pre-school context (i.e., intelligence
o scoring: process of assigning such evaluative codes or
tests, achievement tests, and reading comprehension tests)
statements to performance on tests, tasks, interviews, or
other behavior samples
-retrospective assessment: use of evaluative tools to draw
conclusions about psychological aspects of a person as they existed at
o cut score: reference point, usually numerical, derived by
some point in time prior to the assessment
judgment and used to divide a set of data into two or more
-remote assessment: draw conclusions about a subject who is not in classifications; used in schools, also used by employers as
physical proximity to the person or people conducting the evaluation aids to decision making about personnel hiring,
placement, and advancement
-Ecological Momentary Assessment (EMA): psychological
assessment by means of smartphones; “in the moment” evaluation -some tests are self-scored by testtakers themselves, are scored
by a computer, and others require scoring by trained examiners
-may consider society at large as party to the assessment enterprise -for security purposes the test publisher will typically require
documentation of professional training before filling an order for
a test manual
IN WHAT TYPES OF SETTINGS?
Educational
Clinical Professional books
Counseling Reference volumes
Geriatric Journal articles
Business and military -articles in current journals may contain reviews of the test,
Governmental and organizational updated or independent studies of its psychometric soundness,
Academic research or examples of how the instrument was used in either research
or an applied context
HOW ARE ASSESSMENTS CONDUCTED?
-responsible test users have obligations before, during, and after a test Online databases
or any measurement procedure is administered
HISTORY
-before the test, when test users have discretion with regard to the test 19 TH
CENTURY
administered, they should select and use only the test or tests that are
more appropriate for the individual being tested China as early as 2200 B.C.E.
-Testing was instituted as a means of selecting who, of many
-test administrator (or examiner) must be familiar with the test applicants, would obtain government jobs
materials and procedures and must have at the test site all the
materials needed to properly administer the test -tests examined proficiency in subjects like music, archery,
horsemanship, writing, and arithmetic, as well as agriculture,
-Test users have the responsibility of ensuring that the room in which geography, civil law, and military strategy
the test will be conducted is suitable and conducive to the testing
-knowledge of and skill in the rites and ceremonies of public and
-During test administration rapport between the examiner and the social life were also evaluated during the Song (or Sung) dynasty
examinee can be critically important
-tests emphasized knowledge of classical literature
-After a test administration, these obligations range from
safeguarding the test protocols to conveying the test results in a -testtakers who demonstrated their command of the classics were
clearly understandable fashion. perceived as having acquired the wisdom of the past and were
therefore entitled to a government position
-If third parties were present during testing or if anything else that -passing the examinations could result in exemption from taxes
might be considered out of the ordinary happened during testing, it is
the test user’s responsibility to make a note of such events on the Ancient Greco-Roman writings
report of the testing. -attempts to categorize people in terms of personality types
-if a test is to be scored by people, scoring needs to conform to pre- -such categorizations typically included reference to an
established scoring criteria overabundance or deficiency in some bodily fluid (such as blood or
phlegm) as a factor believed to influence personality
-test users who have responsibility for interpreting scores or other test
results have an obligation to do so in accordance with established Renaissance
procedures and ethical guidelines. -psychological assessment in the modern sense began to emerge
- an evaluative or diagnostic procedure or process that varies from the -Francis Galton: influential contributor to the field of measurement;
usual, customary, or standardized way a measurement is derived, aspired to classify people “according to their natural gifts” and to
either by virtue of some special accommodation made to the assessee ascertain their “deviation from an average”
or by means of alternative methods designed to measure the same - credited with devising or contributing to the development
variables. of many contemporary tools of psychological assessment,
including questionnaires, rating scales, and self-report
Accommodation inventories.
-adaptation of a test, procedure, or situation, or the substitution of one
test for another, to make the assessment more suitable for an assessee -Karl Pearson: developed the product-moment correlation
with exceptional needs technique, its roots can be traced directly to the work of Galton
WHERE TO GO FOR AUTHORITATIVE -assessment was also an important activity at the first experimental
Test catalogues INFORMATION? psychology laboratory, founded at the University of Leipzig in
- distributed by the publisher of the test Germany by Wilhelm Max Wundt
-usually contain only a brief description of the test and seldom -James McKeen Cattell: dealt with individual differences, coined
contain the kind of detailed technical information that a the term mental test
prospective user might require. objective is to sell the test
-Spearman is credited with originating the concept of test reliability
Test manuals as well as building the mathematical framework for the statistical
-detailed information concerning the development of a particular technique of factor analysis.
test and technical
-Victor Henri: collaborate with Alfred Binet on papers suggesting
-information relating to it should be found in the test manual, how mental tests could be used to measure higher mental processes
which usually can be purchased from the test publisher
20TH CENTURY
Page | 2
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-require substantial understanding of testing and supporting -measurement always involves error: collective influence of all of
psychological fields together with supervised experience in the use of the factors on a test score or measurement beyond those specifically
these measured by the test or measurement
Guidelines with respect to certain populations - imply nothing about how much greater one ranking is than another
-designed to assist professionals in providing informed and
developmentally appropriate services -even though ordinal scales may employ numbers or “scores” to
represent the rank ordering, the numbers do not indicate units of
-standards must be followed by all psychologists, guidelines are more measurement
aspirational in nature
THE RIGHTS OF TESTTAKERS Interval Scales
-contain equal intervals between numbers
The right of informed consent
The right to be informed of test findings -each unit on the scale is exactly equal to any other unit on the scale;
The right to privacy and confidentiality interval scales contain no absolute zero point (e.g., IQ tests)
-privacy right: recognizes the freedom of the individual to pick
Ratio Scales
and choose for himself the time, circumstances, and particularly -has a true zero point
the extent to which he wishes to share or withhold from others
his attitudes, beliefs, behavior, and opinions MEASUREMENT SCALES IN PSYCHOLOGY
-ordinal level of measurement is most frequently used in psychology
-privileged information: information that is protected by law
from disclosure in a legal proceeding -intelligence, aptitude, and personality test scores are, basically and
strictly speaking, ordinal
-confidentiality: concerns matters of communication outside Describing Data
the courtroom -distribution: a set of test scores arrayed for recording or study
-Privilege is not absolute; privilege in the psychologist–client -raw score: straightforward, unmodified accounting of performance
relationship belongs to the client, not the psychologist that is usually numerical
-all scores are listed alongside the number of times each score
occurred.
Page | 4
SCALES Assessment
OF MEASUREMENT
Psychological Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-In a bar graph, the rectangular bars typically are not -most commonly used measure of central tendency
contiguous
-equal to the sum of the observations
Median
-middle score in a distribution
-if the total number of scores ordered is an odd number, then the
median will be the score that is exactly in the middle
-when the total number of scores ordered is an even number, then the
median can be calculated by determining the arithmetic mean of the
two middle scores.
-except with nominal data, the mode tends not to be a very commonly
used measure of central tendency
-the value of the modal score is not calculated; one simply counts and
determines which score occurs most frequently
Variability
-an indication of how scores in a distribution are scattered or
dispersed
Range
Frequency polygon: continuous line connecting the points -equal to the difference between the highest and the lowest scores;
where test scores or class intervals simplest measure of variability to calculate
Mean
Page | 5
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
Kurtosis
-average deviation: another tool that could be used to describe the -the steepness of a distribution in its center is
amount of variability in a distribution
Platykurtic (relatively flat)
The standard deviation Leptokurtic (relatively peaked)
-equal to the square root of the average squared deviations about the Mesokurtic (somewhere in the middle)
mean
-normal curve has two tails; area on the normal curve between 2 and
3 standard deviations above the mean and −2 and −3 standard
deviations below the mean are referred to as a tail.
STANDARD SCORES
-a raw score that has been converted from one scale to another scale
Page | 6
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-familiar to many students from achievement tests; take on whole -although correlation does not imply causation, there is an
implication of prediction
values from 1 to 9
PEARSON R
Sten (M: 5.5 SD: 2) -most widely used of all
-statistical tool of choice when the two variables being correlated are
continuous
SPEARMAN RHO
-frequently used when the sample size is small (fewer than 30 pairs of
measurements) and especially when both sets of measurements are in
ordinal (or rank-order) form
-one of the primary advantages of a standard score on one test is that -provide a quick indication of the direction and magnitude of the
it can readily be compared with a standard score on another test relationship, if any, between the two variables
-transformations should be made only when there is good reason to -useful in revealing the presence of curvilinearity: “eyeball gauge” of
believe that the test sample was large enough and representative how curved a graph is.
enough and that the failure to obtain normally distributed scores was
due to the measuring instrument -a graph also makes the spotting of outliers relatively easy
CORRELATION AND INFERENCE -outlier: extremely atypical point located at an outlying distance
from the rest of the coordinate points in a scatterplot
-central to psychological testing and assessment are inferences
(deduced conclusions) about how some things are related to other -outliers are the result of administering a test to a very small sample
things of testtakers
-effect size: the size of the differences between groups PARAMETRIC NON-PARAMETRIC
Independent samples Mann-Whitney
-key advantage of meta-analysis over simply reporting a range of Dependent samples Wilcoxon Signed Rank
findings is that more weight can be given to studies that have larger One-way ANOVA Kruskal-Wallis
numbers of subjects MANOVA Friedman test
-some advantages to meta-analyses are: Pearson r Spearman Rho
-sister of independent samples t-test -normative sample: group of people whose performance on a
particular test is analyzed for reference in evaluating the performance
Wilcoxon Signed Rank Test of individual testtakers
-used to compare the same samples to assess whether the sample
differ from each other -norming: process of deriving norms
-sister of dependent samples t-test -some test manuals provide user norms or program norms, which
consist of descriptive statistics based on a group of testtakers in a
Kruskal-Wallis H Test given period of time rather than norms obtained by formal sampling
-used to test whether or not a group of independent samples is from methods
the same of different population
TYPES OF NORMS
-sister of ANOVA
Percentile norms
Friedman Test -raw data from a test’s standardization sample converted to percentile
-used to test whether or not the data is from the same sample under form
three conditions
-age norms indicate the average performance of different samples of -after all the test data have been collected and analyzed, the test
testtakers who were at various ages at the time the test was developer will summarize the data using descriptive statistics,
administered including measures of central tendency and variability
-norms are developed with data derived from a group of people who
Grade norms are presumed to be representative of the people who will take the test
-average test performance of testtakers in a given school grade in the future
-grade norms are developed by administering the test to
representative samples of children over a range of consecutive grade Fixed Reference Group Scoring Systems
levels -distribution of scores obtained on the test from one group of
testtakers
-mean or median score for children at each grade level is calculated
-used as the basis for the calculation of test scores for future
-primary use of grade norms is as a convenient, readily administrations of the test
understandable gauge of how one student’s performance compares
with that of fellow students in the same grade Norm-Referenced Versus Criterion-Referenced Evaluation
-not typically designed for use with adults who have returned to -criterion-referenced: evaluate it on the basis of whether or not
school some criterion has been met
-grade norms and age norms are referred to more generally as -criterion as a standard on which a judgment or decision may be
developmental norms based
National norms -criterion-referenced testing and assessment: method of
-derived from a normative sample that was nationally representative evaluation and a way of deriving meaning from test scores by
of the population at the time the norming study was conducted evaluating an individual’s score with reference to a set standard
National anchor norms -approach has also been referred to as domain- or content-
-provide some stability to test scores by anchoring them to other test referenced testing and assessment.
scores
-in norm-referenced interpretations of test data, a usual area of focus
-begins with the computation of percentile norms for each of the tests is how an individual performed relative to other people who took the
to be compared using the equipercentile method; must have been test
obtained on the same sample
-in criterion-referenced interpretations of test data, a usual area of
Subgroup norms focus is the testtaker’s performance: what the testtaker can or cannot
-segmented by any of the criteria initially used in selecting subjects do; what the testtaker has or has not learned; whether the testtaker
for the sample does or does not meet specified criteria for inclusion in some group
Local norms -criterion-referenced tests are frequently used to gauge achievement
-typically developed by test users themselves, local norms provide or mastery, they are sometimes referred to as mastery tests
normative information with respect to the local population’s
performance on some test RELIABILITY
-consistency in measurement
SAMPLING TO DEVELOP NORMS
-reliability coefficient is an index of reliability
Test standardization
-process of administering a test to a representative sample of
-a proportion that indicates the ratio between the true score variance
testtakers for the purpose of establishing norms
on a test and the total variance
Sampling
-statistic useful in describing sources of test score variability is the
-process of selecting the portion of the universe deemed to be
variance: the standard deviation squared
representative of the whole population is referred to as sampling
-true variance: variance from true differences
Sample of the population
-a portion of the universe of people deemed to be representative of
the whole population -error variance: variance from irrelevant, random sources
Page | 9
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-measurement error: all of the factors associated with the process of -a measure of inter-item consistency is calculated from a single
measuring some variable, other than the variable being measured; can administration of a single form of a test
be categorized as being either:
random error: caused by unpredictable fluctuations and -an index of inter-item consistency: useful in assessing the
inconsistencies of other variables in the measurement homogeneity of the test
process
-tests are said to be homogeneous if they contain items that measure a
systematic error: typically constant or proportionate to single trait
what is presumed to be the true value of the variable being
measured -heterogeneity: degree to which a test
measures different factors; composed of items that measure more
-once a systematic error becomes known, it becomes predictable as than one trait.
well as fixable
-the more homogeneous a test is, the more inter-item consistency it
SOURCES OF ERROR VARIANCE can be expected to
Test construction have
Test administration
Test scoring and interpretation -a homogeneous test is often an insufficient tool for measuring
Other sources of error multifaceted psychological variables such as intelligence or
personality
RELIABILITY ESTIMATES
Split-Half Reliability Estimates
Test-Retest Reliability Estimates -obtained by correlating two pairs of scores obtained from equivalent
-using the same instrument to measure the same thing at two points in halves of a single test administered once
time
-useful measure of reliability when it is impractical or undesirable to
-correlating pairs of scores from the same people on two different assess reliability with two tests or to administer a test twice
administrations of the same test
-one acceptable way to split a test is to randomly assign items to one
-appropriate when evaluating the reliability of a test that purports to or the other half of the test
measure something that is relatively stable over time, such as a
personality trait -another way is to assign odd-numbered items to one half of the test
and even-numbered items to the other half; yields an estimate of
-passage of time can be a source of error variance; the longer the time split-half reliability that is also referred to as odd-even reliability
that passes, the greater the likelihood that the reliability coefficient
will be lower -another way to split a test is to divide the test by content so that each
half contains items equivalent with respect to content and difficulty
- coefficient of stability: interval between testing is greater than six
months -primary objective in splitting a test in half for the purpose of
obtaining a split-half reliability estimate is to create mini-parallel-
Parallel-Forms and Alternate-Forms Reliability Estimates forms
-degree of the relationship between various forms of a test
Spearman–Brown formula
-two test administrations with the same group are required and test -used to estimate internal consistency reliability from a
scores may be affected by factors correlation of two halves of a test
-alternate forms of a test are typically designed to be -KR-21 formula may be used if there is reason to assume
equivalent with respect to variables such as content and that all the test items have approximately the same degree
level of difficulty of difficulty
Coefficient alpha
-mean of all possible split-half correlations, use on tests
-alternate forms reliability: estimate of the extent to containing nondichotomous items
which these different forms of the same test have been
affected by item sampling error, or other error -preferred statistic for obtaining an estimate of internal
consistency reliability
Internal consistency estimate of reliability
-evaluation of the internal consistency of the test items -requires only one administration of the test
-degree of correlation among all the items on a scale -ranges in value from 0 to 1
Page | 10
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-calculated to help answer questions about how similar sets Speed test
of data are -items of uniform level of difficulty (typically uniformly low) so that,
when given generous time limits, all testtakers should be able to
-a value of alpha above .90 may be “too high”and indicate complete all the test items correctly
redundancy in the items
-score differences on a speed test are therefore based on performance
Average proportional distance (APD) speed because items attempted tend to be correc
-focuses on the degree of difference that exists between
item scores -reliability estimate of a speed test should be based on performance
from two independent testing periods using one of the following:
-used to evaluate the internal consistency of a test that test-retest reliability
focuses on the degree of difference that exists between item alternate-forms reliability
scores split-half reliability from two separately timed half tests
-general “rule of thumb” for interpreting an APD is that an -reliability of a speed test should not be calculated from a single
obtained value of .2 or lower is indicative of excellent administration of the test with a single time limit; result will be a
internal consistency, and that a value of .25 to .2 is in the spuriously high reliability coefficient
acceptable range
Criterion-referenced tests
-calculated APD of .25 is suggestive of problems with the -scores on criterion- referenced tests tend to be interpreted in pass–
internal consistency of the test fail
-simplest way of determining the degree of consistency among -favor the development of longer rather than shorter tests
scorers in the scoring of a test is to calculate a coefficient of
correlation: coefficient of inter-scorer reliability Domain sampling theory and generalizability theory
-estimate the extent to which specific sources of variation under
USING AND INTERPRETING COEFFICIENT OF defined conditions are contributing to the test score.
RELIABILITY
-three approaches to the estimation of reliability: test-retest, -test’s reliability is conceived of as an objective measure of how
alternate or parallel forms, and internal or inter-item consistency precisely the test score assesses the domain from which the test draws
a sample
-the items in the domain are thought to have the same means and
variances of those in the test that samples from the domain
PURPOSE OF THE RELIABILITY COEFFICIENT
-for a test designed for a single administration only, an estimate of Generalizability theory
internal consistency would be the reliability measure of choice -test scores vary from testing to testing because of variables in the
testing situation.
-transient error a source of error attributable to variations in the
testtaker’s feelings, moods, or mental state over time -universe score: analogous to a true score
NATURE OF THE TEST
-generalizability study examines how generalizable scores from a
-considerations such as whether: particular test are if the test is administered in different situations;
The test items are homogeneous or heterogeneous in how much of an impact different facets of the universe have on the
nature test score
The characteristic, ability, or trait being measured is
presumed to be dynamic or static -influence of particular facets on the test score is represented by
The range of test scores is or is not restricted coefficients of generalizability
The test is a speed or a power test
The test is or is not criterion-referenced Decision study
-developers examine the usefulness of test scores in helping the test
Restriction of range or restriction of variance user make decisions.
-if the variance of either variable in a correlational analysis is
restricted by the sampling procedure used, then the resulting -designed to tell the test user how test scores should be used and how
correlation coefficient tends to be lower dependable those scores are as a basis for decisions
ITEM RESPONSE THEORY
Inflation of range or inflation of variance
-if the variance is inflated by the sampling procedure, the correlation -person with X ability will be able to perform at a level of Y
coefficient tends to be higher
-also known as latent-trait theory
Speed tests versus power tests
Power test -person with X amount of a particular personality trait will exhibit Y
-when a time limit is long enough to allow testtakers to attempt all amount of that trait on a personality test designed to measure it
items, and if some items are so difficult that no testtaker is able to
obtain a perfect score
Page | 11
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-provides a measure of the precision of an observed test score -construct validity as being “umbrella validity” because every other
variety of validity falls under it
-an estimate of the amount of error inherent in an observed score or
measurement
-appropriateness of inferences drawn from test scores regarding
-the higher the reliability of a test, the lower the SEM individual standings on a variable called a construct
-tool used to estimate or infer the extent to which an observed score -construct: informed, scientific idea developed or hypothesized to
deviates from a true score describe or explain behavior
-also known as the standard error of a score -constructs are unobservable, underlying traits that a test developer
may invoke to describe test behavior or criterion performance
-index of the extent to which one individual’s scores vary over tests
presumed to be parallel -investigating a test’s construct validity must formulate hypotheses
about the expected behavior of high scorers and low scorers on the
-most frequently used in the interpretation of individual test scores test
-useful in establishing what is called a confidence interval: a range - viewed as the unifying concept for all validity evidence
or band of test scores that is likely to contain the true score
Evidence of Construct Validity
-68% CI: 60.7-66.3 (1 SD away) -various techniques of construct validation may provide evidence, for
96% CI: 58.4-67.6 (2 SD away) example, that:
standard error of measurement can be used to set the confidence The test is homogeneous, measuring a single construct
interval for a particular score or to determine whether a score is
significantly different from a criterion -Pearson r could be used to correlate average subtest scores
with the average total test score
STANDARD ERROR OF THE DIFFERENCE BETWEEN
TWO SCORES -one way a test developer can improve the homogeneity of
a test containing items that are scored dichotomously is by
-determining how large a difference should be before it is considered eliminating items that do not show significant correlation
statistically significant coefficients with total test scores
-if the probability is more than 5%, it is presumed that there was no -coefficient alpha may also be used in estimating the
difference homogeneity of a test composed of multiple-choice items
-more rigorous standard is the 1% standard Test scores increase or decrease as a function of age, the
passage of time, or an experimental manipulation as
-applying the 1% standard, no statistically significant difference theoretically predicted
would be deemed to exist unless the observed difference could have
occurred by chance alone less than one time in a hundred Test scores obtained after some event or the mere
passage of time (or, posttest scores) differ from pretest
scores as theoretically predicted
VALIDITY
Test scores obtained by people from distinct groups
Validation vary as predicted by the theory method of contrasted
-process of gathering and evaluating evidence about validity groups, one way of providing evidence for the validity
of a test is to demonstrate that scores on the test vary in
-both the test developer and the test user may play a role in the a predictable way as a function of membership in some
validation of a test group
Local validation studies
-if a test is a valid measure of a particular construct, then
-necessary when the test user plans to alter in some way the format,
test scores from groups of people who would be presumed
instructions, language, or content of the
to differ with respect to that construct should have
Test
correspondingly different test scores
-necessary if a test user sought to use a test with a population of
testtakers that differed in some significant way from the population Test scores correlate with scores on other tests in
on which the test was accordance with what would be predicted from a theory
standardized that covers the manifestation of the construct in
question
CONTENT VALIDITY
-evaluation of the subjects, topics, or content covered by the items in Convergent evidence
the test -if scores on the test undergoing construct validation tend to correlate
highly in the predicted direction with scores on older, more
-judgment of how adequately a test samples behavior representative established, and already validated tests designed to measure the same
of the universe of behavior that the test was designed to sample (or a similar) construct
CONSTRUCT VALIDITY
Page | 12
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
Factor analysis miss rate: people the test fails to identify as having a
-both convergent and discriminant evidence of construct validity can particular characteristic or attribute
be obtained by the use of factor analysis
false positive: test predicted that the testtaker possess the
-designed to identify factors or specific variables that are typically particular characteristic even if the testtaker did not
attributes, characteristics, or dimensions on which people may differ
false negative: test predicted that the testtaker did not
-employed as a data reduction method in which several sets of scores possess the particular characteristic even if the testtaker
and the correlations between them are analyzed actually did
-factor analysis is conducted on either an exploratory or a -to evaluate the predictive validity of a test, a test targeting a
confirmatory basis particular attribute may be administered to a sample of research
subjects in which approximately half of the subjects possess or
Exploratory factor analysis: “estimating, or extracting exhibit the targeted attribute and the other half do not
factors; deciding how many factors to retain; and rotating
factors to an interpretable orientation” -judgments of criterion-related validity, whether concurrent or
predictive, are based on two types of statistical evidence:
Confirmatory factor analysis: test the degree to which a
hypothetical model (which includes factors) fits the actual Validity coefficient
data -relationship between test scores and scores on the criterion measure
-factor loading: extent to which the factor determines the test score -Pearson correlation coefficient is used
or scores
-affected by restriction or inflation of range
-high factor loadings: provide convergent evidence of construct
validity OTHER TYPES OF VALIDITY
Ecological validity
-moderate to low factor loadings: discriminant evidence of construct -judgment regarding how well a test measures what it purports to
validity measure at the time and place that the variable being measured
must also be valid for the purpose for which it is being -attrition in the number of subjects: validity coefficient may be
used adversely affected
-criterion measure is obtained not concurrently but at some future -biased if some portion of its variance stems from some factor(s) that
time are irrelevant to performance on the criterion measure; as a
consequence, one group of testtakers will systematically perform
Predictive validity differently from another
-test score predicts some criterion measure
-prevention during test development is the best cure for test bias
-test scores may be obtained at one time and the criterion measures
obtained at a future time, usually after some intervening event has -procedure called estimated true score transformations represents
taken place one of many available post hoc remedies
-rating error: numerical or verbal judgment (or both) that places a Naylor-Shine tables
person or an attribute along a continuum identified by a scale of -difference between the means of the selected and unselected groups
numerical or word descriptors known as a rating scale to derive an index of what the test is adding to already established
procedures
-a rating error is a judgment resulting from the intentional or
unintentional misuse of a rating scale -with both tables, the validity coefficient used must be one obtained
by concurrent validation procedures
-leniency error: also known as a generosity error; error in rating
that arises from the tendency on the part of the rater to be lenient in The Brogden-Cronbach-Gleser formula
scoring -used to calculate the dollar amount of a utility gain resulting from
the use of a particular selection instrument under specified conditions
-central tendency error: exhibits a general and systematic
reluctance to giving ratings at either the positive or the negative -utility gain: estimate of the benefit of using a particular test or
extreme; all of this rater’s ratings would tend to clusterin the middle selection method
of the rating continuum
-productivity gain: an estimated increase in work output
-one way to overcome restriction-of-range rating errors (central
tendency, leniency, severity errors) is to use rankings: measure -test is obviously of no value if the hit rate is higher without
individuals against one another instead of against an absolute scale using it
-Halo effect: some raters, some ates can do no wrong Decision theory
-provides guidelines for setting optimal cutoff scores
Test Fairness
-a test is used in an impartial, just, and equitable way SOME PRACTICAL CONSIDERATIONS
-when the base rates are extremely low or high because such a
-some tests, for example, have been labeled “unfair” because they situation may render the test useless as a tool of selection
discriminate among groups of people
The pool of job applicants
UTILITY The complexity of the job
-how useful a test is The cut score in use
o cut score: usually numerical reference point derived
-practical value of using a test to aid in decision making as a result of a judgment and used to divide a set of
data into two or more classifications
FACTORS THAT AFFECT A TEST’S UTILITY
o relative cut score (norm-referenced cut score):
Psychometric soundness reference point—in a distribution of test scores used to
-the higher the criterion-related validity of test scores for making a divide a set of data into two or more classifications—
particular decision, the higher the utility of the test is likely to be that is set based on norm-related considerations
-cost of administering tests can be well worth it if the result is certain o multiple cut scores: use of two or more cut scores
noneconomic benefits
with reference to one predictor for the purpose of
UTILITY ANALYSIS categorizing testtakers
-family of techniques that entail a cost–benefit analysis designed to
o multiple hurdle: a cut score is in place for each
yield information relevant to a decision about the usefulness and/or
predictor used; cut score used for each predictor will
practical value of a tool of assessment
be designed to ensure that each applicant possess some
minimum level of a specific attribute or skill
-undertaken for the purpose of evaluating whether the benefits of
using a test outweigh the costs
o compensatory model of selection: an assumption is
-purpose of a utility analysis is to answer a question related to costs made that high scores on one attribute can, in fact,
and benefits in terms of money “balance out” or compensate for low scores on another
attribute; within the framework of a compensatory
HOW IS A UTILITY ANALYSIS CONDUCTED? model is multiple regression
-provide an estimate of the percentage of employees hired by the use -Disadvantage: when there is low inter-rater reliability and major
of a particular test who will be successful at their jobs, given different disagreement regarding how certain populations of testtakers should
combinations of three variables: the test’s validity, the selection ratio respond to items
used, and the base rate
The Known Groups Method
Page | 14
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-collection of data on the predictor of interest from groups known to Who benefits from an administration of this test?
possess, and not to possess, a trait, attribute, or ability of interest
Is there any potential for harm as the result of an administration
-main problem with using known groups is that determination of of this test?
where to set the cutoff score is inherently affected by the composition
of the contrasting groups How will meaning be attributed to scores on this test?
-subsequent to this training, the experts are given a book of Pilot Work
items, with one item printed per page, such that items are -preliminary research surrounding the creation of a prototype of the
arranged in an ascending order of difficulty test
-expert places a “bookmark” between the two pages (or, the -Test items may be pilot studied (or piloted) to evaluate whether they
two items) that are deemed to separate testtakers who have should be included in the final form of the instrument
acquired the minimal knowledge, skills, and/or abilities
from those who have not -Once the idea for a test is conceived (test conceptualization), test
construction begins
-bookmark serves as the cut score
TEST CONSTRUCTION
TEST DEVELOPMENT -a stage in the process of test development that entails writing test
items (or re-writing or revising existing items), as well as formatting
-umbrella term for all that goes into the process of creating a test items, setting scoring rules, and otherwise designing and building a
test
-The process of developing a test occurs in five stages:
1. test conceptualization -once a preliminary form of the test has been developed, it is
2. test construction administered to a representative sample of testtakers under conditions
3. test tryout that simulate the conditions that the final version of the test will be
4. item analysis administered under
5. test revision
Scaling
Preliminary questions -process of setting rules for assigning numbers in measurement
What is the test designed to measure?
-measuring device is designed and calibrated and by which numbers
What is the objective of the test? In the service of what goal will are assigned to different amounts of the trait
the test be employed?
Types of scales
In what way or ways is the objective of this test the same as or Age-based scale
different from other tests with similar goals? -If the testtaker’s test performance as a function of age is of critical
interest
What real-world behaviors would be anticipated to correlate
with testtaker responses? Grade-based scale
-testtaker’s test performance as a function of grade is of critical
Is there a need for this test? interest
-some rating scales are unidimensional: only one dimension is -should be worded so that the correct answer is specific
presumed to underlie the ratings
-may also be referred to as a short-answer item
-other rating scales are multidimensional: meaning that more than
one dimension is thought to guide the testtaker’s responses essay item: requires the testtaker to respond to a question
by writing a composition, typically one that demonstrates
Method of paired comparisons recall of facts, understanding, analysis, and/or
-testtakers are presented with pairs of stimuli, which they are asked to interpretation
compare
-useful when the test developer wants the examinee to
-for each pair of options, testtakers receive a higher score for demonstrate a depth of knowledge about a single topic
selecting the option deemed more justifiable by the majority of a
group of judges Writing items for computer administration
Item bank
-comparative scaling: judgments of a stimulus in comparison with -relatively large and easily accessible collection of test questions
every other stimulus on the scale providing testtakers with a list of 30
items on a sheet of paper and asking them to rank the justifiability of computerized adaptive testing (CAT)
the items from 1 to 30 -computer administered test-taking process wherein items presented
to the testtaker are based in part on the testtaker’s performance on
-categorical scaling: stimuli are placed into one of two or more previous items
alternative categories that differ quantitatively with respect to some
continuum -test administered may be different for each testtaker, depending on
the test performance on the items presented
Guttman scale
-ordinal-level measures tends to reduce floor effects and ceiling effects
-items on it range sequentially from weaker to stronger expressions of -floor effect: diminished utility of an assessment tool for
the attitude, belief, or feeling being measured distinguishing testtakers at the low end of the ability, trait, or other
attribute being measured
-all respondents who agree with the stronger statements of the
attitude will also agree with milder statements -ceiling effect: refers to the diminished utility of an assessment tool
for distinguishing testtakers at the high end of the ability, trait, or
Scalogram analysis other attribute being measured
-item-analysis procedure and approach to test development that
involves a graphic mapping of a testtaker’s responses -item branching: ability of the computer to tailor the content and
order of presentation of test items on the basis of responses to
Writing Items previous items
What range of content should the items cover?
Which of the many different types of item formats should be -item-branching technology may be used in personality tests to
employed? recognize nonpurposive or inconsistent responding.
How many items should be written in total and for each
content area covered?
Item pool
-reservoir or well from which items will or will not be drawn for the Scoring Items
final version of the test Cumulative model
-the higher the score on the test, the higher the testtaker is on the
Item format ability, trait, or other characteristic that the test purports to measure
-form, plan, structure, arrangement, and layout of individual test
items Class scoring (category scoring)
-selected-response format: require testtakers to select a response -responses earn credit toward placement in a particular class or
from a set of alternative responses category with other testtakers whose pattern of responses is
multiple-choice: an item written in a multiple-choice presumably similar in some way
format has three elements: a stem, a correct alternative or
option, and (3) several incorrect options referred to as Ipsative scoring
distractors or foils -comparing a testtaker’s score on one scale within a test to another
scale within that same test
-multiple-choice item that contains only two possible
responses is called a binary-choice item TEST TRYOUT
-an informal rule of thumb is that there should be no fewer than 5
-most familiar binary-choice item is the true–false item subjects and preferably as many as 10 for each item on the test
-good binary choice contains a single idea, is not -the more subjects in the tryout the better
excessively long, and is not subject to debate
-definite risk in using too few subjects during test tryout comes
matching item: testtaker is presented with two columns; during factor analysis of the findings, when what we might call
premises on the left and responses on the right phantom factors—factors that actually are just artifacts of the small
sample size—may emerge
-testtaker’s task is to determine which response is best
associated with which premise What Is a Good Item?
-good test item is reliable and valid
-constructed-response format: require testtakers to supply or to
create the correct answer, not merely to select it
Page | 16
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-good test item is one that is answered correctly by high scorers on -ideal is .3+
the test as a whole
Item-Characteristic Curves (ICCs)
-item analysis: different types of statistical scrutiny that the test data -can play a role in decisions about which items are working well and
can potentially undergo at this point which items are not
-optimal average item difficulty is approximately .5, with individual 3. Some testtakers may be luckier than others in guessing the choices
items on the test ranging in difficulty from about .3 to .8. that are keyed correct
-negative d-value on a particular item is a red flag because it indicates -exploration of the issues through verbal means such as interviews
that low-scoring examinees are more likely to answer the item and group discussions conducted with testtakers and other relevant
correctly than high-scoring examinees; items then need to be revised parties
or eliminated
“Think aloud” test administration
-higher the value of d, the more adequately the item discriminates the -having respondents verbalize thoughts as they occur
higher-scoring from the lower-scoring testtakers shed light on the testtaker’s thought processes during
the administration of a test
Page | 17
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-next step is to administer the revised test under standardized Quality assurance during test revision
conditions to a second appropriate sample of examinees -a mechanism for ensuring consistency in scoring is the anchor
protocol.
-on the basis of an item analysis of data derived from this
administration of the second draft of the test, the test developer may -anchor protocol: test protocol scored by a highly authoritative
deem the test to be in its finished form scorer that is designed as a model for scoring and a mechanism for
resolving scoring discrepancies
-test’s norms may be developed from the data, and the test will be
said to have been “standardized” on this (second) sample -discrepancy between scoring in an anchor protocol and the scoring
of another protocol is referred to as scoring drift
-when the item analysis of data derived from a test administration
indicates that the test is not yet in finished form, the steps of revision, The Use of IRT in Building and Revising Tests
tryout, and item analysis are repeated until the test is satisfactory and -Using IRT, test developers evaluate individual item performance
standardization can occur with reference to item-characteristic curves (ICCs)
Test Revision in the Life Cycle of an Existing Test -Three of the many possible applications of IRT in building and
-tests are revised when significant changes in the domain represented, revising tests include:
or new conditions of test use and interpretation, make the test
inappropriate for its intended use evaluating existing tests for the purpose of mapping test
revisions
-many tests are deemed to be due for revision when any of the determining measurement equivalence across testtaker
following conditions exist: populations
developing item banks
1. The stimulus materials look dated and current testtakers
cannot relate to them. Evaluating the properties of existing tests and guiding test
revision
2. The verbal content of the test, including the -IRT information curves can help test developers evaluate how well
administration instructions and the test items, contains an individual item (or entire test) is working to measure different
dated vocabulary that is not readily understood by current levels of the underlying construct
testtakers.
Determining measurement equivalence across testtaker
3. As popular culture changes and words take on new populations
meanings, certain words orexpressions in the test items or -help ensure that the same construct is being measured, no matter
directions may be perceived as inappropriate or even what language the test has been translated into
offensive to a particular group and must therefore be
changed. differential item functioning(DIF)
-an item functions differently in one group of testtakers as compared
4. The test norms are no longer adequate as a result of to another group of testtakers known to have
group membership changes in the population of potential the same (or similar) level of the underlying trait
testtakers.
DIF analysis
5. The test norms are no longer adequate as a result of age- -test developers scrutinize group-by-group item response curves,
related shifts in the abilities measured over time, and so an looking for what are termed DIF items
age extension of the norms (upward, downward, or in both
directions) is necessary. DIF items
-items that respondents from different groups at the same level of the
6. The reliability or the validity of the test, as well as the underlying trait have different probabilities of endorsing as a function
effectiveness of individual test items, can be significantly of their group membership
improved by a revision.
-another application of DIF analysis has to do with the evaluation of
7. The theory on which the test was originally based has item-ordering effects, and the effects of different test administration
been improved significantly, and these changes should be procedures
reflected in the design and content of the test.
Developing item banks
-steps to revise an existing test parallel those to create a brand-new -each of the items assembled as part of an item bank, whether taken
one from an existing test or written especially for the item bank, have
undergone rigorous qualitative and quantitative evaluation
Page | 18
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-new items may also be written when existing measures are either not
available or do not tap targeted aspects of the construct being -abstract-reasoning problems were thought to be the best measures of
measured g in formal tests
Piaget -some of the abilities (such as Gv) are vulnerable abilities in that they
-evolving biological adaptation to the outside world decline with age and tend not to return to preinjury levels following
PERSPECTIVES ON INTELLIGENCE brain damage abilities
Interactionism -some abilities (such as Gq) are maintained abilities; they tend not
-refers to the complex concept by which heredity and to decline with age and may return to preinjury levels following brain
environment are presumed to interact and influence the development damage
of one’s intelligence
Three-stratum theory of cognitive abilities (Carroll)
Louis L. Thurstone -top stratum or level in Carroll’s model is g, or general intelligence
-conceived of intelligence as composed of what he termed primary
mental abilities (PMAs) -second stratum is composed of eight abilities and processes: fluid
intelligence (Gf), crystallized intelligence (Gc), general memory and
Early model of multiple abilities learning (Y), broad visual perception (V), broad auditory perception
(U), broad retrieval capacity (R), broad cognitive speediness (S), and
Factor-analytic theories processing/decision speed (T)
-focus is squarely on identifying the ability or groups of
abilities deemed to constitute intelligence -below each of the abilities in the second stratum are many “level
factors” and/or “speed factors
-referred to as a two-factor theory of intelligence -model was the product of efforts designed to improve the practice of
psychological assessment in education (sometimes referred to as
-tests that exhibited high positive correlations with other intelligence psychoeducational assessment) by identifying tests from different
tests were thought to be highly saturated with g, whereas tests with batteries that could be used to provide a comprehensive assessment of
low or moderate correlations with other intelligence tests were a student’s abilities
viewed as possible measures of specific factors
-cross-battery assessment: assessment that employs tests from
-greater the magnitude of g in a test of intelligence, the better the test different test batteries and entails interpretation of data from specified
was thought to predict overall intelligence subtests to provide a comprehensive assessment
Page | 19
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-one’s ability to learn is determined by the number and speed of the -earlier versions of the Stanford-Binet had employed the ratio IQ,
bonds that can be marshaled which was based on the concept of mental age: the age level at
which an individual appears to be functioning intellectually as
indicated by the level of items responded to correctly
Information-processing theories
-the focus is on identifying the specific mental processes that -ratio IQ: ratio of the testtaker’s mental age divided by his or her
constitute intelligence chronological age, multiplied by 100 to eliminate decimals for its
computation
-Russian neuropsychologist Aleksandr Luria
-child whose mental age and chronological age were equal would
-focuses on the mechanisms by which information is processed—how thus have an IQ of 100
information is processed, rather than what is processed
-deviation IQ: comparison of the performance of the individual with
-two basic types of information-processing styles: the performance of others of the same age in the standardization
sample
simultaneous (or parallel) processing: information is
integrated all at one time; “synthesized.” Information is -M: 100, SD: 16
integrated and synthesized at once and as a whole
-SB5 is exemplary in terms of what is called adaptive testing: testing
-tasks that involve the simultaneous mental representations individually tailored to the testtaker
of images or information involve simultaneous processing
The Wechsler Tests
successive (or sequential) processing: each bit of -Bellevue Hospital in Manhattan, needed an instrument for evaluating
information is individually processed in sequence the intellectual capacity of its multilingual, multinational, and
multicultural clients
-logical and analytic in nature; piece by piece and one piece
-the W-B 1 was a point scale, not an age scale; items were classified
after the other, information is arranged and rearranged so
by subtests rather than by age and was organized into six verbal
that it makes sense
subtests and five performance subtests; all the items in each test were
arranged in order of increasing difficulty
PASS model of intellectual functioning
-Planning: strategy development for problem solving
-WAIS-IV is the current Wechsler adult scale
-Attention: (also referred to as arousal), receptivity of information
-Simultaneous and Successive: type of information process
-WAIS-IV It is made up of subtests that are designated either as core
employed
or supplemental
MEASURING INTELLIGENCE
-core subtest: administered to obtain a composite score
-measurement of intelligence entails sampling an examinee’s
-supplemental subtest: (also sometimes referred to as an optional
performance on different types of tests and tasks as a function of
subtest) is used for purposes such as providing additional clinical
developmental level
information or extending the number of abilities or processes
SOME TASKS USED TO MEASURE INTELLIGENCE sampled
Page | 20
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
-fifth index score, the General Ability Index (GAI): kind of and consideration of facts as well as a series of logical judgments to
“composite of two composites”; calculated using the Verbal narrow down solutions and eventually arrive at one solution
Comprehension and Perceptual Reasoning Indexes
-divergent thinking: thought is free to move in many different
-GAI is useful to clinicians as an overall index of intellectual ability directions, making several solutions possible; requires flexibility of
thought, originality, and imagination
-another composite score that has clinical application is the
Cognitive Proficiency Index (CPI): comprised of the Working Remote Associates Test (RAT)
Memory Index and the Processing Speed Index, the CPI is used to -presents the testtaker with three words; the task is to find a fourth
identify problems related to working memory or processing speed word associated with the other three
-CPI was calibrated to have a M: 100 and SD: 15 ISSUES IN THE ASSESSMENT OF INTELLIGENCE
-group intelligence test results provide school personnel with The Construct Validity of Tests of Intelligence
valuable information for instruction-related activities and increased -evaluation of a test’s construct validity proceeds on the assumption
understanding of the individual pupil that one knows in advance exactly what the test is supposed to
measure
-first group intelligence test to be used in U.S. schools was the Otis-
Lennon School Ability Test, formerly the Otis Mental Ability Test ASSESSMENT FOR EDUCATION
-designed to measure abstract thinking and reasoning ability and to Integrative assessment
assist in school evaluation and placement decision-making -employ not only various tools of assessment, but input from various
school personnel, as well as parents, and other relevant sources of
Other measures of intellectual abilities information
-tests designed to measure creativity may well measure variables
related to intelligence measures of creativity may also be thought of Dynamic Assessment
as tools for assessing intelligence -originally developed for use with children
-four terms common to many measures of creativity are: -exploring learning potential that is based on a test-intervention-retest
Originality: ability to produce something that is innovative model
or nonobvious
Achievement tests
Fluency: ease with which responses are reproduced and is -designed to measure accomplishment
usually measured by the total number of responses
produced -measure the degree of learning that has taken place
Flexibility: variety of ideas presented and the ability to -items may be characterized by the type of mental processes required
shift from one approach to another by the testtaker to successfully retrieve the information needed to
respond to the item
Elaboration: richness of detail in a verbal explanation or
pictorial display -fact-based items and conceptual items
occurred as a result of relatively structured input -measure of reading readiness, reading achievement, and reading
difficulties
-referred to as prognostic tests; used to make predictions
-takes between 15 and 45 minutes to administer the entire battery
-tend to draw on a broader fund of information and abilities and may
be used to predict a wider variety of variables -used with children as young as 4½, adults as old as 80
THE PRESCHOOL LEVEL -subtests:
Apgar number Letter Identification
- everybody’s first test” Word Identification
Word Attack
-conducted at 1 minute after birth to assess how well the infant Word comprehension
tolerated the birthing process
-three subtests new to the third edition are Phonological Awareness,
-evaluation is conducted again at 5 minutes after birth to assess how Listening Comprehension, and Oral Reading Fluency
well the infant is adapting to the environment
Psychoeducational Test Batteries
-each evaluation is made with respect to the same five variables; each -test kits that generally contain two types of tests: those that measure
variable can be scored on a range from 0 to 2; and each score (at 1 abilities related to academic success and those that measure
minute and 5 minutes) can range from 0 to 10 educational achievement in areas such as reading and arithmetic
-activity (or muscle tone), pulse (or heart rate), grimace (or reflex -data derived from these batteries allow for normative comparisons as
irritability), appearance (or color), respiration well as an evaluation of the testtaker’s own strengths and weaknesses
*Approximately one hour is a good rule-of-thumb limit for an Kaufman Assessment Battery for Children (K-ABC)
entire test session with a preschooler; less time is preferable -use with testtakers from age 2½ through age 12½
*As testing time increases, so does the possibility of fatigue and -measurement for intelligence and achievement
distraction
THE ELEMENTARY-SCHOOL LEVEL -divided into two groups, reflecting the two kinds of information-
processing skills identified by Luria and his students: simultaneous
skills and sequential skills
Metropolitan Readiness Tests (MRT6)
-assesses the development of the reading and mathematics skills Kaufman Assessment Battery for Children, Second Edition
important in the early stages of formal school learning (KABC-II)
-age range for the second edition of the test was extended upward
-orally administered (ages 3 to 18) to expand the possibility of making ability/achievement
comparisons with the same test through high school
-runs about 90 minutes
THE SECONDARY-SCHOOL LEVEL -10 new subtests were created, 8 of the existing subtests were
removed, and only 8 of the original subtests remained
SAT
-aptitude test widely used in the schools at the secondary level
-test has been of value not only in the college selection process but
also as an aid to high-school guidance and job placement counselors
-whereas traits are frequently discussed as if they were characteristics DEVELOPING INSTRUMENTS TO ASSESS
possessed by an individual, types are more clearly descriptions of PERSONALITY
people Logic and Reason
-may dictate what content is covered by the items
-typology devised by Carl Jung became the basis for the Myers-
Briggs Type Indicator -the use of logic and reason in the development of test items is
-MBTI: in assumption guiding the development of this test was that sometimes referred to as the content or content-oriented approach
people exhibit definite preferences in the way that they perceive or to test development
become aware of—and judge or arrive at conclusions
Theory
Meyer Friedman and Ray Rosenman -personality measures differ in the extent to which they rely on a
-conceived of a Type A personality: competitiveness, haste, particular theory of personality in their development as well as their
restlessness, impatience, feelings of being time-pressured, and strong interpretation
needs for achievement and dominance
Data Reduction Methods
-Type B personality: has the opposite of the Type A’s traits; mellow -include several types of statistical techniques collectively
or laid-back known as factor analysis or cluster analysis
-the use of the word trait presupposes a relatively enduring behavioral Neuroticism: referred to as the Emotional Stability factor;
predisposition, whereas the term state is indicative of a relatively adjustment and emotional stability, including how people
temporary predisposition cope in times of emotional turmoil
PERSONALITY ASSESSMENT: SOME BASIC Extraversion: sociability, how proactive people are in
QUESTIONS seeking out others, as well as assertiveness
Who is being assessed, and who is doing the assessing?
-some methods of personality assessment rely on the assessee’s own Openness: referred to as the Intellect factor; openness to
self-report: process wherein information about assessees is supplied experience as well as active imagination, aesthetic
by the assessees themselves sensitivity, attentiveness to inner feelings, preference for
variety, intellectual curiosity, and independence of
-self-reported information may be obtained in the form of judgment
diaries kept by assessees or in the form of responses to oral or written
questions or test items Agreeableness: interpersonal tendencies that include
altruism, sympathy toward others, friendliness, and the
-self-report methods are very commonly used to explore an belief that others are similarly inclined
assessee’s self-concept: one’s attitudes, beliefs, opinions, and related
thoughts about oneself
Page | 23
Psychological Assessment Transcripts
RGO Review Center | Board Licensure Examination for Psychometrician 2023
Acculturation
-an ongoing process by which an individual’s thoughts, behaviors,
values, worldview, and identity develop in relation to the general
thinking, behavior, customs, and
values of a particular cultural group
Page | 24