Psych Ass
Psych Ass
GREEN - SURE
RED - NOT SURE
1. Depression is more common among people with insomnia than among those with satisfactory
sleep. To determine the reasons for this relationship, investigators identified 40 people suffering
from both depression and insomnia. For each of these 40, they paired two other people of the
same gender and age who were neither depressed nor suffering from any sleep disorder. One of
these was designated the "normal-sleep control," and the other was designated the "yoked
control." All participants slept in a laboratory for one week. The normal-sleep control person slept
without restrictions. During that same time, the yoked control was permitted to sleep when the
depressed-insomniac person slept but was required to awaken whenever the
depressed-insomniac was awakened. A valid questionnaire for measuring depression was
administered at the end of the one-week study. Assume that higher scores on the questionnaire
reflect greater depressive symptomatology. What pattern of results on the depression
questionnaire would justify the conclusion that sleeplessness leads to depression?
2. Which of the following is the best way to establish rapport with a test-taker?
3. A researcher conducted a study to determine the effects of gender and status on the perceived
credibility of an eyewitness testifying in a trial. Participants watched one of four video recordings
depicting the eyewitness and rated the credibility of the eyewitness. In order to determine whether
gender, as a specific variable, had an effect on perceived credibility of the eyewitness, which of
the following must be significant?
● personal values
● creativity and motivation
● traits and states
● social communication skills
6. Which of the following is most appropriate for determining the psychometric soundness of
behavioral assessment?
8. Dr. Chen is interested in feminist attitudes of young adult women in the United States.
Consequently, she administered a feminist attitude questionnaire to a total of 100 young adult
women from three universities. The 100 women tested and the number of young adult women in
the United States are which of the following, respectively?
9. The Beck Depression Inventory-II measures the test-takers’ feelings over what period of time?
● 1 week
● 2 weeks
● 3 weeks
● the last month
10. According to Roberts and DelVecchio (2000), trait consistency tends to increase until one is
between _________ years of age, at which time it peaks.
● 10 and 20
● 30 and 40
● 50 and 60
● 40 and 50
11. A psychologist who does not act in the same or similar way that other reasonable psychologists
would have acted under the same or similar circumstances may be found liable for:
● abuse.
● incompetency.
● malpractice.
● Negligence.
12. In a normal distribution of scores, approximately what percentage of test scores falls between +1
and –1 standard deviations from the mean?
● less than 1%
● 75%
● 50%
● 66%
● paired methods
● content
● summative
● categorical
15. The lower test-retest reliability coefficients found to exist for state anxiety when compared with
higher test-retest reliability coefficients obtained for trait anxiety support which premise?
● None of these.
● States are more enduring personality characteristics than traits.
● Traits are more enduring personality characteristics than states.
● Exhibition of anxiety is very situation-dependent.
16. If a psychologist determines that a client is a danger to others, that psychologist has a legal
obligation to:
18. If an instructor assigns a grade of “A” to all students who earn 900 or more points out of a total of
1000 points during the semester, 900 represents:
19. A 40-item vocabulary test was administered to a group of students. A second similar test of
vocabulary term was administered to this same group of students approximately one week later.
The researcher reported that the correlation between these two tests was r = .90. What type of
reliability is represented in this example?
● split-half
● test-retest
● alternate forms
● Inter-rater
● Conformity increases as group size increases from two people to four or five people.
● The presence of one dissenter in a group is not strong enough to reduce conformity.
● Higher levels of conformity are found in individualistic societies than in collectivistic societies.
● Individuals will follow orders to shock innocent strangers.
● Guttman
● Strong
● Holland
● Wonderlic
23. A recent article in an educational journal described a university at which the average age is 26.
This article also mentioned that 38 percent of the students are over 25 years of age. What can be
concluded from this information?
24. Which of the following statements is TRUE of the role of personality measures in
industrial/organizational psychology?
● Personality tests are playing less and less a role in I/O settings with every passing year.
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● The MMPI-2-RF has quickly become the most widely used measure of personality in I/O settings.
● The same personality test may not be equally suited for use with every job.
25. A neuropsychologist blindfolds a patient and then moves the patient’s arms and legs in various
positions. However, the patient cannot identify where his limbs are located. The
neuropsychologist would MOST likely suspect that the patient has suffered damage to the:
● temporal lobe
● frontal lobe
● parietal lobe
● occipital lobe
27. Who is credited with being the originator of the psychometric concept of test reliability?
● Kraeplin
● Pearson
● Titchener
● Spearman
28. In the language of psychological testing and assessment, reliability refers to:
29. In order to make norms for a certain test more appropriate for use with test-takers from Taiwan,
data from the original standardization sample of a test is supplemented with Taiwanese norms. In
this instance, it would be:
● necessary to reevaluate the wording of the test’s items in order to make certain that test-takers in
Taiwan do not findany of the items offensive in any way.
● perfectly appropriate to continue to use the terms “standardization sample” and “normative
sample” interchangeably with reference to this test.
● desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the “standardization sample” norms with reference to this test.
● inaccurate to continue to use the terms “standardization sample” and “normative sample”
interchangeably with reference to the test.
30. As part of the test development process, a test revision may entail:
31. You wish to determine if the student you are evaluating scored higher on a mathematics test than
on a reading test. What statistic(s) would you calculate?
33. An example of a personality test that employed empirical criterion keying in its development is
the:
● NEO-PI-R
● MMPI
● 16 PF
● Rorschach
34. Which question might a mental health professional be MOST likely to be asked during the course
of a civil proceeding?
37. If a test-taker earns a z score of +2 on a test, approximately how many other test-takers obtained
higher scores, assuming the distribution of test scores is normal?
● 2.5%
● 14%
● 16%
● 25%
38. Which model of intelligence guided the development of the fourth edition of the Stanford-Binet
Intelligence Scale?
● reusing items in an original test that were originally developed for use in a parallel test.
● the creation of alternate and parallel forms of tests based on a group of test-takers’ responses to
the original test.
● statistical efforts to ensure that items translated into foreign languages are of the same difficulty.
● administering certain test items on a test depending on the test-takers’ responses to
previous test items.
40. Users of psychological tests are frequently tempted to treat ordinal data as if it were interval data.
This is the case because of the:
● difficulties that would be encountered if the data were treated as ratio data.
● unwritten rule that exists pertaining to the equal intervals between points measured.
● added flexibility of interval level data for statistical manipulation.
● frequent need to do more than simply rank order test scores.
41. The 7-Minute Screen was developed to identify symptoms associated with which of the following?
● All of these.
● Alzheimer’s disease
● personality disorders
● seizure disorders
43. A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which
tool of assessment is LEAST likely to be used?
45. Which of the following statistics is the preferred measure of central tendency for a skewed
distribution?
● the median
● None of these.
● the mode
● the mean
46. The Children’s Apperception Test (CAT) depicts ________ in its pictures.
47. For use in a study of abnormal and bizarre behavior, a psychologist seeks to use a number of
psychological tests. The psychologist does not want reviews of tests, but only a description of
what is available for possible use in this research. Which reference source should the
psychologist consult?
48. The terms basal level, ceiling level, adaptive approach, and routing test are all associated with
the:
● WISC-IV
● OLSAT
● WAIS-lll
● SB-5
49. Children with chronic middle-ear infections often have low scores on the Sequenced Inventory of
Communication Development (SCID). This is evidence of the _________________ for the SCID.
● construct validity
● internal consistency validity
● test-retest validity
● content validity
● stability
● test-retest reliability
● All of these
● internal consistency
51. The type of research that attempts to replicate a real-world problem in a research or clinical
setting is called:
52. A patient exhibits deficits in word recall, vocabulary, and finding words to name things. A
neuropsychologist would be MOST likely to diagnose this patient with:
● frontal lobes
● occipital lobes
● limbic system
● parietal lobes
53. Which ethical issue is particularly relevant when assessing substance abusers in the context of
research?
● the issue of informed consent
● the issue of cultural sensitivity
● the issue of right to treatment
● the issue characterized by the phrase, “First, do no harm”
54. A student taking a course entitled “Ancient History” is administered a history test. Years later, data
from this test is reviewed by assessment professionals who are preparing a case study on the
test-taker. In their report, the “Ancient History” test is referred to as:
55. Melody exclaims, “I got a C- on the statistics exam, and I was miserable until I thought how
terrible it must be for those who got F’s.” Melody’s attitude is an example of which of the
following?
● social learning
● social comparison
● social anxiety
● social validation
57. A student scores very high on a graduate school admission test and is admitted to graduate
school, largely on the basis of his test score. The student subsequently flunks out. The type of
test outcome described in this situation is known as a:
● true negative.
● false negative.
● positive hit.
● false positive.
58. An administration of the Montreal Neuropsychological Institute Battery entails the administration
of:
60. If a student’s performance on a newly developed math achievement test is compared with his or
her recent performance on another achievement test known to measure math skills, this would be
an example of ________ validity.
● predictive criterion-related
● content
● concurrent criterion-related
● Construct
● have a friend or family member of the group who is fluent in English and Japanese read the test
to the group, simultaneously translating the items word for word.
● have a professional translator read the test to the group, simultaneously translating the items
word for word.
● have a teacher fluent in Japanese and English conduct a brief tutorial in English prior to
administering the test in English, with specific attention given to the meaning of the wording of
key items and corresponding responses.
● None of these.
62. The Hand-Tool Dexterity Test and the O’Connor Tweezer Dexterity Test would most likely be used
by an employer interested in:
64. The Trail Making Tests are part of which neuropsychological test battery?
● general aptitude
● achievement
● intelligence
● both achievement and intelligence
67. You are interested in developing a test for social adjustment in a college fraternity or sorority. You
begin by interviewing persons who had graduated from college after having been a member of a
fraternity or sorority for at least 2 years. Which stage of test development best describes the one
that you are in?
● Stanford-Binet 5
● Woodcock-Johnson lll
● K-ABC
● Hilton PTB
69. A Director of Human Resources is setting up a series of tests to use to select applicants for sales
positions. Inherent in the tests, and applied in the model of selection, is the Director’s assumption
that high sales ability can make up for limited product knowledge. The model of selection being
applied could BEST characterized as:
70. A correlation coefficient is equal to .30. Using the concept of coefficient of determination, the
variance accounted for by chance, error, and other unexplained factors would be:
● None of these.
● approximately 30%.
● approximately 3%.
● approximately 9%.
71. An educational psychologist conducts a utility analysis of a teaching program used to improve the
handwriting of very young children. The measure of utility in this analysis will most likely be:
● decrease in costs.
● increase in performance level.
● reduction in accidents.
● increase in revenue.
73. An assessor determines that John is incompetent to stand trial. This means that John:
● qualitative
● subjective
● brief
● concrete
75. The Schedule for Affective Disorders and Schizophrenia is an example of:
76. The specific objective of a utility analysis will dictate what sort of information will be required, as
well as the specific:
77. If a new test was developed to assist a college in selecting applicants, which group of test-takers
should ideally be administered the test items developed used during the item tryout phase of the
new test’s development?
● all high school juniors who are college-bound
● seniors in high school who were accepted to the college on the basis of criteria other than
the test under development
● college students who put a hold on their academic studies in order to backpack through Europe
for 1 year or more
● freshmen in college admitted who had taken one or more advanced placement courses in high
school
78. Which of the following is LEAST likely to be used in pre-employment screening of job applicants
for unskilled positions in a large corporation?
● application blanks
● letters of recommendation
● aptitude measures
● interviews
80. Which of the following represents a problem unique to self-report personality tests?
● The reading ability of respondents may prevent them from responding accurately to items.
● Respondents may be too “low” on the construct being measured to register on the test.
● All of these.
● Respondents might be unwilling to reveal something negative about themselves.
82. A psychologist working in a mental hospital is interested in predicting suicide risk from three
variables: severity of current depression, duration of current depression, and number of previous
suicide attempts. The psychologist gathers information on these variables for 100 subjects and
wants to combine the information in such a way that suicide risk is most accurately predicted. An
appropriate statistical technique would be:
84. In evaluating a child who is described as “fidgety, restless, and impulsive,” what type of drawing
on the Machover Draw-A-Person Test could that child reasonably be expected to produce?
85. In the language of psychological testing and assessment, scoring refers to assigning evaluative
numbers, codes or statements to performance on:
● Interviews
● Tasks
● Tests
● All of these
88. The higher the item-difficulty index, the ________ the item.
● more robust
● less robust
● harder
● Easier
89. If all scores in a set of test scores were the same, the variance would be equal to:
● None of these
● one
● Two
● Zero
90. A test developer designs a test for the sole purpose of identifying the most highly skilled
individuals among those tested. During the test revision stage of test development, the test
developer will be particularly interested in:
● item reliability
● item bias
● item validity
● item discrimination
● a list of guidelines for a standardized test used to ensure that all test-takers are similar in key
ways to the population of the original standardization sample.
● a statistical procedure in which weights are assigned to each item of a model test to maximize
predictive validity.
● a previously developed test with known validity that can be used as a comparison for newly
developed tests.
● a model for scoring and a mechanism for resolving scoring discrepancies.
92. A key difference between concurrent and predictive validity has to do with
● None of these.
● the time frame during which data on the criterion measure is collected.
● the magnitude of the reliability coefficient that will be considered significant at the .05 level.
● the magnitude of the validity coefficient that will be considered significant at the .05 level.
93. A large corporation scrupulously avoids any possibility of discrimination and adverse impact in its
hiring practices. The selection procedure it probably has in place with regard to its entry-level test
is one that entails:
94. As a result of the normalization of the standard scores on the MMPI-2, a T score of 70 on the
Depression Scale and a T score of 70 on Hypomania Scale will indicate that:
96. According to Vroom's expectancy theory of motivation, employees will expend energy:
99. The standard error of measurement of a particular test of anxiety is 8. A student earns a score of
60. What is the confidence interval for this test score at the 95% level?
● 40-68
● 44-76
● 36-84
● 52-60
100. Using a cut score of 50 on a predictor test a researcher finds a base rate of 1.00. This means
that when a cut score of 50 is used:
● 100% of applicants will perform successfully on a criterion measure.
● 50% of applicants will fail on a criterion measure.
● 100% of applicants will fail on a criterion measure
● 50% of applicants will perform successfully on a criterion measure.
● could have been expected to occur by chance alone 99 times or more in 100.
● has a 99% chance of being accurate.
● accounts for about 1% of the variance.
● could have been expected to occur by chance alone one time or less in 100.
102. The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:
● Holland Opus
● Carl Jung
● B. F. Skinner
● Sigmund Frey
104. The Otis-Lennon School Ability Test yields which of the following composite scores?
107. If the results of an examination are negatively skewed, the exam questions were likely:
● Difficult
● Biased
● Easy
● Quite novel in many respects
108. Forced-choice item formats are typically employed to control which of the following?
109. The plethysmograph is perhaps MOST useful in the assessment and treatment of:
● sexual offenders
● cardiac arrest
● people suffering from migraine
● agitated depression
110. Which statement is TRUE of the test tryout phase of test construction?
111.The statistical tool that is ideally suited for making selection decisions within the framework of a
compensatory model is:
● utility analysis.
● multiple regression.
● expectancy data.
● the Brogden-Cronbach-Gleser formula.
112. An aptitude test that includes both psychomotor and paper-and-pencil tasks is the:
113. A review of a new personality test is published in a journal. In that review, it would be
reasonable to expect to find information about:
115. To ensure that a test developed for national use is indeed suitable for national use, test
developers:
● All of these.
● post sample items on the Web to gauge response of different groups.
● have a culturally representative panel of experts review test items.
● employ a culturally representative group of examiners.
116. A group of researchers was interested in learning whether a newly developed exam would be
useful in determining whether a student will be successful in college. The researchers designed a
study in which students took the new exam prior to entering college, the student took another
exam, which was designed to measure how much information they had learned during their first
year. The score on this exam was then correlated with the student's score on the newly
developed exam. What type of validity was being evaluated in the study?
● Predictive
● Divergent
● Discriminant
● Concurrent
117. If there is common ground among of all of the varied approaches to psychological testing and
assessment, that common ground may MOST have to do with the assessor's:
● interrater reliability.
● test-retest reliability.
● noneconomic costs
● incremental validity.
119. If an ANOVA yields a significant F value, you could rely to test significant differences between
on group means.
121. A counselor created an achievement test with a reliability coefficient of .82. The test was
shortened since many clients felt it was too long. The counselor shortened the test but logically
assumed that the reliability coefficient would now
● remain at .82.
● be at least 10 points higher or lower.
● be lower than .82.
● be approximately .88.
● the work of Cronbach, and later, the work of Brogden and Gleser.
● the team of Brogden, Cronbach, and Gleser working together.
● the work of Brogden and later, the work of Cronbach and Gleser.
● Brogden, Cronbach, and Gleser, each working independently.
123. On the Wechsler tests of intelligence, the Full Scale IQ has a mean of __ and a standard
deviation of ___
● 50; 10
● 100; 16
● 50; 15
● 100; 15
124. Ideally, the first draft of a test should include at least how many items as compared with the
final version of the test?
125. A researcher was interested in whether or not jazz vocals and opera influence men's and
women's emotional states. She hypothesized that these types of music influence men and
women differently. In a study investigating this hypothesis, 40 men and 40 women heard a jazz
piece, and 40 men and 40 women heard an operatic piece. The jazz piece was sung by a man,
and the operatic piece was sung by a woman. Afterward, participants rated themselves on an
inventory measuring emotional state. Higher scores on the inventory indicate a positive mood.
Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and
operatic music positively changes women's moods. Which of the following invalidates that
conclusion?
● Previous studies have shown that men are less emotional than women.
● Men and women were randomly assigned to groups.
● Only one scale was used to measure mood.
● Men's and women's moods were not measured before exposure to the two types of
music.
128. If someone tells you that they took a test and received scores on scales called School Problems,
Conduct Problems, and Immaturity, it is a good bet that this person was administered:
● the NEO-PI-R.
● the MMPI-2-RF.
● the School Problems Checklist.
● the MMPI-A.
129. The task of sorting statement cards from least descriptive to most descriptive is most characteristic
of which type of assessment method?
● Semantic differential rating technique
● Forced-choice
● T-sort
● Q-sort
130. Depression is more common among people with insomnia than among those with satisfactory sleep.
To determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions. During that same
time, the yoked control was permitted to sleep when the depressed-insomniac person slept but was
required to awaken whenever the depressed-insomniac was awakened. A valid questionnaire for
measuring depression was administered at the end of the one-week study. Assume that higher scores on
the questionnaire reflect greater depressive symptomatology. Suppose that the results were consistent
with hypothesis that sleeplessness does NOT lead to depression. Of the following, which would be the
most serious criticism of the study and its conclusion?
131. Individuals who lack awareness, insight, and the ability to recognize problems are referred to as
● A. agnostic.
● B. anosognostic.
● C. amnesiac.
● D. anhedonics
132. CARRYOVER
133. Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
● the Army Beta
● the Armed Services Vocational Aptitude Battery (ASVAB)
● the Army Alpha Test
● the Army General Classification Test (AGCT)
134. In the acronym presented in text for remembering what each of the variables in the APGAR
measures, the letter G in APGAR stands for:
A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which tool of
assessment is LEAST likely to be used?
A: measures of creative thinking
Depression is more common among people with insomnia than among those with satisfactory sleep. To
determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions.
During that same time, the yoked control was permitted to sleep when the depressed-insomniac person
slept but was required to awaken whenever the depressed-insomniac was awakened.
A valid questionnaire for measuring depression was administered at the end of the one-week study.
Assume that higher scores on the questionnaire reflect greater depressive symptomatology.
What pattern of results on the depression questionnaire would one expect if depression were to arise for
reasons other than sleeplessness?
A. Yoked control < normal sleep control = depressed
B. Normal sleep control = yoked control < depressed
C. Normal sleep control < yoked control = depressed
D. Yoked control < normal sleep control ‹ depressed
Sally and Bob both apply for a position as an accounting clerk. The Human Resource (HR) professional
responsible for selecting the best candidate for the position administers a standardized test of basic
mathematical skills to both Sally and Bob. Based on their scores, the HR professional chooses Sally. The
reason for this choice is that Sally has an 85% chance of performing at an acceptable level. By contrast,
Bob's score indicated that he only had a 50% chance of performing successfully. The tool of assessment
used to make this hiring decision was:
A. good, old-fashioned intuition.
B. the method of predictive yield.
C. a Taylor-Russell table.
D. an expectancy table.
Which ethical issue is particularly relevant when assessing substance abusers in the context of research?
A. the issue characterized by the phrase, "First, do no harm"
B. the issue of cultural sensitivity
C. the issue of informed consent
D. the issue of right to treatment
A counselor created an achievement test with a reliability coefficient of.82. The test was shortened since
many clients felt it was too long.
A. The counselor shortened the test but logically assumed that the reliability coefficient would now remain
at .82.
B. be at least 10 points higher or lower.
C.be lower than .82.
D. be approximately .88.
An aptitude test that includes both psychomotor and paper-and-pencil tasks is the:
A. Bennet Mechanical Comprehension Test
B. O'Connor Tweezer Dexterity Test
C. General Aptitude Test Battery
D. Minnesota Clerical Test.
The 7-Minute Screen was developed to identify symptoms associated with which of the following?
A. Alzheimer's disease
B. personality disorders
C. seizure disorders
D. All of these.
Depression is more common among people with insomnia than among those with satisfactory sleep. To
determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions.
During that same time, the yoked control was permitted to sleep when the depressed-insomniac person
slept but was required to awaken whenever the depressed-insomniac was awakened.
A valid questionnaire for measuring depression was administered at the end of the one-week study.
Assume that higher scores on the questionnaire reflect greater depressive symptomatology.
What pattern of results on the depression questionnaire would justify the conclusion that sleeplessness
leads to depression?
A. Normal sleep control = yoked control < depressed
B. Yoked control < normal sleep control = depressed
C. Normal sleep control < yoked control = depressed
D. Yoked control < normal sleep control < depressed
A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
A. standard score
B. test composite
C. test requisite
D. scaled score
The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:
A. Holland Opus
B. Sigmund Freud
C. Carl Jung
D. B. F. Skinner
Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
A. is variable with criterion-referenced tests.
B. is minimized with criterion-referenced tests.
C. cannot be determined with criterion-referenced tests.
D. is maximized with criterion-referenced tests.
The lower test-retest reliability coefficients found to exist for state anxiety when compared with higher
test-retest reliability coefficients obtained for trait anxiety support which premise?
A. None of these.
B. Traits are more enduring personality characteristics than states.
C. Exhibition of anxiety is very situation-dependent.
D. States are more enduring personality characteristics than traits.
You wish to determine if the student you are evaluating scored higher on a mathematics test than on a
reading test. What statistic(s) would you calculate?
A. the raw score on each test as well as the mean of each distribution
B. the standard error of measurement for each test score
C. the standard error of the difference between two scores
D. the mean of each distribution and index of test difficulty for each test
"Multiple predictors may be used so that applicants must meet or exceed the cut score for each predictor
before moving to the next round of the selection process." What process is being described?
A. multiple hurdle selection
B. top down selection.
C. compensatory model of selection.
D. known-groups selection
Forced-choice item formats are typically employed to control which of the following?
A. None of these.
B. test-takers' tendency toward impression management
C. interviewers' tendency to stray from the points of focus
D. the reliability of respondent's response patterns
You are interested in developing a test for social adjustment in a college fraternity or sorority. You begin
by interviewing persons who had graduated from college after having been a member of a fraternity or
sorority for at least 2 years. Which stage of test development best describes the one that you are in?
A. the test revision stage
B. the test construction stage
C. the test-tryout stage
D. the pilot work stage
A "real time, live action" approach to assessment that requires the assessees to demonstrate abilities that
typically are characteristic of those they might encounter on-the-job is referred to as:
A. authentic assessment
B. portfolio assessment
C. performance assessment.
D. curriculum-based assessment
Psychological testing:
A. is characteristically broader in scope than assessment.
B. tends to be less accurate than assessment.
C. is typically more lengthy than assessment.
D. may be one component of the process of assessment.
Melody exclaims, "I got a C- on the statistics exam, and I was miserable until I thought how terrible it must
be for those who got F's." Melody's attitude is an example of which if the following?
A. social learning
B. social anxiety
C. social comparison
D. social validation
The higher the item-reliability index:
A. the higher the internal consistency of the test.
B. the more likely the test developer is to eliminate the item.
C. the lower the internal consistency of the test.
D. the more likely the test-taker is to miss the item.
Which of the following is the best way to establish rapport with a test-taker?
A. shaking hands with the test-taker on arrival to the facility
B. presenting the test-taker with a business card
C. a few words of "small talk" on meeting
D. playing a calming and soothing music prior to testing
Who is credited with being the originator of the psychometric concept of test reliability?
A. Spearman
B. Kraeplin.
C. Pearson
D. Titchener
During the course of a mental status examination, the examiner asks the examinee: "Do you know why
we are here and why I am interviewing you today?" By raising these questions, the examiner is MOST
likely trying to assess the examinee's:
A. orientation.
B. verbal abilities.
C. intellectual resources.
D. insight.
Using a cut score of 50 on a predictor test a researcher finds a base rate of 1.00. This means that when a
cut score of 50 is used:
A. 50% of applicants will fail on a criterion measure.
B. 100% of applicants will perform successfully on a criterion measure.
C. 100% of applicants will fail on a criterion measure.
D. 50% of applicants will perform successfully on a criterion measure.
If an instructor assigns a grade of "A" to all students who earn 900 or more points out of a total of 1000
points during the semester, 900 represents:
A. the selection ratio.
B. the base rate of A-level students.
C. the success rate.
D. the cut score for an A
The specific objective of a utility analysis will dictate what sort of information will be required, as well as
the specific:
A. Rise-and-Shine tables to be used.
B. expectancy tables to be used.
C. methods to be used.
D. Naylor-Shine tables to be used.
A neuropsychologist blindfolds a patient and then moves the patient's arms and legs in various positions.
However, the patient cannot identify where his limbs are located. The neuropsychologist would MOST
likely suspect that the patient has suffered damage to the:
A. frontal lobe
B. occipital lobe
C. temporal lobe
D. parietal lobe
The integration of data from statistical procedures, empirical methods, and formal rules to formulate
descriptions and make predictions is referred to as:
A. actuarial prediction
B. formal prediction
C. empirical prediction
D. clinical prediction
The Hand-Tool Dexterity Test and the O'Connor Tweezer Dexterity Test would most likely be used by an
employer interested in:
A. finding the best employee for a position on an assembly line.
B. understanding a worker's motivation to respond quickly with accuracy.
C. assessing a worker's ability to physically manipulate materials.
D. increasing profit margins by lowering expenses.
The type of research that attempts to replicate a real-world problem in a research or clinical setting is
called:
A. analogue research.
B. research with unobtrusive measures.
C. a case history approach to research.
D. the sign approach to research.
A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
A. standard score
B. test composite
C. scaled score
D. test requisite
Poor performance on the Block Design and other performance subtests of the Wechsler scales along with
high scores on the Verbal subtests would be suggestive of:
A. a "deterioration quotient" (DQ) of 10 or more.
B. possible damage in the right hemisphere of the brain.
C. severe head trauma.
D. possible damage in the left hemisphere of the brain.
What is the correlation coefficient of choice when two variables are ordinal?
A. Chi square
B. Pearson r
C. Spearman rho
D. Cronbach alpha
In order to make norms for a certain test more appropriate for use with test-takers from Taiwan, data from
the original standardization sample of a test is supplemented with Taiwanese norms. In this instance, it
would be:
A. inaccurate to continue to use the terms "standardization sample" and "normative sample"
interchangeably with reference to the test.
B. necessary to reevaluate the wording of the test's items in order to make certain that test-takers in
Taiwan do not find any of the items offensive in any way.
C. desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the "standardization sample" norms with reference to this test.
D. perfectly appropriate to continue to use the terms "standardization sample" and "normative sample"
interchangeably with reference to this test.
Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
A. the Army Beta
B. the Armed Services Vocational Aptitude Battery (ASVAB)
C. the Army General Classification Test (AGCT)
D. the Army Alpha Test
Which tool of psychological assessment is MOST likely to be used after tests have been administered to
a patient in order to evaluate that patient's level of premorbid functioning?
A. the interview
B. All of these.
C. behavioral observation
D. the case history
A psychologist working in a mental hospital is interested in predicting suicide risk from three variables:
severity of current depression, duration of current depression, and number of previous suicide attempts.
The psychologist gathers information on these variables for 100 subjects and wants to combine the
information in such a way that suicide risk is most accurately predicted. An appropriate statistical
technique would be:
A. incremental validity analysis.
B. meta-analysis.
C. multiple regression.
D. simple regression.
Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
A. is variable with criterion-referenced tests.
B. cannot be determined with criterion-referenced tests.
C. is minimized with criterion-referenced tests.
D. is maximized with criterion-referenced tests.
Dr. Chen is interested in feminist attitudes of young adult women in the United States. Consequently, she
administered a feminist attitude questionnaire to a total of 100 young adult women from three universities.
The 100 women tested and the number of young adult women in the United States are which of the
following, respectively?
A. Random assignment and random selection
B. Effect size and population
C. Independent and dependent variables
D. Sample and population
A student scores very high on a graduate school admission test and is admitted to graduate school,
largely on the basis of his test score. The student subsequently flunks out. The type of test outcome
described in this situation is known as a:
A. true negative
B. positive hit
C. false positive
D. false negative
To ensure that a test developed for national use is indeed suitable for national use, test developers:
A. employ a culturally representative group of examiners.
B. post sample items on the Web to gauge response of different groups.
C. have a culturally representative panel of experts review test items.
D. All of these.
The plethysmograph is perhaps MOST useful in the assessment and treatment of:
A. agitated depression
B. people suffering from migraine
C. sexual offenders
D. cardiac arrest
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate a positive mood. Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and operatic
music positively changes women's moods.
Which of the following invalidates that conclusion?
A. Only one scale was used to measure mood.
B. Previous studies have shown that men are less emotional than women.
C. Men and women were randomly assigned to groups.
D. Men's and women's moods were not measured before exposure to the two types of music.
Question 1
A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which tool of
assessment is LEAST likely to be used?
● referral for blood and urine tests
● measures of creative thinking
● familial medical history data
● case history data including school records
Question 2
The Trail Making Tests are part of which neuropsychological test battery?
● Halstead-Reitan Neuropsychological Battery
● Luria-Nebraska Neuropsychological Battery
● Woodcock-Johnson III
● Kaufman Assessment Battery
Question 3
Item branching refers to:
● administering certain test items on a test depending on the test-takers’ responses to previous test
items.
● statistical efforts to ensure that items translated into foreign languages are of the same difficulty.
● reusing items in an original test that were originally developed for use in a parallel test.
● the creation of alternate and parallel forms of tests based on a group of test-takers’ responses to
the original test.
Question 4
A psychologist wishes to compare the performances of an experimental group and a control group on a
continuous measure. Which of the following would be the most typical way to make this comparison?
● computing a multiple correlation
● conducting a t-test on the two means
● computing single correlation coefficient
● conducting a chi-square test
Question 5
On the Wechsler tests of intelligence, the Full Scale IQ has a mean of ________________ and a
standard deviation of _______________.
● 100; 15
● 100; 16
● 50; 15
● 50; 10
Question 6
An educational psychologist conducts a utility analysis of a teaching program used to improve the
handwriting of very young children. The measure of utility in this analysis will most likely be:
● reduction in accidents.
● increase in performance level.
● decrease in costs.
● increase in revenue.
Question 7
The Schedule for Affective Disorders and Schizophrenia is an example of:
● a projective test.
● an unstructured clinical interview.
● an objective personality test.
● a structured clinical interview.
Question 8
A self-report rating scale of neurological impairment is:
● the Neuropsychological Impairment Scale.
● the Seashore Rating Scale.
● the Short Portable Mental Status Questionnaire.
● the Patient’s Assessment of Own Functioning Scale.
Question 9
A psychologist who does not act in the same or similar way that other reasonable psychologists would
have acted under the same or similar circumstances may be found liable for:
● negligence.
● abuse.
● incompetency.
● malpractice.
Question 10
Users of psychological tests are frequently tempted to treat ordinal data as if it were interval data. This is
the case because of the:
● difficulties that would be encountered if the data were treated as ratio data.
● frequent need to do more than simply rank order test scores.
● added flexibility of interval level data for statistical manipulation.
● unwritten rule that exists pertaining to the equal intervals between points measured.
Question 11
A patient is administered the Minnesota Multiphasic Personality Inventory-2-RF (MMPI-2-RF) by an
experienced clinician. The clinician concludes that the patient has schizophrenia. The clinician’s diagnosis
best supports which of the following additional conclusions?
● The clinician’s interpretation of the MMPI-2RF findings is based on knowledge of projective
testing.
● The patient’s pattern of responses to the MMPI-2-RF resembles that of people who are known to
have schizophrenia.
● The patient received a high score on the lie scale of the MMPI-2-RF.
● A brief interview with the patient would reveal that the patient harbors delusions of grandeur.
Question 12
In the acronym presented in text for remembering what each of the variables in the APGAR measures,
the letter G in APGAR stands for:
● grimace
● good muscle tone
● glucose
● genetic inheritance
Question 13
In ipsative scoring, a test-taker’s scores are compared only to:
● the scores of other test-takers from past years who have taken the same test under the same or
similar conditions.
● the scores of other test-takers from the same geographic area who are similar with regard to key
variables such as gender.
● his or her other scores on a parallel form of the same test.
● his or her other scores on the test.
Question 14
The mental status examination used as part of the neuropsychological evaluation:
● will typically delve into specific areas of interest more extensively than the one used as part of a
clinical or counseling assessment.
● typically includes the administration of a neuropsychologically oriented adjective checklist.
● is exactly the same as a mental status examination used during a clinical or counseling
assessment.
● typically includes the administration of an intelligence test.
Question 15
A patient exhibits deficits in word recall, vocabulary, and finding words to name things. A
neuropsychologist would be MOST likely to diagnose this patient with:
● limbic system
● parietal lobes
● occipital lobes
● frontal lobes
● anomia nasearch ko (2)
Question 16
As a result of the normalization of the standard scores on the MMPI-2, a T score of 70 on the Depression
Scale and a T score of 70 on Hypomania Scale will indicate that:
● the scores will be significantly different depending on the gender of the test-taker.
● the two T scores equal the same level of clinical elevation.
● the two scores result in different percentile ranks for each scale.
● neither score is significantly elevated.
Question 17
If a patient suddenly begins to experience extremes in mood ranging from blunted affect to emotional
outbursts, a neuropsychologist would suspect damage to the:
● cerebellum
● spinal cord
● occipital lobe
● limbic system
Question 18
A group of researchers was interested in learning whether a newly developed exam would be useful in
determining whether a student will be successful in college. The researchers designed a study in which
students took the new exam prior to entering college, the student took another exam, which was designed
to measure how much information they had learned during their first year. The score on this exam was
then correlated with the student’s score on the newly developed exam. What type of validity was being
evaluated in the study?
● Concurrent
● Divergent
● Predictive
● Discriminant
Question 19
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate positive mood. Results of this study are presented in the graph below:
Which of the following describes the pattern of findings displayed in the graph?
● Women who heard the jazz piece and men who heard the operatic piece scored higher on the
mood inventory than those in the other two groups.
● Men who heard the jazz piece and women who heard the operatic piece scored higher on the
mood inventory than those in the other two groups.
● Men scored higher than women on the mood inventory regardless of the type of music they
heard.
● Women scored higher than women on the mood inventory regardless of the type of music they
heard.
Question 20
The standard deviation of a sample of test scores is a measure of the
● normality of the distribution
● central tendency of scores
● concurrent validity of the test
● variability of individual scores
Question 21
Critics have argued that projective tests are too
● qualitative
● concrete
● subjective
● brief
Question 22
The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:
● Holland Opus
● Sigmund Freud
● Carl Jung
● B. F. Skinner
Question 23
A Director of Human Resources is setting up a series of tests to use to select applicants for sales
positions. Inherent in the tests, and applied in the model of selection, is the Director’s assumption that
high sales ability can make up for limited product knowledge. The model of selection being applied could
BEST characterized as:
● a multiple hurdle model of selection.
● the method of predictive yield in action.
● a compensatory model of selection.
● the method of contrasting group for selection.
Question 24
Detailed information regarding how a particular test was developed is typically found in:
● the current test catalogue distributed by the test’s publisher.
● the Standards for Educational and Psychological Tests.
● the test manual
● a review of the test published in a journal.
Question 25
If someone tells you that they took a test and received scores on scales called School Problems, Conduct
Problems, and Immaturity, it is a good bet that this person was administered:
● the NEO-PI-R.
● the MMPI-A.
● the MMPI-2-RF.
● the School Problems Checklist.
Question 26
Validity is to ____________ as utility is to ____________.
● consistency; accuracy
● usefulness; accuracy
● usefulness; consistency
● accuracy; usefulness
Question 27
Sally and Bob both apply for a position as an accounting clerk. The Human Resource (HR) professional
responsible for selecting the best candidate for the position administers a standardized test of basic
mathematical skills to both Sally and Bob. Based on their scores, the HR professional chooses Sally. The
reason for this choice is that Sally has an 85% chance of performing at an acceptable level. By contrast,
Bob’s score indicated that he only had a 50% chance of performing successfully. The tool of assessment
used to make this hiring decision was:
● the method of predictive yield.
● an expectancy table.
● good, old-fashioned intuition.
● a Taylor-Russell table.
Question 28
A researcher conducted a study to determine the effects of gender and status on the perceived credibility
of an eyewitness testifying in a trial. Participants watched one of four video recordings depicting the
eyewitness and rated the credibility of the eyewitness.
What type of design was used in this study?
● between-subjects
● between- and within-subjects
● within-subjects
● multivariate correlational
Question 29
Which of the following is most appropriate for determining the psychometric soundness of behavioral
assessment?
● the experimental analysis of behavior
● generalizability theory
● classical test theory
● empirical methods
Question 30
A key difference between concurrent and predictive validity has to do with:
● the magnitude of the reliability coefficient that will be considered significant at the .05 level.
● the time frame during which data on the criterion measure is collected.
● the magnitude of the validity coefficient that will be considered significant at the .05 level.
● None of these.
Question 31
As compared to more traditional, one-on-one and face-to-face assessments, a disadvantage of CAPA is
that it typically deprives the assessor of the opportunity to:
● have a Bowflex workout during the assessment.
● tailor the test’s content to the responses.
● make certain that test forms can be kept secure.
● observe the test-taker’s test-taking behavior.
Question 32
If a test-taker earns a z score of +2 on a test, approximately how many other test-takers obtained higher
scores, assuming the distribution of test scores is normal?
● 2.5%
● 25%
● 16%
● 14%
Question 33
The plethysmograph is perhaps MOST useful in the assessment and treatment of:
● people suffering from migraine
● sexual offenders
● agitated depression
● cardiac arrest
Question 34
Which model of intelligence guided the development of the fourth edition of the Stanford-Binet Intelligence
Scale?
● the Cattell-Horn theory
● Gardner’s theory of multiple intelligences
● the Cattel-Horn-Cattel (CHC) model
● Spearman’s two-factor theory
Question 35
Wakefield’s definition of a disorder includes:
● all of these
● an assumption that an evolutionary failure occurred.
● a value judgment regarding the basic goodness of people.
● a strong belief in herbal remedies for treatment.
Question 36
The specific objective of a utility analysis will dictate what sort of information will be required, as well as
the specific:
● methods to be used.
● Naylor-Shine tables to be used.
● expectancy tables to be used.
● Rise-and-Shine tables to be used.
Question 37
The thalamus acts as:
● a brake on emotional impulses and a calming influence when one is angered.
● visual-spatial sequencer for perceiving complex patterns of movement.
● a communications relay station for sensory information being transmitted to the cerebral cortex.
● an executive controller for volitional motor movements.
Question 38
Research on the Psychopathy Checklist suggests that it is useful in:
● identifying criminal recidivists.
● predicting violence within prisons.
● predicting crimes committed by inmates.
● identifying a psychotic disorder.
Question 39
Exploratory factor analysis is used for all of the following EXCEPT:
● summarizing large data sets efficiently.
● determining the number of dimensions present in the data.
● determining which items correlate with which dimensions in the data.
● determining whether one factor causes the appearance of another.
Question 40
In the language of psychological testing and assessment, scoring refers to assigning evaluative numbers,
codes or statements to performance on:
● Tasks
● Interviews
● All of these
● Tests
Question 41
Question 42
An assessor determines that John is incompetent to stand trial. This means that John:
Question 43
Which statement is TRUE regarding the definition of personality?
Question 45
Which of the following is the best way to establish rapport with a test-taker?
● Presenting the test-taker with a business card
● Playing a calming and soothing music prior to testing
● A few words of “small talk” on meeting
● Shaking hands with the test-taker on arrival to the facility
Question 46
Which BEST describes what is typically measured in personality assessment?
● Creativity and motivation
● Personal values
● Social communication skills
● Traits and states
Question 47
A psychologist working in a mental hospital is interested in predicting suicide risk from three variables:
severity of current depression, duration of current depression, and number of previous suicide attempts.
The psychologist gathers information on these variables for 100 subjects and wants to combine the
information in such a way that suicide risk is most accurately predicted. An appropriate statistical
technique would be:
● Meta-analysis.
● Multiple regression.
● Simple regression.
● Incremental validity analysis.
Question 48
A significant, positive relationship exists between scores on a new test of intelligence and scores on the
fourth edition of the Stanford-Binet intelligence scale. These data may be viewed as supportive of which
type of validity evidence for the new test?
● Convergent evidence of construct validity
● Content validity
● Discriminant evidence of construct validity
● Criterion-related validity
Question 49
The deviation IQ reflects a comparison of the performance of the individual with the performance of
others:
● In the entire standardization sample.
● In the same grade in the standardization sample.
● Of the same age in the standardization sample.
● In the same grade and of the same age in the standardization sample.
Question 50
You are interested in developing a test for social adjustment in a college fraternity or sorority. You begin
by interviewing persons who had graduated from college after having been a member of a fraternity or
sorority for at least 2 years. Which stage of test development best describes the one that you are in?
● The test construction stage
● The test revision stage
● The pilot work stage
● The test-tryout stage
● Single factor within subjects
Question 51
If John earns a full-scale IQ of 90 on the WISC-IV:
● John correctly answered 90 questions.
● John correctly answered 90% of the questions.
● John scored at the low end of the average range.
● Ninety percent of the students in John’s age group scored lower than John on this test.
Question 52
A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
● standard score
● scaled score
● test requisite
● test composite
Question 53
As part of the test development process, a test revision may entail:
● The reprinting of a test.
● Rewording, deletion, or development of new items; and, development of a new edition of a test.
● Rewording, deletion, or development of new items.
● Development of a new edition of a test.
Question 54
What is the correlation coefficient of choice when two variables are ordinal?
● Chi square
● The Spearman rho
● Cronbach alpha
● Pearson r
Question 55
In a distribution that is symmetrical, which of the following is true?
● The distances from Q2 and Q3 to the median are the same.
● The distances from Q1 and Q2 to the median are the same.
● The distances from Q1 and Q3 to the median are the same.
● The distances from Q1 and Q4 to the median are the same.
Question 56
In a study of a new psychopharmacological treatment for clinical depression, 40 participants diagnosed
with depression each received four (4) different amounts of a new medication called Deplow. The first
week, they were given a placebo. During the second week of the study, they took 1 mg. of Deplow each
day. During the third week, they took 3 mg. of Deplow each day, and during the fourth week, they took 5
mg. of Deplow each day. Although the participants took different amounts of the medication each week,
they were not informed about the amount they were taking. The participants also completed a depression
symptom checklist at the end of each week. Results are presented below. The score on the checklist
could range from 0 to 30 indicating severe depression. Assume statistical significance for differences
greater than 3.0. What type of design was used in this study?
Question 57
A “good” test item on an ability test is one:
● To which almost all test-takers respond incorrectly.
● That distinguishes high scorers from low scorers.
● To which almost all test-takers respond correctly.
● In which it is absolutely impossible to guess the correct answer.
Question 58
Kate received a z score of 1 on a reading test. What do we know about Kate’s performance, assuming
that the reading test scores are distributed normally?
● She scored better than only 2/3 of the other students.
● She scored worse than 84% of other students.
● She scored better than 84% of other students.
● She scored worse than only 2/3 of other students.
Question 59
If the results of an examination are negatively skewed, the exam questions were likely:
● Biased
● Easy
● Difficult
● Quite novel in many respects
Question 60
Criterion validity of the General Aptitude Test Battery (GATB) tends to be low, probably because of:
● A limitation of the Taylor-Russell tables.
● The low test-retest reliability of the GATB.
● The low reliability of supervisory ratings.
● Scoring that is based in part on the race of the test-taker.
Question 61
The Brogden-Cronbach-Gleser formula was developed by:
● Brogden, Cronbach, and Gleser, each working independently.
● The team of Brogden, Cronbach, and Gleser working together.
● The work of Brogden and later, the work of Cronbach and Gleser.
● The work of Cronbach, and later, the work of Brogden and Gleser.
Question 62
If an ANOVA yields a significant F value, you could rely on _______ to test significant differences
between group means.
● Duncan’s multiple-range, Tukey’s, or Scheffe’s test.
● Percentile rank.
● Summative or formative evaluation.
● One- and two-tailed t tests.
Question 63
Citing only positive attributes in a self-report measure of personality is a phenomenon referred to as:
● Amplifying
● Projecting
● Self-deception
● Socially desirable responding
Question 64
The greater the magnitude of the item-discrimination index:
● The more people in the lower-scoring group answered the item correctly as compared with those
in the higher-scoring group.
● The more reliable the test.
● The more valid the test.
● The more people in the higher-scoring group answered the item correctly as compared with those
in the lower-scoring group.
Question 65
Which tool of psychological assessment is MOST likely to be used after tests have been administered to
a patient in order to evaluate that patient’s level of premorbid functioning?
● The case history
● The interview
● Behavioral observation
● All of these.
Question 66
If a test-taker is asked to name familiar objects, write familiar words, and follow verbal instructions on a
test that takes 15 minutes or less, he or she would most likely be taking:
● The Aphasia Screening Test.
● The Halstead-Reitan Neuropsychological Battery.
● The Wechsler Memory Scale.
● The Wisconsin Card Sorting Test.
Question 67
An administration of the Montreal Neuropsychological Institute Battery entails the administration of:
● The Wisconsin Card Sorting Test.
● The Wechsler Intelligence Test.
● All of these
● The Mooney Faces Test.
Question 68
In a normal distribution of scores, approximately what percentage of test scores falls between +1 and –1
standard deviations from the mean?
● 66%
● less than 1%
● 50%
● 75%
Question 69
Which of the following statements is TRUE of the role of personality measures in industrial/organizational
psychology?
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● Personality tests are playing less and less a role in I/O settings with every passing year.
● The same personality test may not be equally suited for use with every job.
● The MMPI-2-RF has quickly become the most widely used measure of personality in I/O settings.
Question 70
If a time limit is long enough to allow test-takers to attempt all items, and if some items are so difficult that
no test-takers is able to obtain a perfect score, then the test is referred to as a____ test.
● power
● valid
● reliable
● speed
Question 71
● All of these.
● Respondents may be too "low" on the construct being measured to register on the test.
● Respondents might be unwilling to reveal something negative about themselves.
● The reading ability of respondents may prevent them from responding accurately to items.
Question 72
The integration of data from statistical procedures, empirical methods, and formal rules to formulate
descriptions and make predictions is referred to as:
● clinical prediction
● formal prediction
● empirical prediction
● actuarial prediction
Question 73
● test-retest reliability
● internal consistency
● All of these
● stability
Question 74
Dr. Chen is interested in feminist attitudes of young adult women in the United States. Consequently, she
administered a feminist attitude questionnaire to a total of 100 young adult women from three universities.
The 100 women tested and the number of young adult women in the United States are which of the
following, respectively?
Question 75
Behavioral assessment has many advantages over other forms of assessment. Which is NOT one of
those advantages?
Question 76
The higher the item-difficulty index, the
the item.
● more robust
● harder
● easier
● less robust
Question 77
During the course of a mental status examination, the examiner asks the examinee: "Do you know why
we are here and why I am interviewing you today?" By raising these questions, the examiner is MOST
likely trying to assess the examinee's:
● intellectual resources
● insight
● verbal abilities
● orientation
Question 78
The task of sorting statement cards from least descriptive to most descriptive is most characteristic of
which type of assessment method?
● Forced-choice
● T-sort
● Q-sort
● Semantic differential rating technique
Question 79
Aphasia patients suffer the loss of the ability to:
● perceive smell.
● hold their hands steady.
● express themselves orally or in writing.
● perceive sounds lower in volume than a "dollar watch."
Question 80
Melody exclaims, " got a C- on the statistics exam, and I was miserable until I thought how terrible it must
be for those who got F's." Melody's attitude is an example of which if the following?
● social anxiety
● social comparison
● social learning
● social validation
Question 81
The Otis-Lennon School Ability Test yields which of the following composite scores?
● Verbal Performance Composite
● O-L Composite
● Full Scale 10
● School Ability Index
Question 82
A test developer designs a test for the sole purpose of identifying the most highly skilled individuals
among those tested. During the test revision stage of test development, the test developer will be
particularly interested in:
● item bias
● item reliability
● item validity
● item discrimination
Question 83
The 16 PF exemplifies which approach to assessment?
● impressionistic
● nomothetic
● projective
● Idiographic
Question 84
An example of a personality test that employed empirical criterion keying in its development is
the:
● MMPI
● 16 PF
● NEO-PI-R
● Rorschach
Question 85
Psychometrics may best be defined as:
● the science of psychological measurement.
● the study of psychic phenomena.
● the science of test development.
● the study and use of correlational techniques.
Question 86
Which of the following increases the power of a statistical test?
● changing from a two-tailed to a one-tailed test
● changing alpha from .05 to .01
● using a smaller critical area in the distribution of sample means
● decreasing the sample size from N = 100 to N = 75
Question 87
Research by Solomon Asch supports which of the following?
● Individuals will follow orders to shock innocent strangers.
● Higher levels of conformity are found in individualistic societies than in collectivistic societies.
● The presence of one dissenter in a group is not strong enough to reduce conformity.
● Conformity increases as group size increases from two people to four or five people.
Question 88
If a new test was developed to assist a college in selecting applicants, which group of test-takers should
ideally be administered the test items developed used during the item tryout phase of the new test's
development?
● college students who put a hold on their academic studies in order to backpack through Europe
for 1 year or more
● seniors in high school who were accepted to the college on the basis of criteria other than the test
under development
● freshmen in college admitted who had taken one or more advanced placement courses in high
school
● all high school juniors who are college-bound
Question 89
Psychoeducational test batteries are designed to measure:
● academic motivation
● adjustment and personality
● ability and achievement
● scholastic aptitude
Question 90
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate a positive mood. Results of this study are presented in the graph below:
Average mood inventory scores of men and women by music type.JPG
Which of the following is the most serious problem with the methodology of this research?
● Only one type of music should have been used.
● The sample size was too small to draw a valid conclusion.
● Men and women did not listen to both types of music.(indi ba i2?)
● The singers were not the same gender.
Question 91
If a psychologist determines that a client is a danger to others, that psychologist has a legal obligation to:
● keep the information privileged.
● warn the person who is in danger.
● share this information with a colleague before taking any action.
● seek a legal opinion from a lawyer.
Question 92
Which question might a mental health professional be MOST likely to be asked during the course of a civil
proceeding?
● Is this individual competent to stand trial?
● All of these.
● To what extent did this individual suffer emotional distress?
● Was this individual sane at the time the crime was committed?
Question 93
A 40-item vocabulary test was administered to a group of students. A second similar test of vocabulary
term was administered to this same group of students approximately one week later. The researcher
reported that the correlation between these two tests was r = 90. What type of reliability is represented in
this example?
● split-half
● alternate forms
● test-retest
● inter-rater
Question 94
In a study of a new psychopharmacological treatment for clinical depression, 40 participants diagnosed
with depression each received four (4) different amounts of a new medication called Deplow. The first
week, they were given a placebo. During the second week of the study, they took 1 mg. of Deplow each
day. During the third week, they took 3 mg. of Deplow each day, and during the fourth week, they took 5
mg. of Deplow each day. Although the participants took different amounts of the medication each week,
they were not informed about the amount they were taking. The participants also completed a depression
symptom checklist at the end of each week. Results are presented below. The score on the checklist
could range from O to 30 indicating severe depression. Assume statistical significance for differences
greater than 3.0.
Which of the following would make it difficult to conclude that any decrease in depressive symptoms is
due to Deplow and not to other aspects of the study?
● The lack of comparison with an established antidepressant medication
● The increasing doses of Deplow
● The low sample size
● The lack of a control group
Question 95
When a cut score is set based on norm-related considerations rather than on the relationship of test
scores to a criterion, it is known as:
● an absolute cut score.
● a referential cut score.
● a fixed cut score.
● a relative cut score.
Question 96
Which of the following is NOT an assumption of utility analysis?
● psychological tests are always preferred over other means of assessment.
● the value of people and their performance can be estimated.
● large amounts of information can be integrated to make good decisions.
● the performance of people in organizations can affect organizational viability.
Question 97
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate positive mood. Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and operatic
music positively changes women's moods. Which of the following invalidates that conclusion?
● Previous studies have shown that men are less emotional than women.
● Men's and women's moods were not measured before exposure to the two types of music.
● Only one scale was used to measure mood.
● Men and women were randomly assigned to groups
Question 98
It is one of the vital tools of psychological assessment which pertains to how consistently and accurately a
psychological test measures what it purports to measure.
● Utility
● Reliability
● Psychometric Soundness
● Inter-item Consistency
Question 99
Who is credited with being the originator of the psychometric concept of test reliability?
● Spearman
● Pearson
● Kraeplin
● Tichener
Question 100
If there is common ground among of all of the varied approaches to psychological testing and
assessment, that common ground may MOST have to do with the assessor's:
● strict adherence to ethical guidelines.
● psychoanalytically-based interpretation of findings.
● All of these
● use of an ability test and a test of personality.
Question 101
The lower test-retest reliability coefficients found to exist for state anxiety when compared with higher
test-retest reliability coefficients obtained for trait anxiety support which premise?
● None of these.
● Traits are more enduring personality characteristics than states.
● States are more enduring personality characteristics than traits.
● Exhibition of anxiety is very situation-dependent.
Question 102
If a student's performance on a newly developed math achievement test is compared with his or her
recent performance on another achievement test known to measure math skills, this would be an example
of_______validity.
● predictive criterion-related
● concurrent criterion-related
● content
● construct
Question 103
The mental status examination used as part of the neuropsychological evaluation:
● is exactly the same as a mental status examination used during a clinical or counseling
assessment.
● will typically delve into specific areas of interest more extensively than the one used as part of a
clinical or counseling assessment.
● typically includes the administration of an intelligence test.
● typically includes the administration of a neuropsychologically oriented adjective checklist
Question 104
Which is NOT a typical question that is raised and answered during the test conceptualization stage of
test development?
● Is there a need for the test?
● How valid are the items on the test?
● What is the objective of the test?
● What types of responses will be required of the test-taker?
Question 105
The standard error of measurement of a particular test of anxiety is 8. A student score of 60. What is the
confidence interval for this test score at the 95% level?
● 44-76
● 40-68
● 36-84
● 52-68
Question 106
The Children's Apperception Test (CAT) depicts ____ in its pictures.
● humans interacting with animals
● animals
● humans
● dolls and puppets
Question 107
Psychologists who are called on by the courts to render an opinion regarding a person's sanity must be
prepared to:
● undergo an examination evaluating their own mental status prior to being admitted to the
courtroom.
● have their license revoked if their opinions regarding the sanity of the defendant are at odds with
that of the court.
● deal with all the ramifications of the fact that diagnoses of "sanity" and "insanity are ultimately left
to a judge or a jury to decide.
● explain all the ramifications of the fact that "sanity" and "insanity" are psychological. and not legal
terms.
Question 108
In the language of psychological testing and assessment, reliability refers to:
● the proportion of total variance that can be attributed to true variance.
● The lack of systematic errors.
● whether or not a test publisher consistently publishes high quality instruments.
● How well a test measures what it is intended to measure under specified conditions.
Question 109
A norm group is a group of test-lakers:
● for whom a particular test is deemed appropriate.
● that is typically described in the test manual
● taking a particular test for the very first time.
● for whom a particular test is deemed inappropriate
Question 110
The term "RIASEC" is BEST associated with
● Strong
● Wonderlic
● Holland
● Guttman
Question 111
The 7-Minute Screen was developed to identify symptoms associated with which of the following?
● personality disorders
● seizure disorders
● All of these.
● Alzheimer's disease
Question 112
A student scores very high on a graduate school admission test and is admitted to graduate school,
largely on the basis of his test score. The student subsequently flunks out. The type of test outcome
described in this situation is known as a:
● positive hit.
● false negative.
● false positive.
● true negative.
Question 113
Forced-choice item formats are typically employed to control which of the following?
● None of these.
● the reliability of respondent's response patterns
● test-takers tendency toward impression management
● interviewers tendency to stray from the points of focus
Question 114
A self-report rating scale of neurological impairment is:
● The Short Portable Mental Status Questionnaire.
● the Seashore Rating Scale.
● The Neuropsychological Impairment Scale.
● the Patient's Assessment of Own Functioning Scale
Question 115
An anchor protocol is:
● a list of guidelines for a standardized test used to ensure that all test-takers are similar in key
ways to the population of the original standardization sample.
● a model for scoring and a mechanism for resolving scoring discrepancies.
● a previously developed test with known validity that can be used as a comparison for newly
developed tests.
● a statistical procedure in which weights are assigned to each item of a model test to maximize
predictive validity.
Question 116
In the administration of the TAT:
● all stimulus cards are presented to all subjects
● a minimum of ten cards must be presented.
● the number of cards presented is left to examiner discretion
● a maximum of twenty cards is presented
Question 117
In order to make norms for a certain test more appropriate for use with test-takers from Taiwan, data from
the virginal standardization sample of a test is supplemented with Taiwanese norms. In this instance, it
would be:
● necessary to reevaluate the wording of the test's items in order to make certain that test-takers in
Taiwan do not find any of the items offensive in any way.
● inaccurate to continue to use the terms "standardization sample' and "normative sample"
interchangeably with reference to the test.
● perfectly appropriate to continue to use the terms "standardization sample" and "normative
sample" interchangeably with reference to this test
● desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the " standardization sample norms with reference to this test.
Question 118
The strongest psychometric aspect of the Rorschach is its:
● test-retest reliability over a short period of time.
● interrater reliability with respect to scoring categories
● interrater reliability with respect to interpretations.
● internal-consistency split-half reliability for odd and even items
Question 119
The method of paired comparisons is used to:
● provide test-takers with a sufficient number of pairs of choices to express their "true" opinions.
● provide test-takers with a limited number of pairs of choices in order to lessen testing time.
● maximize the opportunity of selecting a socially desirable response
● minimize the opportunity of selecting a socially desirable response
Question 120
To ensure that a test developed for national use is indeed suitable lo national use, test developers:
● All of these.
● post sample items on the Web to gauge response of different rups
● have a culturally representative panel of experts review test items.
● employ a culturally representative group of examiners
Question 121
The Likert scale is an example of which type of rating scale?
● paired methods
● summative
● content
● categorical
Question 122
hit rate is equivalent to:
● the miss rate/the selection ratio.
● the success rate/base rate of successful performance
● number of correct classifications /total number of classifications
● the base rate/the selection ratio
Question 123
Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
● the Army Alpha Test
● the Armed Services Vocational Aptitude Battery (ASVAB)
● the Army General Classification Test (AGCT)
● the Army Beta
Question 124
An intelligence test originally written in English is to be administered to a group of Japanese immigrants
who do not speak English. In order to obtain an accurate measure of intelligence and completely
eliminate any possible effects due to language. the test administrator should:
● have a teacher fluent in Japanese and English conduct a brief tutorial in English prior to
administering the test in English, with specific attention given to the meaning of the wording of
key items and corresponding responses.
● None of these.
● have a professional translator read the test to the group. simultaneously translating the items
word for word.
● have a friend or family member of the group who is fluent in Endon and Japanese read the test to
the group, simultaneously translating the items word for word.
Question 125
The Beck Depression Inventory-ii measures the test-takers' feelings over what period of time?
● The last month
● 3 weeks
● 1 week
● 2 weeks
Question 126
Which of the following statements is TRUE of the role of personality measures in industrial/organizational
psychology?
● Personality tests are playing less and less a role in I/O setting with every passing year.
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● The MMPI-2-RF has quickly become the most widely used measure of personality in 1/0 settings.
● The same personality test may not be equally suited for use with every job.
Question 127
A counselor created an achievement test with a reliability coefficient. of 82. The test is shortened since
many clients felt it was too long The counselor shortened the test but logically assumed that the reliability
coefficient would now
● remain at 82.
● be at least 10 points higher or lower
● be lower than 82.
● be approximately 88.
An applicant for a job with the U.S. Postal Service scores in the bottom 5% of all applicants on a test that
measures the ability to sort mail. This is an example of:
a. norm-referenced assessment. (comparing that persons score to a normative sample)
b. criterion-referenced assessment.
c. behavioral assessment.
d. an individual who may one day "go postal."
A third-grade student who earned a grade-equivalent score of 5.0 on a standardized test of mathematics:
a. has the same mathematics ability as the average fifth-grade student in that same school.
b. should not be enrolled in a fifth-grade math class.
c. performed similarly to a hypothetical fifth-grade student.
d. will most probably earn a grade of A in the course.
Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that
measures a trait that is relatively stable over time?
a. parallel-forms
b. alternate-forms
c. test-retest
d. split-half
An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval
between the test and retest is more than:
a. 30 days.
b. 60 days.
c. 3 months.
d. 6 months.
Which term is used to refer to the tendency of a rater to evaluate a person higher than they objectively
deserve because of the rater's inability to discriminate between aspects of the person's behavior?
a. halo effect
b. random error
c. generosity error
d. severity error
Although there are some exceptions, in practice, most reliability coefficients, regardless of the specific
type of reliability they are measuring, range in value from:
a. -1 to +1
b. 0 to 100
c. 0 to 1.
d. negative infinity to positive infinity
Which of the following concepts is synonymous with utility as used in the text?
a. consistency
b. truthfulness
c. usefulness
d. accuracy
Which is an example of the use of a completion format on a test? (all other answers are selection format)
a. true-false items
b. matching items
c. short-answer items
d. multiple-choice item
The higher an item-validity index (items correspond to what the test says it measures), the greater the
__________ validity.
a. construct
b. content
c. criterion
d. face
Which is an example of a false positive in the context of employee selection?
a. hired applicants who scored at or above the cut-off score on the employment test went on to fail on
the job
b. hired applicants who scored at or above the cut-off score on the employment test went on to
succeed on the job
c. rejected applicants who scored below the cut-off score on the employment test were rejected but
would have gone on to succeed on the job had they been given a chance
d. rejected applicants who scored below the cut-off score on the employment test were rejected but
went on to succeed at another, totally different job.
On a particular test, men and women tend to have the same total score. Men and women do, however,
tend to exhibit different response patterns to specific items. A reasonable conclusion is that the test is:
a. unreliable.
b. invalid.
c. biased.
d. patently unfair.
In Guttman scaling:
a. test-takers are presented with a forced-choice format
b. each item is completely independent of every other item and nothing can be concluded as the
result of the endorsement of an item.
c. when one item is endorsed by a test-taker, the less extreme aspects of that item are also
endorsed.
d. when more than one item tapping a particular content area is endorsed, the less extreme aspects
of those items are also endorsed.
a. .001; 1.00
b. -1; +1
c. 0%; 100%
d. 1 to 100
a. a stem.
b. a distractor.
c. a foil.
d. All of these
a. item reliability.
b. item validity.
c. item difficulty.
Test items that contain alternatives with five points ranging from "strongly agree" to "strongly disagree"
are characterized as using this approach to scaling:
a. Guttman scaling.
b. Likert scaling.
c. Nielson scaling.
d. opinion scaling.
Multiple hurdles as used in a decision-making process regarding a selection decision refers to:
a. the use of two or more cut scores with reference to one predictor for the purpose of categorizing
test-takers.
b. the multiple stages each applicant must successfully complete in order to get to the next stage in
the evaluation process.
c. the obstacles to success placed before each of the contestants on Project Runway.
d. All of these
Shelly applies for a job at a company that gives all applicants a drug test during the hiring process.
Despite the fact that Shelly smokes marijuana almost daily, the company's test report indicates that
she is drug-free. In this case:
Why might a clinician interview a culturally different client about cultural aspects of his or her life?
A. to develop hypotheses about the intelligence and personality of the interviewee
B. to distinguish psychopathological behavior from that which is more typical of the culture of the
interviewee
C. to differentiate between what constitutes psychopathology in the majority culture from what constitutes
psychopathology in the client's culture
D. to sample new foods
In creating a test designed to measure personality constructs, the test developer's first step would BEST
be to
Henry A. Murray is the author of a "personology" theory of personality and is perhaps best associated
with:
A. the proportion of people the test correctly identifies as possessing a particular trait, behavior,
characteristic, or attribute
B. the proportion of people in the general population who possess the particular trait, behavior,
characteristic, or attribute
C. the proportion of people the test incorrectly identifies as possessing a particular trait, behavior,
characteristic, or attribute
D. the degree of validity of a particular test
The term used to describe the proportion of people in a population who are distinctive due to their
exhibition of a particular trait is
A. success rate.
B. base rate.
C. target rate.
D. cut rate.
A. construct validity.
B. criterion-related validity.
C. content validity.
D. All of these