0% found this document useful (0 votes)
143 views62 pages

Psych Ass

Uploaded by

Krazzky TV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views62 pages

Psych Ass

Uploaded by

Krazzky TV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

PSYCHOLOGICAL ASSESSMENT PRE-TEST ITEMS REVIEWER

GREEN - SURE
RED - NOT SURE

1. Depression is more common among people with insomnia than among those with satisfactory
sleep. To determine the reasons for this relationship, investigators identified 40 people suffering
from both depression and insomnia. For each of these 40, they paired two other people of the
same gender and age who were neither depressed nor suffering from any sleep disorder. One of
these was designated the "normal-sleep control," and the other was designated the "yoked
control." All participants slept in a laboratory for one week. The normal-sleep control person slept
without restrictions. During that same time, the yoked control was permitted to sleep when the
depressed-insomniac person slept but was required to awaken whenever the
depressed-insomniac was awakened. A valid questionnaire for measuring depression was
administered at the end of the one-week study. Assume that higher scores on the questionnaire
reflect greater depressive symptomatology. What pattern of results on the depression
questionnaire would justify the conclusion that sleeplessness leads to depression?

● Normal sleep control = yoked control < depressed


● Normal sleep control < yoked control = depressed
● Yoked control < normal sleep control = depressed
● Yoked control < normal sleep control < depressed

2. Which of the following is the best way to establish rapport with a test-taker?

● presenting the test-taker with a business card


● a few words of “small talk” on meeting
● playing a calming and soothing music prior to testing
● shaking hands with the test-taker on arrival to the facility

3. A researcher conducted a study to determine the effects of gender and status on the perceived
credibility of an eyewitness testifying in a trial. Participants watched one of four video recordings
depicting the eyewitness and rated the credibility of the eyewitness. In order to determine whether
gender, as a specific variable, had an effect on perceived credibility of the eyewitness, which of
the following must be significant?

● a post hoc analysis of gender


● the main effect of gender
● the interaction between gender and status
● the main effect of status

4. The hit rate is equivalent to:

● the success rate/base rate of successful performance.


● number of correct classifications/total number of classifications.
● the base rate/the selection ratio.
● the miss rate/the selection ratio.

5. Which BEST describes what is typically measured in personality assessment?

● personal values
● creativity and motivation
● traits and states
● social communication skills

6. Which of the following is most appropriate for determining the psychometric soundness of
behavioral assessment?

● classical test theory


● the experimental analysis of behavior
● empirical methods
● generalizability theory

7. The NEPSY is to the Luria-Nebraska Neuropsychological Battery as the:

● Halstead-Reitan is to the Luria-Nebraska.


● WISC-III is to the WAIS-III.
● Tower of Hanoi is to the Bender.
● Trail Making Test is to the Porteus Maze Test.

8. Dr. Chen is interested in feminist attitudes of young adult women in the United States.
Consequently, she administered a feminist attitude questionnaire to a total of 100 young adult
women from three universities. The 100 women tested and the number of young adult women in
the United States are which of the following, respectively?

● Independent and dependent variables


● Random assignment and random selection
● Effect size and population
● Sample and population

9. The Beck Depression Inventory-II measures the test-takers’ feelings over what period of time?

● 1 week
● 2 weeks
● 3 weeks
● the last month

10. According to Roberts and DelVecchio (2000), trait consistency tends to increase until one is
between _________ years of age, at which time it peaks.

● 10 and 20
● 30 and 40
● 50 and 60
● 40 and 50
11. A psychologist who does not act in the same or similar way that other reasonable psychologists
would have acted under the same or similar circumstances may be found liable for:

● abuse.
● incompetency.
● malpractice.
● Negligence.

12. In a normal distribution of scores, approximately what percentage of test scores falls between +1
and –1 standard deviations from the mean?

● less than 1%
● 75%
● 50%
● 66%

13. The Likert scale is an example of which type of rating scale?

● paired methods
● content
● summative
● categorical

14. The semantic differential rating technique consists of:

● a 7-point numerical scale with numbers keyed to descriptors.


● a forced-choice format.
● an alphabetical 7-point rating scale with letters keyed to descriptors.
● bipolar adjectives and a 7-point rating scale.

15. The lower test-retest reliability coefficients found to exist for state anxiety when compared with
higher test-retest reliability coefficients obtained for trait anxiety support which premise?

● None of these.
● States are more enduring personality characteristics than traits.
● Traits are more enduring personality characteristics than states.
● Exhibition of anxiety is very situation-dependent.

16. If a psychologist determines that a client is a danger to others, that psychologist has a legal
obligation to:

● warn the person who is in danger.


● keep the information privileged.
● seek a legal opinion from a lawyer.
● share this information with a colleague before taking any action.

17. Aphasia patients suffer the loss of the ability to:


● express themselves orally or in writing.
● hold their hands steady.
● perceive sounds lower in volume than a “dollar watch.”
● perceive smell.

18. If an instructor assigns a grade of “A” to all students who earn 900 or more points out of a total of
1000 points during the semester, 900 represents:

● the base rate of A-level students.


● the cut score for an A.
● the selection ratio.
● the success rate.

19. A 40-item vocabulary test was administered to a group of students. A second similar test of
vocabulary term was administered to this same group of students approximately one week later.
The researcher reported that the correlation between these two tests was r = .90. What type of
reliability is represented in this example?

● split-half
● test-retest
● alternate forms
● Inter-rater

20. Research by Solomon Asch supports which of the following?

● Conformity increases as group size increases from two people to four or five people.
● The presence of one dissenter in a group is not strong enough to reduce conformity.
● Higher levels of conformity are found in individualistic societies than in collectivistic societies.
● Individuals will follow orders to shock innocent strangers.

21. The Work Preference Inventory (WPI) is designed to assess:

● intrinsic and extrinsic motivation.


● active managerial style as distinct from passive managerial style.
● interest in intellectually demanding work as opposed to routine, nondemanding work.
● ability to work with people and ability to work alone.

22. The term “RIASEC” is BEST associated with:

● Guttman
● Strong
● Holland
● Wonderlic

23. A recent article in an educational journal described a university at which the average age is 26.
This article also mentioned that 38 percent of the students are over 25 years of age. What can be
concluded from this information?

● The median age must be greater than the mean age.


● The standard deviation must be relatively small.
● The distribution must be skewed.
● The distribution must be bimodal.

24. Which of the following statements is TRUE of the role of personality measures in
industrial/organizational psychology?

● Personality tests are playing less and less a role in I/O settings with every passing year.
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● The MMPI-2-RF has quickly become the most widely used measure of personality in I/O settings.
● The same personality test may not be equally suited for use with every job.

25. A neuropsychologist blindfolds a patient and then moves the patient’s arms and legs in various
positions. However, the patient cannot identify where his limbs are located. The
neuropsychologist would MOST likely suspect that the patient has suffered damage to the:

● temporal lobe
● frontal lobe
● parietal lobe
● occipital lobe

26. What is the difference between achievement and aptitude tests?

● Aptitude tests are more limited in scope than achievement tests.


● Aptitude tests are not used to make predictions about future performance, whereas achievement
tests are used for this purpose.
● Aptitude tests draw on a broader fund of knowledge than achievement tests.
● Aptitude tests require skills that are formally taught in school, and achievement tests
require skills that are learned informally.

27. Who is credited with being the originator of the psychometric concept of test reliability?

● Kraeplin
● Pearson
● Titchener
● Spearman

28. In the language of psychological testing and assessment, reliability refers to:

● the proportion of total variance that can be attributed to true variance.


● how well a test measures what it is intended to measure under specified conditions.
● the lack of systematic errors.
● whether or not a test publisher consistently publishes high quality instruments.

29. In order to make norms for a certain test more appropriate for use with test-takers from Taiwan,
data from the original standardization sample of a test is supplemented with Taiwanese norms. In
this instance, it would be:
● necessary to reevaluate the wording of the test’s items in order to make certain that test-takers in
Taiwan do not findany of the items offensive in any way.
● perfectly appropriate to continue to use the terms “standardization sample” and “normative
sample” interchangeably with reference to this test.
● desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the “standardization sample” norms with reference to this test.
● inaccurate to continue to use the terms “standardization sample” and “normative sample”
interchangeably with reference to the test.

30. As part of the test development process, a test revision may entail:

● the reprinting of a test.


● rewording, deletion, or development of new items.
● development of a new edition of a test.
● rewording, deletion, or development of new items; and, development of a new edition of a
test.

31. You wish to determine if the student you are evaluating scored higher on a mathematics test than
on a reading test. What statistic(s) would you calculate?

● the standard error of measurement for each test score


● the standard error of the difference between two scores
● the raw score on each test as well as the mean of each distribution
● the mean of each distribution and index of test difficulty for each test

32. A norm group is a group of test-takers:

● that is typically described in the test manual.


● for whom a particular test is deemed appropriate.
● for whom a particular test is deemed inappropriate.
● taking a particular test for the very first time.

33. An example of a personality test that employed empirical criterion keying in its development is
the:

● NEO-PI-R
● MMPI
● 16 PF
● Rorschach

34. Which question might a mental health professional be MOST likely to be asked during the course
of a civil proceeding?

● To what extent did this individual suffer emotional distress?


● Was this individual sane at the time the crime was committed?
● All of these.
● Is this individual competent to stand trial?
35. The higher the item-reliability index:

● the more likely the test developer is to eliminate the item.


● the lower the internal consistency of the test.
● the more likely the test-taker is to miss the item.
● the higher the internal consistency of the test.

36. Which statement is TRUE regarding the definition of personality?

● Freud’s definition of personality is universally accepted.


● There is no universal definition of personality.
● Hall and Lindzey’s definition of personality is universally accepted.
● None of these.

37. If a test-taker earns a z score of +2 on a test, approximately how many other test-takers obtained
higher scores, assuming the distribution of test scores is normal?

● 2.5%
● 14%
● 16%
● 25%

38. Which model of intelligence guided the development of the fourth edition of the Stanford-Binet
Intelligence Scale?

● the Cattel-Horn-Cattel (CHC) model


● Gardner’s theory of multiple intelligences
● Spearman’s two-factor theory
● the Cattell-Horn theory

39. Item branching refers to:

● reusing items in an original test that were originally developed for use in a parallel test.
● the creation of alternate and parallel forms of tests based on a group of test-takers’ responses to
the original test.
● statistical efforts to ensure that items translated into foreign languages are of the same difficulty.
● administering certain test items on a test depending on the test-takers’ responses to
previous test items.

40. Users of psychological tests are frequently tempted to treat ordinal data as if it were interval data.
This is the case because of the:

● difficulties that would be encountered if the data were treated as ratio data.
● unwritten rule that exists pertaining to the equal intervals between points measured.
● added flexibility of interval level data for statistical manipulation.
● frequent need to do more than simply rank order test scores.
41. The 7-Minute Screen was developed to identify symptoms associated with which of the following?

● All of these.
● Alzheimer’s disease
● personality disorders
● seizure disorders

42. In ipsative scoring, a test-taker’s scores are compared only to:

● his or her other scores on the test.


● his or her other scores on a parallel form of the same test.
● the scores of other test-takers from the same geographic area who are similar with regard to key
variables such as gender.
● the scores of other test-takers from past years who have taken the same test under the same or
similar conditions.

43. A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which
tool of assessment is LEAST likely to be used?

● familial medical history data


● case history data including school records
● referral for blood and urine tests
● measures of creative thinking

44. Psychometrics may best be defined as:

● the study of psychic phenomena.


● the study and use of correlational techniques.
● the science of test development.
● the science of psychological measurement.

45. Which of the following statistics is the preferred measure of central tendency for a skewed
distribution?

● the median
● None of these.
● the mode
● the mean

46. The Children’s Apperception Test (CAT) depicts ________ in its pictures.

● humans interacting with animals


● animals
● dolls and puppets
● Humans

47. For use in a study of abnormal and bizarre behavior, a psychologist seeks to use a number of
psychological tests. The psychologist does not want reviews of tests, but only a description of
what is available for possible use in this research. Which reference source should the
psychologist consult?

● The Supplement to the 17th Mental Measurements Yearbook


● Tests in print
● The 17th Mental Measurements Yearbook
● Harper’s Bazaar

48. The terms basal level, ceiling level, adaptive approach, and routing test are all associated with
the:

● WISC-IV
● OLSAT
● WAIS-lll
● SB-5

49. Children with chronic middle-ear infections often have low scores on the Sequenced Inventory of
Communication Development (SCID). This is evidence of the _________________ for the SCID.

● construct validity
● internal consistency validity
● test-retest validity
● content validity

50. An item-reliability index provides a measure of a test’s:

● stability
● test-retest reliability
● All of these
● internal consistency

51. The type of research that attempts to replicate a real-world problem in a research or clinical
setting is called:

● a case history approach to research.


● analogue research.
● research with unobtrusive measures.
● the sign approach to research.

52. A patient exhibits deficits in word recall, vocabulary, and finding words to name things. A
neuropsychologist would be MOST likely to diagnose this patient with:

● frontal lobes
● occipital lobes
● limbic system
● parietal lobes

53. Which ethical issue is particularly relevant when assessing substance abusers in the context of
research?
● the issue of informed consent
● the issue of cultural sensitivity
● the issue of right to treatment
● the issue characterized by the phrase, “First, do no harm”

54. A student taking a course entitled “Ancient History” is administered a history test. Years later, data
from this test is reviewed by assessment professionals who are preparing a case study on the
test-taker. In their report, the “Ancient History” test is referred to as:

● a school achievement test.


● a school aptitude test.
● ancient history.
● a school ability test.

55. Melody exclaims, “I got a C- on the statistics exam, and I was miserable until I thought how
terrible it must be for those who got F’s.” Melody’s attitude is an example of which of the
following?

● social learning
● social comparison
● social anxiety
● social validation

56. self-report rating scale of neurological impairment is:

● the Short Portable Mental Status Questionnaire.


● the Seashore Rating Scale.
● the Neuropsychological Impairment Scale.
● the Patient’s Assessment of Own Functioning Scale.

57. A student scores very high on a graduate school admission test and is admitted to graduate
school, largely on the basis of his test score. The student subsequently flunks out. The type of
test outcome described in this situation is known as a:

● true negative.
● false negative.
● positive hit.
● false positive.

58. An administration of the Montreal Neuropsychological Institute Battery entails the administration
of:

● the Mooney Faces Test.


● All of these
● the Wechsler Intelligence Test.
● the Wisconsin Card Sorting Test

59. When the selection ratio goes down:


● None of these.
● top-down selection policy can become discriminatory.
● hiring becomes less selective.
● competition for the position is likely to increase.

60. If a student’s performance on a newly developed math achievement test is compared with his or
her recent performance on another achievement test known to measure math skills, this would be
an example of ________ validity.

● predictive criterion-related
● content
● concurrent criterion-related
● Construct

61. An intelligence test originally written in English is to be administered to a group of Japanese


immigrants who do not speak English. In order to obtain an accurate measure of intelligence and
completely eliminate any possible effects due to language, the test administrator should:

● have a friend or family member of the group who is fluent in English and Japanese read the test
to the group, simultaneously translating the items word for word.
● have a professional translator read the test to the group, simultaneously translating the items
word for word.
● have a teacher fluent in Japanese and English conduct a brief tutorial in English prior to
administering the test in English, with specific attention given to the meaning of the wording of
key items and corresponding responses.
● None of these.

62. The Hand-Tool Dexterity Test and the O’Connor Tweezer Dexterity Test would most likely be used
by an employer interested in:

● finding the best employee for a position on an assembly line.


● assessing a worker’s ability to physically manipulate materials.
● increasing profit margins by lowering expenses.
● understanding a worker’s motivation to respond quickly with accuracy.

63. The strongest psychometric aspect of the Rorschach is its:

● interrater reliability with respect to scoring categories.


● internal-consistency split-half reliability for odd and even items.
● test-retest reliability over a short period of time.
● interrater reliability with respect to interpretations.

64. The Trail Making Tests are part of which neuropsychological test battery?

● Kaufman Assessment Battery


● Woodcock-Johnson III
● Luria-Nebraska Neuropsychological Battery
● Halstead-Reitan Neuropsychological Battery
65. A psychological autopsy typically includes which of the following?

● an interview with the coroner.


● All of these.
● a review of archival records.
● a postmortem interview with the assessee.

66. The K-ABC is designed to measure:

● general aptitude
● achievement
● intelligence
● both achievement and intelligence

67. You are interested in developing a test for social adjustment in a college fraternity or sorority. You
begin by interviewing persons who had graduated from college after having been a member of a
fraternity or sorority for at least 2 years. Which stage of test development best describes the one
that you are in?

● the test revision stage


● the pilot work stage
● the test construction stage
● the test-tryout stage

68. Which of the following is a psychoeducational test battery?

● Stanford-Binet 5
● Woodcock-Johnson lll
● K-ABC
● Hilton PTB

69. A Director of Human Resources is setting up a series of tests to use to select applicants for sales
positions. Inherent in the tests, and applied in the model of selection, is the Director’s assumption
that high sales ability can make up for limited product knowledge. The model of selection being
applied could BEST characterized as:

● a compensatory model of selection.


● the method of contrasting group for selection.
● the method of predictive yield in action.
● a multiple hurdle model of selection.

70. A correlation coefficient is equal to .30. Using the concept of coefficient of determination, the
variance accounted for by chance, error, and other unexplained factors would be:

● None of these.
● approximately 30%.
● approximately 3%.
● approximately 9%.
71. An educational psychologist conducts a utility analysis of a teaching program used to improve the
handwriting of very young children. The measure of utility in this analysis will most likely be:

● decrease in costs.
● increase in performance level.
● reduction in accidents.
● increase in revenue.

72. In a distribution that is symmetrical, which of the following is true?

● The distances from Q1 and Q4 to the median are the same.


● The distances from Q1 and Q3 to the median are the same.
● The distances from Q2 and Q3 to the median are the same.
● The distances from Q1 and Q2 to the median are the same.

73. An assessor determines that John is incompetent to stand trial. This means that John:

● is mentally retarded or psychotic.


● All of these.
● may have been under the influence of alcohol or some controlled substance at the time the
alleged offense was committed.
● is unable to understand the charges against him and is unable to assist in his own

74. Critics have argued that projective tests are too

● qualitative
● subjective
● brief
● concrete

75. The Schedule for Affective Disorders and Schizophrenia is an example of:

● a structured clinical interview.


● an objective personality test.
● a projective test.
● an unstructured clinical interview.

76. The specific objective of a utility analysis will dictate what sort of information will be required, as
well as the specific:

● expectancy tables to be used.


● Naylor-Shine tables to be used.
● methods to be used.
● Rise-and-Shine tables to be used.

77. If a new test was developed to assist a college in selecting applicants, which group of test-takers
should ideally be administered the test items developed used during the item tryout phase of the
new test’s development?
● all high school juniors who are college-bound
● seniors in high school who were accepted to the college on the basis of criteria other than
the test under development
● college students who put a hold on their academic studies in order to backpack through Europe
for 1 year or more
● freshmen in college admitted who had taken one or more advanced placement courses in high
school

78. Which of the following is LEAST likely to be used in pre-employment screening of job applicants
for unskilled positions in a large corporation?

● application blanks
● letters of recommendation
● aptitude measures
● interviews

79. Psychoeducational test batteries are designed to measure:

● ability and achievement


● adjustment and personality
● academic motivation
● scholastic aptitude

80. Which of the following represents a problem unique to self-report personality tests?

● The reading ability of respondents may prevent them from responding accurately to items.
● Respondents may be too “low” on the construct being measured to register on the test.
● All of these.
● Respondents might be unwilling to reveal something negative about themselves.

81. If John earns a full-scale IQ of 90 on the WISC-IV:

● John scored at the low end of the average range.


● Ninety percent of the students in John’s age group scored lower than John on this test.
● John correctly answered 90 questions.
● John correctly answered 90% of the question

82. A psychologist working in a mental hospital is interested in predicting suicide risk from three
variables: severity of current depression, duration of current depression, and number of previous
suicide attempts. The psychologist gathers information on these variables for 100 subjects and
wants to combine the information in such a way that suicide risk is most accurately predicted. An
appropriate statistical technique would be:

● incremental validity analysis.


● simple regression.
● meta-analysis.
● multiple regression.
83. If a test-taker is asked to name familiar objects, write familiar words, and follow verbal instructions
on a test that takes 15 minutes or less, he or she would most likely be taking:

● the Aphasia Screening Test.


● the Wechsler Memory Scale.
● the Wisconsin Card Sorting Test.
● the Halstead-Reitan Neuropsychological Battery.

84. In evaluating a child who is described as “fidgety, restless, and impulsive,” what type of drawing
on the Machover Draw-A-Person Test could that child reasonably be expected to produce?

● a person of average size drawn in the middle of the page


● a person drawn in an “x-ray” perspective
● a large person extending off the page
● a small person drawn in the left-hand corner of the page

85. In the language of psychological testing and assessment, scoring refers to assigning evaluative
numbers, codes or statements to performance on:

● Interviews
● Tasks
● Tests
● All of these

86. In an evaluation to determine dangerousness, information pertinent to ______________is


typically gathered.

● the specificity and detail of the plan


● All of these.
● the type of weapon to be used
● the availability of a weapon

87. As compared to more traditional, one-on-one and face-to-face assessments, a disadvantage of


CAPA is that it typically deprives the assessor of the opportunity to:

● observe the test-taker’s test-taking behavior.


● tailor the test’s content to the responses.
● make certain that test forms can be kept secure.
● have a Bowflex workout during the assessment.

88. The higher the item-difficulty index, the ________ the item.

● more robust
● less robust
● harder
● Easier

89. If all scores in a set of test scores were the same, the variance would be equal to:
● None of these
● one
● Two
● Zero

90. A test developer designs a test for the sole purpose of identifying the most highly skilled
individuals among those tested. During the test revision stage of test development, the test
developer will be particularly interested in:

● item reliability
● item bias
● item validity
● item discrimination

91. An anchor protocol is:

● a list of guidelines for a standardized test used to ensure that all test-takers are similar in key
ways to the population of the original standardization sample.
● a statistical procedure in which weights are assigned to each item of a model test to maximize
predictive validity.
● a previously developed test with known validity that can be used as a comparison for newly
developed tests.
● a model for scoring and a mechanism for resolving scoring discrepancies.

92. A key difference between concurrent and predictive validity has to do with

● None of these.
● the time frame during which data on the criterion measure is collected.
● the magnitude of the reliability coefficient that will be considered significant at the .05 level.
● the magnitude of the validity coefficient that will be considered significant at the .05 level.

93. A large corporation scrupulously avoids any possibility of discrimination and adverse impact in its
hiring practices. The selection procedure it probably has in place with regard to its entry-level test
is one that entails:

● a top-down selection policy based on test score.


● personnel selection based on a cut score.
● interviews using translators, if necessary.
● None of these.

94. As a result of the normalization of the standard scores on the MMPI-2, a T score of 70 on the
Depression Scale and a T score of 70 on Hypomania Scale will indicate that:

● Neither score is significantly elevated.


● the two T scores equal the same level of clinical elevation.
● the two scores result in different percentile ranks for each scale.
● the scores will be significantly different depending on the gender of the test-taker.

95. Which of the following is NOT an assumption of utility analysis?


● the value of people and their performance can be estimated.
● the performance of people in organizations can affect organizational viability.
● psychological tests are always preferred over other means of assessment.
● large amounts of information can be integrated to make good decisions.

96. According to Vroom's expectancy theory of motivation, employees will expend energy:

● to satisfy a higher category of need.


● to receive recognition for performance
● to achieve an outcome they desire
● to experience feelings of accomplishment.

97. A "good" test item on an ability test is one:

● in which it is absolutely impossible to guess the correct answer


● that distinguishes high scorers from low scorers
● to which almost all test-takers respond incorrectly
● to which almost all test-takers respond correctly.

98. In a study of a new psychopharmacological treatment for clinical depression, 40 participants


diagnosed with depression each received four (4) different amounts of a new medication called
Deplow. The first week, they were given a placebo. During the second week of the study, they
took 1 mg. of Deplow each day. During the third week, they took 3 mg. of Deplow each day, and
during the fourth week, they took 5 mg. of Deplow each day. Although the participants took
different amounts of the medication each week, they were not informed about the amount they
were taking. The participants also completed a depression symptom checklist at the end of each
week. Results are presented below. The score on the checklist could range from 0 to 30
indicating severe depression. Assume statistical significance for differences greater than 3.0.
Which of the following would make it difficult to conclude that any decrease in depressive
symptoms is due to Deplow and not to other aspects of the study?

● The lack of a control group


● The increasing doses of Deplow
● The low sample size
● The lack of comparison with an established antidepressant medication

99. The standard error of measurement of a particular test of anxiety is 8. A student earns a score of
60. What is the confidence interval for this test score at the 95% level?

● 40-68
● 44-76
● 36-84
● 52-60

100. Using a cut score of 50 on a predictor test a researcher finds a base rate of 1.00. This means
that when a cut score of 50 is used:
● 100% of applicants will perform successfully on a criterion measure.
● 50% of applicants will fail on a criterion measure.
● 100% of applicants will fail on a criterion measure
● 50% of applicants will perform successfully on a criterion measure.

101. A correlation coefficient that is significant at the p < .01 level:

● could have been expected to occur by chance alone 99 times or more in 100.
● has a 99% chance of being accurate.
● accounts for about 1% of the variance.
● could have been expected to occur by chance alone one time or less in 100.

102. The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:

● Holland Opus
● Carl Jung
● B. F. Skinner
● Sigmund Frey

103. The Barnum effect in psychological report writing refers to:

● statements that are prejudicial in nature.


● very technical jargon that is difficult for lay readers of the report to understand or interpret.
● vague and general statements that could be applied to most people in many situations.
● conflicting statements about the person within the same report.

104. The Otis-Lennon School Ability Test yields which of the following composite scores?

● Verbal Performance Composite


● Full Scale IQ
● School Ability Index
● O-L Composite

105. Application forms used in employment settings:

● are generally considered to be useful for quick screening.


● are frequently unnecessary, since they usually duplicate information that can be later. obtained
through interviews, tests, and other methods.
● None of these
● have been shown to have poor vailidity and reliability in numerous research studies

106. Research on the Psychopathy Checklist suggests that it is useful in:

● predicting violence within prisons.


● predicting crimes committed by inmates.
● identifying criminal recidivists.
● identifying a psychotic disorder.

107. If the results of an examination are negatively skewed, the exam questions were likely:
● Difficult
● Biased
● Easy
● Quite novel in many respects

108. Forced-choice item formats are typically employed to control which of the following?

● test-takers' tendency toward impression management


● interviewers' tendency to stray from the points of focus
● None of these.
● the reliability of respondents response patterns

109. The plethysmograph is perhaps MOST useful in the assessment and treatment of:

● sexual offenders
● cardiac arrest
● people suffering from migraine
● agitated depression

110. Which statement is TRUE of the test tryout phase of test construction?

● A large number of subjects should be included to ensure accurate results


● None of these statements are true.
● Test conditions should be as similar to the actual administration as possible.
● The sample used must be nationally representative

111.The statistical tool that is ideally suited for making selection decisions within the framework of a
compensatory model is:

● utility analysis.
● multiple regression.
● expectancy data.
● the Brogden-Cronbach-Gleser formula.

112. An aptitude test that includes both psychomotor and paper-and-pencil tasks is the:

● O Connor Tweezer Dexterity Test


● Bennet Mechanical Comprehension Test
● Minnesota Clerical Test
● General Aptitude Test Battery

113. A review of a new personality test is published in a journal. In that review, it would be
reasonable to expect to find information about:

● the intelligence range of prospective test-takers.


● All of these.
● what prompted the publisher to publish this test.
● the psychometric soundness of the test.
114. Which tool of psychological assessment is MOST likely to be used after tests have been
administered to a patient in order to evaluate that patient's level of premorbid functioning?

● the case history


● behavioral observation
● All of these.
● the interview

115. To ensure that a test developed for national use is indeed suitable for national use, test
developers:

● All of these.
● post sample items on the Web to gauge response of different groups.
● have a culturally representative panel of experts review test items.
● employ a culturally representative group of examiners.

116. A group of researchers was interested in learning whether a newly developed exam would be
useful in determining whether a student will be successful in college. The researchers designed a
study in which students took the new exam prior to entering college, the student took another
exam, which was designed to measure how much information they had learned during their first
year. The score on this exam was then correlated with the student's score on the newly
developed exam. What type of validity was being evaluated in the study?

● Predictive
● Divergent
● Discriminant
● Concurrent

117. If there is common ground among of all of the varied approaches to psychological testing and
assessment, that common ground may MOST have to do with the assessor's:

● use of an ability test and a test of personality.


● strict adherence to ethical guidelines.
● All of these
● psychoanalytically-based interpretation of findings.

118. The "Achilles heel" of the Angoff method is:

● interrater reliability.
● test-retest reliability.
● noneconomic costs
● incremental validity.

119. If an ANOVA yields a significant F value, you could rely to test significant differences between
on group means.

● Duncan's multiple-range, Tukey's, or Scheffe's test.


● one- and two-tailed t tests.
● percentile rank.
● summative or formative evaluation.

120. A test format could be normative or ipsative. In the normative format

● each item depends on the item after it.


● each item is independent of all other items.
● each item depends on the item it.
● the client must possess an IQ wibeforethin the normal range.

121. A counselor created an achievement test with a reliability coefficient of .82. The test was
shortened since many clients felt it was too long. The counselor shortened the test but logically
assumed that the reliability coefficient would now

● remain at .82.
● be at least 10 points higher or lower.
● be lower than .82.
● be approximately .88.

122. The Brogden-Cronbach-Gleser formula was developed by:

● the work of Cronbach, and later, the work of Brogden and Gleser.
● the team of Brogden, Cronbach, and Gleser working together.
● the work of Brogden and later, the work of Cronbach and Gleser.
● Brogden, Cronbach, and Gleser, each working independently.

123. On the Wechsler tests of intelligence, the Full Scale IQ has a mean of __ and a standard
deviation of ___

● 50; 10
● 100; 16
● 50; 15
● 100; 15

124. Ideally, the first draft of a test should include at least how many items as compared with the
final version of the test?

● about twice the number of the final version


● about three times the number of the final version
● roughly the same number as the final version
● about half the number of the final version

125. A researcher was interested in whether or not jazz vocals and opera influence men's and
women's emotional states. She hypothesized that these types of music influence men and
women differently. In a study investigating this hypothesis, 40 men and 40 women heard a jazz
piece, and 40 men and 40 women heard an operatic piece. The jazz piece was sung by a man,
and the operatic piece was sung by a woman. Afterward, participants rated themselves on an
inventory measuring emotional state. Higher scores on the inventory indicate a positive mood.
Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and
operatic music positively changes women's moods. Which of the following invalidates that
conclusion?

● Previous studies have shown that men are less emotional than women.
● Men and women were randomly assigned to groups.
● Only one scale was used to measure mood.
● Men's and women's moods were not measured before exposure to the two types of
music.

126. In a study of a new psychopharmacological treatment for clinical depression, 40 participants


diagnosed with depression each received four (4) different amounts of a new medication called
Deplow. The first week, they were given a placebo. During the second week of the study, they
took 1 mg. of Deplow each day. During the third week, they took 3 mg. of Deplow each day, and
during the fourth week, they took 5 mg. of Deplow each day. Although the participants took
different amounts of the medication each week, they were not informed about the amount they
were taking. The participants also completed a depression symptom checklist at the end of each
week. Results are presented below. The score on the checklist could range from 0 to 30
indicating severe depression. Assume statistical significance for differences greater than 3.0.
Which of the following would make it difficult to conclude that any decrease in depressive
symptoms is due to Deplow and not to other aspects of the study?

● The lack of a control group


● The increasing doses of Deplow
● The low sample size
● The lack of comparison with an established anti-depressant medication

127. A syndrome is described as:

● a set of co-occurring emotional and behavioral problems.


● a source of distress or perplexity.
● a condition that inhibits optimal functioning.
● a harmful set of conditions that impairs cognitive ability.

128. If someone tells you that they took a test and received scores on scales called School Problems,
Conduct Problems, and Immaturity, it is a good bet that this person was administered:

● the NEO-PI-R.
● the MMPI-2-RF.
● the School Problems Checklist.
● the MMPI-A.

129. The task of sorting statement cards from least descriptive to most descriptive is most characteristic
of which type of assessment method?
● Semantic differential rating technique
● Forced-choice
● T-sort
● Q-sort

130. Depression is more common among people with insomnia than among those with satisfactory sleep.
To determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions. During that same
time, the yoked control was permitted to sleep when the depressed-insomniac person slept but was
required to awaken whenever the depressed-insomniac was awakened. A valid questionnaire for
measuring depression was administered at the end of the one-week study. Assume that higher scores on
the questionnaire reflect greater depressive symptomatology. Suppose that the results were consistent
with hypothesis that sleeplessness does NOT lead to depression. Of the following, which would be the
most serious criticism of the study and its conclusion?

● The normal sleep-control group was unnecessary.


● The study failed to examine other factors that might also contribute to depression.
● One week of sleep deprivation may have been adequate to produce depression.
● The yoked-control group was unnecessary

131. Individuals who lack awareness, insight, and the ability to recognize problems are referred to as

● A. agnostic.
● B. anosognostic.
● C. amnesiac.
● D. anhedonics

132. CARRYOVER

133. Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
● the Army Beta
● the Armed Services Vocational Aptitude Battery (ASVAB)
● the Army Alpha Test
● the Army General Classification Test (AGCT)

134. In the acronym presented in text for remembering what each of the variables in the APGAR
measures, the letter G in APGAR stands for:

● good muscle tone


● glucose
● grimace
● genetic inheritance

Research on the Psychopathy Checklist suggests that it is useful in:


A. predicting crimes committed by inmates.
B. predicting violence within prisons.
C. identifying a psychotic disorder.
D. identifying criminal recidivists.

Application forms used in employment settings:


A. are generally considered to be useful for quick screening.
B. are frequently unnecessary, since they usually duplicate information that can be later obtained through
interviews, tests, and other methods.
C. have been shown to have poor validity and reliability in numerous research studies.
D. None of these.
An intelligence test originally written in English is to be administered to a group of Japanese immigrants
who do not speak English. In order to obtain an accurate measure of intelligence and completely
eliminate any possible effects due to language, the test administrator should:
A. have a teacher fluent in Japanese and English conduct a brief tutorial in English prior to administering
the test in English, with specific attention given to the meaning of the wording of key items and
corresponding responses.
B. have a professional translator read the test to the group, simultaneously translating the items
word for word.
C. None of these.
D. have a friend or family member of the group who is fluent in English and Japanese read the test to the
group, simultaneously translating the items word for word.

A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which tool of
assessment is LEAST likely to be used?
A: measures of creative thinking

Depression is more common among people with insomnia than among those with satisfactory sleep. To
determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions.
During that same time, the yoked control was permitted to sleep when the depressed-insomniac person
slept but was required to awaken whenever the depressed-insomniac was awakened.
A valid questionnaire for measuring depression was administered at the end of the one-week study.
Assume that higher scores on the questionnaire reflect greater depressive symptomatology.
What pattern of results on the depression questionnaire would one expect if depression were to arise for
reasons other than sleeplessness?
A. Yoked control < normal sleep control = depressed
B. Normal sleep control = yoked control < depressed
C. Normal sleep control < yoked control = depressed
D. Yoked control < normal sleep control ‹ depressed

In the administration of the TAT:


A. a minimum of ten cards must be presented.
B. the number of cards presented is left to examiner discretion.
C. a maximum of twenty cards is presented.
D. all stimulus cards are presented to all subjects.

Sally and Bob both apply for a position as an accounting clerk. The Human Resource (HR) professional
responsible for selecting the best candidate for the position administers a standardized test of basic
mathematical skills to both Sally and Bob. Based on their scores, the HR professional chooses Sally. The
reason for this choice is that Sally has an 85% chance of performing at an acceptable level. By contrast,
Bob's score indicated that he only had a 50% chance of performing successfully. The tool of assessment
used to make this hiring decision was:
A. good, old-fashioned intuition.
B. the method of predictive yield.
C. a Taylor-Russell table.
D. an expectancy table.

Which ethical issue is particularly relevant when assessing substance abusers in the context of research?
A. the issue characterized by the phrase, "First, do no harm"
B. the issue of cultural sensitivity
C. the issue of informed consent
D. the issue of right to treatment
A counselor created an achievement test with a reliability coefficient of.82. The test was shortened since
many clients felt it was too long.
A. The counselor shortened the test but logically assumed that the reliability coefficient would now remain
at .82.
B. be at least 10 points higher or lower.
C.be lower than .82.
D. be approximately .88.

An aptitude test that includes both psychomotor and paper-and-pencil tasks is the:
A. Bennet Mechanical Comprehension Test
B. O'Connor Tweezer Dexterity Test
C. General Aptitude Test Battery
D. Minnesota Clerical Test.

The 7-Minute Screen was developed to identify symptoms associated with which of the following?
A. Alzheimer's disease
B. personality disorders
C. seizure disorders
D. All of these.

Depression is more common among people with insomnia than among those with satisfactory sleep. To
determine the reasons for this relationship, investigators identified 40 people suffering from both
depression and insomnia. For each of these 40, they paired two other people of the same gender and age
who were neither depressed nor suffering from any sleep disorder. One of these was designated the
"normal-sleep control," and the other was designated the "yoked control." All participants slept in a
laboratory for one week. The normal-sleep control person slept without restrictions.
During that same time, the yoked control was permitted to sleep when the depressed-insomniac person
slept but was required to awaken whenever the depressed-insomniac was awakened.
A valid questionnaire for measuring depression was administered at the end of the one-week study.
Assume that higher scores on the questionnaire reflect greater depressive symptomatology.
What pattern of results on the depression questionnaire would justify the conclusion that sleeplessness
leads to depression?
A. Normal sleep control = yoked control < depressed
B. Yoked control < normal sleep control = depressed
C. Normal sleep control < yoked control = depressed
D. Yoked control < normal sleep control < depressed

A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
A. standard score
B. test composite
C. test requisite
D. scaled score

As compared to more traditional, one-on-one and face-to-face assessments, a disadvantage of CAPA is


that it typically deprives the assessor of the opportunity to:
A. make certain that test forms can be kept secure.
B. have a Bowflex workout during the assessment.
C. observe the test-taker's test-taking behavior.
D. tailor the test's content to the responses.

A key difference between concurrent and predictive validity has to do with:


A. the magnitude of the reliability coefficient that will be considered significant at the .05 level.
B. the magnitude of the validity coefficient that will be considered significant at the .05 level.
C. None of these.
D. the time frame during which data on the criterion measure is collected.

The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:
A. Holland Opus
B. Sigmund Freud
C. Carl Jung
D. B. F. Skinner

A self-report rating scale of neurological impairment is:


A. the Patient's Assessment of Own Functioning Scale.
B. the Neuropsychological Impairment Scale.
C. The Short Portable Mental Status Questionnaire.
D. the Seashore Rating Scale.

Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
A. is variable with criterion-referenced tests.
B. is minimized with criterion-referenced tests.
C. cannot be determined with criterion-referenced tests.
D. is maximized with criterion-referenced tests.

Critics have argued that projective tests are too


A. brief
B. qualitative
C. subjective
D. concrete

The lower test-retest reliability coefficients found to exist for state anxiety when compared with higher
test-retest reliability coefficients obtained for trait anxiety support which premise?
A. None of these.
B. Traits are more enduring personality characteristics than states.
C. Exhibition of anxiety is very situation-dependent.
D. States are more enduring personality characteristics than traits.

Which of the following represents a problem unique to self-report personality tests?


A. The reading ability of respondents may prevent them from responding accurately to items.
B. Respondents may be too "low" on the construct being measured to register on the test.
C. All of these.
D. Respondents might be unwilling to reveal something negative about themselves.

You wish to determine if the student you are evaluating scored higher on a mathematics test than on a
reading test. What statistic(s) would you calculate?
A. the raw score on each test as well as the mean of each distribution
B. the standard error of measurement for each test score
C. the standard error of the difference between two scores
D. the mean of each distribution and index of test difficulty for each test

"Multiple predictors may be used so that applicants must meet or exceed the cut score for each predictor
before moving to the next round of the selection process." What process is being described?
A. multiple hurdle selection
B. top down selection.
C. compensatory model of selection.
D. known-groups selection

Forced-choice item formats are typically employed to control which of the following?
A. None of these.
B. test-takers' tendency toward impression management
C. interviewers' tendency to stray from the points of focus
D. the reliability of respondent's response patterns

When the selection ratio goes down:


A. hiring becomes less selective.
B. None of these.
C. competition for the position is likely to increase.
D. top-down selection policy can become discriminatory.

You are interested in developing a test for social adjustment in a college fraternity or sorority. You begin
by interviewing persons who had graduated from college after having been a member of a fraternity or
sorority for at least 2 years. Which stage of test development best describes the one that you are in?
A. the test revision stage
B. the test construction stage
C. the test-tryout stage
D. the pilot work stage

A "real time, live action" approach to assessment that requires the assessees to demonstrate abilities that
typically are characteristic of those they might encounter on-the-job is referred to as:
A. authentic assessment
B. portfolio assessment
C. performance assessment.
D. curriculum-based assessment

Item branching refers to:


A. the creation of alternate and parallel forms of tests based on a group of test-takers' responses to the
original test.
B. reusing items in an original test that were originally developed for use in a parallel test.
C. administering certain test items on a test depending on the test-takers' responses to previous
test items.
D. statistical efforts to ensure that items translated into foreign languages are of the same difficulty.

Psychological testing:
A. is characteristically broader in scope than assessment.
B. tends to be less accurate than assessment.
C. is typically more lengthy than assessment.
D. may be one component of the process of assessment.

Aphasia patients suffer the loss of the ability to:


A. perceive sounds lower in volume than a "dollar watch."
B. perceive smell.
C. express themselves orally or in writing.
D. hold their hands steady.

In an evaluation to determine dangerousness, information pertinent to _______ is typically gathered.


A. the availability of a weapon
B. the type of weapon to be used
C. All of these.
D. the specificity and detail of the plan

Melody exclaims, "I got a C- on the statistics exam, and I was miserable until I thought how terrible it must
be for those who got F's." Melody's attitude is an example of which if the following?
A. social learning
B. social anxiety
C. social comparison
D. social validation
The higher the item-reliability index:
A. the higher the internal consistency of the test.
B. the more likely the test developer is to eliminate the item.
C. the lower the internal consistency of the test.
D. the more likely the test-taker is to miss the item.

The greater the magnitude of the item-discrimination index:


A. the more people in the lower-scoring group answered the item correctly as compared with those in the
higher-scoring group.
B. the more people in the higher-scoring group answered the item correctly as compared with
those in the lower-scoring group.
C. the more reliable the test.
D. the more valid the test.

What is the difference between achievement and aptitude tests?


A. Aptitude tests are more limited in scope than achievement tests.
B. Aptitude tests are not used to make predictions about future performance, whereas achievement tests
are used for this purpose.
C. Aptitude tests draw on a broader fund of knowledge than achievement tests.
D. Aptitude tests require skills that are formally taught in school, and achievement tests require skills that
are learned informally.

In the language of psychological testing and assessment, reliability refers to:


A. whether or not a test publisher consistently publishes high quality instruments.
B. the lack of systematic errors.
C. the proportion of total variance that can be attributed to true variance.
D. how well a test measures what it is intended to measure under specified conditions.

Which of the following is the best way to establish rapport with a test-taker?
A. shaking hands with the test-taker on arrival to the facility
B. presenting the test-taker with a business card
C. a few words of "small talk" on meeting
D. playing a calming and soothing music prior to testing

Who is credited with being the originator of the psychometric concept of test reliability?
A. Spearman
B. Kraeplin.
C. Pearson
D. Titchener

During the course of a mental status examination, the examiner asks the examinee: "Do you know why
we are here and why I am interviewing you today?" By raising these questions, the examiner is MOST
likely trying to assess the examinee's:
A. orientation.
B. verbal abilities.
C. intellectual resources.
D. insight.

Using a cut score of 50 on a predictor test a researcher finds a base rate of 1.00. This means that when a
cut score of 50 is used:
A. 50% of applicants will fail on a criterion measure.
B. 100% of applicants will perform successfully on a criterion measure.
C. 100% of applicants will fail on a criterion measure.
D. 50% of applicants will perform successfully on a criterion measure.
If an instructor assigns a grade of "A" to all students who earn 900 or more points out of a total of 1000
points during the semester, 900 represents:
A. the selection ratio.
B. the base rate of A-level students.
C. the success rate.
D. the cut score for an A

The specific objective of a utility analysis will dictate what sort of information will be required, as well as
the specific:
A. Rise-and-Shine tables to be used.
B. expectancy tables to be used.
C. methods to be used.
D. Naylor-Shine tables to be used.

As compared to more traditional, one-on-one and face-to-face assessments, a disadvantage of CAPA is


that it typically deprives the assessor of the opportunity to:
A. tailor he tests content to the responses.
B. make certain that test forms can be kept secure.
C. observe the test-taker's test-taking behavior.
D. have a Bowflex workout during the assessment.

In a distribution that is symmetrical, which of the following is true?


A. The distances from Q1 and Q3 to the median are the same.
B. The distances from Q1 and Q4 to the median are the same.
C. The distances from Q1 and Q2 to the median are the same.
D. The distances from Q2 and Q3 to the median are the same.

A patient is administered the Minnesota Multiphasic Personality Inventory-2-RF (MMPI-2-RF) by an


experienced clinician. The clinician concludes that the patient has schizophrenia. The clinician's diagnosis
best supports which of the following additional conclusions?
A. The patient's pattern of responses to the MMPI-2-RF resembles that of people who are known to
have schizophrenia.
B. A brief interview with the patient would reveal that the patient harbors delusions of grandeur.
C. The clinician's interpretation of the MMPI-2RF findings is based on knowledge of projective testing.
D. The patient received a high score on the lie scale of the MMPI-2-RF.

A neuropsychologist blindfolds a patient and then moves the patient's arms and legs in various positions.
However, the patient cannot identify where his limbs are located. The neuropsychologist would MOST
likely suspect that the patient has suffered damage to the:
A. frontal lobe
B. occipital lobe
C. temporal lobe
D. parietal lobe

The integration of data from statistical procedures, empirical methods, and formal rules to formulate
descriptions and make predictions is referred to as:
A. actuarial prediction
B. formal prediction
C. empirical prediction
D. clinical prediction

Which statement is TRUE regarding the definition of personality?


A. There is no universal definition of personality.
B. None of these.
C. Freud's definition of personality is universally accepted.
D. Hall and Lindzey's definition of personality is universally accepted.

The Hand-Tool Dexterity Test and the O'Connor Tweezer Dexterity Test would most likely be used by an
employer interested in:
A. finding the best employee for a position on an assembly line.
B. understanding a worker's motivation to respond quickly with accuracy.
C. assessing a worker's ability to physically manipulate materials.
D. increasing profit margins by lowering expenses.

The type of research that attempts to replicate a real-world problem in a research or clinical setting is
called:
A. analogue research.
B. research with unobtrusive measures.
C. a case history approach to research.
D. the sign approach to research.

A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
A. standard score
B. test composite
C. scaled score
D. test requisite

Poor performance on the Block Design and other performance subtests of the Wechsler scales along with
high scores on the Verbal subtests would be suggestive of:
A. a "deterioration quotient" (DQ) of 10 or more.
B. possible damage in the right hemisphere of the brain.
C. severe head trauma.
D. possible damage in the left hemisphere of the brain.

What is the correlation coefficient of choice when two variables are ordinal?
A. Chi square
B. Pearson r
C. Spearman rho
D. Cronbach alpha

In order to make norms for a certain test more appropriate for use with test-takers from Taiwan, data from
the original standardization sample of a test is supplemented with Taiwanese norms. In this instance, it
would be:
A. inaccurate to continue to use the terms "standardization sample" and "normative sample"
interchangeably with reference to the test.
B. necessary to reevaluate the wording of the test's items in order to make certain that test-takers in
Taiwan do not find any of the items offensive in any way.
C. desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the "standardization sample" norms with reference to this test.
D. perfectly appropriate to continue to use the terms "standardization sample" and "normative sample"
interchangeably with reference to this test.

Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
A. the Army Beta
B. the Armed Services Vocational Aptitude Battery (ASVAB)
C. the Army General Classification Test (AGCT)
D. the Army Alpha Test
Which tool of psychological assessment is MOST likely to be used after tests have been administered to
a patient in order to evaluate that patient's level of premorbid functioning?
A. the interview
B. All of these.
C. behavioral observation
D. the case history

A psychologist working in a mental hospital is interested in predicting suicide risk from three variables:
severity of current depression, duration of current depression, and number of previous suicide attempts.
The psychologist gathers information on these variables for 100 subjects and wants to combine the
information in such a way that suicide risk is most accurately predicted. An appropriate statistical
technique would be:
A. incremental validity analysis.
B. meta-analysis.
C. multiple regression.
D. simple regression.

Traditional measures of reliability are inappropriate for criterion-referenced tests because variability:
A. is variable with criterion-referenced tests.
B. cannot be determined with criterion-referenced tests.
C. is minimized with criterion-referenced tests.
D. is maximized with criterion-referenced tests.

Dr. Chen is interested in feminist attitudes of young adult women in the United States. Consequently, she
administered a feminist attitude questionnaire to a total of 100 young adult women from three universities.
The 100 women tested and the number of young adult women in the United States are which of the
following, respectively?
A. Random assignment and random selection
B. Effect size and population
C. Independent and dependent variables
D. Sample and population

In an evaluation to determine dangerousness, information pertinent to ______ is typically gathered.


A. the availability of a weapon
B. All of these.
C. the specificity and detail of the plan
D. the type of weapon to be used

The NEPSY is to the Luria-Nebraska Neuropsychological Battery as the:


A. Halstead-Reitan is to the Luria-Nebraska.
B. WISC-III is to the WAIS-III.
C. Tower of Hanoi is to the Bender.
D. Trail Making Test is to the Porteus Maze Test.

A student scores very high on a graduate school admission test and is admitted to graduate school,
largely on the basis of his test score. The student subsequently flunks out. The type of test outcome
described in this situation is known as a:
A. true negative
B. positive hit
C. false positive
D. false negative

The Brogden-Cronbach-Gleser formula was developed by:


A. the work of Brogden and later, the work of Cronbach and Gleser.
B. the work of Cronbach, and later, the work of Brogden and Gleser.
C. the team of Brogden, Cronbach, and Gleser working together.
D. Brogden, Cronbach, and Gleser, each working independently.

A test format could be normative or ipsative. In the normative format


A. each item depends on the item after it.
B. the client must possess an IQ within the normal range.
C. each item depends on the item before it.
D. each item is independent of all other items.

To ensure that a test developed for national use is indeed suitable for national use, test developers:
A. employ a culturally representative group of examiners.
B. post sample items on the Web to gauge response of different groups.
C. have a culturally representative panel of experts review test items.
D. All of these.

The plethysmograph is perhaps MOST useful in the assessment and treatment of:
A. agitated depression
B. people suffering from migraine
C. sexual offenders
D. cardiac arrest

A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate a positive mood. Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and operatic
music positively changes women's moods.
Which of the following invalidates that conclusion?
A. Only one scale was used to measure mood.
B. Previous studies have shown that men are less emotional than women.
C. Men and women were randomly assigned to groups.
D. Men's and women's moods were not measured before exposure to the two types of music.

The greater the magnitude of the item-discrimination index:


A. the more valid the test.
B. the more reliable the test.
C. the more people in the lower-scoring group answered the item correctly as compared with those in the
higher-scoring group.
D. the more people in the higher-scoring group answered the item correctly as compared with
those in the lower-scoring group.

The higher the item-difficulty index, the


the item.
A. less robust
B. more robust
C. easier
D. harder

Which BEST describes what is typically measured in personality assessment?


A. social communication skills
B. personal values
C. creativity and motivation
D. traits and states

Validity is to _________ as utility is to __________.


A. consistency; accuracy
B. usefulness; consistency
C. usefulness; accuracy
D. accuracy; usefulness

A "good" test item on an ability test is one:

a. to which almost all test-takers respond correctly


b. that distinguishes high scorers from low scorers.
c. to which almost all test-takers respond incorrectly
d. in which it is absolutely impossible to guess the correct answer.

Question 1
A 16-year-old male suspected of drug abuse is referred for neuropsychological evaluation. Which tool of
assessment is LEAST likely to be used?
● referral for blood and urine tests
● measures of creative thinking
● familial medical history data
● case history data including school records
Question 2
The Trail Making Tests are part of which neuropsychological test battery?
● Halstead-Reitan Neuropsychological Battery
● Luria-Nebraska Neuropsychological Battery
● Woodcock-Johnson III
● Kaufman Assessment Battery
Question 3
Item branching refers to:
● administering certain test items on a test depending on the test-takers’ responses to previous test
items.
● statistical efforts to ensure that items translated into foreign languages are of the same difficulty.
● reusing items in an original test that were originally developed for use in a parallel test.
● the creation of alternate and parallel forms of tests based on a group of test-takers’ responses to
the original test.
Question 4
A psychologist wishes to compare the performances of an experimental group and a control group on a
continuous measure. Which of the following would be the most typical way to make this comparison?
● computing a multiple correlation
● conducting a t-test on the two means
● computing single correlation coefficient
● conducting a chi-square test
Question 5
On the Wechsler tests of intelligence, the Full Scale IQ has a mean of ________________ and a
standard deviation of _______________.
● 100; 15
● 100; 16
● 50; 15
● 50; 10
Question 6
An educational psychologist conducts a utility analysis of a teaching program used to improve the
handwriting of very young children. The measure of utility in this analysis will most likely be:
● reduction in accidents.
● increase in performance level.
● decrease in costs.
● increase in revenue.
Question 7
The Schedule for Affective Disorders and Schizophrenia is an example of:
● a projective test.
● an unstructured clinical interview.
● an objective personality test.
● a structured clinical interview.
Question 8
A self-report rating scale of neurological impairment is:
● the Neuropsychological Impairment Scale.
● the Seashore Rating Scale.
● the Short Portable Mental Status Questionnaire.
● the Patient’s Assessment of Own Functioning Scale.
Question 9
A psychologist who does not act in the same or similar way that other reasonable psychologists would
have acted under the same or similar circumstances may be found liable for:
● negligence.
● abuse.
● incompetency.
● malpractice.
Question 10
Users of psychological tests are frequently tempted to treat ordinal data as if it were interval data. This is
the case because of the:
● difficulties that would be encountered if the data were treated as ratio data.
● frequent need to do more than simply rank order test scores.
● added flexibility of interval level data for statistical manipulation.
● unwritten rule that exists pertaining to the equal intervals between points measured.
Question 11
A patient is administered the Minnesota Multiphasic Personality Inventory-2-RF (MMPI-2-RF) by an
experienced clinician. The clinician concludes that the patient has schizophrenia. The clinician’s diagnosis
best supports which of the following additional conclusions?
● The clinician’s interpretation of the MMPI-2RF findings is based on knowledge of projective
testing.
● The patient’s pattern of responses to the MMPI-2-RF resembles that of people who are known to
have schizophrenia.
● The patient received a high score on the lie scale of the MMPI-2-RF.
● A brief interview with the patient would reveal that the patient harbors delusions of grandeur.
Question 12
In the acronym presented in text for remembering what each of the variables in the APGAR measures,
the letter G in APGAR stands for:
● grimace
● good muscle tone
● glucose
● genetic inheritance
Question 13
In ipsative scoring, a test-taker’s scores are compared only to:
● the scores of other test-takers from past years who have taken the same test under the same or
similar conditions.
● the scores of other test-takers from the same geographic area who are similar with regard to key
variables such as gender.
● his or her other scores on a parallel form of the same test.
● his or her other scores on the test.
Question 14
The mental status examination used as part of the neuropsychological evaluation:
● will typically delve into specific areas of interest more extensively than the one used as part of a
clinical or counseling assessment.
● typically includes the administration of a neuropsychologically oriented adjective checklist.
● is exactly the same as a mental status examination used during a clinical or counseling
assessment.
● typically includes the administration of an intelligence test.
Question 15
A patient exhibits deficits in word recall, vocabulary, and finding words to name things. A
neuropsychologist would be MOST likely to diagnose this patient with:
● limbic system
● parietal lobes
● occipital lobes
● frontal lobes
● anomia nasearch ko (2)
Question 16
As a result of the normalization of the standard scores on the MMPI-2, a T score of 70 on the Depression
Scale and a T score of 70 on Hypomania Scale will indicate that:
● the scores will be significantly different depending on the gender of the test-taker.
● the two T scores equal the same level of clinical elevation.
● the two scores result in different percentile ranks for each scale.
● neither score is significantly elevated.
Question 17
If a patient suddenly begins to experience extremes in mood ranging from blunted affect to emotional
outbursts, a neuropsychologist would suspect damage to the:
● cerebellum
● spinal cord
● occipital lobe
● limbic system

Question 18
A group of researchers was interested in learning whether a newly developed exam would be useful in
determining whether a student will be successful in college. The researchers designed a study in which
students took the new exam prior to entering college, the student took another exam, which was designed
to measure how much information they had learned during their first year. The score on this exam was
then correlated with the student’s score on the newly developed exam. What type of validity was being
evaluated in the study?
● Concurrent
● Divergent
● Predictive
● Discriminant
Question 19
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate positive mood. Results of this study are presented in the graph below:
Which of the following describes the pattern of findings displayed in the graph?
● Women who heard the jazz piece and men who heard the operatic piece scored higher on the
mood inventory than those in the other two groups.
● Men who heard the jazz piece and women who heard the operatic piece scored higher on the
mood inventory than those in the other two groups.
● Men scored higher than women on the mood inventory regardless of the type of music they
heard.
● Women scored higher than women on the mood inventory regardless of the type of music they
heard.
Question 20
The standard deviation of a sample of test scores is a measure of the
● normality of the distribution
● central tendency of scores
● concurrent validity of the test
● variability of individual scores

Question 21
Critics have argued that projective tests are too
● qualitative
● concrete
● subjective
● brief
Question 22
The Myers-Briggs Type Indicator (MBTI) is based on the theoretical writings of:
● Holland Opus
● Sigmund Freud
● Carl Jung
● B. F. Skinner
Question 23
A Director of Human Resources is setting up a series of tests to use to select applicants for sales
positions. Inherent in the tests, and applied in the model of selection, is the Director’s assumption that
high sales ability can make up for limited product knowledge. The model of selection being applied could
BEST characterized as:
● a multiple hurdle model of selection.
● the method of predictive yield in action.
● a compensatory model of selection.
● the method of contrasting group for selection.
Question 24
Detailed information regarding how a particular test was developed is typically found in:
● the current test catalogue distributed by the test’s publisher.
● the Standards for Educational and Psychological Tests.
● the test manual
● a review of the test published in a journal.
Question 25
If someone tells you that they took a test and received scores on scales called School Problems, Conduct
Problems, and Immaturity, it is a good bet that this person was administered:
● the NEO-PI-R.
● the MMPI-A.
● the MMPI-2-RF.
● the School Problems Checklist.
Question 26
Validity is to ____________ as utility is to ____________.
● consistency; accuracy
● usefulness; accuracy
● usefulness; consistency
● accuracy; usefulness
Question 27
Sally and Bob both apply for a position as an accounting clerk. The Human Resource (HR) professional
responsible for selecting the best candidate for the position administers a standardized test of basic
mathematical skills to both Sally and Bob. Based on their scores, the HR professional chooses Sally. The
reason for this choice is that Sally has an 85% chance of performing at an acceptable level. By contrast,
Bob’s score indicated that he only had a 50% chance of performing successfully. The tool of assessment
used to make this hiring decision was:
● the method of predictive yield.
● an expectancy table.
● good, old-fashioned intuition.
● a Taylor-Russell table.

Question 28
A researcher conducted a study to determine the effects of gender and status on the perceived credibility
of an eyewitness testifying in a trial. Participants watched one of four video recordings depicting the
eyewitness and rated the credibility of the eyewitness.
What type of design was used in this study?
● between-subjects
● between- and within-subjects
● within-subjects
● multivariate correlational
Question 29
Which of the following is most appropriate for determining the psychometric soundness of behavioral
assessment?
● the experimental analysis of behavior
● generalizability theory
● classical test theory
● empirical methods

Question 30
A key difference between concurrent and predictive validity has to do with:
● the magnitude of the reliability coefficient that will be considered significant at the .05 level.
● the time frame during which data on the criterion measure is collected.
● the magnitude of the validity coefficient that will be considered significant at the .05 level.
● None of these.

Question 31
As compared to more traditional, one-on-one and face-to-face assessments, a disadvantage of CAPA is
that it typically deprives the assessor of the opportunity to:
● have a Bowflex workout during the assessment.
● tailor the test’s content to the responses.
● make certain that test forms can be kept secure.
● observe the test-taker’s test-taking behavior.
Question 32
If a test-taker earns a z score of +2 on a test, approximately how many other test-takers obtained higher
scores, assuming the distribution of test scores is normal?
● 2.5%
● 25%
● 16%
● 14%
Question 33
The plethysmograph is perhaps MOST useful in the assessment and treatment of:
● people suffering from migraine
● sexual offenders
● agitated depression
● cardiac arrest
Question 34
Which model of intelligence guided the development of the fourth edition of the Stanford-Binet Intelligence
Scale?
● the Cattell-Horn theory
● Gardner’s theory of multiple intelligences
● the Cattel-Horn-Cattel (CHC) model
● Spearman’s two-factor theory
Question 35
Wakefield’s definition of a disorder includes:
● all of these
● an assumption that an evolutionary failure occurred.
● a value judgment regarding the basic goodness of people.
● a strong belief in herbal remedies for treatment.
Question 36
The specific objective of a utility analysis will dictate what sort of information will be required, as well as
the specific:
● methods to be used.
● Naylor-Shine tables to be used.
● expectancy tables to be used.
● Rise-and-Shine tables to be used.
Question 37
The thalamus acts as:
● a brake on emotional impulses and a calming influence when one is angered.
● visual-spatial sequencer for perceiving complex patterns of movement.
● a communications relay station for sensory information being transmitted to the cerebral cortex.
● an executive controller for volitional motor movements.
Question 38
Research on the Psychopathy Checklist suggests that it is useful in:
● identifying criminal recidivists.
● predicting violence within prisons.
● predicting crimes committed by inmates.
● identifying a psychotic disorder.
Question 39
Exploratory factor analysis is used for all of the following EXCEPT:
● summarizing large data sets efficiently.
● determining the number of dimensions present in the data.
● determining which items correlate with which dimensions in the data.
● determining whether one factor causes the appearance of another.
Question 40
In the language of psychological testing and assessment, scoring refers to assigning evaluative numbers,
codes or statements to performance on:
● Tasks
● Interviews
● All of these
● Tests
Question 41

In an evaluation to determine dangerousness, information pertinent to ______________is typically


gathered.

● the type of weapon to be used


● the specificity and detail of the plan
● the availability of a weapon
● All of these.

Question 42

An assessor determines that John is incompetent to stand trial. This means that John:

● Is mentally retarded or psychotic.


● May have been under the influence of alcohol or some controlled substance at the time the
alleged offense was committed.
● All of these.
● Is unable to understand the charges against him and is unable to assist in his own

Question 43
Which statement is TRUE regarding the definition of personality?

● Hall and Lindzey’s definition of personality is universally accepted.


● Freud’s definition of personality is universally accepted.
● There is no universal definition of personality.
● None of these.
Question 44
A counselor who fears the client has an organic, neurological, or motoric difficulty would most likely use
the
● Minnesota Multiphasic Personality Inventory.
● Thematic Apperception Test.
● Bender Gestalt.
● Rorschach.

Question 45
Which of the following is the best way to establish rapport with a test-taker?
● Presenting the test-taker with a business card
● Playing a calming and soothing music prior to testing
● A few words of “small talk” on meeting
● Shaking hands with the test-taker on arrival to the facility
Question 46
Which BEST describes what is typically measured in personality assessment?
● Creativity and motivation
● Personal values
● Social communication skills
● Traits and states

Question 47
A psychologist working in a mental hospital is interested in predicting suicide risk from three variables:
severity of current depression, duration of current depression, and number of previous suicide attempts.
The psychologist gathers information on these variables for 100 subjects and wants to combine the
information in such a way that suicide risk is most accurately predicted. An appropriate statistical
technique would be:
● Meta-analysis.
● Multiple regression.
● Simple regression.
● Incremental validity analysis.

Question 48

A significant, positive relationship exists between scores on a new test of intelligence and scores on the
fourth edition of the Stanford-Binet intelligence scale. These data may be viewed as supportive of which
type of validity evidence for the new test?
● Convergent evidence of construct validity
● Content validity
● Discriminant evidence of construct validity
● Criterion-related validity

Question 49
The deviation IQ reflects a comparison of the performance of the individual with the performance of
others:
● In the entire standardization sample.
● In the same grade in the standardization sample.
● Of the same age in the standardization sample.
● In the same grade and of the same age in the standardization sample.

Question 50

You are interested in developing a test for social adjustment in a college fraternity or sorority. You begin
by interviewing persons who had graduated from college after having been a member of a fraternity or
sorority for at least 2 years. Which stage of test development best describes the one that you are in?
● The test construction stage
● The test revision stage
● The pilot work stage
● The test-tryout stage
● Single factor within subjects

Question 51
If John earns a full-scale IQ of 90 on the WISC-IV:
● John correctly answered 90 questions.
● John correctly answered 90% of the questions.
● John scored at the low end of the average range.
● Ninety percent of the students in John’s age group scored lower than John on this test.
Question 52
A test score or index derived from the combination of, and/or mathematical transformation of, one or more
subtest scores is known as a:
● standard score
● scaled score
● test requisite
● test composite

Question 53
As part of the test development process, a test revision may entail:
● The reprinting of a test.
● Rewording, deletion, or development of new items; and, development of a new edition of a test.
● Rewording, deletion, or development of new items.
● Development of a new edition of a test.

Question 54
What is the correlation coefficient of choice when two variables are ordinal?
● Chi square
● The Spearman rho
● Cronbach alpha
● Pearson r

Question 55
In a distribution that is symmetrical, which of the following is true?
● The distances from Q2 and Q3 to the median are the same.
● The distances from Q1 and Q2 to the median are the same.
● The distances from Q1 and Q3 to the median are the same.
● The distances from Q1 and Q4 to the median are the same.

Question 56
In a study of a new psychopharmacological treatment for clinical depression, 40 participants diagnosed
with depression each received four (4) different amounts of a new medication called Deplow. The first
week, they were given a placebo. During the second week of the study, they took 1 mg. of Deplow each
day. During the third week, they took 3 mg. of Deplow each day, and during the fourth week, they took 5
mg. of Deplow each day. Although the participants took different amounts of the medication each week,
they were not informed about the amount they were taking. The participants also completed a depression
symptom checklist at the end of each week. Results are presented below. The score on the checklist
could range from 0 to 30 indicating severe depression. Assume statistical significance for differences
greater than 3.0. What type of design was used in this study?

● Single factor between subjects


● Cross-sectional
● Multifactor between subjects

Question 57
A “good” test item on an ability test is one:
● To which almost all test-takers respond incorrectly.
● That distinguishes high scorers from low scorers.
● To which almost all test-takers respond correctly.
● In which it is absolutely impossible to guess the correct answer.

Question 58
Kate received a z score of 1 on a reading test. What do we know about Kate’s performance, assuming
that the reading test scores are distributed normally?
● She scored better than only 2/3 of the other students.
● She scored worse than 84% of other students.
● She scored better than 84% of other students.
● She scored worse than only 2/3 of other students.

Question 59
If the results of an examination are negatively skewed, the exam questions were likely:
● Biased
● Easy
● Difficult
● Quite novel in many respects
Question 60
Criterion validity of the General Aptitude Test Battery (GATB) tends to be low, probably because of:
● A limitation of the Taylor-Russell tables.
● The low test-retest reliability of the GATB.
● The low reliability of supervisory ratings.
● Scoring that is based in part on the race of the test-taker.

Question 61
The Brogden-Cronbach-Gleser formula was developed by:
● Brogden, Cronbach, and Gleser, each working independently.
● The team of Brogden, Cronbach, and Gleser working together.
● The work of Brogden and later, the work of Cronbach and Gleser.
● The work of Cronbach, and later, the work of Brogden and Gleser.

Question 62
If an ANOVA yields a significant F value, you could rely on _______ to test significant differences
between group means.
● Duncan’s multiple-range, Tukey’s, or Scheffe’s test.
● Percentile rank.
● Summative or formative evaluation.
● One- and two-tailed t tests.

Question 63
Citing only positive attributes in a self-report measure of personality is a phenomenon referred to as:
● Amplifying
● Projecting
● Self-deception
● Socially desirable responding

Question 64
The greater the magnitude of the item-discrimination index:
● The more people in the lower-scoring group answered the item correctly as compared with those
in the higher-scoring group.
● The more reliable the test.
● The more valid the test.
● The more people in the higher-scoring group answered the item correctly as compared with those
in the lower-scoring group.

Question 65
Which tool of psychological assessment is MOST likely to be used after tests have been administered to
a patient in order to evaluate that patient’s level of premorbid functioning?
● The case history
● The interview
● Behavioral observation
● All of these.

Question 66
If a test-taker is asked to name familiar objects, write familiar words, and follow verbal instructions on a
test that takes 15 minutes or less, he or she would most likely be taking:
● The Aphasia Screening Test.
● The Halstead-Reitan Neuropsychological Battery.
● The Wechsler Memory Scale.
● The Wisconsin Card Sorting Test.
Question 67
An administration of the Montreal Neuropsychological Institute Battery entails the administration of:
● The Wisconsin Card Sorting Test.
● The Wechsler Intelligence Test.
● All of these
● The Mooney Faces Test.

Question 68
In a normal distribution of scores, approximately what percentage of test scores falls between +1 and –1
standard deviations from the mean?
● 66%
● less than 1%
● 50%
● 75%

Question 69
Which of the following statements is TRUE of the role of personality measures in industrial/organizational
psychology?
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● Personality tests are playing less and less a role in I/O settings with every passing year.
● The same personality test may not be equally suited for use with every job.
● The MMPI-2-RF has quickly become the most widely used measure of personality in I/O settings.

Question 70

If a time limit is long enough to allow test-takers to attempt all items, and if some items are so difficult that
no test-takers is able to obtain a perfect score, then the test is referred to as a____ test.

● power
● valid
● reliable
● speed
Question 71

Which of the following represents a problem unique to self-report personality tests?

● All of these.
● Respondents may be too "low" on the construct being measured to register on the test.
● Respondents might be unwilling to reveal something negative about themselves.
● The reading ability of respondents may prevent them from responding accurately to items.

Question 72

The integration of data from statistical procedures, empirical methods, and formal rules to formulate
descriptions and make predictions is referred to as:

● clinical prediction
● formal prediction
● empirical prediction
● actuarial prediction

Question 73

An item-reliability index provides a measure of a test's.

● test-retest reliability
● internal consistency
● All of these
● stability

Question 74

Dr. Chen is interested in feminist attitudes of young adult women in the United States. Consequently, she
administered a feminist attitude questionnaire to a total of 100 young adult women from three universities.
The 100 women tested and the number of young adult women in the United States are which of the
following, respectively?

● Random assignment and random selection


● Independent and dependent variables
● Sample and population
● Effect size and population

Question 75

Behavioral assessment has many advantages over other forms of assessment. Which is NOT one of
those advantages?

● Behavioral assessment can provide behavioral baseline data.


● Behavioral assessment can be used to pinpoint environmental conditions that are acting to
trigger, maintain, or extinguish certain behaviors.
● Behavioral assessment can provide adequate explanations for apparently contradictory dynamics
in motivation.
● Behavioral assessment can provide a record of the assessee's behavioral strengths and
weaknesses across a variety of situations.

Question 76
The higher the item-difficulty index, the
the item.

● more robust
● harder
● easier
● less robust

Question 77

During the course of a mental status examination, the examiner asks the examinee: "Do you know why
we are here and why I am interviewing you today?" By raising these questions, the examiner is MOST
likely trying to assess the examinee's:

● intellectual resources
● insight
● verbal abilities
● orientation

Question 78

The task of sorting statement cards from least descriptive to most descriptive is most characteristic of
which type of assessment method?

● Forced-choice
● T-sort
● Q-sort
● Semantic differential rating technique

Question 79
Aphasia patients suffer the loss of the ability to:
● perceive smell.
● hold their hands steady.
● express themselves orally or in writing.
● perceive sounds lower in volume than a "dollar watch."
Question 80
Melody exclaims, " got a C- on the statistics exam, and I was miserable until I thought how terrible it must
be for those who got F's." Melody's attitude is an example of which if the following?
● social anxiety
● social comparison
● social learning
● social validation
Question 81
The Otis-Lennon School Ability Test yields which of the following composite scores?
● Verbal Performance Composite
● O-L Composite
● Full Scale 10
● School Ability Index
Question 82
A test developer designs a test for the sole purpose of identifying the most highly skilled individuals
among those tested. During the test revision stage of test development, the test developer will be
particularly interested in:
● item bias
● item reliability
● item validity
● item discrimination
Question 83
The 16 PF exemplifies which approach to assessment?
● impressionistic
● nomothetic
● projective
● Idiographic
Question 84
An example of a personality test that employed empirical criterion keying in its development is
the:
● MMPI
● 16 PF
● NEO-PI-R
● Rorschach
Question 85
Psychometrics may best be defined as:
● the science of psychological measurement.
● the study of psychic phenomena.
● the science of test development.
● the study and use of correlational techniques.

Question 86
Which of the following increases the power of a statistical test?
● changing from a two-tailed to a one-tailed test
● changing alpha from .05 to .01
● using a smaller critical area in the distribution of sample means
● decreasing the sample size from N = 100 to N = 75

Question 87
Research by Solomon Asch supports which of the following?
● Individuals will follow orders to shock innocent strangers.
● Higher levels of conformity are found in individualistic societies than in collectivistic societies.
● The presence of one dissenter in a group is not strong enough to reduce conformity.
● Conformity increases as group size increases from two people to four or five people.

Question 88
If a new test was developed to assist a college in selecting applicants, which group of test-takers should
ideally be administered the test items developed used during the item tryout phase of the new test's
development?
● college students who put a hold on their academic studies in order to backpack through Europe
for 1 year or more
● seniors in high school who were accepted to the college on the basis of criteria other than the test
under development
● freshmen in college admitted who had taken one or more advanced placement courses in high
school
● all high school juniors who are college-bound
Question 89
Psychoeducational test batteries are designed to measure:
● academic motivation
● adjustment and personality
● ability and achievement
● scholastic aptitude
Question 90
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate a positive mood. Results of this study are presented in the graph below:
Average mood inventory scores of men and women by music type.JPG
Which of the following is the most serious problem with the methodology of this research?
● Only one type of music should have been used.
● The sample size was too small to draw a valid conclusion.
● Men and women did not listen to both types of music.(indi ba i2?)
● The singers were not the same gender.
Question 91
If a psychologist determines that a client is a danger to others, that psychologist has a legal obligation to:
● keep the information privileged.
● warn the person who is in danger.
● share this information with a colleague before taking any action.
● seek a legal opinion from a lawyer.
Question 92
Which question might a mental health professional be MOST likely to be asked during the course of a civil
proceeding?
● Is this individual competent to stand trial?
● All of these.
● To what extent did this individual suffer emotional distress?
● Was this individual sane at the time the crime was committed?
Question 93
A 40-item vocabulary test was administered to a group of students. A second similar test of vocabulary
term was administered to this same group of students approximately one week later. The researcher
reported that the correlation between these two tests was r = 90. What type of reliability is represented in
this example?
● split-half
● alternate forms
● test-retest
● inter-rater
Question 94
In a study of a new psychopharmacological treatment for clinical depression, 40 participants diagnosed
with depression each received four (4) different amounts of a new medication called Deplow. The first
week, they were given a placebo. During the second week of the study, they took 1 mg. of Deplow each
day. During the third week, they took 3 mg. of Deplow each day, and during the fourth week, they took 5
mg. of Deplow each day. Although the participants took different amounts of the medication each week,
they were not informed about the amount they were taking. The participants also completed a depression
symptom checklist at the end of each week. Results are presented below. The score on the checklist
could range from O to 30 indicating severe depression. Assume statistical significance for differences
greater than 3.0.

Which of the following would make it difficult to conclude that any decrease in depressive symptoms is
due to Deplow and not to other aspects of the study?
● The lack of comparison with an established antidepressant medication
● The increasing doses of Deplow
● The low sample size
● The lack of a control group

Question 95
When a cut score is set based on norm-related considerations rather than on the relationship of test
scores to a criterion, it is known as:
● an absolute cut score.
● a referential cut score.
● a fixed cut score.
● a relative cut score.
Question 96
Which of the following is NOT an assumption of utility analysis?
● psychological tests are always preferred over other means of assessment.
● the value of people and their performance can be estimated.
● large amounts of information can be integrated to make good decisions.
● the performance of people in organizations can affect organizational viability.

Question 97
A researcher was interested in whether or not jazz vocals and opera influence men's and women's
emotional states. She hypothesized that these types of music influence men and women differently. In a
study investigating this hypothesis, 40 men and 40 women heard a jazz piece, and 40 men and 40
women heard an operatic piece. The jazz piece was sung by a man, and the operatic piece was sung by
a woman. Afterward, participants rated themselves on an inventory measuring emotional state. Higher
scores on the inventory indicate positive mood. Results of this study are presented in the graph below:
The researcher concludes from her study that jazz music positively changes men's moods and operatic
music positively changes women's moods. Which of the following invalidates that conclusion?
● Previous studies have shown that men are less emotional than women.
● Men's and women's moods were not measured before exposure to the two types of music.
● Only one scale was used to measure mood.
● Men and women were randomly assigned to groups
Question 98
It is one of the vital tools of psychological assessment which pertains to how consistently and accurately a
psychological test measures what it purports to measure.
● Utility
● Reliability
● Psychometric Soundness
● Inter-item Consistency
Question 99
Who is credited with being the originator of the psychometric concept of test reliability?
● Spearman
● Pearson
● Kraeplin
● Tichener
Question 100
If there is common ground among of all of the varied approaches to psychological testing and
assessment, that common ground may MOST have to do with the assessor's:
● strict adherence to ethical guidelines.
● psychoanalytically-based interpretation of findings.
● All of these
● use of an ability test and a test of personality.
Question 101
The lower test-retest reliability coefficients found to exist for state anxiety when compared with higher
test-retest reliability coefficients obtained for trait anxiety support which premise?
● None of these.
● Traits are more enduring personality characteristics than states.
● States are more enduring personality characteristics than traits.
● Exhibition of anxiety is very situation-dependent.
Question 102
If a student's performance on a newly developed math achievement test is compared with his or her
recent performance on another achievement test known to measure math skills, this would be an example
of_______validity.
● predictive criterion-related
● concurrent criterion-related
● content
● construct
Question 103
The mental status examination used as part of the neuropsychological evaluation:
● is exactly the same as a mental status examination used during a clinical or counseling
assessment.
● will typically delve into specific areas of interest more extensively than the one used as part of a
clinical or counseling assessment.
● typically includes the administration of an intelligence test.
● typically includes the administration of a neuropsychologically oriented adjective checklist
Question 104
Which is NOT a typical question that is raised and answered during the test conceptualization stage of
test development?
● Is there a need for the test?
● How valid are the items on the test?
● What is the objective of the test?
● What types of responses will be required of the test-taker?
Question 105
The standard error of measurement of a particular test of anxiety is 8. A student score of 60. What is the
confidence interval for this test score at the 95% level?
● 44-76
● 40-68
● 36-84
● 52-68
Question 106
The Children's Apperception Test (CAT) depicts ____ in its pictures.
● humans interacting with animals
● animals
● humans
● dolls and puppets
Question 107
Psychologists who are called on by the courts to render an opinion regarding a person's sanity must be
prepared to:
● undergo an examination evaluating their own mental status prior to being admitted to the
courtroom.
● have their license revoked if their opinions regarding the sanity of the defendant are at odds with
that of the court.
● deal with all the ramifications of the fact that diagnoses of "sanity" and "insanity are ultimately left
to a judge or a jury to decide.
● explain all the ramifications of the fact that "sanity" and "insanity" are psychological. and not legal
terms.
Question 108
In the language of psychological testing and assessment, reliability refers to:
● the proportion of total variance that can be attributed to true variance.
● The lack of systematic errors.
● whether or not a test publisher consistently publishes high quality instruments.
● How well a test measures what it is intended to measure under specified conditions.
Question 109
A norm group is a group of test-lakers:
● for whom a particular test is deemed appropriate.
● that is typically described in the test manual
● taking a particular test for the very first time.
● for whom a particular test is deemed inappropriate

Question 110
The term "RIASEC" is BEST associated with
● Strong
● Wonderlic
● Holland
● Guttman
Question 111
The 7-Minute Screen was developed to identify symptoms associated with which of the following?
● personality disorders
● seizure disorders
● All of these.
● Alzheimer's disease
Question 112
A student scores very high on a graduate school admission test and is admitted to graduate school,
largely on the basis of his test score. The student subsequently flunks out. The type of test outcome
described in this situation is known as a:
● positive hit.
● false negative.
● false positive.
● true negative.
Question 113
Forced-choice item formats are typically employed to control which of the following?
● None of these.
● the reliability of respondent's response patterns
● test-takers tendency toward impression management
● interviewers tendency to stray from the points of focus
Question 114
A self-report rating scale of neurological impairment is:
● The Short Portable Mental Status Questionnaire.
● the Seashore Rating Scale.
● The Neuropsychological Impairment Scale.
● the Patient's Assessment of Own Functioning Scale
Question 115
An anchor protocol is:
● a list of guidelines for a standardized test used to ensure that all test-takers are similar in key
ways to the population of the original standardization sample.
● a model for scoring and a mechanism for resolving scoring discrepancies.
● a previously developed test with known validity that can be used as a comparison for newly
developed tests.
● a statistical procedure in which weights are assigned to each item of a model test to maximize
predictive validity.
Question 116
In the administration of the TAT:
● all stimulus cards are presented to all subjects
● a minimum of ten cards must be presented.
● the number of cards presented is left to examiner discretion
● a maximum of twenty cards is presented
Question 117
In order to make norms for a certain test more appropriate for use with test-takers from Taiwan, data from
the virginal standardization sample of a test is supplemented with Taiwanese norms. In this instance, it
would be:
● necessary to reevaluate the wording of the test's items in order to make certain that test-takers in
Taiwan do not find any of the items offensive in any way.
● inaccurate to continue to use the terms "standardization sample' and "normative sample"
interchangeably with reference to the test.
● perfectly appropriate to continue to use the terms "standardization sample" and "normative
sample" interchangeably with reference to this test
● desirable to integrate the Taiwanese norms into the original norms so that both norms could be
referred to as the " standardization sample norms with reference to this test.
Question 118
The strongest psychometric aspect of the Rorschach is its:
● test-retest reliability over a short period of time.
● interrater reliability with respect to scoring categories
● interrater reliability with respect to interpretations.
● internal-consistency split-half reliability for odd and even items
Question 119
The method of paired comparisons is used to:
● provide test-takers with a sufficient number of pairs of choices to express their "true" opinions.
● provide test-takers with a limited number of pairs of choices in order to lessen testing time.
● maximize the opportunity of selecting a socially desirable response
● minimize the opportunity of selecting a socially desirable response
Question 120
To ensure that a test developed for national use is indeed suitable lo national use, test developers:
● All of these.
● post sample items on the Web to gauge response of different rups
● have a culturally representative panel of experts review test items.
● employ a culturally representative group of examiners
Question 121
The Likert scale is an example of which type of rating scale?
● paired methods
● summative
● content
● categorical
Question 122
hit rate is equivalent to:
● the miss rate/the selection ratio.
● the success rate/base rate of successful performance
● number of correct classifications /total number of classifications
● the base rate/the selection ratio
Question 123
Which of the following tests employed by the Army during World War I was probably the most
"culture-fair"?
● the Army Alpha Test
● the Armed Services Vocational Aptitude Battery (ASVAB)
● the Army General Classification Test (AGCT)
● the Army Beta
Question 124
An intelligence test originally written in English is to be administered to a group of Japanese immigrants
who do not speak English. In order to obtain an accurate measure of intelligence and completely
eliminate any possible effects due to language. the test administrator should:
● have a teacher fluent in Japanese and English conduct a brief tutorial in English prior to
administering the test in English, with specific attention given to the meaning of the wording of
key items and corresponding responses.
● None of these.
● have a professional translator read the test to the group. simultaneously translating the items
word for word.
● have a friend or family member of the group who is fluent in Endon and Japanese read the test to
the group, simultaneously translating the items word for word.
Question 125
The Beck Depression Inventory-ii measures the test-takers' feelings over what period of time?
● The last month
● 3 weeks
● 1 week
● 2 weeks
Question 126
Which of the following statements is TRUE of the role of personality measures in industrial/organizational
psychology?
● Personality tests are playing less and less a role in I/O setting with every passing year.
● The distinction between task-related and people-related aspects of a job is irrelevant to
personality measurement.
● The MMPI-2-RF has quickly become the most widely used measure of personality in 1/0 settings.
● The same personality test may not be equally suited for use with every job.
Question 127
A counselor created an achievement test with a reliability coefficient. of 82. The test is shortened since
many clients felt it was too long The counselor shortened the test but logically assumed that the reliability
coefficient would now
● remain at 82.
● be at least 10 points higher or lower
● be lower than 82.
● be approximately 88.

Test reliability refers to:


a. how accurately a test measures what it purports to measure.
b. how consistently a test measures what it purports to measure.
c. the "depth" of measurement of a particular construct.
d. the "bandwidth" of measurement of a particular construct

An applicant for a job with the U.S. Postal Service scores in the bottom 5% of all applicants on a test that
measures the ability to sort mail. This is an example of:
a. norm-referenced assessment. (comparing that persons score to a normative sample)
b. criterion-referenced assessment.
c. behavioral assessment.
d. an individual who may one day "go postal."

A third-grade student who earned a grade-equivalent score of 5.0 on a standardized test of mathematics:
a. has the same mathematics ability as the average fifth-grade student in that same school.
b. should not be enrolled in a fifth-grade math class.
c. performed similarly to a hypothetical fifth-grade student.
d. will most probably earn a grade of A in the course.

A source of error variance may take the form of:


a. item sampling.
b. test-takers' reactions to environment-related variables such as room temperature and lighting.
c. test-taker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.
d. All of these

Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that
measures a trait that is relatively stable over time?
a. parallel-forms
b. alternate-forms
c. test-retest
d. split-half

An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval
between the test and retest is more than:
a. 30 days.
b. 60 days.
c. 3 months.
d. 6 months.

A test is considered valid when the test:


a. measures what it purports to measure.
b. measures whatever it is that it measures consistently.
c. can be administered efficiently and cost-effectively.
d. has little or no error associated with it

Which is NOT a method of evaluating the validity of a test?


a. evaluating scores on the test to scores obtained on other tests
b. evaluating the content of the test
c. evaluating the percentage of passing and failing grades on the test
d. evaluating test scores as they relate to predictions from a particular theory

Which term is used to refer to the tendency of a rater to evaluate a person higher than they objectively
deserve because of the rater's inability to discriminate between aspects of the person's behavior?
a. halo effect
b. random error
c. generosity error
d. severity error

Although there are some exceptions, in practice, most reliability coefficients, regardless of the specific
type of reliability they are measuring, range in value from:
a. -1 to +1
b. 0 to 100
c. 0 to 1.
d. negative infinity to positive infinity

Which of the following concepts is synonymous with utility as used in the text?
a. consistency
b. truthfulness
c. usefulness
d. accuracy

Which is an example of the use of a completion format on a test? (all other answers are selection format)
a. true-false items
b. matching items
c. short-answer items
d. multiple-choice item

The higher an item-validity index (items correspond to what the test says it measures), the greater the
__________ validity.
a. construct
b. content
c. criterion
d. face
Which is an example of a false positive in the context of employee selection?
a. hired applicants who scored at or above the cut-off score on the employment test went on to fail on
the job
b. hired applicants who scored at or above the cut-off score on the employment test went on to
succeed on the job
c. rejected applicants who scored below the cut-off score on the employment test were rejected but
would have gone on to succeed on the job had they been given a chance
d. rejected applicants who scored below the cut-off score on the employment test were rejected but
went on to succeed at another, totally different job.

On a particular test, men and women tend to have the same total score. Men and women do, however,
tend to exhibit different response patterns to specific items. A reasonable conclusion is that the test is:
a. unreliable.
b. invalid.
c. biased.
d. patently unfair.

In Guttman scaling:
a. test-takers are presented with a forced-choice format
b. each item is completely independent of every other item and nothing can be concluded as the
result of the endorsement of an item.
c. when one item is endorsed by a test-taker, the less extreme aspects of that item are also
endorsed.
d. when more than one item tapping a particular content area is endorsed, the less extreme aspects
of those items are also endorsed.

Who is best associated with the development of the scaling methodology?


a. Galton
b. Cohen
c. Spearman
d. Thurstone

Item-discrimination indexes can range from ________ to ________.

a. .001; 1.00
b. -1; +1
c. 0%; 100%
d. 1 to 100

The elements of a multiple-choice item include:

a. a stem.
b. a distractor.
c. a foil.
d. All of these

An anchor protocol is:


a. a previously developed test with known validity that can be used as a comparison for newly developed
tests.
b. a statistical procedure in which weights are assigned to each item of a model test to maximize
predictive validity.
c. a list of guidelines for a standardized test used to ensure that all test-takers are similar in key ways to
the population of the original standardization sample.
d. a model for scoring and a mechanism for resolving scoring discrepancies.
Item analysis is conducted to analyze:

a. item reliability.
b. item validity.
c. item difficulty.

Test items that contain alternatives with five points ranging from "strongly agree" to "strongly disagree"
are characterized as using this approach to scaling:

a. Guttman scaling.
b. Likert scaling.
c. Nielson scaling.
d. opinion scaling.

Multiple hurdles as used in a decision-making process regarding a selection decision refers to:

a. the use of two or more cut scores with reference to one predictor for the purpose of categorizing
test-takers.
b. the multiple stages each applicant must successfully complete in order to get to the next stage in
the evaluation process.
c. the obstacles to success placed before each of the contestants on Project Runway.
d. All of these

A potential noneconomic benefit of a well-run evaluation program is:

a. increase in quantity and quality of workers' on-the-job performance.


b. decrease in time it takes to train new workers.
c. reduction in the number of workplace accidents.
d. All of these

When the selection ratio goes down:

a. top-down selection policy can become discriminatory.


b. hiring becomes less selective.
c. competition for the position is likely to increase.
d. Both a and b

Shelly applies for a job at a company that gives all applicants a drug test during the hiring process.
Despite the fact that Shelly smokes marijuana almost daily, the company's test report indicates that
she is drug-free. In this case:

a. a false positive was reported.


b. a false negative was reported.
c. a confirmed hit occurred.
d. someone should investigate what the personnel director is smoking

The terms organicity and neurological damage:


A. were generally used interchangeably from about the time of World War I to the 1950s.
B. are basically the same diagnostic entities and are unitary in nature.
C. refer to the fact that most brain-damaged children share a similar pattern of cognitive, behavioral, and
motor deficits.
D. both refer to damage to the brain, the spinal cord, and the peripheral nervous system.

Why might a clinician interview a culturally different client about cultural aspects of his or her life?
A. to develop hypotheses about the intelligence and personality of the interviewee
B. to distinguish psychopathological behavior from that which is more typical of the culture of the
interviewee
C. to differentiate between what constitutes psychopathology in the majority culture from what constitutes
psychopathology in the client's culture
D. to sample new foods

In the administration of the Thematic Apperception Test (TAT):

A. a minimum of ten cards must be presented.


B. all stimulus cards are presented to all subjects.
C. a maximum of twenty cards is presented.
D. the number of cards presented is left to examiner discretion.

In creating a test designed to measure personality constructs, the test developer's first step would BEST
be to

A. determine which items would lead to socially desirable responses.


B. create a large pool of potential items.
C. define the construct or constructs being measured.
D. select a representative sample of testtakers for test tryout

Citing or endorsing only positive attributes in a self-report measure of personality is a phenomenon


referred to as

A. putting on the Barnum.


B. amplifying.
C. self-deception.
D. socially desirable responding.

Henry A. Murray is the author of a "personology" theory of personality and is perhaps best associated
with:

A. the Rorschach Inkblot Test.


B. the Mooney Problem Checklist.
C. the Draw-A-Person Technique.
D. the Thematic Apperception Test.

Which of the following is the BEST definition of hit rate?

A. the proportion of people the test correctly identifies as possessing a particular trait, behavior,
characteristic, or attribute
B. the proportion of people in the general population who possess the particular trait, behavior,
characteristic, or attribute
C. the proportion of people the test incorrectly identifies as possessing a particular trait, behavior,
characteristic, or attribute
D. the degree of validity of a particular test
The term used to describe the proportion of people in a population who are distinctive due to their
exhibition of a particular trait is

A. success rate.
B. base rate.
C. target rate.
D. cut rate.

The item-validity index is key in determining

A. construct validity.
B. criterion-related validity.
C. content validity.
D. All of these

You might also like