0% found this document useful (0 votes)
28 views24 pages

Psychological Testing

Uploaded by

zoyaaak201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views24 pages

Psychological Testing

Uploaded by

zoyaaak201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Psychological Test

Psychological testing, also called psychometrics, the systematic use of tests to quantify
psychophysical behavior, abilities, and problems and to make predictions about psychological
performance.
The word “test” refers to any means (often formally contrived) used to elicit responses to
which human behavior in other contexts can be related. When intended to predict relatively
distant future behavior (e.g., success in school), such a device is called an aptitude test. When
used to evaluate the individual’s present academic or vocational skill, it may be called
an achievement test. In such settings as guidance offices, mental-health clinics, and psychiatric
hospitals, tests of ability and personality may be helpful in the diagnosis and detection of
troublesome behavior. Industry and government alike have been prodigious users of tests for
selecting workers. Research workers often rely on tests to translate theoretical concepts
(e.g., intelligence) into experimentally useful measures.

Five main characteristics of a good psychological test are as follows:

1. Objectivity

2. Reliability

3. Validity

4. Norms

5. Practicability

1. Objectivity:
The test should be free from subjective—judgment regarding the ability, skill,
knowledge, trait or potentiality to be measured and evaluated.

2. Reliability:
This refers to the extent to which they obtained results are consistent or reliable.

ADVERTISEMENTS:

When the test is administered on the same sample for more than once with a
reasonable gap of time, a reliable test will yield same scores. It means the test is trustworthy.
There are many methods of testing reliability of a test.
3. Validity:
It refers to extent to which the test measures what it intends to measure. For example,
when an intelligent test is developed to assess the level of intelligence, it should assess the
intelligence of the person, not other factors.

Validity explains us whether the test fulfils the objective of its development. There are many
methods to assess validity of a test.

4. Norms:
Norms refer to the average performance of a representative sample on a given test. It
gives a picture of average standard of a particular sample in a particular aspect. Norms are the
standard scores, developed by the person who develops test. The future users of the test can
compare their scores with norms to know the level of their sample.

5. Practicability:
ADVERTISEMENTS:

The test must be practicable in- time required for completion, the length, number of items or
questions, scoring, etc. The test should not be too lengthy and difficult to answer as well as
scoring.
ROTTER INCOMPLETE SENTENCE BLANK
- Sentence Completion Test

- Semi structured projective technique in which the subject is asked to finish a sentence for
which the first word or words are supplied.

Introduction:
RISB was developed by Julian Rotter and Benjamin Willerman in the early 1940s as
a means of screening large groups of soldiers to evaluate adjustment and fitness to return to duty
and to obtain specific information for evaluation and treatment.

The original RISB was published in 1950, and the most recent revisions, including

separate forms for clients in high school, college and adulthood, were published in 1992.

Measuring both adjustment and maladjustment is a chief aim of thee RISB, with the goal of
identifying both the presence and the relative absense of psychopathology. Therefore, the RISB
is intended help guide an initial clinical interview, formulate a diagnosis and arrive at a treatment
plan, rather than provide a comprehensive evaluation of the personality dynamics.

This over-all adjustment score is of particular value for screening purposes with college students
and in experimental studies. The ISB has also been used in a vocational guidance center to select
students requiring broader counseling than was usually given, in experimental studies of the
effect of psychotherapy and in investigations of the relationship of adjustment to a variety of
variables.

TEST: ADMINISTRATION, SCORING, NORMS, VALIDITY,


RELIABILITY
- 40 item test composed of sentence stems

- 20-40 minute administration

In the manual for the most recent edition of the RISB, 1992, new norms was based on data
collection from three studies conducted between 1977 and 1988.

Compared to other projective tests, sentence completion tests have been described as one of the
most valid among SCT's, the RISB has the most consistent evidence supporting its use in the
diagnosis and assessment of adjustment. Initial studies of Rotter and colleagues indicated that the
RISB was able to correctly identify 78% of the adjusted repondents and 59% of the maladjusted
respondents for women and 89% of adjusted and 52% maladjusted respondents for men.
In terms of Face Validity, RISB was constructed with low face validity, which is a test attribute
in keeping with the SCT orientation towards uncovering latent personality characteristics of
which an individual maybe unaware or unwilling to divulge directly

An overall score of 145 is generally perceived as the cutoff score for identifying significant
adjustment issues. However, as Rotter point out, this cutoff score is not absolute as an index of
psychopathology; rather, it should be used as a guide in the clinical judgement process.

A formal scoring system exist for the RISB, but some assessors choose not to use it, and clinical
judgement typically plays significant role in scoring. Thus, like that of the TAT, its scientific
standing is questioned by those who insist on tests with establishedreliability and validity. As
such, the RISB is often used to complement other personality measures and to provide more
personal details about the psychological problems of a particular client.

Advantages:
1. Freedom of Response

2. Some disguise in the purpose of the test is present

3. Group administration is relatively efficient.

4. No special training is ordinarily necessary for administration

5.The method is extremely flexible in that new sentence beginnings can be constructed or tailor
made for a variety of clinical, applied and experimental purposes

Disadvantages:
1. Susceptible to semi-objective scoring, it cannot be machine scored and requires general skill
and knowledge of personality analysis for clinical appraisal and interpretation.

2. There is not as much disguise of purpose as in other projective methods. Consequently,


sophisticated subject may be able to keep the examiner from knowing what he does not wish to
reveal.

3. Insufficient material is obtained in some cases, particularly from illiterate, disturbed or


uncooperative subjects. Application of the method as a group test also requires writing and
language skills and has not yet been adequately evaluated for potential clinical usefulness for
younger children.

Omission Response: Not Scored


“C” or conflict, responses are those indicating an un healthy or maladjusted frame of mind.
These include hostility reactions, pessimism, symptom elicitation, hopelessness and suicidal
wishes, statements of unhappy experiences, and indications of past maladjustment.

Responses range from C1 to C3 according to the severity of the conflict or maladjusted


expressed. The numerical weights for the conflict responses are

C1 (4) = Typical of the C1 category are responses in which concern is expressed regarding such
things as the world state of affairs, financial problems, specific school difficulties, physical
complaints, identification with minority groups, and so on. In general it might be said that
subsumed under C1 are minor problems which are not deep-seated or incapacitating, and more or
less specific difficulties.

C2 (5) = More serious indications of maladjustment are found in the C2 category. On the whole
the responses refer to broader, more generalized difficulties than are found in C1. I Included here
are expressions of inferiority feelings, psychosomatic complaints, concern over possible failure,
generalized school problems, lack of goals, feeling of inadequacy, concern over vocational
choice, and difficulty in heterosexual relationships as well as generalized social difficulty.

C3 (6) = Expression of severe conflict or indications of maladjustments are rated C3. Among the
difficulties found in this area are suicidal wishes, sexual conflicts, severe family problems, fear
of insanity, strong negative attitudes toward people in general, feelings of confusion, expression
of rather bizarre attitudes, and so forth.

“P” or positive responses are those indicating a healthy or hopeful frame of mind. These are
evidence by humorous or flippant remarks, optimistic responses, and acceptance reactions.
Responses range from P1to P3 depending on the degree of good adjustment expressed in the
statement. The numerical weights for the positive responses are

P1 (2) = In the P1 class common responses are those which deal with positive attitudes toward
school, hobbies, sports, expression interest in people, expression of warm feeling toward some
individual and so on.

P2 (1) =Generally found under the heading of P2 are those replies which indicate a generalized
positive feeling toward people, good social adjustment, healthy family life, optimism and humor.

P3 (0) = Clear cut good natured humor, real optimism, and warm acceptance are types of
responses which are subsumed under the P3 group. The ISB deviates from the majority of the
test in that it scores humorous responses.
NEUTRAL RESPONSES
“N” or neutral responses are those not falling clearly into either of the above categories. They are
generally on a simple descriptive level. Two general types of responses which account for a large
share of those that fall in the neutral category. One group includes those lacking emotional tone
or personal reference. The other group is composed of many responses which are found as often
among maladjusted as among adjusted individual and through clinical judgment could no be
legitimately place in either C or P group. All the N responses are scored 3. For example, “Most
girls . . . are females” or “When I was child . . . I spoke as a child”. These types of responses will
lie in neutral responses.

RELATIVELY LONG RESPONSES


Additional Point (+1)

RELIABILITY

NORMS
The RISB manual reports adequate internal consistency, stability and interrater agreement.
Because the RISB is designed to sample broad content areas, assessing the internal consistency
of the measure yields only conservative estimate of its reliability. However the RISB still yields
moderate reliability values for both half-split reliability estimate and Cronbach's alpha. Split-half
estimates for different forms of RISB range from .74 to .84 in males and .83 to .86 for females.
Cronbach's alpha was .69 for a sample of college men. Thus moderate internal consistency is
evident in spite of the RISB's diverse content.

SCORING
RISB in Comparison with:

SCORING:
In terms of inter-scorer reliability, the original validity study of RISB found
coefficient of .91 for males and .96 for females. Since that time, such estimates have been
replicated in the literature, and the coefficients of agreement have ranged from as high as .99 to a
low of .72
Interpretation of RISB

The raw score is the simple numerical count response. Such as the number of the correct
responses of intelligence test.

Conflict responses:
Indicate unhealthy and maladjustment of behavior. Conflict response range is C1, C2 and C3
according to severity of conflicts.
Conflict responses

Total responses of C1 0
Total responses of C2 7
Total responses of C3 1

Positive responses:
Indicate helpful and healthy mind. Positive response range is P1, P2 and P3 according to
degree of responses.
Positive responses
Total responses of P1 3
Total responses of P2 16
Total responses of P3 5

Neutral responses:
Means neither positive nor negative or equal to N=5
Neutral responses
Total responses of N 8
Omission
Means no answer given or incomplete thought

Quantitative Analysis

C1 = 4 C2 = 5 C3 = 6 N=3 P1 = 2 P2 = 1 P3 = 0 Total Scoring

0 7 1 8 3 16 5 40

0 35 6 24 6 16 0 87

Total Responses
Positive response Neutral response Conflict response
22 24 41

Total = 87

Predetermined Subject
Cut of score Score point
135 87

Analysis of RISB:
Her responses showed positive conditions in more areas, she feels comfort in the society. The
client score is 87 which as below from 135 it shows her mental stability. About social world, her
thoughts and feelings were positive.

The Beck Depression Inventory


There are many assessment and diagnostic tools that either measure intelligence, aptitudes,
achievements, and behaviors so it is was no surprise when the Beck Depression Inventory (BDI)
was created in 1961 by Aaron T. Beck, a pioneer in cognitive therapy, with the sole purpose of
determining the severity and intensity level of the symptoms of depression. The Beck Depression
Inventory is defined as a validated measure that has been instrumental in leading to numerous
diagnoses of depression due to its most recent revisions that more closely resemble the
diagnostic criteria for depression (Gregory,2007). Over the year many studies have questioned
the credibility of the BDI but its soundness have been established through documentations of the
internal consistency of the scale, its test-retest reliability, and its extensive validation against
other measures of depression and independent criteria for depression (Gregory, 2007). The
utilization of the Beck Depression Inventory still continues to expand into a variety of clinical
and non-clinical practice sites to identify depressive symptoms that might have otherwise
unrecognized and undiagnosed. Similar to many other assessment tools it credibility and
usability have come into question but research studies have shown that the Beck Depression
Inventory is successful when it comes to producing the outcome that it is intended to measure.

Description of the test and its History:


The Beck Depression Inventory is
a widely utilized 21-item self-report scale in both clinical and research studies (Beck et al.,1996).
The scale was originally developed in 1961 as an interviewer-assisted format but has undergone
several revisions over the last 35 years from the BDI-1A(1978), to the most recent version The
Beck Depression Inventory-II (1996) which is a completely self-administered format. The Beck
Depression Inventory-II is a depression rating scale that can be used in individuals that are ages
13 years and older, and rates symptoms of depression in terms of severity on a scale from 0 to 3
based on the 21 specific items. Patients that endorse multiple items on the questionnaire(i.e.
sadness, pessimism, past failure, loss of pleasure, guilty feelings, punishment fears, self-dislike,
and so forth) typically have higher scores with a maximum score of 63 compared to others. The
sum of the BDI generally represents the severity of the depression with the test being scored
differently for the general population compared to those individuals with an established clinical
diagnosis of depression. For the general population, a score of 21 or greater is associated with
depression but for individuals who have been clinically diagnosed, scores from 0 to 9 represent
minimal depressive symptoms, scores of 10 to 16 indicate mild depression, scores of 17 to 29
indicate moderate depression, and scores of 30 to 63 indicate severe depression. Content and use
of Beck Depression Inventory The self-report consists of questionnaires that primarily focus on
the cognitive distortions that underlie depression (Beck & Steer,1987). The current version of the
inventory was specifically developed to serve as an assessment of symptoms that correspond to
the criteria for diagnosing depressive disorders listed in the American Psychiatric Association’s
publication of the Diagnostic and Statistical Manual for Mental Disorders-Fourth Edition- Text
Revision(American Psychiatric Association, 2000). In its attempt to further reflect the current
DSM-IV diagnostic criteria for depression not only were new items such as agitation,
worthlessness, loss of energy and concentration difficulty included but both increases and
decreases in appetite were added in the same item with hypersomnia and hyposomnia in another
item. Items related to changes in body image, hypochondria, and difficulty working were
replaced but item dealing with thoughts of suicide, interest in sex, and feelings of being punished
remained the same. When it comes to analyzing the content of the revised version of the
questionnaire, items generally cover the cognitive and affective components of depression. The
addition of these components into the assessment process only allow for a more definitive
conclusion to be reached when there is the question as to the presence or absence of depression.

Benefits of the Beck Depression Inventory:


The rating duration for the
BDI was changed from a 1-week period to over the past 2 weeks with the revised BDI-II. The
revised version represents a significant milestone because of improvements having been made
over the original structure which includes revisions to the content, psychometric validity,
external validity, and its ability to be utilized in widespread clinical practice sites(Beck et al.,
1996). One of the primary characteristics that has lead to the increased popularity of the use of
the BDI-II is that the majority of people are able to complete the 21 items of the self-report
within a 5-10 minutes time span but in order for this occur the administrator must make it a point
to preserve the integrity of the test results such as the testing environment possessing sufficient
illumination for reading and being quiet enough to facilitate adequate concentration for the test
taker.

Reliability, Validity, and Factor Analysis:


The reliability of the
Beck Depression Inventory has been based off its use in clinical studies as well as being shown
to be exemplary for use in both depressed and nondepressed samples of older individuals as
established in other age groups as well(Gallagher et al.,1982). A study that examines that
psychometric properties of the Beck Depression Inventory was conducted by Wiebe et al(2005)
that focused on a comparison of the psychometric properties of English and Spanish language
versions of the Beck Depression Inventory in substantial samples of undergraduate students
yielded results that provided evidence of the strong internal consistency of the BDI-II across both
languages, and the test-retest reliability of the BDI –II being acceptable for both languages
(Wiebe,2005). The reliability of the Spanish translation slightly exceeded the original English
version with Spanish translation producing a coefficient of 0.91 and the English version yielding
a coefficient of 0.89 leaving little or no variability that could be attributed to language
(Wiebe,2005). An analysis of factor validity using confirmatory factor analysis (CFAs) to find
the best fit for the two factor model demonstrated that the English language factor structure
showed a good fit with the data from the Spanish instrument. The results of analysis
demonstrated that the translation is appropriate for use in both medical and student samples.
With the ever increasing need to provide mental health service that are sensitive to any given
culture it is recognized that continued research is required to document the validity and reliability
of commonly used clinical and research instruments.

Psychometric Qualities of Beck Depression Inventory :


It is widely known from various studies that the psychometric qualities of the
BDI are believed to be quite sound (Beck et al., 1996). The self-report manual is well written,
and succeeds in providing the reader with information regarding norms, factor analysis, and
nonparametic item-option characteristic curves for each item. The BDI manages to retain a high
level of standardization by maintaining a consistent practice of uniformity for test administrators
and advising examiners about scenarios for potential distortions of test results by keeping in
mind that self-reports inventories are subject to a response bias, and ultimately factoring this into
the overall interpretation of the test results (Dozois et al., 1998). Additionally, the formulation of
the directions allows for a consistent administration process, for example, the statements for each
of the 21 items attempt to maintain the same format such as (0= I am upbeat about the future),
(1= I feel slightly discourages about the future), (2= I feel the future has little to offer for me),
(3=I feel that the future is utterly hopeless). This format of questioning is representative of the
format for all of the 21 items with a total raw score serving as the sum of the endorsements of
symptoms of depression.

Critique of the Beck Depression Inventory:


A critique of the strength
and weaknesses of the Beck Depression Inventory can produce findings about its psychometric
qualities such as the advantage of its uniform standardization procedure which consists of easy
and formalized directions for test administration as well as the ability of reader to comprehend
the guidelines provided in the instructional manual. On the other hand, there are disadvantages to
that the manual does not adequately address such as the potential for clients to alter their
presentations based on an incentive or personal agenda associated with being diagnosed with
depression. The Beck Depression Inventory manual does not completely address how such an
issue can be appropriately handled so as to not interfere with the test results, but rather provides
an ambiguous answer for a possible resolution. For the most part, The Beck Depression
Inventory reports “correlations of 0.93 and 0.84” between the BDIII and its predecessors in two
samples of 191 and 84 outpatients and the correlations between of 0.68 and 71, respectively,
between the BDI-II and two other depression instruments(The Revised Hamilton Psychiatric
Rating Scale for Depression and the Beck Hopelessness Scale (Sprinkle et al., 2002,pp.381).

Use of Beck Depression Inventory in clinical practice:


As a clinical psychologist, the Beck Depression Inventory can be used during a
patient encounter to gauge whether or not a patient endorses feeling of depression. If after
multiple encounter the patient exhibits classical symptoms of depression, the inventory could be
utilized to confirm or deny this suspicion through self-report. The Beck Depression Inventory
can serve as the first tier for the assessment of depression with the DSM-IVTR coming behind to
provide an official diagnosis (American Psychiatric Association,2000). The benefits that can be
reaped from using the Beck Depression Inventory come from its ease of administration and
understandable questions that allow the user to maneuver through the 21 items of the
questionnaire. The simplicity of the questionnaire allows for its use with a wide variety of
patients from adolescents to adults which can then lead to an increase of undiagnosed of
unrecognized depressive symptoms. Along with the ability to identify those patients that might
be exhibiting depressive symptoms, the inventory can produce problems with more patients
stating that they are depressed in order to benefit from the receiving the diagnosis, or having a
personal agenda the comes with having the label of depression.

Disadvantage (Challenges) of the Beck Depression


Inventory:
As a clinical psychologist,the challenge will come when
determining which patients are exhibiting legitimate signs and symptoms of depression or which
patients are pretending in the hopes of reaping some type of benefit. Patients are either capable
of hiding their despair or can exaggerate their depression with the Beck Depression Inventory but
for those patients who are motivated to accurately display their inner emotions, the inventory
represents one of the best instruments for identifying the presence and/or severity of depressive
symptoms (Stehouwer, 1987). Also, the results of the Beck Depression Inventory can be used to
not only assess and monitor changes in depressive symptoms among people in a mental health
care setting but it can be translated to other practice settings whether inpatient and outpatient
(Beck et al. 1988). The ubiquitous use of the inventory can potentially allow for a confirmed
diagnosis of depression based off aspects of the DSM-IV-TR or assist with producing a list of
other mood disorders that might be the culprit if it is not depression. Summarization of the Beck
Depression Inventory The current status of the Beck Depression Inventory signifies a test that
has reached a pinnacle in terms of its level of merit and credibility. Over many years, it has
undergone two significant revisions with the most recent (2nd edition revision) displaying the
greatest improvements that have ever been made to the instrument. In order for it to become a
better representation of the diagnostic criteria that was been established by the American
Psychiatric Association for depression it has incorporated key components of the DSM-IV-TR
criteria for depression into the 21 items of the inventory. The revised version of the Beck
Inventory brings the scale into a better accord with current psychiatric diagnostic criteria (Ward,
2006). An emphasis has been placed on observing cognitive and affects components of
depression such as pessimism, guilt, crying, indecision, and selfaccusations and eight items
focusing on somatic and performance variables such as sleep problems, body image, and work
difficulties (Gregory, 2007). The Beck Depression Inventory has established itself to be a
reliable instrument for providing a comprehensive assessment of depressive symptoms across
genders and a wide age gap. The transition from an interview-based to self-report has brought
both advantages and disadvantages for the assessment process. For example, an advantage is the
fact that patients are more likely to verbalize their inner feelings if they are in control of the
process but a disadvantage is the ability for non-motivated patients to manipulate the process and
state what they feel will lead to them receiving a diagnosis of depression and having some type
of personal gain.
Interpretation of BDI

Minimal Depression Mild Depression ModerateDepression Severe Depression

0-9 10-18 19-29 30-63

Analysis of BDI:
The clients score on BDI is 5 it indicates that the subject is suffering from “Minimal
Depression”.
Thematic Apperception Test

Thematic apperception test (TAT) is a projective psychological test developed during the
1930s by Henry A. Murray and Christiana D. Morgan at Harvard University. Proponents of the
technique assert that subjects' responses, in the narratives they make up about ambiguous
pictures of people, reveal their underlying motives, concerns, and the way they see the social
world. Historically, the test has been among the most widely researched, taught, and used of such
techniques.

History
The TAT was developed by American psychologist Murray and lay psychoanalyst
Morgan at the Harvard Clinic at Harvard University during the 1930s. Anecdotally, the idea for
the TAT emerged from a question asked by one of Murray's undergraduate students, Cecilia
Roberts. She reported that when her son was ill, he spent the day making up stories about images
in magazines and she asked Murray if pictures could be employed in a clinical setting to explore
the underlying dynamics of personality.

Murray wanted to use a measure that would reveal information about the whole person but found
the contemporary tests of his time lacking in this regard. Therefore, he created the TAT. The
rationale behind the technique is that people tend to interpret ambiguous situations in accordance
with their own past experiences and current motivations, which may be conscious or
unconscious. Murray reasoned that by asking people to tell a story about a picture, their defenses
to the examiner would be lowered as they would not realize the sensitive personal information
they were divulging by creating the story.

Murray and Morgan spent the 1930s selecting pictures from illustrative magazines and
developing the test. After 3 versions of the test (Series A, Series B, and Series C), Morgan and
Murray decided on the final set of pictures, Series D, which remains in use today. Although she
was given first authorship on the first published paper about the TAT in 1935, Morgan did not
receive authorship credit on the final published instrument. Reportedly, her role in the creation of
the TAT was primarily in the selection and editing of the images, but due to the primacy of the
name on the original publication the majority of written inquiries about the TAT were addressed
to her; since most of these letters included questions that she could not answer, she requested that
her name be removed from future authorship.

During the time Murray was developing the TAT he was also involved in Herman Melville
studies. The therapeutic technique originally came to him from the "Doubloon chapter" in Moby
Dick.[6] In this chapter, multiple characters inspect the same image (a Doubloon), but each
character has vastly different interpretations of the imagery—Ahab sees symbols of himself in
the coin, while the religiously devout Starbuck sees the Christian Trinity. Other characters
provide interpretations of the image that give more insight into the characters themselves based
on their interpretations of the imagery. Crew members, including Ahab, project their self
perceptions onto the coin which was nailed to the mast. Murray, a lifelong Melvillist, often
maintained that all of Melville's oeuvre was for him a TAT.

After World War II, the TAT was adopted more broadly by psychoanalysts and clinicians to
evaluate emotionally disturbed patients. Later, in the 1970s, the Human Potential
Movement encouraged psychologists to use the TAT to help their clients understand themselves
better and stimulate personal growth.

Procedure
The TAT is popularly known as the picture interpretation technique because it uses a
series of provocative yet ambiguous pictures about which the subject is asked to tell a story. The
TAT manual provides the administration instructions used by Murray, although these procedures
are commonly altered. The subject is asked to tell as dramatic a story as they can for each picture
presented, including the following:

 what has led up to the event shown


 what is happening at the moment
 what the characters are feeling and thinking
 what the outcome of the story was
If these elements are omitted, particularly for children or individuals of low cognitive abilities,
the evaluator may ask the subject about them directly. Otherwise, the examiner is to avoid
interjecting and should not answer questions about the content of the pictures. The examiner
records stories verbatim for later interpretation.
The complete version of the test contains 32 picture cards. Some of the cards show male figures,
some female, some both male and female figures, some of ambiguous gender, some adults, some
children, and some show no human figures at all. One card is completely blank and is used to
elicit both a scene and a story about the given scene from the storyteller. Although the cards were
originally designed to be matched to the subject in terms of age and gender, any card may be
used with any subject. Murray hypothesized that stories would yield better information about a
client if the majority of cards administered featured a character similar in age and gender to the
client.
Although Murray recommended using 20 cards, most practitioners choose a set of between 8 and
12 selected cards, either using cards that they feel are generally useful, or that they believe will
encourage the subject's expression of emotional conflicts relevant to their specific history and
situation. However, the examiner should aim to select a variety of cards in order to get a more
global perspective of the storyteller and to avoid confirmation bias (i.e., finding only what you
are looking for).
Many of the TAT drawings consist of sets of themes such as: success and failure, competition
and jealousy, feeling about relationships, aggression, and sexuality. These are usually depicted
through picture cards.

Psychometric characteristics
Thematic Apperception Tests are meant to evoke an involuntary
display of one’s subconscious. There is no standardization for evaluating one’s TAT responses;
each evaluation is completely subjective because each response is
unique. Validity and reliability are, consequently, the largest question marks of the TAT. There
are trends and patterns, which help identify psychological traits, but there are no distinct
responses to indicate different conditions a patient may or may not have. Medical professionals
most commonly use it in the early stages of patient treatment. The TAT helps professionals
identify a broad range of issues that their patients may suffer from. Even when individual scoring
procedures are examined, the absence of standardization or norms make it difficult to compare
the results of validity and reliability research across studies. Specifically, even studies using the
same scoring system often use different cards, or a different number of cards. Standardization is
also absent amongst clinicians, who often alter the instructions and procedures. Murstein
explained that different cards may be more or less useful for specific clinical questions and
purposes, making the use of one set of cards for all clients impractical.

Reliability
Internal consistency, a reliability estimate focusing on how highly test items correlate to
each other, is often quite low for TAT scoring systems. Some authors have argued that internal
consistency measures do not apply to the TAT. In contrast to traditional test items, which should
all measure the same construct and be correlated to each other, each TAT card represents a
different situation and should yield highly different response themes. Lilienfeld and
colleagues countered this point by questioning the practice of compiling TAT responses to form
scores. Both inter-rater reliability (the degree to which different raters score TAT responses the
same) and test–retest reliability (to degree to which individuals receive the same scores over
time) are highly variable across scoring techniques. However, Murray asserted that TAT answers
are highly related to internal states such that high test-retest reliability should not be
expected. Gruber and Kreuzpointner (2013) developed a new method for calculating internal
consistency using categories instead of pictures. As they demonstrated in a mathematical proof,
their method provides a better fit for the underlying construction principles of TAT, and also
achieved adequate Cronbach's alpha scores up to .84

Validity
The validity of the TAT, or the degree to which it measures what it is supposed to
measure, is low. Jenkins has stated that “the phrase ‘validity of the TAT’ is meaningless,
because validity is specific not to the pictures, but to the set of scores derived from the
population, purpose, and circumstances involved in any given data collection." That is, the
validity of the test would be ascertained by seeing how clinician's decisions were assisted based
on the TAT. Evidence on this front suggests it is a weak guide at best. For example, one study
indicated that clinicians classified individuals as clinical or non-clinical at close to chance levels
(57% where 50% would be guessing) based on TAT data alone. The same study found that
classifications were 88% correct based on MMPI data. Using TAT in addition to the MMPI
reduced accuracy to 80%

Alternate considerations
Despite the conflicting information about the psychometric
characteristics of the TAT, proponents have argued that the TAT should not be judged using
traditional standards of reliability and validity. According to Holt, “the TAT is a complex
method of assessing people, which does not lend itself to the standard rules of thumb about test
standards [. . .]” (p. 101). For example, it has been argued that the purpose of the TAT is to
reveal a wide range of personality characteristics and complex, nuanced patterns, as opposed to
traditional psychological tests that are designed to measure unitary and narrow
constructs. Hibbard and colleagues examined several considerations about traditional views of
reliability and validity as they apply to the TAT. First, they noted that traditional views of
reliability may limit the validity of a measure (such as occurs with multi-faceted concepts in
which characteristics are not necessarily related to each other, but are meaningful in
combination). Further, Cronbach's alpha, a commonly used measure of internal consistency, is
dependent on the number of items in scale. For the TAT, most scales use only a small number of
cards (with each card treated like an item) so alphas would not be expected to be very high.
Many clinicians also discount the importance of psychometrics, believing that generalizability of
the findings to a given client’s situation is more important than generalizing findings to the
population.

Scoring systems
When he created the TAT, Murray also developed a scoring system based on
his need-press theory of personality. Murray's system involved coding every sentence given for
the presence of 28 needs and 20 presses (environmental influences), which were then scored
from 1 to 5, based on intensity, frequency, duration, and importance to the plot. However,
implementing this scoring system is time-consuming and was not widely used. Rather, examiners
have traditionally relied on their clinical intuition to come to conclusions about storytellers.
Although not widely used in the clinical setting, several formal scoring systems have been
developed for analyzing TAT stories systematically and consistently. Three common methods
that are currently used in research are the:
Defense Mechanisms Manual (DMM)
This assesses three defense mechanisms: denial (least
mature), projection (intermediate), and identification (most mature). A person's thoughts/feelings
are projected in stories involved.

Social Cognition and Object Relations (SCOR) scale


This assesses four different dimensions of object
relations: Complexity of Representations of People, Affect-Tone of Relationship Paradigms,
Capacity for Emotional Investment in Relationships and Moral Standards, and Understanding of
Social Causality.

Personal Problem-Solving System—Revised (PPSS-R)


This assesses how people identify, think about and
resolve problems through the scoring of thirteen different criteria. This scoring system is useful
because theoretically, good problem-solving ability is an indicator of an individual’s mental
health. Although the TAT is a projective personality technique that is based primarily on the
psychoanalytic perspective, the PPSS-R scoring system is designed for clinicians and researchers
working from a cognitive behavioral framework. The PPSS-R scoring system has been studied in
a wide range of populations, including college students, community residents, jail inmates,
university clinic clients, community mental health center clients, and psychiatric day treatment
clients. Thus, the PPSS-R scoring system allows clinicians and researchers to assess for problem
solving ability and social functioning in many types of people, without being hindered by social
desirability effects.
Similar to other scoring systems, with the PPSS-R TAT cards are typically administered
individually and examinees' responses are recorded verbatim. Unlike other scoring systems, the
PPSS-R only uses six of the 31 TAT cards: 1, 2, 4, 7BM, 10, and 13MF. The PPSS-R provides
information about four different areas related to problem solving ability: Story Design, Story
Orientation, Story Solutions, and Story Resolution. These four areas are assessed by the 13
scoring criteria, 12 of which are rated on a 5-point scale that ranges from -1 to 3.
Each of these scoring categories attempts to measure the following information:

 Story Design measures the examinee's ability to identify and formulate a problem
situation.
 Story Orientation assesses the examinee's level of personal control, emotional distress,
confidence and motivation.
 Story Solutions assesses how impulsive the examinee is. In addition to evaluating the
types of problem solutions that are provided, the number of problem solutions that
examinees provide for each of the TAT cards is summed.
 Story Resolution provides information on the examinee's ability to formulate problem
solutions that maximize both short and long-term goals.
Examiners are encouraged to explore information obtained from the TAT stories as hypotheses
for testing rather than concrete facts.

General Interpretation
Interpretation of the responses will vary depending on the examiner and
what type of scoring was used. It is common that the standard scoring systems are used more in
research settings than clinical settings. Individuals can select certain scoring systems if they have
the goal to evaluate a specific variable such as motivation, defense mechanisms, achievement,
problem-solving skills, etc. If a clinician selects not to use a scoring system, there are some
general guidelines that can be utilized. For example, the stories created by the individuals in
response to the TAT cards are a combination of three things: the card stimulus, the testing
environment, and the personality of the examinee. For each card, the individual must
subjectively interpret the pictures which involves the individual taking their own experiences and
feelings to create a story. Therefore, it is beneficial to look at the common themes in the stories’
content and structure to help make conclusions.
With interpretation of the responses, it is important for the clinician to consider some cautions to
verify the information is as accurate as possible. First, the examiner should always be
conservative when interpreting responses. It is important to always err on the side of caution
instead of making bold conclusions. The examiner should also consider all the data when using
the TAT in a testing or evaluative setting. One response should not be given more importance
over the other responses. Additionally, the examiner should take the individual’s developmental
status and cultural background into consideration when examining responses. All of these
cautions should be considered when an examiner is using the TAT.

Criticisms
Like other projective techniques, the TAT has been criticized on the basis of poor
psychometric properties (see above). Criticisms include that the TAT is unscientific because it
cannot be proved to be valid (that it actually measures what it claims to measure), or reliable
(that it gives consistent results over time). As stories about the cards are a reflection of both the
conscious and unconscious motives of the storyteller, it is difficult to disprove the conclusions of
the examiner and to find appropriate behavioral measures that would represent the personality
traits under examination. Characteristics of the TAT that make conclusions based on the stories
yielded from TAT cards hard to be disproved have been termed "immunizing tactics." These
characteristics include the Walter Mitty effect (i.e., the assertion that individuals will exhibit high
levels of a given trait in TAT stories that do not match their overt behavior because TAT
responses may represent how a person wishes they were, not how they truly are) and the
inhibition effect (i.e., the assertion that individuals will not exhibit high levels of a trait in TAT
responses because they are repressing that trait). In addition, as the present needs of the
storyteller change over time, it is not expected that later stories will produce the same results.
The lack of standardization of the cards given and scoring systems applied is problematic
because it makes comparing research on the TAT very difficult. With a dearth of sound evidence
and normative samples, it is tough to determine how much useful information can be gathered in
this manner.
Some critics of the TAT cards have observed that the characters and environments are dated,
even 'old-fashioned', creating a 'cultural or psycho-social distance' between the patients and the
stimuli that makes identifying with them less likely. In specific situations it is even hard to
identify with people of opposite gender. Also, in researching the responses of subjects given
photographs versus the TAT, researchers found that the TAT cards evoked more 'deviant' stories
(i.e., more negative) than photographs, leading researchers to conclude that the difference was
due to the differences in the characteristics of the images used as stimuli.
In a 2005 dissertation, Matthew Narron, Psy.D. attempted to address these issues by reproducing
a Leopold Bellak 10 card set photographically and performing an outcome study. The results
concluded that the old TAT elicited answers that included many more specific time references
than the new TAT.

Contemporary applications
Despite criticisms, the TAT continues to be used as a tool for research into
areas of psychology such as dreams, fantasies, mate selection and what motivates people to
choose their occupation. Sometimes it is used in a psychiatric or psychological context to
assess personality disorders, thought disorders, in forensic examinations to evaluate crime
suspects, or to screen candidates for high-stress occupations. It is also commonly used in routine
psychological evaluations, typically without a formal scoring system, as a way to explore
emotional conflicts and object relations.
TAT is widely used in France and Argentina using a psychodynamic approach.
David McClelland and Ruth Jacobs conducted a 12-year longitudinal study of leadership using
TAT and found no gender differences in motivational predictors of attained management level.
The content analysis, however, "revealed 2 distinct styles of power-related themes that
distinguished the successful men from the successful women. The successful male managers
were more likely to use reactive power [that is, aggressive themes while the successful female
managers were more likely to use resourceful [that is, nurturing power themes. Differences
between the sexes in the power themes were less pronounced among the managers who had
remained in lower levels of management.

Popular culture
Due to the test's earlier popularity within psychology, the TAT has appeared in
a wide variety of media. For example, the Thomas Harris novel Red Dragon (1981) includes a
scene where the imprisoned psychiatrist and serial killer Dr. Hannibal Lecter mocks a previous
attempt to administer the test to him, while Michael Crichton included the TAT in the battery of
tests given to the disturbed main character Harry Benson in his novel The Terminal Man (1972).
The test is also given to the main characters in two widely differing tales about the human
mind: A Clockwork Orange (1962) and Daniel Keyes's Flowers for Algernon (1958–1966).
Italian poet Edoardo Sanguineti wrote a collection of poetry called T.A.T (1966–1968) that refers
to the Test.

References.
 https://www.encyclopedia.com/medicine/psychology/psychology-and-psychiatry/
thematic-apperception-test
 https://www.mentalhelp.net/psychological-testing/thematic-apperception-test/
 American Psychiatric Association. (2000). Diagnostic and statistical manual of mental
disorders (Revised 4th ed.). Washington, DC: Author.
 American Psychological Association(2002). Ethical Principles of Psychologists and
Code Of Conduct. Washington, DC: Author.
 Beck, A.T., Steer R.A., Brown G.K. (1996). Beck Depression Inventory Manual, 2nd
Edition. San Antonio, TX, Psychological Corporation.
 Beck, A. T.,Steer, R.A., & Garbin, G.M. (1988). Psychometric properties of the Beck
Depression Inventory: Twenty-five years of evaluation." Clinical Psychology Review, 8,
77- 100.
 Beck A.T., Beamesderfer, A.(1974). Assessment of depression: the depression
inventory. Mod
 Probl, 7,151–169.
 Beck, J.S., Beck, A.T., & Jolly, J.B.(2001). Beck Youth Inventories, San Antonio, TX:
Psychological Association.
 Dozois, D.J.A.,Dobsson, K.S., & Ahnberg, J.L.(1998). A psychometric evaluation of the
Beck Depression Inventory-II. Psychological Assessment, 10, 83-89.
 Gallagher D, Breckenridge J, Steinmetz J, et al(1982). The Beck Depression Inventory
and Research Diagnostic Criteria: congruence in an older population. J Consult Clin
Psychol, 51,945–946.
 Gregory, R. J. (2007). Psychological testing: History, principles, and applications (5th
ed.). Boston: Pearson Education, Inc.
 Leigh, I.W., & Anthony-Tolbert, S.(2001).Reliability of the BDI-II with deaf persons.
Rehabilitation Psychology, 46,195- 202.
 Osman, A.Kopper, B.A, Barrios, F. et al., (2007). Reliability and Validity of the Beck
Depression Inventory-II with Adolescent Psychiatric Inpatients. Psychological
Assessment,16,120- 132.
 Ward, L.C.(2006). Comparison of Factor Structure Models for the Beck Depression
Inventory-II. Psychological Assessment, 28, 81- 88.

You might also like