0% found this document useful (0 votes)
75 views235 pages

Zlib - Pub Personality Tests and Assessments

This document provides a summary of Philip E. Vernon's 1953 book "Personality Tests and Assessments". The book was one of the first comprehensive accounts of personality assessment methods by a British author. It surveys different techniques for personality testing including interviews, tests based on physical or psychological measures, tests of behavior, ratings and rating scales, questionnaires, and projective techniques. For each method, the evidence for or against its use is examined and references to relevant literature are provided. The book aims to provide an overview of contributions from British, American, and other psychologists at the time in the developing field of personality assessment.

Uploaded by

Tania Sarkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views235 pages

Zlib - Pub Personality Tests and Assessments

This document provides a summary of Philip E. Vernon's 1953 book "Personality Tests and Assessments". The book was one of the first comprehensive accounts of personality assessment methods by a British author. It surveys different techniques for personality testing including interviews, tests based on physical or psychological measures, tests of behavior, ratings and rating scales, questionnaires, and projective techniques. For each method, the evidence for or against its use is examined and references to relevant literature are provided. The book aims to provide an overview of contributions from British, American, and other psychologists at the time in the developing field of personality assessment.

Uploaded by

Tania Sarkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 235

PSYCHOLOGY REVIVALS

Personality Tests
and Assessments

Philip E. Vernon
Psychology R evivals

Personality T ests and A ssessm en ts

O r i g i n a l l y p u b l i s h e d in 1 9 5 3 th is b o o k p ro v i d e d th e first c o m ­
p re h e n s iv e a c c o u n t o f m e t h o d s o f p e rs o n a lity ass essm en t by a
B ritish a u th o r . It starts w i t h a s h o r t survey o f p e rs o n a lity th e o ry ,
p o i n t i n g o u t th e difficulties in an y m e t h o d o f te s ti n g or assess­
m e n t . N e x t it d esc rib es t h e w eak nesses o f t h e c o m m o n in te rv ie w
m e t h o d . ( T h r o u g h o u t th e e m p h a s is is on m e t h o d s w h ic h are
usab le in e d u c a t io n a l o r v o c atio n al g u i d a n c e a n d se lectio n , n o t on
m e t h o d s w h ic h are m a i n ly o f scientific in te re st.) T h e r e a f t e r it takes
u p each m a in ty p e o f te c h n i q u e - tests based on p h y s i q u e o r p s y ­
c h o lo g ica l m e asu res, on expressive m o v e m e n t su ch as g e s t u r e s and
h a n d w r i t i n g , tests o f b e h a v io u r ( i n c l u d i n g W a r Office Selection
Board ‘h o u se p a r t y ’ m e t h o d s ) , r a t in g s an d ra t in g scales, q u e s ­
tio n n a ir e s, a n d so-called p ro je c tiv e te c h n i q u e s . T h e e v id e n c e for or
a g a in s t each tes t o r m e t h o d is su rv eyed a n d n u m e r o u s references
p ro v i d e d for re le v a n t lite r a tu re . I l lu s tra tiv e e x c erp ts are g iv e n o f
m a n y o f th e m o re p r o m i s i n g tests, a n d so m e p ic to ria l il lu s tra tio n s .
B ritish w o rk in th is field a t th e t i m e is covered c o m p l e te l y , a n d an
a t t e m p t is m a d e to p ro v i d e a fair s u m m a r y o f th e m a i n c o n t r i b u ­
tio n s o f A m e ric a n a n d o t h e r p sy c h o lo g ists o f th e day.
This page intentionally left blank
Personality Tests
and Assessm ents

P h ilip E. V e r n o n

Routledge
Taylor & Francis Group
R
L O N D O N AN D NEW YORKI
First p u b li s h e d in 195 3
by M e t h u e n & C o Ltd

T h is e d itio n first p u b li s h e d in 2 0 1 4 by R o u tl e d g e
27 C h u r c h R o a d , H o v e , B N 3 2F A

S im u lta n e o u s ly p u b l i s h e d in th e U SA an d C a n ad a
by R o u tl e d g e
7 11 T h i r d A v e n u e , N e w Y o rk , N Y 1 0 0 1 7

Routledge is an imprint of the Taylor & Francis Group, an informa business

A ll r i g h ts reserved. N o p a rt o f th is bo o k m ay be re p r in te d or re p ro d u c e d
o r u tilise d in any form o r by any elec tronic, m e c h a n ic a l, or o t h e r m ean s,
now k n o w n o r h ereafter in v e n t e d , in c l u d i n g p h o to c o p y i n g a n d r e c o r d in g ,
or in an y in f o r m a t io n sto rag e o r retrie val sy s tem , w i t h o u t p e rm is s io n in
w r i t i n g from th e p u b lish e rs .

P u b lis h e r ’s N o te
T h e p u b li s h e r has g o n e to g r e a t le n g t h s to e n su re th e q u a li ty o f th is
re p r in t b u t p o in t s o u t t h a t so m e im p e rfe c tio n s in th e o rig in al copies may­
be a p p a re n t.

D is c la im e r
The p u b l i s h e r has m a d e every effort to trace c o p y r i g h t h o ld e rs and
w e lc o m e s c o rre sp o n d e n c e from th ose th e y have b een u n a b le to con tac t.

A L ibrary o f C o n g re ss record exists u n d e r LC C o n tro l no.: 5 3 0 1 3 2 3 8

IS B N : 9 7 8 - 0 - 4 1 5 - 7 1 6 6 5 - 9 (h b k )
IS B N : 9 7 8 - 1 - 3 1 5 - 8 7 9 5 3 - 6 (eb k)
PERSONALITY TESTS
AND
ASSESSMENTS

BY

P H IL IP E. VERNON
PROFESSOR OF EDUCATIONAL PSYCHOLOGY IN THE
INSTITUTE OF EDUCATION, UNIVERSITY OF LONDON

METHUEN & CO. LTD. LONDON


11 New Fetter Lane, E.C.4
First published in 1953
,
Reprinted 1957 1962 and 1965

1-4

CATALOGUE N O . 0 2 / 5438/36

R E P R IN T E D B Y L IT H O G R A P H Y I N G R E A T B R IT A IN B Y
JA R R O L D A N D S O N S L T D . N O R W IC H
TO
GORDON A LLPO RT
This page intentionally left blank
FOREWORD
OST of my own work in the field of personality assessment
M was done between 1928 and 1985, a t which tim e such
rapid developments were taking place th a t it seemed hardly
possible to encompass them in a book. However, I attem pted
to provide a system atic survey of verbal tests, questionnaires
and ratings in a monograph, which was published as a Report
of the Industrial H ealth Research Board in 1988. This has
long been out of print, and Chapters V II-IX of the present
volume cover much the same ground. I am grateful to the
Medical Research Council and to the Controller of Her
M ajesty’s Stationery Office for permission to reproduce some
of the material contained in th a t Report.
Since the 1930s, personality testing has become more
stabilized. The exigencies of applied psychology in war-time
showed th a t certain methods could be p u t to immediate use,
under carefully controlled conditions, whereas m any bright
ideas were quite impracticable. I t showed also th a t so-called
‘ clinical ’ methods like the interview and projection techniques,
in spite of their apparent advantages in yielding insight into
the personality as an organized whole, are very subjective and
untrustw orthy tools for vocational purposes. Thus the tim e
is now ripe for an overall appraisal of the various approaches to
personality assessment, for noting the more solid achievements
and the m ost promising lines for further development, and for
dismissing the unsuccessful.
The 1940s have also seen some decline of interest in the
assessment of individual personalities, and a greater emphasis
on the psychology of people’s behaviour in social groups. B ut
however true it may be th a t the individual person behaves
differently according to the structure of the group of which
he is a member, the problem of assessment still remains. The
results of most investigations in general and social psychology
are still affected by personality differences among th e people
being studied, and these need to be measured. The employer
or teacher, and the educational or vocational psychologist,
still wish to find out the character, social and emotional quali­
ties, the attitudes and interests, of prospective pupils, students,
vii
viii Personality Tests and Assessments
and employees. While this book tries to cover all such types of
assessment, it does not deal specifically with the diagnosis of
abnormal personalities, nor discuss clinical methods in detail.
Some use is made of the evidence from tests of neurotic and
psychotic patients, but chiefly in order to prove the worth
of the tests among normal adults and children.
Though there are numerous and valuable reviews of parts of
the field of personality assessment, and these are listed in the
short bibliography a t the end, no book dealing with the whole
of it appears to have been published since Symonds’s Diagnosing
Personality and Conduct in 1981, and none a t all by a British
author. (Special mention should be made of R. B. Cattell’s
Description and Measurement o f Personality and H. J . Eysenck’s
Dimensions of Personality; b u t both of these are primarily
outlines of the writers’ own contributions.) I have tried to
cover the majority of British publications, but the American
literature is by now so vast th a t I have had to be highly
selective. My judgments as to w hat to leave out are doubtless
open to question, but I believe th a t sufficient references are
supplied in the footnotes or the bibliography to enable the
research worker to follow up any topic th a t I appear to have
slighted. The book is written mainly for psychology students,
but it avoids technicalities as far as possible and will, it is
hoped, be useful to the intelligent layman who wishes to know
what psychology can contribute to the im portant problem of
personality assessment. Chapter I deals with fundamental
theoretical and methodological m atters ; it may,, if preferred,
be read last rather than first. The final chapter sums up the
practical implications of the survey.
My interest in personality psychology was first aroused by
the writings of Gordon W. Allport, and I owe more than I can
say to the stimulus of working with him during 1980-1981 and
1987. Thus I have ventured to dedicate this to him, although
I have diverged from his views in many respects. I am
particularly grateful also to Mark A. May and Henry A.
Murray for their inspiration, and for the help th a t they gave
with my studies while I was in America.
Acknowledgements are also due to the following for permission
to quote illustrative items from published tests :—
Dr. E. A. Doll, The Training School, Vineland, New Jersey-—
The Vineland Social Maturity Scale. Dr. D. A. Laird, Colegate
Foreword ix
University, Hamilton, N.Y.— Personal Inventories B2 and C8.
Professor T. F. Lentz, Washington University, St. Louis, Mo.—
C-R Opinionaire. Dr. E. J . Shoben and Genetic Psychology
Monographs—Scale of Parental Attitudes to Child Adjustment.
Dr. Lydia Jackson, London—Test o f Family Attitudes. F. H. and
G. W. Allport and Houghton Mifflin Co., Boston, Mass.—
Allport’s A -S Reaction Study and Allport-Vernon Study of
Values. Stanford University Press—Strong’s Vocational Interest
Blank, Bernreuter’s Personality Inventory, and Willoughby’s
Emotional Maturity Scale. Bureau of Publications, Teachers
College, Columbia University, New York—G. B. Watson’s Test o f
Public Opinion and Mailer’s Character Sketches. C. H. Stoelting
Co., Chicago, 111.—Woodworth’s Personal Data Sheet and
Pressey’s X -0 Test.
P.E.V.
This page intentionally left blank
CONTENTS

I. Introduction 1
II. The Interview : Its Reliability and Validity 20
III. Physical Signs of Personality 82
IV. Expressive Movements 40
V. Simple Behaviour and Cognitive Tests 68
VI. Miniature and Real Life Situation Tests 88
VII. Ratings and Judgments of Personality 101
VIII. Self-Ratings and Personality Questionnaires 122
IX. Measurement of Attitudes and Interests 144
X. Projection Techniques 170
XI. Conclusions and Future Developments 199
Short Bibliography of Suggested Reading 207
Index of Authors 209
Index of Subjects 215

xi
This page intentionally left blank
I

Introduction
N bringing up children wisely, and guiding them into suitable
Iaccount
educational careers and occupations, we have to take
of their personality qualities. The doctor will tell us
their physical capacities and defects ; school examinations and
psychological tests will give us a t least an approximate
indication of their educational abilities, intelligence and
aptitudes along special lines. B ut many a child with high
intellectual qualifications a t 11 years does not fulfil his promise
in a secondary grammar school through lack of perseverance,
weak academic interests, or emotional instability. Other
children who are dull according to tests, or in their school work,
develop into worthy members of society owing to their sound
personalities. In Personnel Selection in the British Forces 1
Dr. Parry and the writer have described the success of psycho­
logical methods of allocating recruits to jobs in the Services
during the Second World War. B ut there were many individuals
who did very much better, or less well, in some employment
than had been predicted, largely because of the difficulties of
making accurate personality assessments. For example, a little
progress was made—but far too little—in diagnosing the men
with poor morale or neurotic tendencies who were a liability
to the army, or the potential leaders who would make good
officers.
Cannot the psychologist, then, devise some tests of person­
ality, analogous to tests of intelligence and other abilities,
which could be applied by the teacher, the vocational or
personnel officer, or others interested in the future of a pupil
or employee ? Such tests would be of the utm ost value too to
Child Guidance Clinic workers, and to those concerned with
abnormal personalities such as delinquents and criminals,
neurotics or the insane. Much unhappiness and failure, not
only in school or employment, but also in marriage, might be
1 Cf. Bibliography. N .B .—Other references not listed in footnotes will
be found in the Bibliography at the end of the book.
X
a Personality Tests and Assessments
obviated by scientific personality testing. B ut though an
enormous amount of research and experiment on tests has, in
fact, been carried out by psychologists during the past thirty
years or so, the answer to our question is indubitably negative.
We shall see in this book th a t many personality qualities can
be measured or diagnosed fairly effectively, but th a t the methods
are far too elaborate and time-consuming, or far too dependent
on the skill and experience of the psychologist, to be generally
applicable for any practical purpose, or to be used by anyone
not specially trained. True, it is possible to suggest some
improvements on the unreliable methods th a t the average
layman habitually employs. Moreover, the outlook is more
hopeful in certain fields, such as the measurement of interests
and attitudes. B ut in the writer’s opinion it is safer to be
pessimistic regarding the future of personality testing in general,
and one of the objects of this chapter is to explain why.

DEFINITIONS OF PERSONALITY, CHARACTER


AND TEMPERAMENT

Personality has been variously defined, and its nature and


origins variously explained, as is shown by G. W. Allport in
his Personality— A Psychological Approach. Here we mean by
it, simply, what sort of a person is so-and-so, what is he like ?
Or as R. B. Cattell expresses it—personality is th a t which
enables us to predict a person’s behaviour in a given situation.
While a man’s intelligence, his bodily strength and skills are
certainly part of his personality, yet the term refers chiefly to
his emotional and social qualities, together with his drives,
sentiments and interests. (Note, however, th a t this is much
broader than the colloquial usage, whereby an individual is
sometimes said to ‘ have personality ’ if he is very domineering,
impressive or attractive.)
Character is often used synonymously with personality, b ut
is usually a more evaluative term. T hat is, it refers to certain
traits of personality which are approved or disapproved, such
as honesty, reliability, integrity, self-control and their opposites.
The word ‘ temperament ’ has also received diverse definitions,
but is most usefully limited to the constitutional and inborn
factors underlying personality—the instinctive drives, the
effects of the endocrine glands or other physiological factors on
Introduction 8
a person’s behaviour, and certain general tendencies which may
be a t least in p a rt hereditarily determined, such as the strength
or urgency of drives, excitability vs. placidity, and emotional
instability. We cannot in fact ever observe tem peram ent
directly, since even in early infancy it is influenced and modified
by the parents’ or nurse’s handling and other environm ental
factors. Nevertheless quite marked individual differences do
occur in the personalities of young babies, also among brothers
or sisters who have apparently been brought up alike. Hence
the existence of innate tem peram ental factors seems a reasonable
hypothesis.1 (Like personality, tem peram ent often has a
narrower colloquial usage, namely, the unstable or hysterical
traits commonly found among artists and actresses. B ut
according to our view this is merely one kind of tem peram ent.
Indeed, very probably it is not innate a t all, b u t is a kind of
personality.)

THE STRUCTURE OF PERSONALITY

Personality develops, then, from the interaction of the living


human organism with an environm ent th a t frustrates or
encourages, and conditions its impulses. Psychoanalysts have
shown how the manner of early handling, feeding, and weaning,
the love and security th a t the parents may give or withhold,
and the ‘ sanctions ’ th a t society imposes, mould the growing
child. Although their theories are largely unverified, con­
troversial and unscientific (so th a t few, if any, definite associa­
tions are established between particular methods of upbringing
and later personality traits 2), yet we m ay agree th a t an
organized system or structure is built up which includes the
conscious sentiments and interests, and the unconscious
1mechanisms ’ or complexes, and which determines the child’s
or ad u lt’s behaviour in any situation. Much progress is indeed
being made through the researches of medical and of ex­
perim ental depth psychologists towards formulating general

1 Recently Eysenck and Prell have obtained strong evidence, from a


study o f twins, that emotional stability-instability is hereditarily deter­
mined to at least the same extent as is general intelligence. Eysenck, H. J .
and Prell, D . B ., ‘ The Inheritance of Neuroticism : An Experimental
Study ’. J . Ment. Sci., 1951, 97, 441-165.
• Cf. Orlansky, H., Bibliography.
4 Personality Tests and Assessments
principles of personality dynamics,1 though this is not the place
to describe such work. Special mention should be made, how­
ever, of Allport’s principle of functional autonomy—the view
th a t fresh mechanisms and interests continue to develop during
a man’s life-time and to become self-supporting. There is no
justification for the implication th a t the motive forces of his
behaviour invariably trace back to McDougallian instincts, or
to Freudian complexes which are fixed during the pre-school
years. The evidence presented in later chapters does not
support the view th a t personality can only be understood if it
is studied longitudinally or historically. Nevertheless this
structure is fairly stable, and thus produces consistency of
behaviour towards similar situations from time to time. (For
example, Neilon * has compared personality sketches of
2-year-old children with independent sketches of the same
individuals 15 years later, and found th a t judges could match
or identify the one set with the other fairly successfully. In
terms of the present writer’s matching formula, the consistency
of personality over this time is represented by a coefficient of
0-64.) I t is the extreme complexity of the structure, and the
fact th a t many of its links are repressed into the unconscious
mind, which make it so difficult for us to understand a person’s
motives. Personal behaviour is always meaningful, and even
a t its most irrational (as in the neurotic or psychotic) it is
logically determined by this structure. Yet it may appear to
vary inexplicably. The same child may display very different
characteristics a t home, a t play with his friends, in school classes
under different teachers. And we interpret it so diversely
th a t experimental investigations rarely yield correlations higher
than 0-5 to 0-6 between assessments of a person’s traits by
different acquaintances or observers.
Research on leadership during and since the war has been
particularly enlightening in showing th a t this quality of
behaviour is not, as it were, a fixed 4 property ’ of the individual,
but th a t it varies according to the kind of social group of which
1 E.g. Sean, R. R , Survey of Objective Studies of Psychoanalytic Concepts.
New Y ork : Social Science Research Council, 1943. Cattell, R. B.,
Personality. New Y ork : McGraw-Hill, 1950. Stagner, R ., see Biblio­
graphy.
* Neilon, P., ‘ Shirley’s Babies After Fifteen Years : A Personality
Study J . Genet. Psychol., 1948, 78, 175-186.
Introduction 5
he is a member and the activities in which the group is engaged.1
O ther relevant studies, such as those of H artshorne and May
on honesty, will be cited in later chapters. B u t this does not
mean, as some American writers have supposed, th a t personality
consists of vast numbers of independent habits, specific to each
situation. The consistency is there if we could b u t trace it.
Nevertheless the traits or qualities of behaviour by which we
describe people are, it m ust be adm itted, very rough and over­
simplified generalizations. We are far too a p t to jum p to
conclusions, to ignore the complexities of structure, and to
assume th a t people will always react in certain limited ways—
th a t a boy who, say, is caught cheating in class is destined for a
life of crime, whereas the cheating may have arisen from all
sorts of motives. Hence experimental research reveals a
tendency th a t Thorndike * called the * halo effect I f we rate
or assess a number of people on several presumably distinctive
traits (say good looks, intelligence, sociability, moral character),
it is always found th a t the ratings overlap rath er closely.
Presum ably we are influenced, unw ittingly, by our. general
good or bad impressions of the people, and so attrib u te all the
desirable traits to some, undesirable ones to others, almost
regardless of their meaning.

PERSONALITY TRAITS

Since this book is mainly concerned w ith attem pts to measure


or judge personality traits, we m ust define the sense in which
the term is used. T rait refers to any characteristic in which
people differ or vary from one another. B u t such differences
exist a t many levels, as it were. There are physical character­
istics, and objective features of behaviour such as speed of
walking, am ount of time spent a t church or the cinema, etc.
A t the other end of the scale there are th e psychoanalyst’s
mechanisms or complexes, which cannot be directly observed
a t all, bu t which are inferred as the underlying ‘ ganglia ’ of
personality structure. Personality traits lie between these
extrem es; they are more general qualities of social and
1 Cf. Carter, L., H aythom , W ., and Howell, M., * A Further Investigation
o f the Criteria o f Leadership ’. J . Abn. Soc. Psychol., 1950, 45, 850-858.
* Thorndike, E . L., * A Constant Error in Psychological Ratings ’.
J . A ppl. Psychol., 1920, 4, 25-29.
6 Personality Tests and Assessments
emotional behaviour—the common features which we abstract
from observing how people differ. And we would regard them
(like Stagner) as descriptive rather than (like Allport) as
explanatory. Cattell distinguishes between descriptive or
‘ surface ’ traits, and underlying ‘ source ’ traits. We would
accept this, but doubt his claim to be able to determine source
traits merely by factor analysis (cf. p. 12).
The scientific study and measurement of traits is exceedingly
difficult, for several reasons. First, they are mostly very vague
a r d ambiguous in meaning, and different people often include
different modes of behaviour within any one trait. Such things
as height and weight are objective; any two observers would
arrive a t practically the same measurements. Mechanical
ability, arithm etical attainm ent and the like, also comprise
certain types of behaviour which can be fairly readily recognized
and agreed upon by different observers (though even here the
notorious unreliability of examinations shows th a t their
objectivity is limited). Leadership, honesty, persistence,
introversion, aggressiveness, tim idity, etc., are still more com­
plex. Behaviour which one person interprets as aggressive
m ight be called adventurous by another, or limelight exhibi­
tionism by another.
Our second difficulty is th a t they involve subjective interpre­
tation. They are partly dependent on the observer. His own
personality and viewpoint both influence w hat he notices in
other people’s behaviour, and his interpretations of th e traits
responsible for such behaviour.1 For example, a Victorian
parent sees naughty actions in a child, and attribu tes these to
rebelliousness, where a progressive parent either notices nothing,
or regards the actions as a sign of the child’s initiative. We
recognize this tendency in everyday life ; for example, we do
not accept A’s judgm ent of B a t its face value if we regard A
as a prejudiced person. And the more we penetrate behind the
superficial behaviour characteristics to the underlying motives,
the more subjective interpretation enters. The novelist,
dram atist and biographer who excel in depicting personality do
not merely give us a factual record of a m an’s actions over a
period. They select the most significant incidents, and infer
or intuit the m an’s feelings, desires, fears, ideas, and so integrate
1 Cf. Vernon, P. E ., * The Biosocial Nature of the Personality Trait ’.
Psychol. Rev., 1933, 40, 533-54S.
Introduction 7
the facts into a meaningful whole. Much the same is true of
the psychoanalyst’s or clinical psychologist’s case study.
A third point, referred to above, is the artificiality of studying
personality in isolation from society. This runs counter to
one of the dom inant trends in contemporary psychology.
Nevertheless the applied psychologist or the layman continually
makes judgments about individuals as though they possessed
distinctive traits, attitudes, and interests in almost any social
context. The extent to which success or failure in a job, or the
development of neurotic or criminal tendencies, is determined
by the qualities of the individual or of the group, is a m atter
for experiment rather than for theoretical argument.
Fourthly, we have seen th a t any particular piece of behaviour
depends on such a m ultiplicity of factors in the personality
structure and the environment, th a t no one reacts in accordance
with a tra it all the time. Nevertheless this does not mean th a t
it is hopeless to try to measure personality traits or to assess
people. For it is clear th a t some individuals behave more
markedly and frequently than others do in a manner th a t
most of us would call, say, timid, and th a t others are more
bold or fearless. Of course we cannot measure tim idity in
absolute physical units as we can height, tem perature, etc. B ut
so long as we can arrange people in rank order for the trait,
or agree th a t some are ‘ high ’, some ‘ low ’, the essential
requirem ents of measurem ent are m et.1 The most fruitful
approach (which originated in May and H artshorne’s investiga­
tions of character) has been described by the writer elsewhere
as the trait-composite method.®

THE TRAIT-COMPOSITE APPROACH

Suppose we desire to obtain a measure of an individual’s


timidity-boldness : we first select a num ber of situations in
everyday life to which people react either tim idly or boldly, or
devise special situations or tests which seem likely to bring out
the tra it (cf. Chaps. IV -V I). We m ust apply these to a large
1 Cf. Banks, C., and Burt, C., ‘ Statistical Analysis in Educational
Psychology Current Trends in British Psychology (edit. C. A. Mace and
P. E. Vernon). London : Methuen, 1953.
2 Vernon, P. E., ‘ Human Temperament ’. Eugen. Rev., 1932, 23,
325-331.
8 Personality Tests and Assessments
group of comparable people, not only to the one individual,
and record their responses, since the only way of measuring
the strength or weakness of his tim idity in each situation is by
seeing how he stands relative to the average and spread of
scores in such a group, i.e. by reference to a statistical dis­
tribution. Here, then, we have a series of samples of the
individuals’ tim id or bold behaviour. N ext we present them
w ith a questionnaire or ‘ paper-and-pencil te st,’ which contains,
say, fifty questions about their past behaviour or their feelings
under conditions which were likely to stim ulate them to fear
or bravery (cf. Chap. V III). We get them to check the answer
to each question which best describes themselves, and if they
do this frankly one or more scores will be obtained which
represent a sample of their own opinions about their tim idity.
Thirdly, we ask other people who are well acquainted with
the individuals to estim ate tim idity and other related traits on
some form of rating scale (cf. Chap. VII). Here again we shall
get better results if they base their assessments on actual
behaviour th a t they have observed in the past, rather than on
their personal subjective impressions. And we should try to get
hold of acquaintances who will have diverse slants or viewpoints.
The judgments of schoolfellows may afford one quite useful
sample, but different samples should be collected from relatives,
teachers or employers, or others who have observed them in
various phases of their existence.
We now possess scores on, say, a dozen different samples of
behaviour or opinion, and proceed to find by correlation
methods the extent to which the samples ‘ hang together ’ or
are consistently associated. Thus if we take the schoolfellows’
opinions and the questionnaire scores, we shall probably find
some agreement, b u t perhaps no higher than a coefficient of 0-4.
This indicates a slight tendency for those individuals who think
th a t they behave tim idly to be regarded as tim id by their
friends, although there are m any exceptions. Different sets of
ratings by associates may agree more highly, b u t this m ay be
largely because they have similar biases. T h at is why it is
im portant to get raters from various walks of life, even if the
inter-correlations sink. The coefficients for the objective
situations or tests are likely to be more irregular, yielding an
average correlation with all the other samples of around 0-2
(in the w riter’s experience). B ut this should not surprise us
Introduction 9
in view of the previous discussion. Some samples may show
virtually no agreement because they are so remote from the
underlying personality structure (e.g. this m ight happen if we
took softness of voice or weak handwriting pressure as tests of
tim idity). Other samples may have been affected by some
unanticipated motive so th a t they are really reflecting quite
different traits (e.g. dislike of th e experimenter, or some
unconscious compensatory mechanism, etc.).
The next step is now obvious, namely, to combine as many
diverse, yet overlapping, samples as possible. This combination
of scores is w hat we call a trait-composite, and this provides us
w ith the best available measurement of the tra it in which we
are interested. Samples which fail to correlate positively w ith
the rest are eliminated, leaving perhaps only half a dozen or so
w ith a fairly high average inter-correlation. B u t one has to be
very cautious a t this stage, since omission of the more diverse
samples necessarily narrows the scope of the composite and
renders it less representative. Thus it would not do to use
only ratings or opinions, although these would usually overlap
quite highly. (This is a problem which statisticians have not
yet solved even in the field of abilities ; and in the personality
sphere, it looks as if the subjective judgm ent of psychologists
and the practicability of the sample measures m ust largely
determine the choice.) Provided, however, th a t we finish with
half a dozen or more really varied samples, having an average
inter-correlation of 0.30 or more, we can be satisfied th a t our
composite scores have a ‘ theoretical validity ’ of a t least 0.85.
The notion of theoretical validity implies th a t th e perfect
criterion of our tra it is a complete record of all the individual’s
behaviour of the timid-bold variety, together w ith his self-
expressed ideas and wishes, and all the impressions th a t he
makes on acquaintances, or interpretations th a t they offer
(whether biased or not). N aturally we can never collect all
this information, b u t if we have a representative set of over­
lapping samples, we can predict th a t other samples would
overlap similarly, and th a t the correlation between our com­
bined samples and the complete survey would am ount to 0.85.
N ote th a t this approach forces us to realise th a t a test is not ‘ a
miraculous instrum ent for revealing otherwise unsuspected
qualities ’ (cf. Vernon and Parry). I t is either a good or a
poor sample of the sort of behaviour, or opinion, which goes to
10 Personality Tests and Assessments
constitute some trait. If this viewpoint were more generally
adopted, less credence would be given to horoscopes, or
physiognomical signs of personality, or to some of the very
artificial tests of tem peram ent and personality on which much
tim e has been wasted in the past.
Though no really satisfactory external criterion of the validity
of a personality te st exists, the correlation between any sample
and such a composite provides us with the best available
evidence. In m any experiments, tests have been validated by
correlating with associates’ ratings. B ut while this m ay be a
useful first step, it really am ounts to no more than comparing
one imperfect sample of the tra it with another biased sample.
The same objection applies to comparisons with personality
questionnaires.
Some psychologists would argue, with considerable justifica­
tion, th a t this approach is still too vague and subjective to be
of value, and th a t it would be better to do w ithout the concept
of traits. For example, Eysenck starts with the much more
definite datum —the difference between neurotic patients and
non-patients, or the difference between hysteric patients and
dysthym ic (anxiety and obsessional) patients, etc. H e then
proceeds to build up composites of tests which differentiate
the groups as effectively as possible. Many other recognized
‘ syndromes ’ or types of abnormal personality can be and have
been used as criteria. Again, we can contrast delinquent with
non-delinquent children, males w ith females, or children who
succeed in gramm ar school with others (of the same intelligence
and social background) who fail, and likewise develop batteries
of predictive measures. This approach too, however, does not
provide a sure and complete solution, for several reasons. First,
it is doubtful how far the syndromes are d istin ctiv e; many
different, yet overlapping, classifications of m ental patients
are possible.1 Neither schizophrenics, hysterics, nor juvenile
delinquents constitute homogeneous or clear-cut groups.
1 Several psychologists have applied factor analysis in an attem pt to
settle this pioblem. Unfortunately no two arrive at the same classification.
Cf. Moore, T. V., 4 The Essential Psychoses and their Fundamental
Syndromes ’. Stud. Psychol. Psychiat. Cath. Univ. Amer., 1933, 3, 1-128.
Burt, C. L., ‘ The Analysis of Temperament ’. Brit. J . Med. Psychol., 1938,
17, 158-188. Eysenck, H. J., see Bibliography. W ittenbom , J . R .,
‘ Symptom Patterns in a Group of Mental Hospital Patients ’. J . Consult.
Psychol., 1951, 15, 290-302.
Introduction 11
Secondly, psychiatric diagnoses are subjective and unreliable,
ju st like associates’ ratings. Thirdly, there are numerous traits
th a t we would like to measure which cannot be anchored to any
syndrome. I t is reasonable to regard m ental patients as
representing the extremes of certain normal personality
tendencies. B ut where are we to find extreme cases of, say,
tolerance vs. prejudice, impulsiveness-cautiousness, persistence,
sense of humour, and so on ? Actually, both the trait-composite
and the group-difference approaches are valuable, and should
be regarded as complementary. A trait-composite for emotional
stability might be based largely on measures which differentiate
neurotics from normals and vice versa, one for persistence on
tests which help to predict scholastic or job success.

FACTOR ANALYSIS IN THE FIELD OF PERSONALITY

The relations of these methods to factor analysis require


some consideration. By statistical treatm ent of the correlations
between personality tests or ratings it is possible to determine
the common elements or factors, in the same way th a t g (general
intellectual factor) is extracted from intelligence tests, v from
verbal tests, k or S from spatial tests, and so on.1 If we are
studying two or more trait-composites or syndromes sim ultane­
ously, factorization will show whether they are independent, or
whether they would be better combined or divided up differently.
I t helps, then, in mapping out the sphere of personality more
parsimoniously, so th a t we do not waste our efforts in con­
structing composites for a lot of traits which come to much the
same thing (e.g. boldness, aggressiveness, leadership, impulsive­
ness, initiative, self-confidence and sociability probably overlap
considerably, and factorization might reduce them to a small
number of more fundamental dimensions). Also factorization
enables us to arrive a t more accurate weighted trait-composite
scores than the simple sum of results on a set of samples.
B ut statistically established factors m ay either be broad in
content like our trait-composites, or they may be narrow. Thus
1 Cf. Vernon, P. E., The Structure of Human Abilities. London : Methuen,
1950. For surveys of factor analysis in the field of personality, see Cattell,
R. B., Description and Measurement of Personality. London : Harrap,
1946. Eysenck, H. J., The Structure of Personality. London : Methuen,
1958.
12 Personality Tests and Assessments
a combination of several tests of reaction tim e would give a
clear factor, b u t this would be a very unim portant personality
trait. Indeed the m ajority of factorial studies which have been
made in the field of personality have been conducted with
limited kinds of samples—associates’ ratings, self-rating
questionnaires, etc. Cattell has surveyed these studies very
fully and has p u t forward a list of a dozen prim ary factors or
‘ source traits ’ from his own analyses of ratings. B u t his
attem pts to equate others’ results with his own involve con­
siderable guesswork, and in later investigations where he did
apply objective tests and questionnaires, these failed for the
most p art to group themselves convincingly under his prim ary
factors.1
There is also the difficulty th a t an infinite number of solutions
exist to any factorial problem. Cattell tries to decide the most
appropriate set of rotations by Thurstone’s criterion of Simple
Structure, Eysenck by linking his factors to criterion groups
(neurotics vs. normals, etc.). B ut it is still true to say th a t no
two leading psychologists agree as to which are the best
dimensions. Factors vary again in different groups. Most
investigations have been done with male college students.
B u t we could not expect, nor do we usually find, the same
patterning among women, among unselected adults, among
neurotic or other patients, or among children of different ages.
Nevertheless it is possible to discover a modicum of agreement
on two m ajor dimensions, or a t least a good deal of over­
lapping between different authors. Fig. 1 attem pts to give
a rough portrayal of such results. In accordance with
factorial practice, traits shown a t right angles are independent
or uncorrelated, and those making small angles w ith each
other are closely correlated. More than two dimensions are
really needed, and dotted lines are intended to indicate
spokes th a t project in various dimensions from the plane of
th e paper.
The most pervasive or far-reaching dimension m ight be
term ed dependability— & blend of persistence, purposiveness,
1 Cattell, R . B., * Primary Personality Factors in the Realm o f Objective
Tests ’. J . Person., 1948, 16, 459-487. Cattell, R. B., and Saunders, D . R .,
‘ Inter-relation and Matching of Personality Factors from Behavior
Ratings, Questionnaires, and Objective Test Data ’. J . Soc. Psychol.,
1850, 81, 248-260.
Introduction 18
stability and good character. As early as 1915 W ebb 1 described
such a factor—W or will-persistence, though this was based on
ratings and was therefore certainly contam inated w ith halo or
general good impression. C attell’s F acto r C is sim ila r; it
combines such traits as em otionally stable, realistic and
persevering. In another careful analysis of ratings, R eyburn
and R a a t h 2 found separate Stable-M ature-Balanced and

EXTRAVEP . INTROVERT
(SOCIAL) (UNSOCIAL)

Persis­ Neurotic,
tence Unstable
UNDEPENO-
ABLE
Fig. 1.— Diagram of Relations between Main Personality Dimensions.

Persistence factors, b u t these were quite closely correlated.


Conscientious Effort was the m ost im portant of th e syndromes
arrived a t in Sanford’s elaborate research w ith children.8 H e
attrib u te d it to harmonious and integrated developm ent of th e
Ego and Super-ego. More objective backing is forthcom ing from
1 W ebb, E ., ‘ Character and Intelligence ’. Brit. J . Psychol. Monogr.
Suppl., 1915, N o. 3.
* R eybum , H. A., and Raath, M. J . , 4 Primary Factors o f Personality
Brit. J . Psychol., Statist. Sec., 1950, 8, 150-158.
* Sanford, R . N ., et. at., 4 Physique, Personality, and Scholarship
Monogr. Soc. ties. Child Devel., 1943, 8, No. 84.
14 Personality Tests and Assessments
H artshom e and May’s work. F or though th e correlations
between their measures of honesty, persistence, self-control,
co-operation, and consistency were low, yet there was a common
element, which Mailer 1 identified as ‘ readiness to forgo an
immediate gain for the sake of a more remote b u t greater
gain \ (Actually persistence showed least overlap w ith the
other traits mentioned ; hence the ‘ good character ’ aspect of
dependability is shown by a dotted line on our diagram.)
Brogden 2 factorized 29 character tests, and though he rotated
them into half a dozen factors, it is clear th a t there m ust have
been quite a strong common element or general factor. Several
other researches have beer, done on persistence tests (cf. p. 88),
and though here too some authors find very little overlapping,
and others break down the tra it into distinct group-factors or
types of persistence, Ryans concludes th a t a general factor is
justified. In a very thorough research on London secondary
school boys, M acA rthur3 recently dem onstrated a strong
persistence factor in objective tests and ratings (together with
subsidiary group-factors), and found th a t this, as well as
intelligence, contributed to school achievement. Our notion of
dependability corresponds, indeed, to Alexander’s 4 X-factor of
industriousness which, as shown elsewhere,8 so largely deter­
mines educational and vocational success. I t links too w ith
Charlotte Biihler’s conception of the ‘ work attitu d e ’ among
young children and with the ‘ sense of duty ’ which, she
considers, differentiates older and more m ature from younger
adults.8 Finally our factor certainly overlaps w ith Eysenck’s
neuroticism factor, which was based on the contrast between
normals and neurotic patients. For some of his best differ­
entiating tests are usually accepted as tests of persistence.
1 Mailer, J. B ., 4 General and Specific Factors in Character ’. J . Soc.
Psychol., 1984, 5, 97-102.
• Brogden, H. E., * A Factor Analysis o f Forty Character Tests
Psychol. Monogr., 1940, 52, No. 234, 89-55.
■ MacArthur, R. S., An Experimental Investigation of Persistence and its
Measurement at the Secondary School Level. Ph.D. Thesis, University of
London, 1951.
4 Alexander, W. P., ‘ Intelligence, Concrete and Abstract ’. Brit. J .
Psychol. Monogr. Suppl., 1935, No. 19.
• Vernon, op. cil., p. 11.
• Biihler, C., From Birth to Maturity. I/ondon : Kcgan Paul, 1935.
Frenkel, E., * Studies in Biographical Psychology ’. Char, dk Person., 1936,
5, 1-34.
Introduction 15
W hether B u rt’s 1 factor of general emotionality—obtained from
observations and ratings of children’s emotional reactions—can
also be identified with undependability is more d o u b tfu l; it
is therefore shown in Fig. 1 on a different plane. N ote th a t our
factor is not entirely independent of general intelligence (cf. p.
7 1 ), b u t we should need yet another plane to portray this.
The underlying core of the second main dimension, orthogonal
to or independent from the first, is more controversial. For
though it corresponds to the familiar extravert-introvert
dichotomy, this conception has been variously interpreted.
Ju n g ’s original notion of objective vs. subjective orientation
is difficult to pin down, and we would suggest th a t the most
clear-cut dimension is social-co-operative-liking people vs. un ­
sociable. Factorial studies of personality questionnaires show
th a t this type of introversion overlaps largely with neurotic
tendencies.2 Hence the angle on our Figure between In tro ­
version and Instability is acute ; while Eysenck’s own introvert
dimension (derived from dysthym ic vs. hysteric patients) is
drawn a t 90 degrees to neuroticism. Cattell’s F actor F of
Surgency (optimistic, sociable vs. melancholic, seelusive) is very
close to our conception, as also are dichotomies based on manic
vs. depressed patients, or on the Freudian oral-gratified and
ungratified types.3 Banks’s 4 two m ajor factors, obtained
from ratings of women students, are emotional stability and
‘ euthymic-dysthymic ’. Reyburn and R a ath ’s 5 first factor was
Spontaneity-Cheerful-Sociable, and Fiske 8 found a first factor
1 Burt himself regards it as the innate temperamental element under­
lying (the inverse of) Webb’s acquired character factor, W. It seems to be
more akin to Cattell’s Factor D-hypom anic vs. phlegmatic. Cf. Burt, C. L.,
* The Factorial Analysis of Emotional Traits ’. Char. & Person., 1939, 7,
238-254, 285-299.
* Cf. Eysenck, Bibliography; Vernon, P. E., ‘ The Assessment of
Psychological Qualities by Verbal Methods ’. Industr. HUh. Res. Board
Rep., No. 83. London : H.M. Stat. Off., 1938.
* Goldman, F., ‘ Breast-feeding and Character Formation ’. J . Person.,
1948, 17, 83 -103.
* Banks, C., * Primary Personality Factors in Women : A Re-analysis
Brit. J . Psychol., Statist. Sec., 1948, 1, 204-218.
‘ Op cit.
* Fiske, D. W ., ‘ Consistency of the Factorial Structures o f Personality
Ratings from Different Sources ’. J. Abn. Soc. Psychol., 1949, 44, 829-844.
In this investigation the most prominent second and third factors were
named Emotional Maturity and Conformity. Our conception of
Dependability partakes of both of these.
16 Personality Tests and Assessments
of Social A daptability in ratings of clinical psychology students
by staff, other students, and themselves. B u rt,1 however,
distinguishes two bipolar emotional factors—sthenic vs. asthenic
and pleasant vs. unpleasant. Kretsclimer’s a cyclothyme-
schizothyme typology is even more difficult to place. I t over­
laps with Ju n g ’s description of extravert-introvert in many
respects, b u t is quite different in others. Moreover it is not,
as K retschmer claimed, continuous w ith the distinction between
manic-depressive insanity and schizophrenia (cf. p. 86).
Cattell distinguishes it sharply from surgency, bu t as his cyclo-
thym e Factor A is based on th e traits Easygoing, Frank,
Co-operative vs. Obstructive, Reserved, it seems impossible to
disregard the overlapping. We have, therefore, shown it on
our Figure by a dotted line.
O ther traits which are certainly connected with, b u t should
probably be considered as oblique to, extraversion-introversion
are dominance-submissiveness (Cattell’s Factor E), and impul-
siveness-caution. There seems to be an obscure relation also
with the masculine-feminine dichotomy, with the practical-
verbal (k :m vs. v:ed) factor in abilities, and w ith types of
interests.* Obviously the situation is very confused. I t m ight
be cleared up by sub-dividing Dependability and Extraversion-
introversion into group factors, or by the addition of other
independent dimensions. B u t thrre seems to be little prospect
of different factorists agreeing. A t least this brief survey has
indicated th a t factor analysis is not yet in a position to supply
a complete and acceptable map of personality. Though an
invaluable tool, it cannot by itself answer the question—w hat
tra it composites should we try to measure ? Nevertheless it
would be no mean achievement if these two factors alone could
be firmly established and measured by practicable batteries of
tests and ratings. N ot only would it provide a foundation for
more detailed personality studies, b u t also it would cover a
great deal of w hat is needed in educational and vocational
selection and guidance.
1 Op cit.
' Kretschmer, E., Physique and Character. London : Kegan Paul,
1925.
• Cf. North, R. D., * An Analysis o f the Personality Dimensions o f
Introvereion-Extroversion ’. J . Person., 1949, 17, 352-808.
Introduction 17

PERSONALITY TYPES

Mention should also be made of the German typological


approach. Representative doctrines are those of Ju n g with his
sensation, thinking, feeling, and intuition as well as extravert-
introvert, ty p e s ; Kretschmer’s cyclothym e-schizothym e;
Jaensch’s T and B, integrate and disintegrate, and the syn-
thetic-analytic types of older writers ; and Spranger’s types of
value (cf. p. 162). Such types are quite different from trait-
composites or syndromes, th a t is, dusters of behaviour samples
which are known to inter-correlate.1 They are adm itted to be
idealizations or intuitive generalizations, whose value resides
in their bringing together a whole set of phenomena and showing
how they interact in the hypothetical typical person. They
help in the understanding rather than in the exact description
or measurement of people. For example, K retschm er * inter­
relates physique, psychopathological syndromes, normal
tem peram ental variations, certain characteristics of perception,
imagery and movement, and styles of artistic and literary
productions. B ut no one individual is supposed to show all
these aspects, any more than we expect to meet pure and
complete examples of such types as the aesthete, the Aberdonian
or the absent-minded professor. The latter are referred to as
4stereotypes ’ by social psychologists ; they are very similar to
the German concept, though they possess even less logical or
empirical backing. Kretschmer, Jaensch and their collaborators
have carried out many experiments which claim to show
differences between groups of people representing different
types, though they seldom apply statistical tests of significance,
or correlation techniques, for dem onstrating ju st how closely
the characteristics are linked. The present writer has surveyed
some of the literature elsewhere, and shown how chaotic are
1 They can, however, be expressed quantitatively by means o f what
Stephenson calls Q-technlque. Stephenson shows also that there is no
validity in the common objection that most of the traits or attributes of
which a type is composed tend to be normally, not bimodally, distributed.
Stephenson, W., ‘ Methodological Considerations o f Jung’s Typology \
J . Ment. Set., 1939, 85, 185-205. Introduction to Q-Tcchnique. Univ.
Chicago Psychol. Dept, (mimeographed), 1951.
* For a sympathetic appraisal of Kretschmerian typology, see Eysenck,
H. J ., * Cyclothymia and Schizothymia as a Dimension of Personality.
I. Historical Review.’ J . Person., 1950, 19, 128-152.
18 Personality Tests and Assessments
th e results, due to the lack of exact specification of the persons
and situations investigated, and of proper experimental con­
trols.1 Such work, together w ith the writings of Ju n g and
Spranger, is full of interesting insights and hypotheses which
would be worth following up by objective research. B u t it
appears to contribute even less to the scientific assessment oi
personality than does the trait-composite, the syndrome or
group-difference, or the factorial approaches.
This brings us to the extremely im portant and controversial
question of how far personality can be measured. Is it not a
G estalt or totality, which is disrupted by our breaking it down
into separate traits or factors ? 4 The mere enum eration of a
person’s traits and habits does not give us the person himself,
since it omits the essential aspect of organized structure. Each
single characteristic has to be considered in relation to the
whole.’ 2 Allport has p u t the argum ents very clearly for the
idiographic, clinical, or intuitive approach to personality, as
opposed to the nomothetic, analytic and psychometric approach;
while Eysenck has forcefully supported the nomothetic
position. Now we would agree th a t the clinical standpoint
always operates in everyday life ; we see the individual as a
unique and integrated whole, and we make our predictions about
him, or advise him, in the light of our understanding of his past
history and present structure. Every biographer or novelist,
every vocational or medical psychologist who is concerned with
personalities, does the sa m e ; and they would regard a cross-
section of a person’s measurements on a series of separate
tra its as entirely inadequate. B u t unfortunately the factual
evidence, some of which is surveyed in our next chapter, seems
to show th a t every laym an’s or psychologist’s insight into an
individual’s personality is different. The 4 clinical ’ approach
is so subjective and unreliable th a t more accurate predictions
can often be made by narrower, though more objective, methods.
The present w riter would not accept Eysenck’s extreme position,
since the objective methods themselves give very variable and far
from certain results. B u t he would urge th a t clinical judgm ents
would always be assisted, and often corrected, by making use of
1 Vernon, P. E., ‘ The Rorschach Inkblot Test ’. Brit. J . Med. Psychol.,
1988, 18, 179-200.
* Vernon, P. E ., ‘ Can the “ Total Personality ” Be Studied Objectively? ’
Char, db Person., 1935, 4, 1-10.
Introduction 19
such objective techniques and scientific tests as are available.
Moreover, he would reject the view th a t these two approaches
m ust necessarily be opposed. On the one hand, most of the hypo­
theses about personality to be investigated, and most of the ideas
for possible tests, derive from the clinical (medical or vocational)
psychologist, or from everyday-life observations. On th e other
hand, the psychometrist can devise appropriate objective
techniques for validating or disproving these hypotheses.
For example, the lay or psychological interviewer can be
given full freedom to use his ordinary methods of enquiry, and
to apply his insight, and yet express his conclusions in term s of
predictions, say, of success in some job, which can then be
validated in the same way as a test. The m atching method,
described in Chap. IV, enables correlations to be calculated
between two sets of ‘ wholistic ’ data, such as impressions of
the voice or of handwriting, and sketches of the personalities.
Even the uniqueness of every personality could be covered,
theoretically a t least, by the standard psychometric ap p ro ach ;
for if sufficient traits were measured, each personality would
show a different p attern or profile of scores. Cronbach 1 has
suggested one technique for comparing patterns of scores, as
distinct from isolated variables, and m any other statisticians
are interested in this problem.2 Especially fruitful are the
techniques based on ‘ correlations-between-persons ’ or
‘ between-occasions ’ (Q- and P-techniques) developed by Burt,
Stephenson, and Cattell,3 since these make possible the statis­
tical treatm ent of structural properties of individual person­
alities. N ot only cross-sectional b u t also genetic-dynamic
patterns can be formulated quantitatively. W ith th e influx,
especially in America, of laboratory-trained psychologists into
the fields of clinical and consulting psychology, we m ay hope for
more rapid advances in bringing the two approaches together.
1 Cronbach, L. J., ‘ “ Pattern Tabulation ” : A Statistical Method for
Analysis of Limited Patterns of Scores, with Particular Reference to the
Rorschach Test ’. Educ. Psychol. Measmt., 1949, 9, 149-171.
* Cf. Banks, C., and Burt, C., ‘ Statistical Analysis in Educational
Psychology’. Current Trends in British Psychology (edit. C. A. Mace and
P. E. Vernon). London : Methuen, 1958.
* Burt, C. L., and Watson, H ., * Factor Analysis of Assessments ror a
Single Person \ Brit. J . Psychol., Statist. Sec., 1951, 4, 179-192. Cattell,
R . B ., Personality. New York : McGraw-Hill, 1950. Stephenson, W.,
op. cit., p. 17.
II
The Interview : Its Reliability and Validity

H E interview is the method most frequently used for


T assessing personality either for educational or vocational
purposes, and for the diagnosis of maladjusted children and
m ental patients. There may be a single formal session conducted
by a headmaster, prospective employer, or a group of people,
where a rough picture of the personality of the interviewee is
obtained and recorded in the form of notes or a personality
sketch, or used to guide the decision to accept or reject. Or
there may be a series of interviews, interspersed with observa­
tions of behaviour in the school, factory, or clinic, leading to the
completion of a school’s or personnel m anager’s record card,
or the writing of a full case-study. The more sophisticated
vocational psychologist may sum up his impressions by rating
a series of standard traits. The psychological processes involved
in conducting interviews and making judgments of personality
are well described by Oldfield 1 (cf. also Chap. V II), and several
useful books have been published on the technique of inter­
viewing. A general summ ary is given by Vernon and Parry.
The interview is also the m ost comprehensive of methods,
since it makes more or less use of all th e techniques which we
shall discuss in subsequent chapters, though usually in a
haphazard m a n n er: observations of external appearance, of
gestures, voice and other modes of expression, and of behaviour
under the stress of the interview situation or in response to
difficult questions ; evidence regarding behaviour and achieve­
ments in the p a s t; self-descriptive d ata regarding the inter­
viewee’s interests, social attitudes, e t c .; and reference is
usually made to testimonials or assessments provided before­
hand by associates. The psychiatric interview m ay in addition
include some projection material, such as free association and
analysis of dreams. I t is this very multifariousness which
makes its results so uncertain ; there is so much scope for the
1 Oldfield, R . C., The Psychology of the Interview. London : Methuen,
1941.
SO
The Interview : Its Reliability and Validity 21
interviewer to jum p to false conclusions, and to be influenced
by prejudices and unsound theories. A notorious instance of
this was described by Rice.1 Two interviewers tried to discover
the reasons for the destitution of large numbers of persons
applying for relief. One, a socialist, found th a t 39% of his
cases were attributable to industrial conditions, 22% to drink.
The other, an advocate of temperance, found 7% and 62%
respectively under these headings.
The selection interview is obviously unsatisfactory, too,
because it provides such an unrepresentative and limited sample
of the interviewee’s behaviour. He is keyed up to make a
good impression, and subjected to a situation very different
from th a t commonly occurring in his later school career or
job. B ut in spite of all the adverse evidence the interview is
likely to remain the m ajor technique for the following reasons.
First, it is almost universally accepted not only by employers
and others who require assessments of personality, b u t also by
people who are to be assessed. The former would resent being
unable to exercise their own powers of ‘ summing up and the
latter are suspicious of more impersonal (even if more efficient)
techniques such as tests, examinations, or w ritten information.
Secondly, it is quicker and more economical than any method
involving tests, and also more flexible or adaptable to new
purposes or to special circumstances. The development and
validation of a b attery of tests involves protracted work by
skilled psychologists, and if candidates of a different age or
different educational or social background come forward, the
whole process needs to be repeated. Tests and other methods
more objective than the interview can certainly be of great use
in research on personality, bu t it will rarely be possible to apply
them for any practical purpose other th an supplementing the
interview.

RE X . T A B T 1 . I T Y AND VALIDITY

Before proceeding, we should clarify th e meaning of th e terms


*reliability ’ and ‘ validity ’. When a layman states th a t a test or
interview is reliable he usually means th a t its results are correct
and useful. This, however, is w hat th e psychologist calls
* Rice, S. A., 1 Contagious Bias in the Interview \ Amer. J . Sociol.,
1829, 85, 420-423.

3
22 Personality Tests and Assessments
valid—the extent to which the method predicts, or correlates
with, some external criterion. As shown in Chap. I, the validity
of a test or interview can be investigated : (a) by its correlations
with other samples of behaviour or opinion (e.g. ratings) which
are presumed to cover the same tra it or q u a lity ; (b) by its
correlation with a trait-composite or factor ; (c) by its capacity
to discriminate between groups known on other grounds to
differ in personality, such as different types of neurotic patients,
or successes and failures in a job.
Reliability is used to indicate the trustworthiness or stability
of the test itself, ap a rt from its representativeness or capacity
for predicting anything else. Different kinds of tests or
judgments possess different kinds of reliability and unreliability,
and it will be as well to list some of the main ones. In sub­
sequent chapters the type th a t is mentioned should usually
be clear from the context.
1. The agreement between two or more persons in their
judgm ents of, or decisions about, the personalities or traits of a
group of individuals or ‘ subjects This applies to interview
assessments, to diagnoses of m ental patients by psychiatrists,
to ratings, and to interpretations of personality based on
projection tests, or expressive movements.
2. The agreement between two or more observers in recording
specified types of behaviour ; also the agreement between two
or more scorers in sorting the responses of subjects to projection
tests into the same categories (apart from any interpretation of
the behaviour or the responses).
8. The agreement between scores or ratings received by the
same subjects when the test, interview, or other method is
repeated. Any method can be investigated in this way, b u t it
is seldom done, both because the subjects’ responses may differ
when they meet a situation a second time, and because person­
ality qualities themselves are adm itted to be somewhat unstable
or liable to fluctuation. This may be denoted as repeat
reliability.
4. The agreement between scores derived from one half of
the test and those from the other half (e.g. the corrected odd-
even technique), or the consistency of responses to all the items
(Kuder-Richardson technique). The comparison of scores on
two alternate forms of a test given on different occasions is
interm ediate between Nos. 8 and 4. So is the method some­
The Interview : Its Reliability and Validity 28
times applied to ‘ time-sampling ’ (p. 94), where records of
behaviour on alternate days over a considerable period may be
inter-correlated. This type of reliability—better termed
consistency—chiefly applies to objective tests and question­
naires, and to certain projection test scores (e.g. Rorschach).
I t can be studied too when a judge gives ratings on several
items presumed to cover the same trait, as in third-person
questionnaires (p. 109).
5. A number of tests such as the Haggerty-Olson-Wickman
Behavior Schedules, Kent-Rosanoff word association, Strong
Interest Blank, etc. are scored by the resemblance of the
subject’s responses to those of a given group (mental patients,
people in different vocations, etc.). Such scores tend to be
unstable unless the standardization group was very large. In
this instance, reliability can best be established by w hat is
called cross-validation. Two groups of, say, m ental patients
are used, and two scoring keys developed which differentiate
them from normal persons. Group I is then scored on Group
I l ’s key, and vice versa. The reliability of the keys is shown by
the extent to which each group is differentiated from normals
by the other’s key. I t is generally recognized nowadays th a t
the same kind of procedure should be applied in all validatory
studies, particularly of batteries of tests. Validities established
within a single group only are often very unreliable.

RELIABILITY OF THE INTERVIEW

Obviously no method of personality assessment can be valid


unless it is first reasonably reliable in any or all of the above
senses. No exact standards can be laid down, b u t lower
coefficients of reliability are normally accepted in the field of
personality than in th a t of abilities. Only Type 2 correlations
are expected to approxim ate to 10. For the other types, -70
upwards would be considered satisfactory, and -50 to -70 as low
but passable. Coefficients of -85 and over are suspiciously
high, unless obtained from exceptionally thorough tests or from
the combined judgm ents of several persons. They m ay indicate
th a t the method is assessing some rather artificial or stereotyped
quality which possesses poor validity.
Hollingworth carried out one of the first studies of the
reliability (Type 1) of interviewing. Twelve sales managers
24 Personality Tests and Assessments
interviewed 57 applicants independently and ranked them on :
* suitability for the position in question H e states th a t every
applicant received rankings most of th e way from the top to the
bottom of the list, though he does not quote any inter­
correlations. H artog and Rhodes’s 1 study of interviewing of
16 candidates by two boards of experienced Civil Service
examiners deserves particular notice, since it was typical of
much employment or scholastic interviewing. Each board had
the candidates’ educational records, and questioned each
candidate for quarter to half an hour. The four or five members
recorded separate assessments of alertness, intelligence, and
suitability of personality for the Civil Service, and after
discussion reached a final percentage mark. W ithin each board
there was fairly close agreement, b u t the two boards differed
greatly, the correlation between their final marks being only
•41±-14. The average discrepancy in marks was 12%, b u t it
ranged from 1 % to 81 % for different candidates.
Other studies by Webb, Magson, Fearing, Cantril, Hill and
Williams, and work carried out in the Services during the war,
are summarized by Vernon and P arry. They mostly yield
correlations between interviewers of around -5 to -6. I t may be
useful to illustrate this figure. If 2 schoolmasters independently
interviewed 100 candidates for a gram m ar school, and each
picked the 20 best in his opinion, they would agree on 9 of their
choices only and disagree on 11. This reliability is low, b u t it
is no lower than th a t usually found for ratings of personal
qualities by acquaintances who have known th e subjects for
some time (cf. Chap. VII). In one investigation by Newman,
Bobbitt, and Cameron,* much higher figures of over -8 were
obtained. Here the psychologist and psychiatrist interviewers
of officer candidates had thoroughly analysed and specified
beforehand ju st w hat qualities they were looking for. The
reliability of psychiatric diagnoses (that is the assignation of
patients to particular neurotic or psychotic categories) cannot
readily be stated, since so much depends on the coarseness of
the classification and on the heterogeneity of the patients.
1 Hartog, P., and Rhodes, E . C., An Examination of Examinations.
London : Macmillan, 1935.
* Newman, S. H., Bobbitt, J. M., and Cameron, D. C .,4 The Reliability
o f the Interview Method in an Officer Candidate Evaluation Program ’.
Amer. Psychologist, 1946, 1, 103-109.
The Interview : Its Reliability and Validity 25
Quite commonly there is disagreement on 50% of cases.1
Large variations are also found in the proportions of cases
(from comparable populations) which different psychiatrists
pu t into the various psychopathological groups.

VALIDITY OF THE INTERVIEW

The evidence regarding validity is still less favourable,


though Vernon and P arry quote studies where particular
psychologists or personnel selection officers (PSOs) were notably
successful in picking recruits for Service employments or for
officer training. Usually the PSO in the Army or N avy had a
record of the candidate’s success on a num ber of intelligence,
educational, or aptitude tests, and his answers to a biographical
questionnaire. A fter interviewing him for about quarter of an
hour, he summarized his judgments of suitability in a final
recommendation and, in some experiments, assessed his
likelihood of success in th a t employment. The remarkable
result of several such investigations was th a t the PSOs’ pre­
dictions, although combining the indications from tests with
judgm ents of experience, personality and interests, were on the
average less valid than predictions based on the best tests
alone. Since there were big individual differences in the
selecting ability of different PSOs, it follows th a t several of
them m ust have given much worse predictions than the tests.
S tu it2 quotes similar results from the U.S. N avy where inter­
viewers did no better, or even worse, than ability tests. (No
personality tests were used in these studies.)
Similarly McClelland3followed up pupils in Dundee secondary
schools and showed th a t, if account had been taken of primary
school teachers’ judgm ents of industriousness or other person­
ality qualities, the num ber of instances of bad selection would
not have been reduced. No investigation seems to have been
made of gram m ar school headm asters’ interviews, b u t they
would hardly be likely to do better th an the teachers who had
1 Cf. Ash, P., * The Reliability of Psychiatric Diagnoses ’. J . Aim. Soc.
Psychol., 1049, 44, 272-276.
* Stuit, D . B. (edit.), Personnel Research and Test Development in the
Bureau of Naval Personnel. Princeton, N .J. : Princeton University
Press, 1947.
* McClelland, W., Selection for Secondary Education. London: University
of London Press, 1942.
26 Personality Tests and Assessments
known the pupils for some years. This does not mean th a t the
teachers could not judge suitability for secondary schools a t
all, b u t th a t the valuable element in their judgm ents was
already incorporated in the ordinary school marks. Himmel-
weit’s 1 research on the selection of students a t the London
School of Economics is even more devastating. Interview
judgm ents by a board of university teachers showed zero
validity when checked against degree marks 1 and 2 years
la te r ; whereas a short entrance examination gave small
positive predictions of success, and a battery of aptitude tests
considerably better predictions.
Such experiments cannot always be accepted a t their face
value, for two technical reasons. F irst, the accepted candidates
who are followed up may have been actually selected chiefly on
the basis of the interview. In such a selected group the validity
inevitably drops, whereas the validity of tests, etc., which were
used only indirectly or not a t all for selection, is less affected.
Secondly, many writers compare the interview with the best
of a number of tests, or with the best combination of a battery
of tests. Such validity coefficients are by no means stable,
especially if the group is small. Different tests m ight rise to
the top on another occasion, and the same battery w ith the
same weighting of its component parts would certainly give a
much lower coefficient (a phenomenon known as shrinkage).
These snags can be allowed for, b u t they are often forgotten by
those who condemn the interview as worthless.

RESEARCHES INTO THE SELECTION OF CLINICAL


PSYCHOLOGISTS AND OF CIVIL 8ERVANTS

The most comprehensive investigation of interviews, ability


and personality tests yet made is th a t of Kelly and Fiske into
the selection of candidates for training as clinical psychologists.
Each candidate was studied for 7 days. So far results have
been published for a group of 76 to 98 cases, who were carefully
assessed after 2 years of their 4-year training, and again a t the
end of their training. Thus a somewhat different picture may
emerge when larger numbers are available, or when success on
the actual job can be determined. The coefficients quoted in
1 Himmelweit, H. T., and Summerfield, A., 4 Student Selection—An
Experimental Investigation, II ’. Brit. J . Social., 1951, 2, 59-75.
The Interview : Its Reliability and Validity 27
Table I are averages of correlations w ith the 2-year and 4-year
assessments. They are all very small and statistically unreliable,
but this is to be expected in highly selected groups. The
following conclusions are indicated :

T a b le I

AVERAGE VALIDITY COEFFICIENTS OF PROCEDURES USED


IN SELECTING CLINICAL PSYCHOLOGISTS

Correlations
2 yr. 4 yr.
Paper qualifications only, judged by 2 psychologists . •17 -22
Paper qualifications -f 1 hour’s interview •15 -25
Judgments based on paper qualifications + a number of
objective tests . . . . . . •27 •28
Separate objective tests only : Miller Analogies •17 •80
Certain scores on Guilford-Martin Personality Inventory •22 •16
Strong Interest Blank, scored for psychologist . •25 •20
Strong Interest Blank, scored for clinical psychologist . -32 •16
Judgments based on above data -f a series of projection tests •29 ■26
Separate projection tests, each given by an independent
tester
Rorschach Inkblots . . 12 •05
Thematic Apperception . '1 1 •12
Sentence Completion (group test) . . 19 •21
D itto + a further intensive interview . . -27 •26
Judgments o f a team based on all the above . -24 ■30
D itto after observing 4 Situations 1 tests (final prediction) . -20 •38
Situations alone judged by 3 independent observers . -27 ■22

1. The addition of the first or second interview does not


increase the validity of predictions. (Note th a t these judgments
were not used for the actual selection of candidates.)
2. The study of paper qualifications and objective test scores
gives almost as good predictions as any of the methods involving
personal contact with the candidates. The collection of more
data, or the bringing of additional judges into the decision,
seldom improves, and may lower, the validity.
3. Certain objective tests are superior to any method in­
volving ‘ clinical ’ judgment, though we cannot say whether
these would stand up as well on subsequent occasions. A
battery of the best objective ability and personality tests might
well achieve a validity of 4. The Situations tests also give
promising results (cf. group-observation methods, p. 96).
28 Personality Tests and Assessments
4. The projection tests are decidedly less valuable than the
b etter objective ones, and the simplest group projection
technique is better than the elaborate individual ones.
5. The degree of confidence felt by the judges in any method
provided completely misleading indications of its true value.
Kelly and Fiske’s explanation of these results is th a t the
more d ata the judge of personality has to go on, th e more he is
likely to over-weight indicators th a t have poor significance,
and to under-weight those with good validity. I f their findings
are confirmed, it would follow th a t diagnostic methods which
claim to penetrate most deeply into the personality or to give
the fullest insight, such as psychiatric interviews and batteries
of projection tests, are of less value for practical predictive
purposes than quite superficial b u t more objective methods.
We are not, of course, denying the usefulness of psychiatric
diagnoses of psychopathological conditions, nor the ability of
psychiatrists to make moderately good predictions of the onset
of neurotic breakdown. H u n t1 shows th a t psychiatric screening
of recruits in the American Services was of some value ; though
in the absence of experimental evidence, we cannot state th a t
other more objective methods m ight not have been more
successful. Again, in Chap. X, we shall see th a t projection
tests such as Rorschach and T.A.T. are of proved w orth in
differential diagnosis of mental patients. B ut all these methods
which depend largely on subjective, intuitive assessment,
appear to be of very dubious value for vocational predictions.
A t the same time, Kelly and Fiske’s work m ust be accepted
with caution. The somewhat similar selection and follow-up of
B ritish candidates for the higher adm inistrative Civil Service
and Foreign Service, described by the writer elsewhere,2 point
to rather different conclusions. These candidates too under­
went a cum ulative procedure, though it was less elaborate
(lasting only 2 to 3 days), and no objective personality tests
were used. The criterion against which validity was measured
consisted of assessments after they had spent l£ to 2 years
a t the job. The specimen coefficients quoted in Table I I
1 Cf. Wittson, C. L., Hunt, W. A., and Stevenson, I., ‘ A Follow-up
Study o f Neuropsychiatric Screening ’. J . Abn. Soc. Psychol., 1940, 41,
79-82.
* Vernon, P. E ., 4 The Validation o f Civil Service Seleotion Board
Procedures ’. Occup. Psychol., 1950, 24. 75-95.
The Interview : Its Reliability and V alidity 29
refer to 330 selected candidates. They are higher th a n K elly
and Fiske’s, m ainly because they have been corrected for
selectivity. The following points emerge :
1. O bjective tests and exam inations have poor predictive
value. Even the best com bination of them would hardly give
a coefficient greater th a n -3. P ersonality tests m ight have
helped. Thus the sociometric rating is promising.

T able I I

V ALID ITY COEFFICIENTS OF VARIO US PARTS OF


CIVIL SERVICE SELECTION PROCEDURE

Entrance exam inations, or verbal intelligence tests (average


coefficient) . . . . . . . . -22
Observation of discussion among groups o f candidates (average for 8
observers) . . . . . . . . -82
D itto + observation o f com mittee and other exercises (2 observers) -44
Consideration o f all above evidence + individual interviews
(2 interviewers) . . . . . , -47
Sociometric ratings by candidates them selves . . -29
Final mark after discussion between 3 observer-interviewere . . -50
Separate board considers all above evidence and re-interviews . -58

2. W ith th e accum ulation of evidence from successive


exercises (Situations), the validities rise.
3. The addition of interviews produces no appreciable
im p ro v em en t; (b u t the judgm ents based on exercises m ay to
some ex ten t have been contam inated b y interview d ata, as the
interviews did no t always come last).
4. A re-interview by an independent board (containing no
psychologists) does im prove the final predictions significantly.
A nother relevant research into th e assessm ent of children’s
personalities is briefly described by B u rt. Using teachers’
ratings as criteria, he found th a t objective tests and judgm ents
based on projection m aterial gave the poorest validities.
Individual interviews combined w ith observation of behaviour
were more successful, and observations in stan d ard social
situations (cf. p. 96) b etter still. B u t th e com bination of
m ethods, after discussion am ong 3 or more observers and
interviewers, gave the highest validity.
There is good evidence also for th e value of one type of in ter­
viewing in the follow-up studies of cases given vocational
80 Personality Tests and Assessments
guidance by the N ational Institu te of Industrial Psychology
(cf. Vernon and Parry). A few tests of abilities are employed,
b u t the recommendations are based mainly on the psychologist’s
synthesis of information from the school and the parents, and
from an interview with the candidate. Over 90% of those
who follow the recommendation are found to be satisfied and
satisfactory in their jobs some years later, as compared with
some 50% of those who go against the recommendation. I t is
possible th a t fuller tests of ability together w ith some personality
tests would do better than this, b u t it is difficult to see how
these could be chosen and applied when an unlimited range of
occupations is under consideration. Here (as also in Civil
Service selection and in B u rt’s investigation), the interviewei
follows a systematic procedure designed to bring out the most
relevant traits and interests, bu t makes little use of psycho­
analytic interpretations or of projective testing. This may have
something to do with the, apparently, more valid results.

CONCLUSIONS

We may conclude th a t many interviews given by untrained,


and also some by very highly trained, persons are of little or no
value for the practical assessment of personality. Nevertheless
the method does give useful results in some circumstances.
U nfortunately we know little about choosing good interviewers
and training them. Clearly it is desirable to develop more
objective methods, despite the difficulties pointed out in the
previous chapter, and despite the fact th a t they do not seem to
help much in increasing one’s insight into or understanding of
people. In guidance, counselling or therapeutic situations they
cannot be expected to replace the overall picture of the develop­
m ent and structure of personality provided by the system atic
interview and case-study ; yet they should often help to correct
biased judgm ents or undue confidence on the p art of the
interviewer. In selection situations where large numbers of
candidates are involved, and where scientific follow-up is
possible, they m ight well improve on current interview
procedures. I t would be better indeed if the interview was
confined to assessing certain traits which cannot readily be
covered by other methods, th a t is—treated as one te st whose
results are combined with those of other tests. This approach
The Interview : Its Reliability and Validity 81
is used, for example, in the selection of officers for the American
Army, sometimes also in the British Civil Service and in second­
ary school selection. Jenkins describes the American officer
boards, where the interviewers do not explore experience or
background bu t look for and assess oniy those aspects of manner
and speech, and social traits which are readily brought o u t in
an interview situation. I t is worth remembering th a t the
object of selection is usually to pick people who will impress
employers, colleagues, and subordinates favourably—not
merely individuals who will do the job well. Often, therefore,
the interview may constitute a useful analogous exercise
(cf. p. 99).
We can now turn to a discussion of the more objective
methods, or a t least a selection of them. There has been so
much ingenuity in the construction of tests, and so much
experimentation often leading to variable or contradictory
results, th a t we shall consider only those whose value, or lack of
value, appears to be best authenticated. Elsewhere the w riter
has pointed out th a t psychological tests derive initially from
the methods we normally employ in judging people in everyday
life, or in interviewing them, though they are refined in a
number of ways. ‘ A psychological test, by presenting a
standardized task or situation, elicits a sample of the testee’s
behaviour which can be objectively scored and compared with
norms of performance, and which has been proved to be
predictive of future occupational or other behaviour.’ In the
field of personality this is an ideal rather than an actuality.
Few tests have reached a stage where application and scoring
are as standardized as in testing, say, intelligence; and few
have trustw orthy norms, except for specialized groups such as
American university students. Reliabilities are low, as already
pointed out, and validities very difficult to establish. I t is
best, perhaps, to ask not what tests are available, b u t w hat
methods show most and least promise in the scientific study of
personality.
I ll
Physical Signs of Personality
PHYSIOGNOMY

OW far are outer physical characteristics indicative of


H personality traits ? H eight and weight, body build,
dimensions of the head and bumps on the skull, shape of
profile, height of forehead, size of jaw, colour of hair and eyes,
shape of the fingers, lines on the palm—all these a t various
times have been regarded as significant, and still play a p a rt
in popular lore. Red hair is sometimes said to show irritability,
blue eyes—innocence; the jaw reveals determ ination or
weakness, the forehead intellect, and so on. The ancient Greeks
attributed to people the characteristics of animals which they
resembled. For example, people with aquiline features were
noble, b u t grasping, like eagles. L avater and the physiognom­
ists made careful studies of the features of outstanding people—
artists, philosophers, soldiers, criminals—on the assumption
th a t others who resembled them physically would be similar
psychologically. The phrenologists analysed personality into a
series of propensities and faculties, the strength of each of
which was represented by the protuberance of a certain
section of the skull. We should also mention, in passing, the
astrologers’ claim th a t personality is influenced by the stars
under which one is born, since a remarkable number of people,
particularly women, appear to find this credible. Many of our
epithets for personality—m artial, saturnine, lunatic, etc., derive
from such superstitions.
When these alleged correlations are p u t to experimental test,
scarcely a single one shows the slightest value. D. G. Paterson’s
Physique and Intellect gives a useful summary of the evidence.
In general there is no agreement between measurements of any
physical characteristic and judgm ents of personality given by
acquaintances. For example, Paterson and Ludgate 1 asked
1 Paterson, D. G., and Ludgate, K. E., ‘ Blonde and Brunette Traits :
A Quantitative Study \ J . Personnel Res., 1922, 1, 122-128.
3S
Physical Signs o f Personality 38
94 students to assess people well known to them on 26 traits,
and, on dividing these people into blondes and brunettes,
found no appreciable difference in any trait. There was no
support for the common belief th a t blondes are more domin­
eering, dynamic, or im patient. The claims of German racial
theorists regarding the superiority of Nordic physical types to
Mediterraneans are equally baseless. These so-called races
are distinguished not only by colouring of hair and eyes, b u t
also by shape of sk u ll; Nordics being dolichocephalic or long­
headed, Mediterraneans brachycephalic or broad-headed. B ut
experiment has shown no relation between length or breadth
of skull and either intelligence or other psychological qualities.
Another influential physiognomical theory was th a t of
Lombroso, the Italian criminologist, who claimed th a t criminals
belong to a degenerate physical type, which can be recognized
by characteristic features or stigmata. B ut he om itted to
enquire how often these same stigm ata occur in non-criminals.
L ater research certainly shows poor physique and physical
defects to be somewhat more common among criminals and
delinquents, b u t there is no distinct criminal type.
Palmistry and Chirognomy. Psychologists often classify these
along with astrology, phrenology and physiognomy as mere
charlatanry. Recently, however, C. W olff1 has provided
plausible reasons why certain dimensions of the hand and
fingers, and lines on the palm, m ight be affected by endocrino­
logical and neurological conditions, and thus indirectly reflect
health, temperam ent, and intelligence. She describes a well-
controlled experiment in reading from the hands alo n e; her
judgments of the personality traits of 24 students agreed with
their self-judgments (an obviously unsatisfactory criterion) to a
small bu t statistically significant extent. So far her methods
have not been confirmed by others, and it is doubtful whether
most professional palmists work on equally sound principles.
Readers m ay object th a t they themselves or their acquaint­
ances have had rem arkably accurate character-readings from
palmists, phrenologists, and the like. Quite ap art from fraud,
there are good reasons why this is possible. F irst, there are a
number of vague characteristics which anyone will accept as
applying to themselves, e.g. 1 strong sense of humour, abilities
1 Wolff, C., ‘ Character and Mentality as Related to Hand-Markings ’.
Brit. J . Med. Psychol., 1941, 18, 364-382.
84 Personality Tests and Assessments
not fully appreciated by others’, etc. Morgenthaler1 gave copies
of a single phrenological diagnosis to 10 women, independently;
on the average each of them considered th a t 70% of its statem ents
accurately described herself. Secondly, the character reader
seldom bases his judgments solely on specific features of the hand
or head. There are numerous other clues—conversation, manner,
dress, etc., and these contribute to a total impression which, as
we shall see in the next chapter, may be much more revealing.
Thirdly, we are ap t to remember a few striking coincidences much
more clearly than a large number of inaccurate statem ents.
Nevertheless there are some positive findings to be mentioned.
First, there are gross pathological conditions such as cretinism
and acromegaly, where disorders of the endocrine glands result
both in physical and psychological abnormalities. The cretin
(who has not received thyroxin treatm ent) is both mentally
defective and sluggish in tem peram ent. Again, men w ith very
feminine, or women with very masculine, physique do tend to
show some of the emotional qualities and interests of the
opposite sex, according to Terman and Miles ; * though there
are far too m any exceptions for this to be a safe generalization
in judging people. Secondly, there are slight positive correla­
tions between height or weight, also size of head, and intelligence
or academic ability on the one hand, and leadership or aggressive
traits on the other. Galton and Pearson found th a t Honours
students a t Cambridge had heads some 2% larger in volume
than ordinary or Pass Degree students. Industrial executives,
political and religious leaders have been shown to be signi­
ficantly taller and heavier than others less successful in their
careers. The highly intelligent children studied by T erm an3 were
generally superior in physique and health to average children of
the same age, not—as is sometimes supposed—puny prodigies.
The most thorough experiment was th a t of Murdock and Sullivan,4
> Morgenthaler, W ., * TJeber populare Charackterdiagnostik ’. Schweiz,
mcd. Wochenschrift, 1030, 60, 851-860.
' Terman, L. M., and Miles, C. C., Sex and Personality. New York :
McGraw-Hill, 1936. W. H. Sheldon (cf. p. 37) has also described a feminine
physical type in men and its personality correlates.
•Terman, L. M., et. al., Genetic Studies of Genius, Vol. I. Stanford,
California : Stanford University Press, 1925.
• Murdock, K., and Sullivan, L. R., 4 A Contribution to the Study of
Mental and Physical Measurements in Normal Children ’. Amer. Phys.
Hduc. Rev., 1928, 28.
Physical Signs o f Personality 85
who found correlations of + 14 to 16 between the heights and
weights of 600 children and their intelligence te st scores.
B ut it should not be concluded th a t physical size means a
large brain and th a t a large brain indicates intelligence. I t is
a t least as likely th a t the more intelligent and successful
individuals have usually been brought up in more favourable
circumstances, and so tend to be superior in health and physique.
Although in the evolution of animal species, increase in size of
forebrain or cerebrum goes with increase in intelligence, and
although the white human race has larger brains than certain
primitive peoples such as Australian aboriginals, there is no
proven correlation among whites. I t has been suggested th a t
complexity of convolutions or other features of the brain,
rather than mere size, underlie intelligence. B ut in fact
physiologists cannot, a t present, tell us anything about the
intellectual or other personality characteristics of an individual
from examining his brain, except in cases of disease or gross
abnormality. We need hardly add th a t there is no justification
for any of the claims of the phrenologists. N ot only is it untrue
th a t mental capacities and personality traits depend on
particular sections of the cerebrum, b u t also the strength of a
faculty would not affect the size of th a t section, nor produce
any swelling visible on the skull surface.

TYPES OF BODY BUILD

A third and more im portant correlation is th a t between


body build and certain traits which may be roughly labelled
introversion-extraversion. The notion of physical and tem pera­
mental types has a lengthy history, well described by Roback 1
in his Psychology o f Character. Shakespeare summed it up :
Let me have men about me that are f a t ;
Sleek-headed men and such as sleep o’ nights :
Yond Cassius has a lean and hungry look ;
He thinks too much : such men are dangerous.

The best-known modern formulation is th a t of Kretschmer,*


who found th a t a m ajority of schizophrenic patients were of
1 Roback, A. A., The Psychology of Character. New York, Harcourt,
Brace, 1927.
* Kretschmer, E., Physique and Character. London : Kegan Paul, 1925.
86 Personality Tests and Assessments
asthenic or leptosome (tall-thin-pale) and athletic (intermediate)
build, whereas a majority of cycloid or manic-depressive
patients were of pyknic (rotund-florid) physique. He extended
this generalization to normal persons, claiming th a t asthenics
tend to be schizothyme or quiet, sensitive, reserved in tempera­
ment, whereas pyknics are cyclothyme or emotionally labile,
genial, and sociable. He also classified historical figures by
body build, stating that they show distinctive philosophies or
interests. For example, realists, humorists, and materialistic
scientists tend to be pyknic, while romantics, idealist philoso­
phers and metaphysicians tend to be asthenics.
Now we cannot accept Kretschmer’s theories without
qualifications. Later surveys, even of psychotic patients, do
not always confirm his findings, particularly when age is
controlled. (For manic-depressive insanity tends to occur later
in life than schizophrenia, when more people are fat.) So far
as geniuses are concerned, it is easy to pick out cases th at f i t ;
no check was made on those that didn’t. I t is questionable,
again, whether deductions based on psychotics apply to normals.
Indeed by an ingenious adaptation of factor analysis (termed
criterion analysis), Eysenck 1 has shown th at schizophrenia and
cycloid insanity cannot be regarded as extreme forms of a
normal schizothyme-cyclothyme dimension. He did find,
however, that the neurotic conditions of hysteria and dysthymia
(anxiety -(-obsessional) are continuous with the normal, and
that patients belonging to these extreme groups are to some
extent differentiated by bodily build. Perhaps the most striking
results among normals are those of Burt,2 who prefers Viola’s
classification of build into macrosplanchnic (predominance of
trunk over limbs) and microsplanchnic (predominance of
vertical over horizontal dimensions), to Kretschmer’s. He
quotes correlations of -32 in adults and -26 in children with his
sthenic-asthenic or extravert-introvert emotional factor.
Another controversial point is just what index of physical
type to use. Kretschmer himself has a complicated series of
criteria for classification, whose application appears somewhat
1 Eysenck, H. J., The Scientific Study of Personality. London s Rout-
ledge, 1952. A better name would be ‘ rotation to a criterion ’, since it
does not, in fact, involve any analysis of the criterion.
* Burt, C. L., ‘ The Analysis of Temperament ’. Brit. J . Med. Psychol.,
1988, IT, 158-188.
Physical Signs of Personality 87
subjective. Moreover, his theory implies th a t every individual
falls definitely into one of four distinct types, rather than th a t
build is a m atter of degree ; (the fourth type is the dysplastic or
irregular). Other writers have suggested various more objective
and quantitative morphological indices. Eysenck compared a
large number of bodily measurements by factor analysis, and
found thd following simple index to be representative :
— ——— • The average male adult obtains an index
6 X Chest Diameter
of 1Q0, while asthenics range up to about 130, pyknics down to
about 70.
In recent years an ingenious threefold classification of build
and temperament has been put forward by Sheldon.1 I t is
based on the relative development of three bodily components :
endomorphy (roundness, softness), mesomorphy (hardness,
muscularity) and ectomorphy (delicate, ‘ linear ’ physique with
weak development of both visceral and somatic structures). An
individual is photographed from standard positions. From
measurements of the pictures he is assigned a threefold rating
on a 1 to 7 scale, indicating his standing on each component.
For example a definite pyknic, on Kretschmer’s system, might
be 7 11, and a moderate asthenic 2 4 6. Sheldon’s tempera­
mental classification, which is claimed to correspond closely to
the physical, is based on thorough clinical interviewing and
observation. By means of ratings on 60 traits the relative
prominence of viscerotonia (sociable, affectionate, love of
comfort), somatotonia (vigorous, assertive, love of muscular
activity), and cerebrotonia (reserved, love of privacy and mental
activity), is determined. Sheldon’s original correlations are
obviously spurious, and later work suggests th a t the connection
between his physical types and personality traits amounts to the
usual figure of around -2 to -3.2 Investigations have been made
1 Sheldon, W. H ., and Stevens, S. S., The Varieties of Temperament.
New York : Harper, 1942. See also, Hunt, J. McV., Bibliography.
* Cf. Child, I. £,., and Sheldon, W. H., ‘ The Correlation Between Com­
ponents of Physique and Scores on Certain Psychological Tests ’. Char,
dk Person., 1941, 10, 23-34. Fiske, D. W., * A Study of Relationships to
Somatotype J . Appl. Psychol., 1944, 28, 504-519. Smith, H. C.,
‘ Psychometric Checks on Hypotheses Derived from Sheldon’s Work on
Physique and Temperament ’. J . Person., 1949,17, 810-820. Child, I. L.,
‘ The Relation o f Somatotype to Self-ratings on Sheldon’s Temperamental
TTaitt ’. J . Person., 1950, 18, 440-^58.

4
88 Personality Tests and Assessments
into the somatotypes of officer candidates and air-force pilots,
but there seems to be no evidence th at they are of any value in
selecting men of suitable temperament. Like Kretschmer’s
types, somatotypes probably depend considerably on age.

PSYCHOSOMATIC AND ENDOCRINOLOGICAL


RELATIONSHIPS

Although these small positive correlations are almost the


only examples of direct relations between physique and person­
ality, we should not forget the complex network of interactions
between the physical and the emotional known as psycho-
somatics.1 Adler pointed out th a t m any persons suffering from
an organic defect of speech, or th e senses, or from a deformity,
strive to compensate for this inferiority, and even develop
special talent in the very field of their weakness. Demosthenes’s
im pediment of speech and Beethoven’s deafness are stock
examples. People of very small stature not infrequently display
sxcessive assertiveness or talkativeness. B u t there is, of course,
no straightforw ard correlation here between some physical
condition and a psychological quality. R ather it is a dynam ic
situation which differs widely in different individuals and which,
while it m ay be traced out clinically, can hardly be expressed
statistically. Demosthenes’s impediment m ay have been one
factor in making him a great orator ; in someone w ith a different
personality it m ight ju st as well lead to shyness and taciturnity.
Nevertheless one interesting research by Faterson * did yield
a correlation of -23 between the scores of students on an inferi­
ority self-rating questionnaire (cf. p. 127) and th e numbers of
bodily defects or symptoms of disease elicited in medical
examinations of these students. A nother physical fact to which
Adler attached much importance was order of birth. Being
the oldest or youngest child in a family may affect personality
development in various ways, b u t as Jones’s 3 survey of the
literature shows, no safe generalizations are possible. There does,
1 Cf. Dunbar, H. F ., Emotions and Bodily Changes. New Y o rk :
Columbia University Press, 1988.
* Faterson, H. F., ‘ Organic Inferiority and the Inferiority A ttitude ’.
J . Soc. Psychol., 1981, 2, 87-101.
* Jones, H. E ., * Order of Birth in Relation to the Development o f the
Child ’. A Handbook of Child Psychology (edit. C. Murchison). Worcester,
Mass. : Clark University Press, 1981.
Physical Signs o f Personality 89
however, seem to be a slight tendency for only children to be more
susceptible to maladjustment and delinquency than others.
There is some connection, again, between lefthandedness (and
possibly left-eyedness) and personality. Certainly an undue
proportion of delinquent and of educationally backward
children are left-handed. But we cannot say whether left­
handedness is a symptom of nervous and temperamental
instability, or whether some young children with maladjusted
personalities tend unconsciously to express their revolt against
society by contrariness over handedness. B urt1 gives an excellent
discussion of the complex nature and origins of handedness.
Certain sensory deficiencies are often the product of neurotic,
rather than of purely physiological, conditions. Thus Eysenck
finds defective dark adaptation or night vision useful as a measure
of neuroticism. Similarly S later2 obtained poor visual acuity
more frequently among neurotic patients than among normals.
Peptic ulcer is the most notorious example of a psychosomatic
disorder. A large proportion of ulcer patients (though by no
means all) tend to show a characteristically drawn and anxious
facial expression, and to be asthenic in physique. They are
often highly vigorous and ambitious people who drive themselves
to the detriment of their health and digestion ; and the develop­
ment of the ulcer often follows some serious disappointment or
frustration in their lives 8 But it is hardly possible to say how
far physical factors underlie the psychological, or vice versa.
Certain forms of asthma and many other diseases have similarly
been shown to be associated with, or to be unconscious expres­
sions of, psychological mechanisms. Yet there is no one-to-one
connection which would justify the diagnosis of people who are
liable to particular illnesses as always belonging to particular
personality types.
The notion that bodily chemistry underlies temperament goes
back to the Greeks and Romans, to Hippocrates and Galen.4
The four classical types of temperament—sanguine, choleric,
1 Burt, C. L., The Backward Child. London: University of London Press,
1987.
* Slater, E., and Slater, P., ‘ A Heuristic Theory of Neurosis ’. J . Neurol.
Neuroturg. dk Psychiat., 1944, 7, 49-55.
* Cf. Davies, D . T., and Wilson, A. T. M., * Observations on the Life-
History o f Chronic Peptic Ulcer ’. Lancet, 1987, 1858-1860.
4 Cf. Smith, M., ‘ The Nervous Temperament Brit. J . Med. Piyehol.,
1980, 10, 99-174.
40 Personality Tests and Assessments
melancholic, and phlegmatic—were attributed to the relative
prominence of four bodily fluids or humours—blood, yellow
bile, black bile, and phlegm. The modern version of this
doctrine is based on the discovery of the vital effects of the
endocrine glands and their secretions on growth, health, and
the emotions. Berman,1 for exatnple, not only describes a
pituitary, a thyroid, and other types of personality in whom
these glands are said to be dominant, but also claims to diagnose
the types to which historical personalities belonged. And he
attributes criminality and insanity largel)r to glandular dis­
orders. Now it is certainly false to regard each hormone as
responsible for some particular set of traits, although it may
well be true that disturbances in the normal equilibrium of
bodily chemistry do influence the emotions and behaviour.
(Equally, personality maladjustment may in some cases bring
about endocrine malfunction.) I t is far too simple, for example,
to regard the sex glands as determining masculine and feminine
temperamental characteristics, or their efflorescence during
puberty as producing the well-known instability of adolescence.
For psychologists recognize th at these sex differences and
adolescent traits '■rary widely in different societies and are largely
culturally determined. In other words, they are matters of
personality, not only of temperament. It is, therefore, much
too optimistic to expect a system of blood tests, or X-rays of
the glands, to provide us with anything more than very rough
indications of personality trends. Almost the only well-
substantiated result is that disorders of the endocrine system,
particularly of the pituitary, are found in a considerable
proportion, perhaps 20%, of children showing serious delin­
quency or maladjustment.2 But it is not clear whether such
disorders can be diagnosed easily and objectively, nor how
frequently they occur in psychologically normal children. Some
investigators claim a correlation between alkalinity of body
fluids and emotional stability, but this is denied by others.3
1 Berman, L., The Glands Regulating Personality. New Y ork: Macmillan,
1928.
* Cf. Healy, W., and Bronner, A. P., New Light on Delinquency and its
Treatment. New Haven, Conn. : Yale University Press, 1930. Lurie, L. A.,
* Endocrinology and Behavior Disorders of Children ’. Amer. J . Ortho-
psychiat., 1935, 5, 141-153.
* Cf. Gilchrist, J. C., and Furchtgott, E ., * Salivary pH as a Psycho-
physiological Variable ’. Psychol. Bull., 1951, 48, 193-210.
Physical Signs of Personality 41
One would expect that persons with high basal metabolism would
in general be more active and energetic than those who live at a
lower rate. An investigation by Dispensa was unpromising, but
both Sanford and Herrington provide confirmatory evidence.1
Another plausible theory which may help to link biochemical
factors with bodily build and temperament is that some people
are dominated by the sympathetic nervous system, others by
the parasympathetic. ‘ Sympathicotonics ’ are supposed to be
more dominating, impulsive, active, while ‘ vagotonics ’ are
more anxious, depressed, and cautious. A large amount of
work has been done with such indices of autonomic activity as
basal metabolism, pulse, respiration, blood pressure, salivation,
tendency to flushing, and sweating, etc. Unfortunately these
variables fluctuate widely from day to day, and are mostly
very specific, thus it is difficult to get reliable diagnostic
measures on reasonable numbers of cases. And though factor
analysis has been applied, there is no agreement as to what are
the main physiological ‘ dimensions ’.2 Nevertheless the
evidence suggests a general physiological activity factor, and
an autonomic imbalance or sympathetic vs. parasympathetic
factor. Sanford finds that parasympathetic response does link
with asthenic build and (negatively) with social, outgoing,
lively personality traits. Similarly Eysenck, using Wenger’s
best index—the salivation rate—found it to be higher among
hysteric than dysthymic patients. He admits that this might
be due to greater anxiety at being tested among dysthymics,
which would inhibit salivation. Darling3 considers that
measures of psychogalvanic response (cf. below) and of blood
pressure in children (which tend to be inversely related)
correspond to parasympathetic and sympathetic activity. The
1 Dispensa, J., 1Relationship of the Thyroid with Intelligence and
Personality ’. J. Psychol., 1938, 6, 181-186. Sanford, R. N., et. al.,
‘ Physique, Personality and Scholarship ’. Mcmogr. Soc. Res. Child Devel.,
1943, 8, No. 34. Herrington, L. P., ‘ The Relation of Physiological and
Social Indices of Activity Level ’. Studies in Personality (edit. Q. McNeruar
and M. A. Merrill). New York : McGraw-Hill, 1942.
* Cf. Darling, R. P., ‘ Autonomic Action in Relation to Personality
Traits o f Children ’. J. Aim. Soc. Psychol., 1940, 85, 246-260. Wenger,
M. A., ‘ Studies of Autonomic Balance in Army Air Forces Personnel ’.
Compar. Psychol. Monogr., 1948, 19, No. 4. Cattell, R. B., Personality.
New York : McGraw-Hill, 1950. Sanford, R . N., op. cil. Herrington,
L. P., op. cit.
* Op cit.
42 Personality Tests and Assessments
difference between them correlated to about -8 with ratings on
Activity, Alertness, Co-operativeness, and A ttention.
Two other theories should be mentioned. M cDougall1
considered th a t introversion arises from inhibition of the lower
by the higher nervous centres, and th a t this is released to
some extent by chemical influences; in other words, th e
extravert is normally in a state akin to mild intoxication.
This has not led to any useful tests (cf. p. 79). Finally, there
are Jaensch’s T (tetanoid) and B (Basedowoid) types, which are
believed to depend on the parathyroid and thyroid glands, and
on calcium metabolism.* Originally derived from differences
in eidetic imagery, the types are claimed to show n o t only
distinct sensory and perceptual characteristics, b u t also different
bodily physique and structure of the capillaries. (As pointed
out in Chap. I, few of these correlations were based on any
convincing evidence). Jaensch later expanded his theories into
an elaborate system of integrate and disintegrate ty p e s ; the
former were synthetic, intuitive people, the latter more analytic
and inflexible. During the Nazi regime his typology became
entangled with so-called racial psychology, and hardly merits
further consideration.

THE PSYCHOGALVANIC REFLEX AND


ELECTROENCEPHALOGRAPHY

A t one tim e it was hoped th a t the psychogalvanic reflex would


provide a significant clue to emotional traits (cf. Fig. 2). The
electrical resistance of some part of the body, usually the hand,
is measured continuously by a W heatstone Bridge apparatus,
and it shows marked fluctuations (which are outw ith conscious
control) when the person is stimulated, say, by an electric shock,
or by the th reat of a shock, or by emotionally toned words in a
free association experiment (cf. p. 172). Like the pulse, blood
pressure and respiration, or muscular tonus (cf. p. 53), it is
affected by psychological tensions. Thus it is often included
in so-called Lie Detection instruments. Unfortunately, however,
there is no simple association between the reflex and any
1 McDougall, W., 4 The Chemical Theory o f Temperament Applied to
Introversion and Extroversion J . Abn. Soc. Psychol., 1980, 24, 293-309.
' Jaensch, E. R., Eidetic Imagery and Typological Methods of Investiga­
tion. London : Kegan Paul, 1980.
Physical Signs o f Personality 48
particular mental state. The PGR is the end result of a variety
of factors—circulatory, postural, sweat secretion and tempera­
ture, not to speak of the thickness of the cuticle, the position
of the electrodes, and other vagaries of the apparatus ; and it
is elicited to varying extents in different people by a great
variety of stimuli. Thus, although twenty or more experiments
on the psychological correlates have been published, their

Fig. 2.—Psychogalvanic Reflex. Electrodes are attached


to the back and palm of the subject’s hand, or to two
fingers, and are connected to a battery by a Wheatstone
Bridge circuit. The variable resistance is then adjusted
until no deflection appears in the galvanom eter; the
subject’s resistance (multiplied b y 10) can be read off.
When a stimulus is applied the resistance tends to drop,
and the galvanometer needle moves.

results are largely contradictory. Perhaps the most thorough


work is that of Darrow and Heath.1 Neither initial, nor
average resistance, nor total drop in resistance during the
application of a series of standard stimuli, show any clear
relation to other tests or ratings of emotional traits, though
rapid recovery rate from stimulation appears more promising.
Since there is a definite tendency for more emotional word
associations to evoke bigger responses than less emotional,*
1 Darrow, C. W ., and Heath, L. L., ‘ Reaction Tendencies Relating to
Personality ’. Studies in the Dynamics of Behavior (edit. K. S. Lashley).
Chicago, 111. : Chicago University Press, 1932.
* Cf. Smith, W. W., The Measurement of Emotion. London : Kegan
Paul, 1922.
44 Personality Tests and Assessments
the average or percentage response per stimulus is often
considered to give a measure of ‘ emotionality Some studies
bear this out, but others, including those of Darrow and of the
writer, fail to do so and suggest more connection with social,
extraverted, traits. We would conclude that the PGR may be
of value in clinical studies of the associations or other stimuli
which arouse most tension in an individual, but that it can
hardly be employed as a generalized test of any identifiable
personality trait.
The electroencephalograph, which records the overall
electrical activity of the brain in the living subject, has also
been less fruitful as an objective indicator of mental phenomena
than was anticipated. The main rhythm—the alpha waves—
does develop with age during early childhood, but is apparently
unrelated to intelligence. W alter1 claimed to be able to
distinguish visual thinkers from verbalizers among adults by
means of their EEG records, though this has not been con­
firmed. One very useful finding is the appearance of certain
abnormal wave forms in almost all epileptics ; and the EEG
is now regularly employed in diagnosing this condition. Similar
abnormalities occur in some 60% of psychopaths, and in
considerable proportions of delinquents and criminals, but only
in 5-10% of normal stable adults.8 But it turned out to be
quite useless in the selection of air-crew. Although then we
have here a reliable test of some condition of the brain which
underlies certain mental abnormalities, its significance for
personality is still obscure.
In conclusion: although this chapter is so largely negative,
it would be as incorrect to say that physical characteristics are
unrelated to personality as it would be to claim th at a large jaw
always means strength of will, or th at slender fingers always
indicate an artistic nature. W hat we have shown is that, with
rare exceptions, no isolated physical or chemical symptom has
any invariable significance. Hence any system of personality
diagnosis based on such signs is certain to be false. None the
less, many anatomical or biochemical factors may exert an
important influence in the development of the individual
1 Walter, W. G., ‘ Electro-Encephalography ’. Recent Advances in
Psychiatry. London : Churchill, 1944.
* Cf. Hill, D ., and Watterson, D ., ‘ Elect ro-Encephalographie Studies of
Psychopathic Personalities J . Neurol. <fr Psychiat., 1942, 5, 47-65.
Physical Signs of Personality 45

personality. Some of these—such as body build, and certain


chemical, psychogalvanic and EEG measures—might occasion­
ally be included in appropriate trait composites, although
they do not constitute samples of behaviour in the sense
described in Chap. I. We shall see, also, in the next chapter
that the features, hands and body may, when subjectively
interpreted, be expressive of personality.
IV
Expressive Movements

JUDGM ENTS FROM PHOTOGRAPHS

N judging personality from external appearance we norm ally


I rely m ore on an unanalysed impression of the general balance
of facial and bodily proportions th a n on specific signs, also on
the m ovem ents or play of the features and lim bs during con­
versation, walking, or other behaviour. Consider first, however,
w hat can be deduced from the to ta l sta tic appearance as given
in photographs of individuals otherwise unknow n to us. In
num erous experim ents observers have been asked to ran k sets o f
* photographees ’ for intelligence or other traits, or to identify
such characteristics as vocations, or to decide which of a num ber
of emotions the photographic is expressing. The last o f these
is fairly easy. A dults and older children are usually q u ite
successful in nam ing the more prim itive em otions—anger, fear,
laughter, disgust, etc., rath e r less so in judging th e m ore subtle
expressions such as disillusionm ent (though much depends on
the aptness of the particular photograph). However, this is an
artificial type of experim ent. Landis 1 has shown th a t actors
(who usually supply the photographs) can readily p o rtray a
num ber of stereotyped and conventional expressions, b u t th a t
unsophisticated persons subjected to highly em otional situations
do no t necessarily assume any such consistent or recognizable
expressions. In fact we depend as m uch on th e co n tex t o r to ta l
situation in judging facial expressions of em otions as on the
features themselves : witness th e difficulty of telling w h at the
people on a cinema screen are expressing if one enters in th e
middle of a film, until one has had *'me to pick u p th e thread
of the plot. E ven if, as Charles D arw in suggested, th ere were
consistent and universal m uscular adjustm ents for each m ain
em otion, it would not necessarily follow th a t more perm anent
traits would be revealed by sim ilar contractions. The highly
1 Landis, C., * The Interpretation o f Facial Expression in Em otion \
J . Gen. Psychol., 1929, 2, 59-72.
46
Expressive Movements 47
irritable or the good-humoured individual m ight or m ight not
habitually show the expressions typical of anger or happiness,
respectively.
The ranking of photographs for intelligence gives very low
average correlations of +-10 or less with intelligence te st scores,
though some raters achieve fairly high correlations and others
negative coefficients. There is no tendency for psychologists,
teachers, or doctors to do any better a t this than other judges,
and women are no more successful than men. I t is found,
however, th a t most judges agree more closely with one another
than with the true o rd er; in other words, th a t there is a fairly
widespread convention as to w hat the intelligent or dull person
looks like—a convention which is largely fallacious. P in tn e r1
questioned his judges and found th a t they relied on a diversity
of signs such as bright eyes, wearing spectacles, also on resem­
blances to acquaintances. Judgm ents of personality traits
such as sociability, efficiency, energy, humour, etc., may be
slightly more successful, though it is, of course, more difficult
to get an objective criterion of accuracy, and m ost investigators
have employed ratings by associates. B u r t2 obtained coeffi­
cients ranging up to + 37 for certain traits, though the average
was only + '1 8 ; and the present w riter obtained an average
of only 01 when rankings of 10 students’ photographs on 4
traits were compared with composite measures of these traits.
I t was noticeable th a t the judgments of the different traits—
intelligence, artistic interests, sociability and efficiency—over­
lapped considerably; and th a t the 5 students who received
the highest average judgments on all traits were all smiling
when photographed, whereas the 5 with low judgm ents all
happened to be caught frowning or w ith solemn or disagreeable
expressions. Uhrbrock has published similar results, and
T hornton shows th a t wearers of spectacles tend to be rated
especially high in intelligence, industriousness, and honesty.3
1 Pintntr, R ., * Intelligence as Estimated from Photographs Psychol.
Rev., 1918, 25, 286-290.
* Burt, C. L., ‘ Facial Expression as an Index of Mentality in Children
Child Study, 1919, 12, 1-10.
* Uhrbrock, R. S., ‘ Estimating Intelligence from Photographs ’. Proc.
I X Intern. Cong. Psychol. Princeton, N .J .: Psychol. Rev. Co., 1930,
451-452. Thornton, G. R ., ‘ The Effect of Wearing Glasses upon Judg­
ments o f Personality Traits of Persons Seen Briefly ’. J . A ppl. Psychol.,
1944, 28, 208-207.
48 Personality Tests and Assessments
Landis 1 and several other investigators have obtained virtually
no agreement between judgm ents of vocational success from
photographs and actual success.
This shows the worthlessness of asking for photographs with
applications for jobs, except perhaps in choosing a private
secretary or a mannequin. If the applicant puts on a pleasant
smile when being photographed he is likely to be credited w ith
all the desirable traits of intellect and character, whereas a less
fortunate pose will damn him with m ost employers. Although
it is true th a t some judges are more accurate than others, this
ability to judge is highly variable and uncertain. Success a t
one set of photographs, or a t one trait, is very little indication
of success a t other sets, or other kinds of judgm ents (cf. p. 118).
A t the same time, this method of correlating rankings with
external criteria of the photographees’ traits is open to serious
criticism. I t is likely th a t most individuals do reveal some
aspects of their personalities in static photographs, though they
do not all provide any reliable evidence of th e same aspects.
In another experiment the writer listed 80 traits in random
order, 6 of which were known from a previous investigation
to be particularly applicable to each of 5 photographees. Judges
managed to m atch 88% of these traits w ith the appropriate
photograph, and this represents a m oderate agreement, equiva­
lent to a correlation of -32.3
A famous study of stereotypes was made by Rice.3 Nine
photographs, including a French premier, a senator, a boot­
legger, and a Russian Bolshevik were given to several groups of
judges to fit or m atch with their titles. Many of the matches
revealed the existence of biased or stereotyped opinions as to
w hat such people should look like. Thus one of the photo­
graphees had a beard, and the m ajority called him the Bolshevik;
but the Russian himself presented a distinguished foreign
1 Landis, C., and Phelps, L. W., * The Prediction from Photographs of
Success and of Vocational Aptitude ’. J. Exper. Psychol., 1928, 11, 818-
824.
* This and other matching coefficients quoted below are contingency
coefficients. Burt’s matching formula : / c —1
(where correct choices, 1 = number of V t —1
choices) yields higher correlations, for example -41 in this experim ent;
(cf. Vernon and Burt, Bibliography).
* Rice, S. A., * “ Stereotypes ” : A Source of Error in Judging Human
Character ’. J . Personnel Res., 1926, 5, 267-276.
Expressive Movements 49
appearance, and the majority identified him as the French
premier. Nevertheless one quarter of all the judgm ents were
correct, where one ninth would be expected by pure chance,
suggesting therefore th a t people do to some extent conform to
our stereotypes, or th a t our judgments do have some slight
validity. Arnheim, Gahagan, and the present w rite r1 carried
out numerous experiments where small sets of photos of un ­
known writers, politicians, or other persons had to be matched
w ith excerpts from their writings, with their professions, with
short case-studies, specimens of handwriting, records of their
voices and so forth, and in almost all of these a considerable
superiority to chance success was found, corresponding to
coefficients of around -4. Y et another technique which works
well is to get the judges to write free characterizations of the
photographees in their own words. Most of the sketches
appear to contain a good deal of accurate material, and this
can be proved objectively by asking other judges who know
the original photographees to guess which sketch refers to
which person.
In this field of expression the matching method 8 is an advance
on methods which deal with one tra it a t a time. B ut it has
various weaknesses. Although numerous judges can be used,
they cannot deal with more than about half a dozen person­
alities a t once. Hence their success is very variable, depending
largely on the heterogeneity of the particular personalities. I t
is better for each judge to match several sets of material,
chosen a t random. When matching, say, photographs with
character sketches or case-studies, it often happens th a t some
single phrase in a sketch happens to give the clue. A remark
about health or stature, neatness, m aturity, etc., may allow an
identification, w ithout any real consideration of the personality
as a whole or the appearance as a whole. Other photographs
in the set which do not happen to fit in w ith the judges’
stereotypes may never be matched correctly except by a process
of elimination.

1 Arnheim, R., ‘ Experimentell-psychologisehe Untersuchungen zum


Ausdrucksproblem ’. Psychol. Forsch., 1928, 11, 1-182. Gahagan, L.,
4 Judgments of Occupations from Printed Photographs ’. J . Soc. Psychol.,
1933, 4,128-134. Vernon, P. E .,4 Can th e 44 Total Personality ” be Studied
O bjectively?’ Char. <t Person., 1935, 4, 1-10.
* Cf. Vernon, Bibliography.
50 Personality Tests and Assessments

EXPRESSIVE MOVEMENTS

One would expect to get better judgments from observations


of living, moving people, or from motion pictures, than from
static photographs. I t is useful to distinguish here between
expressive and adaptive behaviour. The latter implies reactions
to particular situations, where we are interested in what an
individual does and why, i.e. in his aims and achievements.
While the former connotes how he acts, regardless of the
‘ content ’ of his behaviour. I t includes facial expressions,
gestures, postures, poses ; also voice and style of speech, hand­
writing and literary or artistic style (though not the content of
what is said or written). Dress and furnishings of a room are
among the other characteristics which to some extent reflect
the personality of the owner. N aturally the two types of
behaviour overlap, and we commonly rely on both content and
style in judging a p erso n ; but it is not difficult to separate
them for research purposes.
In an attractive book, A M irror o f Personality, J . G. Vance 1
writes : 4 Almost everything we do gives some clue to the
hidden self, and particularly everything which ju st happens
without prem editation.’ P art of Freud’s Psychopathology o f
Everyday Life is devoted to showing th a t apparently aimless
movements are no more determined by chance than are slips of
the tongue or lapses of memory, and th a t they unwittingly reveal
inner traits and wishes. A full discussion and classification is
given by Allport and Vernon in Studies in Expressive Movements.
To a larger extent than is commonly recognized, these modes
of expression depend on the conventions and customs in which
the individual has been reared. Thus the clergyman’s voice,
the army officer’s gait and straight back, the Bohemian artist’s
clothes, the stenographer’s painted finger-nails, the Frenchm an’s
free—and the Englishman’s restricted—use of his hands in
conversation, have little significance for personality. They are
useful, of course, in enabling one to identify the social groups
to which an individual belongs. B ut one needs to be familiar
with national, occupational, or social class differences before
one can observe genuine individual differences. Similarly, the
graphologist must know the script which a writer was originally
1 London : Williams and Norgate, 1927.
Expressive Movements 51
taught, and base his interpretations on deviations from this
model. Movements are much affected also by health or fatigue,
and by environmental restrictions such as the pressure of
clothes or furniture on the body or, in the case of writing, the
kind of pen-nib and paper.
This does not mean, however, th a t our non-adaptive move­
ments are merely specific learned habits. Thus every ad u lt’s
writing differs from w hat he or she was originally tau g h t in a
m anner peculiar to himself. Moreover, this individual style
does not depend only on the finger and w rist muscles and nerves,
for it reappears if one writes on the blackboard, or with the
left hand, or w ith the toe on a smooth sand surface. H and­
writing is a particularly useful mode of expressive movement to
study because it leaves a perm anent trace ; it is a crystallized
gesture. B ut the same conclusions hold good for any other
mode. How they arise, and how far they reflect either conscious
traits or unconscious tendencies, is too complex a psychological
problem to be considered here. Only in rare instances, such as
obsessional hand-washing, can th e clinical psychologist trace
fairly clearly the origin of any particular movement.1 And
this means th a t the interpretation of any expression is highly
subjective and uncertain. The obvious deductions—th a t a
loud voice shows domineeringness, a flabby handshake weak­
ness, an illegible handwriting disorderliness, are probably as
often as not untrue. Many expressions are compensatory.
Sometimes different parts of the body are contradictory, as
when a shy and nervous person controls his face and voice but
gives himself away w ith his hands or feet. And intentional
distortion of manner, or playing a part, is not difficult; many
people, such as salesman, largely live by it.
Probably our judgm ents of tem porary emotions or moods are
reasonably successful, in th a t they enable us to get on with one
another in daily life, w ithout too m any misunderstandings.
But, as in the case of the features, it does not follow th a t more
perm anent traits can be inferred or intuited. One thing we
do know from experim ent is th a t there is a strong trend towards
reliability and consistency. Allport and Vernon measured the
speed, extension, and pressure of a large num ber of simple
1 Cf. Krout, M. H., ‘ A Preliminary Note on Some Obscure Symbolic
Muscular Responses of Diagnostic Value in the Study of Normal Subjects ’.
Amer. J. Psychiat., 1931, 11, 29-71.
52 Personality Tests and Assessments
movements, in walking, writing, drawing, etc., and found quite
high correlations between different occasions and different
muscle groups. Moreover, the man who was expansive in
having large handwriting tended to draw large figures on the
blackboard or with his foot in sand, to take big strides in
walking, and to overestimate angles when his lower arm was
rotated. In addition to this ‘ areal ’ or expansive quality of
movement, it was possible to establish a quality of 4 centri-
fugality * or ‘ outward-tendency and a factor of force or
emphasis. Similarly, a series of experiments by W olff1 showed
th a t independent judges could m atch different modes of
expression with some success—motion pictures of bodily
actions, records of the voice, profiles, handwritings, and styles
of retelling a story.
Investigations which try to correlate judgm ents or measure­
ments of expressive movements with ratings by associates,
te st scores or other criteria of personality, generally give
positive b u t not very good results. Cleeton and K n ig h t2 had
a num ber of people sitting silent on a platform. Judgm ents by
observers on several traits gave an average correlation of only
+ •28 w ith friends’ ratings, b u t this was noticeably better than
the average validity of 00 for several measurements of
physiognomical features which were studied in the same
experiment. Eysenck, Himmelweit, and Petrie * find no good
differentiation on a number of simple expressive movement
tests between neurotic and normal adults or children, b u t
claim more promising results w ith psychotic patients. Estes 4
took motion pictures of several pairs of subjects engaged in
such actions as removing coat and shirt, holding a lighted
m atch, and wrestling with each other. These subjects had been
very comprehensively studied and assessed previously in
M urray’s * research, thus it was possible to show th a t judgm ents
1 Wolff, W., Expression of Personality. New York : Harper, 1943.
* Cleeton, G. U., and Knight, F. B., ‘ Validity of Character Judgments
Based on External Criteria ’. J . A ppl. Psychol., 1924, 8, 215-281.
* Eysenck, H. J., The Scientific Study of Personality. London s Rout-
ledge, 1952. Himmelweit, H. T., and Petrie, A., 4 The Measurement of
Personality in Children ’. Brit. J . Educ. Psychol., 1951, 21, 9-29.
4 Estes, S. G., 4 Judging Personality from Expressive Behavior ’.
J . Abn. Soc. Psychol., 1938, 38, 217-238.
* Murray, H. A., et. al., Explorations in Personality. New York : Oxford
University Press, 1938.
Expressive Movements 53

based on th e films alone agreed significantly with independent


evidence ab o u t their personalities. One of th e m ost in ter­
esting points was th a t university teachers and professional
psychologists were rath e r poor judges of th e films, artistic and
literary people b etter than average. In Wolff’s study, free
characterizations of th e subjects were w ritten by judges who
observed their modes of expression, and these were identified by
acquaintances of the subjects—w ith varying success. Wolff
found th a t subjects themselves often failed to identify th eir own
modes of expression (e.g. their voices), b u t nevertheless showed
strongly em otional reactions to them . F or example, th ey wrote
longer and more favourable or unfavourable characterizations
of them th a n they did of other people’s. This suggested th a t
some modes reflect, not the conscious personality structure, b u t
deeper tendencies which the subject is unwilling to accept.
However, H untley,1 who confirmed m ost of the experim ental
results, points out th a t they could be more simply explained by
‘ ego-involvem ent ’. A subliminal or partial recognition of one’s
own voice m ight be sufficient to evoke the same halo effect th a t
occurs in ordinary self-ratings.
A nother approach is th a t of E nke,2 who claims characteristic
differences between th e expressive m ovem ents of pyknics and
asthenics, which are therefore presum ably related to tem pera­
m ent. Pyknics tend to be more unrestrained, smooth, flexible,
and varied both in gestures and postures, play of th e features,
voice, and handw riting. F or example, in carrying a full
glass of w ater across a room, asthenics are said to move
cautiously w ith ‘ anguished ’ expressions, pyknics to be more
slap-dash.
N um erous techniques of recording the tension of various
muscle groups, and its variations, have been devised,8 an d there
is m uch evidence pointing to a connection between such tension
and em otional tension. Jacobsen,4 for example, bases his
1 H untley, C. W ., ‘ Judgments of Self Based upon Records o f Expressive
Behavior ’. J . Abn. Soc. Psychol., 1940, 35, 398^427.
* Enke, W., ‘ Die Psychom otorik der K onstitutionstypcn ’. Zsch. f .
ang. Psychol., 1930, 36, 237-287.
* Cf. D avis, R . C., ‘ Methods of Measuring Muscular Tension ’. Psychol.
Bull., 1942, 39, 329-346.
4 Jacobson, E ., Progressive Relaxation. Chicago : University o f Chicago
Press, 1929. ‘ The Neurovoltmeter ’. Amer. J . Psychol., 1939, 52, 620-
624.

5
54 Personality Tests and Assessments
treatm en t of neurotic and unstable patients largely on th e
practice of progressive m uscular relaxation. H e describes
an instrum ent—th e neurovoltm eter—for sum m ating action
potentials, and thus m easuring th e to ta l contraction tendencies
of, say, th e relaxed arm . As yet, however, there is no direct

Fig. 3.— Motor Reactions on a Luria Apparatus o f: (A) a stable,


and (B) an unstable subject, during a free association test. In each
record the upper line represents the pressure o f the left hand resting
passively; the next line shows the compression b y the right hand o f a
rubber bulb as the subject responds to each stim ulu s; the bottom
line indicates the tim e between the stim ulus word and the response.
(Reproduced from The Nature of Human Conflicts, b y A. R . Luria.
B y permission o f Liveright Publishers. Copyright 1932.)

dem onstration th a t individual differences correlate w ith


recognizable differences in personality. The Russian psycholo­
gist, L uria,1 reports very striking results from free word
association tests in which the subject presses a bulb w ith his
right hand as he gives each response, and also rests his left
hand on a surface which records involuntary trem ors or pressure
1 Luria, A. R ., The Nature of Human Conflicts. New York : Liveright,
1932.
Expressive Movements 55
variations. This continuous record from both hands shows
irregularities when highly emotional verbal stimuli are given
(in this respect it is quite similar to the psychogalvanic reflex),
or when the subject is under strain, and even more marked
disorganization occurs among neurotics (cf. Fig. 3). These effects
are widely confirmed in the literature, though it can scarcely be
said th a t they have yet led to a simple and straightforward test.
Clarke, and Albino 1 report correlations between neuroticism
and right-hand, and left-hand disturbances, respectively, during
word association tests. Several studies of muscular tension in
young children have been carried out by Duffy,2 where the
pressure of the hand on a rubber bulb was recorded during
reaction time and other tests. She claims correlations of
around -5 with assessments of the children’s excitability and
emotionality.

VOICE SPEECH, AND HANDWRITING

The voice is an interesting form of gesture which happens to


be made by the th ro at muscles, and to be audible instead of
visible. Intonation and dynam ics; speed, rhythm , and
continuity ; pronunciation ; vocabulary and choice of words,
and style all provide im portant clues in our reactions to one
another’s personalities.3 Pear 4 has pointed out the r61e of the
voice in judgm ents of social class and, in an experiment over
the radio, found th a t occupation and age can be guessed fairly
accurately. Dusenbury and Knower 6 showed th a t emotions
can be expressed by, and recognized from, tone of voice as
accurately as from facial expressions. Asthenic or pyknic
1 Clarke, A. D . B., The Measurement of Emotional Instability by Means of
Objective Tests. Ph.D. Thesis, University of London, 1050. Albino, R. C.,
‘ The Stable and Labile Personality Types of Luria in Clinically Normal
Individuals ’. Brit. J . Psychol., 1948, 39, 54-60.
• Duffy, E ., * The Measurement of Muscular Tension as a Technique for
the Study of Emotional Tendencies ’. Amer. J . Psychol., 1932,44,146-162.
• Cf. the analyses by Newman, S., and Mather, V. G., * Analysis of
Spoken Language of Patients with Affective Disorders ’. Amer. J.
Psychiat., 1938, 94, 913-942 ; and Moses, P. J., ‘ The Study of Personality
from Records of the Voice ’. J . Consult. Psychol., 1942, 6, 257-261.
4 Pear, T. H ., Voice and Personality. London : Chapman and Hall,
1931.
• Dusenbury, D., and Knower, F. H ., ‘ Experimental Studies of the
Symbolism of Action and Voice, II ’. Quart. J . Speech, 1939, 25, 67-75.
56 Personality Testa and Assessments
physique can be identified with m oderate success,1 an d different
types of psychotics show m any characteristic differences in
voice and speech.2 The m ost extensive study was th a t of
A llport and Cantril,3 where sets of three speakers read identical
passages to large audiences, either from behind a curtain, or
through microphone and loudspeaker. Various characteristics
of the speakers were known, and th e listeners tried to m atch
these with the voices. Occupations, political affiliations,
specimens of handw riting, and photographs were identified only
slightly b etter th a n chance would allow (coefficients of around
•2). B ut high or low scores on tests of ascendance-submission
and extraversion-introversion were correctly m atched 47% of
tim es (vs. 33% by chance), and short descriptions of th e to ta l
personalities of the speakers were even b etter recognized, th e
correlations being -29 and -41 respectively. As in experim ents
with photographs, judges tend to have common stereotypes
about voices, w hether or not these are correct.
In the field of speech, much work has been done on the
developm ent of vocabulary, gram m atical forms and length of
sentences, with age. More relevant to personality are P iag et’s
observations on the egocentricity and dogm atism of the
language of young children, and their inability to conceive
im personal causation. As Sanford * suggests, there m ay well
be significant differences among adults in these same qualities,
which m ight be measured from their conversation. We
commonly regard rate of speaking as indicative of impulsiveness-
deliberation, and fluency or facility as characteristic of e x tra ­
version, and there is some slight supporting evidence. Rogers
(cf. p. 70) found th a t oral fluency did not overlap a t all w ith
fluency in w ritten tests, when general verbal ability was held
c o n s ta n t; the former, b u t not the latter, correlated around -3
w ith ratings on such traits as cheerfulness. Johnson 6 discusses
1 Cf. Fay, P. J., and Middleton, W. C., * Judgm ent o f Kretschmerian
Body Types from the Voice as Transmitted over a Public Address System ’.
J . Soc. Psychol., 1940, 12, 151-162 ; Moses, op. cit.
* Cf. Newman and Mather, op. cit.
3 Allport, G. W., and Cantril, H ., ‘ Judging Personality from Voice ’.
J . Soc. Psychol.. 1934, 5, 37-55.
4 Sanford, see Bibliography. Cf. also Henle, M., and Hubbell, M.
*“ Egocentricity” in Adult Conversation’. J . Soc. Psychol., 1938,9,227-234.
‘ Johnson, \V. B., 'S tu d ies in Language Behavior, I. A Program of
Research ’. Psychol. Monogr., 1944, 56, No. 2, 1-15.
Expressive Movements 57

a number of quantitative measures of language, such as the


Type/Token ratio—the number of different words over total
words. One of the most promising is the Verb/Adjective
ratio, which has been shown by B usem ann1 to correlate
positively with ratings of emotional stability. Again Balken
and Masserman2 find the highest proportion of verbs in the
speech of anxiety neurotics, the lowest among hysteric patients.
Much investigation needs to be done into the consistency
of speech measures ; it is only too likely th a t a person’s
style varies greatly in different contexts and different social
situations.
In the ordinary interview, the interviewer not only has the
full range of modes of expression to judge from, but can also
take account of the content of what the interviewee says. The
fact th a t interview judgments are often so inconsistent and
inaccurate, even with this additional data, indicates th at
expressive movements and voice alone are too readily misinter­
preted to yield any easy test of personality which would possess
high validity.
For this reason one would not expect the analysis of hand­
writing to be as useful as graphologists claim, though it has the
advantages not only of permanence but also of being more
spontaneous and less liable to intentional distortion or disguise
than any other mode of expression. I t has the disadvantage
th a t only about a quarter of the population are sufficiently
fluent writers to have developed a mature, individual, style.
Graphologists adm it th at children’s and poorly educated adults’
scripts are much less revealing, though H. J . Jacoby believes
th a t even the scribbling* of pre-school children are to some
extent diagnostic. (His book, A n a lysis o f H andw riting ,3 may
be recommended for a general review. Allport and Vernon, and
Bell, survey the psychological literature.)
We must distinguish clearly between the older graphological
systems which attributed significance to specific signs or details
(such as length of t-bars, height of upper and lower ' projec­
tions etc.) and more modern methods based on the work of
1 Busemann, A., Die Sprache der Jugend als Ausdruck der Enttcick-
lungsrhythmik. J e n a : Fischer, 1025.
* Balken, E . R ., and Masserman, J. H ., ‘ The Language o f Phantasy ’.
J . Psychol., 1940, 10, 75-86.
3 London : Allen and Unwin, 1939.
58 Personality Tests and Assessments
Klages and Saudek, which emphasize th e dynamic p attern of
the script as a whole, its flow and rhythm , its control and drive.
Objective investigations in which measurements of details are
correlated with ratings of the writers give results as negative
as do studies of physiognomical signs.1 Nevertheless there seem
to be a few significant relations, e.g. backhanded slope, relatively
small capital letters, and insufficient space between lines, tend
to go with certain traits of em otionality or introversion.2 Many
psychologists concluded from the early investigations th a t the
whole of graphology was mere quackery. Some of their other
studies have shown a complete lack of understanding of the
graphologist’s aims and methods (e.g. several which prove th a t
untrained judges can guess the sex of the w riter in about 66%
of cases, where 50% could be got by chance). On their side,
graphologists have been somewhat unwilling to subm it to the
requirem ents of scientific experimentation, and are a p t to use
an esoteric jargon for describing personality which renders
validation very difficult. They claim th a t the usefulness of
graphology is proved if the writers themselves, or their
acquaintances, accept their personality sketches as accurate.
This is quite unconvincing to the psychologist.
The possibilities of a more Gestalt-like approach are shown
by Wolff’s and Arnheim’s work, already mentioned, and by
B obertag’s 3 study. Here several graphologists analysed hand­
writing specimens and wrote case-studies describing the
personalities of the writers. Friends of the w riters then m atched
or identified these w ith an average success of 80%, where
only 20 % would be expected by chance. In a similar experim ent
Theiss 4 had untrained judges m atch scripts w ith thum bnail
personality sketches. His enquiries showed th a t half the judges
relied on the general pattern of the writing, while a third
1 Cf. Hull, C. L., and Montgomery, R. B ., * An Experimental Investiga­
tion o f Certain Alleged Relations between Character and Handwriting ’.
Psychol. Rev., 1919, 26, 63-74.
J Cf. Land, A. H., * Graphology, a Psychological Analysis \ Univ. of
Buffalo Stud., 1924, 8, 81-114. Harvey, O. L., * The Measurement of
Handwriting Considered as a Form of Expressive Movement ’. Char, dk
Person., 1934, 2, 310-321.
* Bobertag, O., 1st die Graphologie zuverldssig ? Heidelberg, Kampmann,
1929.
* Theiss, H., ‘ E xperim ented Untersuchungen tiber die Erfassung des
hundschriftlichen Ausdrucks durch L aien ’. Psychol. Forach., 1981, 15,
276-358.
Expressive Movements 59
employed more specific signs. Powers,1 also Cantril and Rand 2
in other matching experiments got moderately good results
from professional graphologists, and poorer ones from non­
graphologists. Stein Lewinson 3 was able to predict failures in
college work and personality maladjustment with considerable
success from her analyses of the handwritings of women
students. Eysenck 4 found that a graphologist could to some
extent predict the answers of neurotic patients to a personality
questionnaire, and could match her analyses with personality
diagnoses written by psychiatrists. Some of the patients
appeared to express themselves in their writing much more
completely than others, and some questions were much better
guessed than others. In a later research,5 another graphologist
(considered to be professionally skilled) entirely failed to
differentiate the handwritings of a group of neurotics from those
of normals. There is plenty of evidence, however, th at certain
types of psychotics show strongly differentiated handwriting
characteristics. This is summarized by Bell.
In conclusion it should be pointed out that modern graphology
is a highly skilled art and science. The amateur who attempts
to apply it after reading one or two books is unlikely to give
diagnoses of any value at all. Many so-called professionals are
also inept, and there is no ready means of distinguishing the
bad from the good. The best undoubtedly are able to produce
very penetrating diagnoses of some (though not all) mature
writers. But even they are limited in what they can cover.
For example, handwriting gives little scope for the recognition
of special talents, or social attitudes. I t may chiefly be expected
to throw light on the emotional structure, conscious and
unconscious, of the personality, on character integration and
neurotic mechanisms. How completely these are expressed in
graphic movements we do not yet know.
1 Powers, E ., cf. Allport and Vernon, Bibliography.
* Cantril, H ., and Rand, H. A., * An Additional Study o f the Deter­
mination of Personal Interests by Psychological and Graphological
Methods ’. Char. <£ Person., 1934, 3, 72-78.
* Cf. Munroe, R., Stein Lewinson, T., and Waehner, T. S., * A Comparison
of Three Projective Methods \ Char. & Person., 1944, 13, 1-21.
* Eysenck, H. J ., ‘ Graphological Analysis and Psychiatry : An Experi­
mental Study Brit. J . Psychol., 1945, 35, 70-81.
8 Eysenck, H. J., ‘ “ Neuroticism ” and Handwriting ’. J . Abn. Soc.
Psychol., 1948, 43, 94-96.
60 P ersonality Tests and A ssessm ents
H andw riting pressure and its variations constitute an
im p o rtan t elem ent in graphological analysis. Usually they are

Straight lines

F ig. 4 .— H an d w ritin g Pressure V ariations as E xp ressive o f P erson ality.


T w o stu d en ts — one an im m ature, self-assertive, ex tra v a g a n t and
u n stab le in d ivid u al, th e other a colourless, q u iet, agreeable and
d ependable person — drew som e sh ort straigh t lin es an d theft w rote
a senten ce con tainin g th e w ords ‘ th e lazy dogs T h ese records show
rem arkable differences betw een th e writers in th e application o f pressure
t o th e p oin t o f th e pencil w hile carrying o u t th e sam e sim p le task s,
differences w hich appear to be characteristic o f their personalities.
(R eproduced from Studies in E xpressive M ovem ent, 1932, b y G. W .
A llport and P . E . V ernon, b y perm ission o f T h e M acm illan C om pany.)

judged by the thickness of th e trace m ade by th e nib, or th e


d ep th of indentation of th e paper. Num erous instrum ents
have been devised for giving more accurate records both of
point-pressure (on the paper) and grip-pressure (of th e hand on
Expressive Movements 61
the p e n ); though almost all of these have the disadvantage of
interfering to some extent with normal writing movements.
Such records show marked individual differences both of average
pressure and of pattern, which appear to reflect the personalities
of the writers (cf. Fig. 4), and which are characteristically
distorted in some organic and functional disorders. Both
Allport and Vernon, and Pascal,1 quote correlations of -5 to -6
between point-pressure and such traits as Energy and Expres­
siveness. More systematic exploration of the possibilities of
such apparatus is badly needed.
Two other methods developed out of graphic movements by
clinical psychologists deserve mention. Mira’s 2 myokinetic
diagnosis (P.M.K.) requires the patient to draw with each
hand in turn series of straight, zigzag and circular lines of
standard form, in various spatial directions. Half-way through
each task his vision is blocked, and the subsequent kinaesthetic-
ally-guided movements naturally tend to diverge from the
original form. The type of drift is claimed to throw light on
inner emotional and aggressive trends. Bender’s 3 Visual-
Motor Gestalt test gives scope for distorted perception as well as
inaccurate motor reproduction of shapes. I t is based, like the
Binet Memory for Designs, on observation and drawing of a
number of simple figures. Characteristic disturbances occur in
cases of nervous and mental disorder. Similarly Bvihler 4 has
described abnormal features in responses to the Stanford-Binet
Ball and Field test among neurotic children. There is little
evidence as to the diagnostic possibilities of such tests in more
normal personalities.
Artistic productions and literary style might logically be
considered here, but are more conveniently postponed to the
chapter on projective techniques (pp. 180, 192).
1 Pascal, G. R., ‘ Handwriting Pressure : Its Measurement and Signifi­
cance ’. Char, dk Person., 1948, 11, 235-254.
* Mirn, E., ‘ Myokinetic Psychodiagnosis Proc. Royal f!o<. Med,, 1940,
88, 178-194.
* Bender, L., 4 A Visual Motor Gestalt Test and its Clinical Use ’. Res.
Monogr. Amer. Orlhopsychiat. Ass., 1938, No. 8. See also Pascal, G. R.,
and Suttell, B. J., The Bender-Gestalt Test. New Y o rk : Grune and
Stratton, 1951.
* Buhler, C., ‘ The Ball and Field Test as a H flp in the Diagnosis of
Emotional Difficulty ’. Char. & Person., 1988, 6 , 257-278.
62 Personality Tests and Assessments

BEHAVIOUR DURING PERFORMANCE TESTS

One of the chief difficulties in direct, objective, testing of


personality is th a t subjects so readily guess the object of the
test, correctly or incorrectly, and tend to modify th eir normal
behaviour in order to make a favourable impression. There is
much to be said, therefore, for more indirect tests where subjects
think th a t their abilities are being tested, and do n o t realize
th a t a t the same tim e their expressive movements and m anner
of performance are being observed. Discussing the methods of
personality assessment available in the 1920s, B u r t 1 w ro te :
‘ The most helpful suggestions are to be gained, n o t from any
formal or quantitative work, bu t rather from an alert attention
to his method of attack . . . his confidence, heedlessness, his
readiness to co-operate, his attitude when faced by difficulty
or d o u b t; the sidelights so secured are often far more
illuminating and far more correct than any single score on
a scale.’
Most psychologists who have written about individual Binet,
Merrill-Palmer, or performance tests, have stressed the value
of observing the child’s reactions to difficulties and his social
behaviour in the testing situation. And m any te st record sheets
contain rating scales for the personality traits which commonly
emerge. Goodenough * found th a t negativism, shyness, and
distractibility in young children, as rated by different testers a t
successive tests, are fairly reliable, a t least over short periods.
Most children do not behave entirely differently on different
occasions or with different testers. On the next page is repro­
duced a fairly comprehensive rating sheet which the w riter
uses in any suitable test or interview situation with adults. I t
follows the principle of F reyd’s graphic rating scales (cf. p. 107).
A cross or tick is made on each of the vertical lines in order to
express rapidly the approxim ate standing of th e subject on
each quality. Additional items may be added for particular
tests, e.g. the Porteus Mazes. I t should be completed directly
after the testing session and supplemented by descriptive notes.
1 Burt, C. L., The Young Delinquent. London : University of London
Press, 1925.
1 Goodenough, F. L., ‘ The Emotional Behavior of Young Children
during Mental Tests ’. J . Juxxn. Res., 1929, 13, 204-219.
Expressive Movements 68

Although the tester can base his judgments on the full range
of expressive movements, including conversation, the method
has serious limitations. The subject is in an unusual situation,
and there is no guarantee that he will react to the problems set
before him in the same manner as he habitually reacts in daily
life. Particularly in testing adults, it is difficult to produce a
natural, spontaneous, atmosphere. I t is only too easy, also,
to jump to conclusions from slight evidence, to misinterpret
facial, vocal, or other expressions, or to be biased by chance
resemblances to acquaintances, by general like or dislike of the
testee, or by hasty first impressions, etc. On the other hand
the psychologist usually has the advantage of coming fresh to
each testee, uninfluenced by previous knowledge of him.
Clearly everything depends on his impartiality, experience, and
intuitive skill.
In one experiment by the w riter,1 25 students were observed
by 3 testers (2 of them quite inexperienced) in three different
performance test sessions. The average agreement between
their impressions, determined by the m atching of personality
sketches, was represented by a correlation of -72. They also
rated the subjects on Practical Intelligence, Quickness, and
Impulsiveness, Extraversion-Introversion, and Emotional
Stability. The average inter-correlation of -56 shows a con­
siderable am ount of variation either in the subjects’ behaviour
or in the testers’ interpretations. Nevertheless the correlations
of the summed judgm ents with composite measures of the same
traits averaged -50 (-48 to -61), showing a very promising
validity.
The method can be made still more useful by including a wide
variety of tests which will stimulate the subjects to display more
significant behaviour. Ordinary individual intelligence, educa­
tional and performance tests are hardly provocative enough. It
can also be developed in such a way as to yield quantitative
indices at least of certain traits, and thus reduce the amount of
subjective judgment. A good example of this is the Q or Quality
score in the Porteus Mazes,* based on the number of times the
subject crosses or touches the printed lines, cuts corners, starts
to go up wrong turnings, lifts his pencil, etc. Porteus states
1 Vernon, op. cit., p- 49.
* Porteus, S. D ., Qualitative Performance in the Maze Test. Vineland,
N .J. : Smith Printing House, 1942.
GENERAL RATING SCALE FOR QUALITATIVE OBSERVATIONS
DURING TESTING AND INTERVIEWING
Name................................................ Date Examiner
ACTIVITY
Excited, restless, unable to keep still Impulsive
Quick and vivacious
Stable
Calm and deliberate Cautious
In e rt and listless Inhibited
Poses, m otor a ttitu d es............................................................................................
T ic s .............................................. N ail-biting.................... Twitchings
Fiddling with m a te ria l.............. Clothes............... H ands............... Feet
Peculiar expressions....................................Excessive wrinklings.......................
MOVEMENT
Fluent and graceful
Accurate and well-controlled Quick stride and movements
Angular and awkward Slow stride and movements
Clumsy
PHYSIQUE AND BEARING
Impressive in bearing Healthy looking, well developed and nourished
Satisfactory impression
Unimpressive I Unhealthy, feeble physique
Forceful, efficient, energetic, upright posture and gait
Slouching gait
Weak, Inefficient movements and bearing
Plump (pyknic)proportions Florid
Well and symmetrically proportioned
Thin (asthenic) Pale
PERSONAL APPEARANCE AND EXPRESSION
A ttractive and good-looking (positive reaction)
Pleasant Sensual.....
Uninteresting, indifferent attractiveness
Ugly and repulsive (negative reaction) Effeminate
Strong expressiveness of face and gestures Frank
Expressionless I Secretive
Quick and strong sense of humour Cheerful, optimistic
Slow b u t sure
Unable to see humour
IDepressed, melancholy

Mature, serious, philosophical Excitable, irritable


Even-tempered
Im m ature, childish Calm, phlegmatic
PERSONAL CARE
Fastidious in dress, over-manicured
Good taste, neat and clean
Passable and Inconspicuous
Careless in dress and cleanliness
Slovenly and unkem pt
64
SPEECH
Voice re so n a n t, pleasing, w ell-m odulated j C lear, flu en t, d istin c t
1
H a rd , h arsh , p inched I S tu tte rs , sta m m e rs
E xpresses m eaning d irec tly , g ram m atic ally , w ith facility
IU nable to ex p ress him self, u n g ram m atic al
A ccen t......................................... ......
G arrulous, o v e r-ta lk a tiv e
R a th e r voluble | B rillia n t in ta lk in g , w ide v o ca b u la ry
Seldom speaks of ow n accord I D ull a n d sto lid , n arro w v o cab u lary
R e tic e n t, ta c itu rn

S E L F -A S S E R T IO N
P o m pous a n d o verbearing
C om placent D ecisive
S elf-confident a n d possessed
W avering
S elf-critical a n d d ep re cato ry
E m b a rrassed , b ashful, self-conscious C o n trasuggestible
A nxious, ap prehensive
Subm issive, re tirin g S uggestible

C O -O P E R A T IV E N E S S
W illing to co -o p erate in ev e ry r e s p e c t; en ters In to sp irit
R eserved a n d form al
C onstrained a n d suspicious, ou tsid e th e situ a tio n
S u rly a n d hostile
S crupulous, p u n c tu a l a n d reg u la r in a tte n d a n c e a n d ap p licatio n
In d u strio u s
E asy-going, indifferent
L az y an d irre g u la r

A L E R T N E S S A N D C O N C E N T R A T IO N
In te llig e n tly a tte n tiv e , w ide-aw ake
C o n c en tra ted
A bsent-m inded
E asily d istra c te d , in a tte n tiv e

T E S T R E A C T IO N S : P L A N N IN G
A n aly tical
S erious b u t u n sy ste m a tic P ro fits b y p a s t ex perience
T rial a n d e rro r
H a p h a z a rd R e p ea ts sam e m istak es

E M O T IO N
W ild a n d u n re stra in e d em o tio n al beh av io u r a n d rem a rk s
W ilful a n d childish reactio n s, capricious
Som e loss of sclf-control a n d o v e rt em otion
H u m o ro u s a n d u n concerned
Serious, philosophical
R epressed a n d in h ib ited

S P E C IA L C H A R A C T E R IS T IC S
66 Personality Tests and Assessments
that delinquents and adult criminals make an average of two
to three times as many such errors as normal controls, but this
has not yet been independently confirmed. Foulds,1 for
example, finds few differences between neurotic and normal
adults, though different types of neurotics show some char­
acteristic differences, particularly in speed of maze performance.
A series of measurements indicative of emotional reactions to
psychomotor and performance tests, together w ith rating scales,
were devised by Biesheuvela and his associates in selecting
South African Air Force pilots during the last war. I t is claimed
th a t these helped in choosing men w ith suitable personalities
as well as aptitudes. However, the evidence is equivocal, for
similar methods tried out in the USAAF gave very meagre
correlations of 1 to -2 with the criterion of passing vs. failing
pilot training. In fact they were scarcely superior to judgm ents
based on appearance alone.3 Similarly M. D. Vernon 4 attem pted
to develop methods of gauging liability to breakdown under
stress. Subjects worked a t a dotting machine (cf. p. 85) a t
high speed for several minutes, and a t other tests likely to
induce strain, and records were made of their performance in
successive half-minutes. From the upward or downward, or
irregular trend, and from qualitative observations, a somewhat
subjective rating of stability was reached. B u t no good evidence
of the predictive value of these ‘ trend tests ’ is available.
O ther more objective techniques are described in Chap. VI.
Even in group paper-and-pencil tests of abilities there is
some scope for individual differences in manner of performance,
which may express the testees’ personalities without their
being aware of it. Some people check multiple-choice answers
only when they are certain, others guess more wildly. Guilford
and Lacey 6 have shown that a consistent error-score factor
(distinct from ability at the tests as such) may be extracted,
1 Foulds, G. A., ‘ Temperamental Differences in Maze Performance.
Part I. Characteristic Differences Among Psychoneurotics ’. Bril. J .
Psychol., 1851, 42, 209-217.
• Biesheuvel, S., ‘ An Observational Technique of Temperament and
Personality Assessm ent’. Nat. Inst. Personnel Res. Bull., 1949, 1, No. 4.
• Guilford, J. P., and Lacey, J. I., Printed Classification Tests. Army
Air Forces Aviat. Psychol. Prog. Res. Rep. No. 5. Washington, D.C. :
U.S. Government Printing Oflice, 1947.
4 Cf. Vemon and Parry.
4Op fit.
Expressive Movements 67

which they term ‘ carefulness In her research on student


selection, H im m elw eit1 found such an error score to add
appreciably to the prediction of subsequent degree results.
Conceivably this m ight be useful too in selecting gramm ar
school pupils by objective group tests, b u t it has not y et been
tried. (The pupils and their teachers would, of course, have
to be kept in ignorance, otherwise the very careless m ight be
trained to modify their test behaviour.)
In conclusion it would seem th a t the evidence for objective
indices is somewhat less favourable than th a t for subjective
judgm ents of m anner of performance by experienced testers.
B u t much more research is needed.
1 Himmelweit, H. T., and Summerfield, A., * Student Selection—An
Experimental Investigation, I I ’. Brit. J . Social., 1951, 2, 59-75. Cf. also
Brown, W. M., ‘ A Study of the “ Caution ” Factor and its Importance in
Intelligence Test Performances ’. Amer. J . Psychol., 1924, 85, 808-888.
Manson, G. E ., *Personality Differences in Intelligence Test Performance ’.
J . A ppi. Psychol., 1925, 9, 230-255. Fruchter, B., * Error Scores as a
Measure o f Carefulness ’. J . Educ. Psychol., 1950, 41, 279-291.
V
Simple-Behaviour and Cognitive Tests
H E R E is no hard and fast distinction between the indirect
T tests of the previous chapter and m any of the tests
described below such as oscillation, will-temperament, dotting
machine, and the group observation methods. However, most
of the tests in this and the subsequent chapter were devised to
measure directly, or to sample, particular traits or types of
behaviour. I t will be apparent th a t early workers in the field
of objective tests (probably influenced by the highly analytical
trend of German experimental psychology) resorted chiefly to
very simple sensory, motor, or ideational tests. Particularly
in Britain, the development of m ental testing was dom inated
by C. E. Spearman, and most of the work on personality, ap art
from th a t of Burt, was restricted by his narrow views of
tem peram ent. He laid it down th a t there were three main
factors or dimensions 1 :
p —perseveration, the tendency to inertia, or hang-over
effect, in m ental processes, as contrasted w ith the
ability to switch quickly from one process to another ;
/ —fluency or quickness and richness of m ental associations ;
o—oscillation or variability in the performance of any task.

The very elementary tests of these factors, together with


the analogous tem peram ent tests constructed by Downey
in America, were generally unreliable and highly specific.
They showed little overlap w ith one another, and they had
scarcely any bearing on the personality qualities of everyday
life. Thus the m ajority of more recent tests, as shown in
Chap. VI, have approxim ated more closely to natural samples
of behaviour.
1 There was also, of course, Webb’s W or will-character factor, but this
was based on ratings, not tests. A useful summary o f the work o f the
Spearman school is given by Wynn Jones, L., An Introduction to Theory
and Practice of Psychology. London : Macmillan, 1934.
68
Simple Behaviour and Cognitive Tests 69

SPEED AND FLUENCY

The first question to ask is whether some people are consist­


ently quick or slow in all their actions and thoughts. In other
words, is there a factor of personal tempo, possibly related to
extravert-introvert tem peram ent, which could be measured by
reaction time, tapping or other dexterity tests, or speed of
walking, or quickness a t intelligence or other tests ? The
w riter has discussed the evidence elsewhere,1 and has shown
th a t considerable consistency does exist, both among manual
and cognitive tests, but th a t to a greater extent speed is specific
to particular kinds of activities. For example, different
measures of reaction time correlate very closely, also different
tapping tests ; b u t reaction times give much lower correlations
with tapping. The same is true when records are made of
normal speed of movement, in walking, speaking, writing, etc.,
as distinct from maximum speed. Thus Rimaldi 2 repeated and
extended Allport and Vernon’s investigation by giving 59
tests of normal speed to a group of students. He analysed his
results into a series of multiple factors representing different
types of speed. B ut in fact almost all his inter-correlations were
positive, so th a t it would be justifiable to recognize a small
general factor, together with group factors for quickness a t
special types of activity.
Schwegler, N o tcu tt,3 and the present writer have obtained
small correlations between various speed or fluency tests and
extraversion among children and a d u lts ; and K retschm er’s
claim for quicker movements and reaction times in pyknics
than in asthenics has had some confirmation. Himmelweit *
points out th a t most dexterity tests can be scored for speed or
accuracy, and th a t these two measures are largely independent,
or even negatively related. When a choice is given, hysteric
1 Vem on, P. E ., The Structure of Human Abilities. London: Methuen,
1950. Cf. also Allport and Vemon, Bibliography.
• Rimaldi, H. J. A., ‘ Personal Tempo J . Atm. Soc. Psychol., 1951, 48,
288-808.
* Schwegler, R . A., ‘A Study of Introvert-Extrovert Responses to Certain
Test Situations ’. Teach. Coll. Conlr. Educ., 1929, No. 861. Notcutt, B .,
4 Perseveration and Fluency Brit. J . Psychol., 1948, 88, 200-208.
4 Himmelweit, H. T., 4 Speed and Accuracy o f Work as Related to
Temperament ’. Brit. J . Psychol., 1946, 86, 182-144.

6
70 Personality Tests and Assessments
patients tend to do better on tests scored for speed, dysthym ics
on tests scored for accuracy.
In the intellectual, as distinct from the motor, field it is quite
difficult to separate speed from * power ’, i.e. from general
intelligence. However, such tests as those mentioned below
usually yield a factor for fluency of m ental associations
(Spearman’s / ) , over and above g and v factors, which may have
some significance for personality. Here also some investigators
sub-divide it into more specialized types.1 Examples of fluency
tests used by Cattell, Stephenson and Studm an, Thurstone,
Eysenck, Rogers, and others 2 include :
W riting as many words as possible in a m inute beginning
w ith the letter S ; or words ending with ‘ tion ’ ;
W riting as m any names of animals, birds, plants, as
possible, in a minute each ;
W riting adjectives to describe a ho u se; listing names of
round o b je cts;
Giving associations to inkblots ;
Speed of free word association, or numbers of words in
continuous association ;
Suggesting objects which could be inserted a t a certain spot
in a picture.
Normal speed of reading, and productivity in w riting com­
positions, or in building words from a given set of letters, have
also been used ; and several of th e above tests have been
applied orally. I t has been shown th a t manic patients score
higher in / than melancholics, and Cattell 3 claims th a t his
battery of w ritten tests scores as highly as -6 with assessments
of ‘ surgency ’ among normal subjects. Probably this is
largely attributable to halo (cf. p. 5), for no one else has
approached this figure. Petrie * found th a t / tests do not
differentiate between hysterics and dysthymics, and an extensive
and thorough research by Rogers s gave negative results. He
1 Cf. Vernon, op. cit.
1 Cattell, R. B., A Guide to Mental Testing. London : University of
London Press, 1930. Stephenson, W., Studman, G. L., et. al., ' Spearman
Factors and Psychiatry '. Brit. J . Med. Psychol., 1934, 14, 101-135.
Thurstone, L. L., *Primary Mental Abilities ’. Psychometr. Monogr., 1938,
1. Rogers, C. A., A Factorial Study of Verbal Fluency and Related Dimen­
sions of Personality. Ph.D. Thesis, University of London, 1952.
5 Op cit. * See Eysenck, Bibliography. 6 Op cit.
Sim ple Behaviour and Cognitive Tests 71
applied a large variety of fluency tests and personality ratings
to a representative group of 14-year-olds. His oral fluency tests
did show some relation to ‘ surgent ’ traits, b u t his w ritten
ones none a t all. I t has been shown by Bousfield and Sedgwick1
th a t the production of responses in a typical fluency te st tends
to follow an exponential curve, and the constants for such a
curve m ight be identified with the total reservoir of the subject’s
responses and with their rate of exhaustion. Rogers found no
evidence th a t these more analytic fluency measures gave any
better correlations with personality traits.

INTELLIGENCE AND PERSONALITY

As mentioned in Chap. I, general intelligence itself overlaps


to some extent with the emotional and moral sides of personality.
For example, in Term an’s Genetic Studies o f Genius,* th e gifted
children tended to have a wider range and better quality of
interests than average, to be superior on character tests and
ratings, and in emotional adjustm ent (though those with the
highest I.Q.s of all were somewhat more liable than the rest
to show difficulties of adjustm ent). Conversely, mental
defectives are certified on grounds of social inadequacy and
emotional instability as well as of low intelligence, and these
qualities are to some extent associated.® Among adults, again,
intelligence is quite clearly connected w ith cultural interests,
and small positive correlations have been found w ith such
attitudes as internationalism, tolerance, liberal, or progressive
opinions, etc.4
An extremely tangled situation exists in the field of psycho­
pathology, and we can review it only rather superficially,
since its bearing on the assessment of normal personalities is
1 Bousfield, VV. A., and Sedgwick, C. H. W., ‘ An Analysis of Sequences
o f Restricted Associative Responses ’. J . Gen. Psychol., 1944, 80,149-165.
This equation was first put forward by Thomson, G. H ., and Thompson,
J . R ., ‘ Outlines of a Method for the Quantitative Analysis of Writing
Vocabularies Brit. J . Psychol., 1915, 8, 52-69.
* Stanford, C alif.: Stanford University Press, 1925.
s Cf. Doll, E. A., 4 Preliminary Standardization of the Vineland Social
Maturity Scale ’. Amer. J . Orthopsychiat., 1988, 6,288-293. O’Connor, N.,
and Tizard, J . , 4 Predicting the Occupational Adequacy of Certified Mental
Defectives ’. Occup. Psychol., 1951, 25, 205-211.
4 Cf. Allport, G. W. 4 The Composition o f Political Attitudes ’. Amer.
J . Social., 1929, 35, 220-288.
72 Personality Tests and Assessments
lim ited.1 On most intelligence tests the scores of neurotics differ
little if a t all from those of norm als; b u t psychotics, especially
those of the organic type, are distinctly poorer. Performance
declines also in aphasic conditions, or w ith brain injury, and
to some extent with age ; (indeed much of the deficit among
psychotics may be an age effect). B u t this deterioration is
certainly more marked on some kinds of tests than others, and
is usually least on tests of w hat C a tte ll2 calls ‘ crystallized ’
intelligence, such as Vocabulary, whose level represents, as it
were, what intelligence has achieved in the past. Tests of
‘ fluid ’ intelligence, which are most affected, include speeded
tests, tests involving complex abstraction or conceptualization,
spatial judgm ent and orientation, memory (e.g. of digit series),
etc. Several deterioration indices have been devised which
contrast scores on tests th a t 4 hold ’ and ‘ don’t hold ’, for
example, the Babcock-Levy, the Shipley-Hartford, and the
Wechsler-Bellevue.3 U nfortunately such indices are n ot very
reliable, and although they show considerable group differences,
they give very low correlations with one another in individual
diagnosis. (It is seldom realized th a t difference scores are
always much less reliable statistically than are the separate
scores, from which they are derived; also, th a t two such scores
obtained from two imperfectly correlated sets of tests may
scarcely correlate a t all with one an o th er.)4 N aturally the
1 Useful reviews are given by Hunt, J. McV., Bibliography, and Brody,
M. B., ‘ A Survey of the Results of Intelligence Tests in Psychosis ’. Brit.
J . Med. Psychol., 1942, 19, 215-261.
* Cattell, R. B., ‘ The Measurement of Adult Intelligence ’. Psychol.
Bull., 1948, 40, 153-193.
5 Babcock, H., and Levy, L., Revision of the Babcock Examination for
Measuring Efficiency of Mental Functioning. Chicago, 111. : Stoelting,
1940. Shipley, W. C., ‘ A Self-administering Scale for Measuring Intel­
lectual Impairment and Deterioration ’. J. Psychol., 1940, 9, 371-877.
Wechsler, D., The Measurement of Adult Intelligence. Baltimore, Md. :
Williams and Wilkins, 3rd edit., 1944.
* Discrepancies between school achievement and intelligence level are
often regarded as giving an index o f some personality factor such as
industriousness. These too are apt to be very unreliable when measured
by single attainment and intelligence tests. The Achievement Quotient
is particularly unsatisfactory ; educational attainment should be compared,
not with M.A. or I.Q., but with predicted attainment based on the regres­
sion of attainment on intelligence. Cf. Chapman, J. C., ‘ The Unreliability
of the Difference between Intelligence and Educational Ratings ’. J . Educ.
Psychol., 1923, 14, 103-108.
Simple Behaviour and Cognitive Tests 73

type of deterioration differs in different psychopathological


conditions, and elaborate systems of diagnosis have been based
on discrepancies between the various Wechsler-Bellevue sub­
tests, concept-formation tests, particular Stanford-Binet items,
and so forth.1 Rabin and Guertin * have surveyed the Wechsler
literature, and concluded th a t there is little or no prospect for
the success of any mechanical system of differential diagnosis
based on profiles of performances on cognitive tests. Y et a t
the same time it may well be true th a t an experienced clinical
psychologist can, from studying a patient’s pattern of responses
to diverse tests, obtain useful insights into his mental condition,
and thus assist in making the psychiatric diagnosis. In spite
of the enormous am ount of m aterial available in clinics and
mental hospitals, and the hundreds of investigations th a t have
been published, we still seem to have no thoroughly established
quantitative indices. The contradictory and confusing results
arise, no doubt, partly because patients in any one psychiatric
group are never homogeneous and entirely distinct from other
groups, partly because of the influence on test performances of
poor co-operation during m ental illness, and partly because of
the difficulties of making allowance for age and previous
education.
The same conclusion holds for variability or range of scores on
different tests. Many psychologists working in Child Guidance
Clinics believe th a t a wide scatter of passes and fails on the
Stanford-Binet or Term an-Merrill scales is indicative of
m aladjustm ent; and numerous measures of variability for this
or for the Wechsler scale have been proposed. Ja sta k 3 has
discussed the evidence, and concluded th a t Binet scatter is not
diagnostic, although certain items such as Digits Backward
and Memory for Designs tend to offer special difficulties to
abnormal children. Unevenness in the performance of recruits
a t various intelligence, mechanical and educational tests was
studied during the war, and could not be found to have any
significance for personality. In his Progressive Matrices test

1 Cf. Rapaport, Gill, and Schafer, Bibliography.


* Rabin, A. I., and Guertin, W. H ., 4 Research with the Wechsler-
Bellevue T e s t : 1945-1950 ’. Psychol. Bull., 1951, 48, 211-248.
* Jastak, J., Variability of Psychometric Performances in Mental Diagnosis.
New York : J. Jastak, 1934. ‘ Problems of Psychometric Scatter A n alysis’.
Psychol. Bull., 1949, 46, 177-197.
74 Personality Tests and Assessments
(1988), R a v e n 1 provides an index of unreliability or irregularity
of score p attern ; this too appeared to bear no relation to
neuroticism. O ther kinds of variability will be considered
later when we come to Spearman’s oscillation factor.
Is the difference between Vocabulary or w-tests, and more
abstract intelligence or spatial (g and k) tests of any value in
personality assessment ? Several studies suggest th a t scores on
Kohs Blocks, Porteus Mazes, Picture Completion, also Progres­
sive Matrices, are relatively high among delinquents, b u t
lowered among neurotics. For example, Earl * claims th a t the
profile of scores on Binet Vocabulary, a verbal absurdities test,
Kohs Blocks and Dearborn No. 3 Formboard, is particularly
useful (with the assistance of qualitative observations) in
diagnosing instability and social inadequacy among morons.
Himmelweit,* however, finds the ratio of Vocabulary to
Matrices scores to relate to introversion, dysthym ics obtaining
larger ratios than hysterics.
Another possible lead is provided by the concept-formation
or sorting tests of Goldstein and Scheerer, Weigl, Vigotsky,
Berg, and others.4 Brain-injured and schizophrenic patients,
it is claimed, can cope with these a t the concrete or perceptual
level bu t cannot realize or formulate abstract principles of
classification. Actually no one has proved th a t these are
anything more than rather unreliable tests of g, though it
would be w orth investigating w hether the perceptual vs. the
conceptual approach is a consistent personality variable.6 In
the typical sorting test the subject is presented w ith a series of
objects which can be classified in various ways (for example,
wooden blocks of different shapes, sizes, and colours), and he is
told to sort them or to pick out those which are similar to
certain standard objects. If he has mastered one principle of
1 Raven, J. C., Progressive Matrices. London : Lewis, 1938.
* Earl, C. J. C., ‘ A Psychograph for Morons ’. J . Abn. Soc. Psychol.,
1940, 85, 428-448.
* Himmelweit, H. T., * The Intelligence-Vocabulary Ratio as a Measure
o f Temperament ’. Char. & Person., 1945, 14, 93-105.
* Goldstein, K., and Scheerer, M., Tests of Abstract and Concrete
Thinking. New York : Psychological Corporation, 1945'. Hanfmann, E.,
* A Study of Personal Patterns in an Intellectual Performance ’. Char, cfc
Person., 1941, 9, 815-825. Berg, E. A., 4 A Simple Objective Technique
for Measuring Flexibility in Thinking \ J. Gen. Psychol., 1948, 39, 15-22.
6 Cf. Hanfmann. op. cit.
Simple Behaviour and Cognitive Tests 75
classification, e.g. by colour, he is given a new and more complex
problem, and the process is repeated. Such tests are thus
considered to involve not only the capacity for abstracting and
generalising, b u t also flexibility in shifting to fresh principles.
W hether or not this is true, flexibility and its converse—rigidity
or inertia of mental processes—have played an im portant p art
in the development of tem peram ent tests.

PERSEVERATION AND RIGIDITY

Some of the earliest tests were based on sensory functions,


for example, slowness of dark adaptation, or low speed of fusion
of colours on a colour wheel. I t was claimed th a t melancholic
patients showed greater perseveration a t such tests th an manics,
b u t this was not borne out by later research. (Nor, incidentally,
has any connection been found between any perseveration tests
and the perseverative tendencies th a t occur in certain types of
psychosis.) More popular were m otor perseveration tests,
where the subject writes certain letters or figures a t maximum
speed, and then reverses the letters or writes them in some
unusual way. The decrease in speed of performance is taken as
a measure of the perseverative effect of the normal performance.
Examples include :
Copying prose with and w ithout dotting th e i’s and
crossing the t ’s ;
Doing simple 4-ru le su m s, b u t s u b s tit u tin g + fo r — a n d X
fo r 4 - sig n s, a n d v ic e -v e r s a ;
W riting SS. . for half-minute, 88. . for half-minute, then
S3S8. . for 1 m inute ;
W riting ee. . . forwards, and ee . . starting a t the
4 wrong ’ end, then alternating.
Mirror drawing, which involves the breaking down of estab­
lished eye-hand co-ordinations, is also sometimes regarded as
a p test. There has been much dispute as to the best measure
of decrement or hang-over ; all the scores tend to be affected by
initial speed of writing or copying, unless appropriate correction
is made.1 A fter correction, the correlations between different
1 Cf. Walker, K. F., Staines, R. G., and Kenna, J. C .,4 The Influence of
Scoring Methods Upon Score in Motor Perseveration Tests ’. Brit. J .
Psychol., 1945, 35, 51-60.
76 Personality Tests and Assessments
tests seldom reach statistical significance (unless the tests are
very similar), and they vary so widely in different researches
using the same tests th a t it is extremely dubious whether any
genuine factor of perseveration can be said to exist.1 Lankes
and Bernstein 2 obtained correlations as high as -5 between
pooled test scores and self-ratings or assessments of persevera-
tive behaviour in daily life, but correlations w ith measures of
introversion or of depressive tendency (with which perseveration
has been identified) have generally been negligible. An entirely
different claim was p u t forward by Pinard,8 namely, th a t both
very high and very low perseverators are more difficult,
unreliable, and lacking in self-control and perseverance than
subjects obtaining moderate scores. In other words, persevera­
tion itself is not a significant trait, b u t abnormal reactions to p
tests correlate to some extent w ith personality abnorm ality.
Both C a tte ll4 and Eysenck provide some independent con­
firmation of this.
More recently both Walker, Staines, and K enna and Cattell 5
have stated th a t the alternation type of test and the miscel­
laneous sensory tests are useless, bu t th a t ' creative effort ’
tests, where the subject has to break down some well-established
habit (such as i : t and -\---- ) are more consistent, and th a t they
measure a quality of ‘ disposition rigidity ’. Cattell obtained
negative correlations around -4 between a battery of such tests
and ratings of Dominance and Character Integration, implying
th a t the person with strong Ego development is more flexible and
capable of modifying his habits. H e argues, too, th a t there are
im portant racial tem peram ental differences between Nordics
and Mediterraneans in this tra ;t. So far no other evidence of the
validity of disposition-rigidity tests seems to have been published.
1 Cf. Jasper, H. H ., *Is Perserveration a Functional Unit Participating in
All Behavior Processes’. J . Soc. Psychol., 1931, 2, 28-51. N otcutt, B .,
op. cit., p. 69.
1 Lankes, W., * Perseveration ’. Brit. J . Psychol., 1915, 7, 387-419.
Bernstein, E., ‘ Quickness and Intelligence \ Brit. J . Psychol. Monogr.
Suppl., 1924, No. 7.
* Pinard, J. W., * Tests of Perseveration ’. Brit. J . Psychol., 1932, 28,
5-19, 114-126.
* Cattell, R. B ., ‘ The Riddle of Perseveration ’. J . Person., 1946, 14,
229-267.
* Walker, K. F., Staines, R. G., and Kenna, J. C., ‘ P-tests and the
Concept o f Mental Inertia ’. Char. <& Person., 1943, 12, 82-45. Cattell,
op. cit.
Sim ple Behaviour and Cognitive Tests 77
Yet another conception of rigidity is put forward by Lewin
and Kounin,1 who regard it as the main characteristic differ­
entiating the personalities of feeble-minded from normal
children. When given a choice of drawing tasks they tend to
persist longer at a single task, where normal children prefer
a change. Perseveration was also demonstrated at card-sorting
tests, and lack of flexibility on a concept-formation test,
especially among defective adults. But how far such tests
represent a consistent trait of personality, or how far they
merely reflect lack of intelligence, has not been investigated.
Luchins 2 suggests that inefficient methods of education may
produce rigid rather than flexible minds, and he has devised a
number of paper-and-pencil concept-formation tests. Subjects
are given a series of problems and find th at these can be solved
by the application of one principle. Half-way through the
principle alters, but the original ‘ set ’ may delay, or entirely
prevent, them from realizing this. Note th at the interfering
set in such tests is established temporarily during the course of
the test, whereas in perseveration tests it consists of a fully
automatized habit. An investigation by Oliver and Ferguson 3
suggests that the former largely depend upon g , whereas the
latter do involve a separate factor. Frenkel-Brunswick argues
th at highly prejudiced or ethnocentric persons tend to be rigid
or intolerant of ambiguities in the perceptual sphere, and
experiments by Rokeach with a problem similar to Luchins’s
provide some rather slight support for this.4
Luchins’s tests also bear a considerable resemblance to some
of the tests, described later, which have been proposed as
measures of suggestibility. The notion of interference enters
into a great variety of psychological phenomena, for example
in retroactive inhibition, in positive and negative transfer, in
1 Lewin, K., A Dynamic Theory of Personality. New York : McGraw-
Hill, 1985. Kounin, J. S., * Experimental Studies of Rigidity ’. Char, tfc
Person., 1941, 9, 251-282. Their topological theory of personality rigidity
is strongly criticised by Werner, H., ‘ The Concept o f Rigidity : A Critical
Evaluation Psychol. Rev., 1946, 53, 43-52.
a Luchins, A. S., ‘ Proposed Methods of Studying Degrees of Rigidity ’.
J . Person., 1947, 15, 242-246.
* Oliver, J. A., and Ferguson, G. A., ‘ A Factorial Study of Tests of
Rigidity ’. Canad. J . Psychol., 1951, 5, 49-59.
* Adorno, T. W., Frenkel-Brunswick, E ., et. al., The Authoritarian
Personality. New York : Harper, 1950.
78 Per sonality Tests and Assessments
memory for incompleted tasks,1 in conditioning, and in the
sets observed in psychophysical experiments. In all of these
there are, no doubt, individual differences in flexibility, but
it seems extremely unlikely th at they all depend on one
and the same trait.* Nor is there any good evidence that
any of the tests we have mentioned reflect any socially
important trait. Even the traditional p tests are so boring
to the subjects, so troublesome to give and score, and of
such dubious validity, th at no psychologist seems to make
any practical use of them.

OSCILLATION AND VARIABILITY

Fluctuations of performance have been measured in a large


number of tests such as reaction time, simple addition sums,
cancellation, muscle tone, handwriting, tapping and other
motor tasks, usually w ith disappointing results. The Mean
Variation or Standard Deviation of a long series of reaction
times is said to be much higher in unstable m ental patients
than among normals, since they find it difficult to m aintain the
necessary concentration. B ut it is not clear how far this
difference arises merely from their lack of experience in doing
reaction times. To measure oscillation in addition and other
similar tasks, the subject works continuously, b u t makes a
tick, say every 15 seconds, to show the am ount done. Con­
siderable practice, or fatigue, effects m ay occur, b u t fluctuations
can be measured from a moving average. Day-to-day variations
can also be obtained, though naturally a large number of
records is needed if the oscillation scores are to achieve reason­
able reliability. There is some doubt as to whether absolute
1 In a factorial study of test9 of persistence, Rethlingshafer included
tests o f memory for incompleted tasks and some of Cattell’s alternating
type o f p tests. Both of these showed appreciable loadings ('35 to -45) on
her ' willingness to continue ’ factor. Other motor and sensory persevera­
tion tests and a perseveration questionnaire yielded a distinct factor, or
else were wholly specific. It is a pity that this interesting research with
29 tests was carried out on only 88 students. Rethlingshafer, D ., * Re­
lationship of Tests of Persistence to Other Measures of Continuance of
Activities ’. J . Abn. Soc. Psychol., 1942, 37, 71-82.
* Further useful discussions of rigidity, particularly in the abnormal
Held, are given by Werner, op. cit., and Fisher, S., * An Overview of Trends
in Research Dealing with Personality Rigidity ’. J . Person., 1949, 17,
342-351.
Sim ple Behaviour and Cognitive Tests 79
or relative variability should be measured, b u t w hatever the
method, correlations between variability a t different (simple)
tasks are generally so low as to cast doubt on the existence of
any consistent tra it.1 Nor is there any good evidence th a t such
tests are diagnostic of emotional instability in general. One of
the best results is th a t of Walton,* who found a correlation of
—•28 between oscillation among schoolchildren on 4 tests, and
teachers’ ratings of steadiness of character. In another research
with several types of variability, Connor 8 found some differ­
entiation between m aladjusted and normal children, b u t no
correlation with tests or ratings of instability among normals.
More promising results obtained with more complex and
realistic tests are described in the next chapter.
Another type of fluctuation is the reversals th at occur when
looking at ambiguous perspective figures like the Necker cube,
the staircase, or shadow-windmill. McDougall * suggested that
extraverts have a slower rate of reversal than introverts, and
that this rate is decreased by an extraverting drug such as
alcohol. While there is some evidence of these tests differ­
entiating manic from schizophrenic patients, they fail to
distinguish hysteric and dysthymic neurotics; and several
experiments show no correlations with self-rating tests of
extraversion-introversion among normals.®*
Some exceptions to our general condemnation of sensory and
motor tests are provided by the work of Eysenck and his
collaborators. They have shown that certain tests of muscular
co-ordination give quite promising correlations with normality
vs. neuroticism. Presumably, good muscular control tends to
be associated with emotional stability. Thus neurotics often
perform badly on the Track Tracer and the O'Connor Tweezer
1 Cf. Cockett, R., Variability in Human Task Efficiency. Ph.D. Thesis,
University of London, 1950.
* Walton, R. D., ‘ Individual Differences in Amplitudes o f Oscillation
and Their Connection with Steadiness of Character \ Bril. J . Psychol.,
1989, 80, 88 40.
* Connor, D. V., The Effect of Temperamental Traits upon Intelligence
Test Performance. Ph.D. Thesis, University of London, 1952.
4 McDougall, W., and Smith, M., ‘ The Effects o f Alcohol and Some
Other Drugs during Normal and Fatigued Conditions ’. Med. Res. Counc.
Spec. Rep., No. 56. London : H.M. Stat. Office, 1920.
* Guilford, J. P., and Hunt, J. M., ‘ Some Further Experimental Tests
of McDougall’s Theory of Introversion-Extroversion ’. J. Abn. Soc.
Psychol., 1932, 26, 324-332.
80 Personality Tests and Assessments
D exterity tests. Again, N. O’Connor and Tizard 1 found th a t
the H eath rail-walking test gave the best indication of ‘ employ­
ability ’ or *work success ’ in a large battery of tests applied
to feeble-minded youths. This t e s t 8 measures the distance a
subject can walk along 4-, 2-, and 1-inch rails w ithout falling off.
Most striking is the body-sway t e s t 8 of static ataxia and
suggestibility. The subject stands with his eyes closed, and a
thread is attached to his back which connects w ith a pointer
and records on a smoked drum. The am ount of sway is con­
siderably higher in the average neurotic patient th an in
normals. B ut the effect is enhanced when a gramophone
record is played reiterating the suggestion : 4 You are falling,
falling forwards . . .’ Correlations as high as G w ith Eysenck’s
neuroticism factor have been obtained among adults, and the
test has some diagnostic value among m aladjusted children.

PERCEPTUAL TESTS

Brief mention should be made of the numerous tests which


have been claimed, by Continental typologists, to differentiate
between pyknics and asthenics, or integrates and disintegrates,
and so forth. Form vs. colour dominance is usually measured
by exposing sets of coloured shapes for a fraction of a second
with a tachistoscope, and seeing whether subjects more readily
perceive and recall the colours or the forms. Tachistoscopic
perception of complex visual material has also been used to
classify people into synthetic (vague perception of the whole)
or analytic (precise perception of details). Other tests aim to
show the capacity for spreading the attention, as contrasted
with concentrating on a single task. The Gottsehaldt figures,
in which the subject has to pick a given shape out of a complex
Gestalt, is presumed to involve flexibility in manipulating
configurations. None of the work done in this field gives
satisfactory evidence that such perceptual types or tendencies
are consistent, or that they bear any close relation to personality.
The most thorough study of a large number of perceptual tests
* Op cit. p. 71.
* Heath, S. R., * Clinical Significance o f Motor Defect, with Military
Implications \ Amer. J . Psychol., 1944, 57, 482-499.
* Hull, C. L., H ypnosis and Suggestibility. New Y o rk : Appleton
Century, 1938.
Sim ple Behaviour and Cognitive Tests 81
was that of Thurstone,1 in which several factors were isolated,
including: Perceptual Closure, Flexibility with Configurations,
Susceptibility to Visual Illusions, Oscillation of Reversible
Perspective Figures, etc. But there is no confirmation for his
suggestion that some of these factors might differentiate
successful leaders or administrators, or good and bad readers.

JUNE d o w n e y ’s w il l -t e m p e r a m e n t tests

This series of tests * had a considerable vogue in the 1920s,


b u t the results were as unsatisfactory as those of p and o tests,
and they are now regarded as little more than a psychological
curiosity. A dozen traits were supposed to be measured by
elementary tests based on handwriting, which were given in
group form or, better, individually. For exam ple:
Speed o f Movement.—Tested by normal speed of hand­
writing.
F lexibility. —By ability to disguise writing, and to copy a
model rapidly.
Care and Accuracy in Detail. —By time spent spontaneously
on disguises and accuracy of copying.
Impulsiveness. —By tendency to write larger and faster
under distraction, e.g. with eyes closed.
The tests were in fact based on a careful study of graphological
systems, and Downey herself found th a t the pattern or profile
of scores on the various tests provided an illuminating picture
of her subjects’ personalities. However, in numerous researches
the separate scores were shown to have low reliability coefficients
of around -5, and little or no agreement was obtained between
these scores and ratings by acquaintances on the relevant traits.
Possibly our criticism of the tests described in this chapter
may seem unduly harsh, because their authors did not usually
expect them to be diagnostic of everyday life personality
qualities. Rather they were concerned with fundamental
temperamental tendencies. We would say, however, that the
attem pt to measure inborn factors was a misguided one. Most
1 Thurstone, L. L., A Factorial Study of Perception. Chicago, 111. :
Chicago University Press, 1944.
* Downey, J. E., The Will-Temperament and its Testing. Yonkers, N .Y . :
World Book Co., 1924.
82 Personality Tests and Assessments
psychologists nowadays admit that intelligence can be tested
only as a product of heredity and environment. Similarly, in
the field of personality we can never really separate what is
innate from what is acquired, and even if we could test the
former it would scarcely help us in any of the practical problems
of predicting people’s behaviour.
VI
M iniature and Real L ife Situation Tests
N order to obtain more realistic samples of behaviour, which
I will yield quantitative scores, the psychologist is usually
forced to simplify and restrict the testing situation. This may
lead to considerable artificiality; hence subjects may in fact
react to ‘ m iniature ’ situations very differently from the way
they would react in more natural circumstances. Thus it very
commonly occurs th a t the validity of an objective personality
te st varies widely in different researches, depending on slight
variations in the manner of application or in the attitudes of
the subjects to being tested. And although quite promising
results have been achieved with many of the tests described in
the first half of this chapter, few if any of them can be trusted
to work in the same straightforward manner as tests of intelli­
gence and special abilities. The observational methods
mentioned later in the chapter are less restricted, b u t are
consequently more open to the vagaries of human judgm ent.
And they, too, constitute only limited samples of the general
behaviour tendencies or traits which we are trying to measure.
Thus, as pointed out in Chap. I, numerous, varied samples are
desirable ; and as each of these may require elaborate arrange­
ments and materials, successful objective personality testing is
chiefly confined to experimental researches.

PERSISTENCE TESTS

One of the first attem pts was Fernald’s 1 te st of endurance,


which consisted simply in recording the length of tim e a subject
would stand with his heels raised off the ground. The median
time for normal students was 86 minutes, whereas among a
group of prisoners the median was only 15 minutes. Other
versions of this test depend on the subject’s holding his breath
as long as possible, or sitting with one leg outstretched and the
1 Femald, G. G., * An Achievement Capacity Test ’. J . Educ. Psychol. ,
1912, a, 3 3 1 -3 3 6 .
83
84 Personality Tests and Assessments
foot raised an inch above another chair, or maintaining a grip
on a hand dynamometer equivalent to two-thirds of his own
maximum grip. Several of these have worked well in Eysenck’s
researches, and have correlated quite highly with his stability-
neuroticism factor. Dysthymics also show somewhat greater
persistence than hysterics. Howells1 similarly applied a
battery of tests of resistance to bodily pain and fatigue, which
showed some value in the prediction of academic grades.
Obviously such tests have to be ‘ put across ’ in such a way as
to stimulate the subjects’ efforts, and their results might be
seriously affected if, for instance, the subjects thought they
were being applied for selection purposes.
Several other tests have been developed in which subjects
are given some difficult manipulative puzzle or intellectual
problem, and the time they are willing to spend on it is recorded.2
Wordbuilding or hard intelligence test items have been used,
and a promising Number Series test for adults, where the
subjects are told that some of the problems are insoluble, is
described by French.® Here, too, the motivation of the subjects
is vital. MacA rth u r4 showed th at results differ somewhat
when the tasks are given individually, or in group situations
where a subject can compare himself with his fellows. He
found it best to avoid suggesting th at persistent trying was
desirable, and instead to supply some similar alternative
employment for his subjects to go on to when they had
spontaneously spent as long as they wished on the original task.
A useful variant, which must be given individually, is
Morgan and Hull’s 6 Maze test. This consists of grooves cut in
a board, and covered by a movable piece of card with a small
hole in it so th at the subject can see only half an inch of the
board around his stylus. By inserting or removing blocks,
the experimenter can set four problems of increasing difficulty,
1 Howells, T. H ., “ An Experimental Study of Persistence,” J . Aim. Soc.
Psychol., 1988, 28, 14-29.
* Cf. Hartshorne and May, and Ryans, Bibliography.
* French, J. W., ‘ The Validity of a Persistence Test ’. Psychometr.,
1948, 18, 271-277.
* MacArthur, R. S., An Experimental Investigation of Persistence and its
Measurement at the Secondary School Level. Ph.D. Thesis, University of
London, 1951.
‘ Morgan, J. J. B ., and Hull, H. L., ‘ The Measurement of Persistence
J . A ppl. Psychol., 1928, 10, 180-187.
M iniature and Real Life Situation Tests 85
the last being insoluble. Instead of scoring objectively by tim e,
the tester applies a 9-point rating scale which describes various
degrees of persistence and hard work or of fiddling and w anting
to quit. The w riter obtained good validity for this te st among
students, b u t substituted a very difficult fourth problem for th e
insoluble one in order to avoid tricking them .
Thornton 1 considers th a t the correlations between such tests
are too low to justify the conception of a common factor of
persistence, and claims th a t w ithstanding-discom fort tests and
keeping-on-at-a-task tests, etc., yield separate factors. B u t
investigations such as those of Ryans, M acA rthur * and others
do support a general factor in all types of persistence tests,
together w ith small group factors in particular types (cf. p. 14).
Moreover, a b attery of persistence tests correlates not only with
teachers’ or acquaintances’ ratings (at least am ong children),
b u t also w ith scholastic success (over and above any effects of
intelligence), and—inversely—w ith em otional instability or
neuroticism .

EMOTIONAL STABILITY

Several tests have been devised to elicit instability or


variations of perform ance in more ‘ provoking ’ situations than
those characteristic of o (oscillation) tests. The McDougall-
Schuster dotting machine exposes to the subject a sequence of
small circles, irregularly spaced, in each of which he p u ts a d o t
w ith a pencil or stylus. The circles move faster and faster, and
sooner or later the subject breaks down. B u t he may continue
system atically dotting every second or th ird circle, o r he may
lose his head and m ake feeble jabs, or give up. Sm ith, Culpin,
and F arm er 3 gave the te st to telegraphists and scored it by
the to tal dots up to the point where five consecutive circles
were missed. These scores correlated from -33 to -46 with
assessments of neuroticism based on clinical interviews.
Neurotics w ith obsessional tendencies, however, tended to score

‘ Thornton, G. R ., ‘ A Factor Analysis o f Tents Designed to Measure


Persistence ’. Psychol. Monogr., 1989, 51, N o. 229.
* Op cit.
# Smith, M., Culpin, M., and Farmer, E ., * A Study o f Telegraphists'
Cramp ’. Industr. Fat. Res. Board Rep., N o. 48. London : H.M. Stat.
Office, 1927.

7
86 Personality Tests and Assessments
highly. The validity of this particular score requires con­
firmation ; it may be th at, as in Morgan and H ull’s Maze test,
observations of m anner of reacting to difficulties would be
more useful. The fact th a t such motor co-ordination tests as
rail-walking and manual dexterity are done badly by neurotics
(cf. p. 79) gives some support.
Cattell’s 1 C.M.S. (Cursive Miniature Situation) test is a much
more elaborate version of dotting, ;n which the subject crosses
out or encircles various kinds of lines and patterns which pass
before him at a rapid rate. If he kee ps his head and chooses
the most profitable sets of lines, he can raise his score consider­
ably. Possibly the task is a bit too sophisticated except for
intelligent and/or experienced subjects. But Cattell claims
remarkably good differentiation between psychotic, delinquent,
and normal adults. Unfortunately the scoring is so tedious
th at no one else has tried to confirm the value of the test.
Probably there is scope for a test intermediate in complexity
between the McDougall and the Cattell, which could be scored
by electric counters. Kehr in Germany, and Freeman 2 in
America, have described discriminatory reaction tests which
likewise put the subject in ‘ stress ’ situations, and which appear
to have given good results. Freeman compares the perform­
ances under difficult, with those under easy conditions, and
measures the time taken to recover efficiency after the stress
period.
Instability of performance at learning tests is also promising.
Thus B all3 finds that the learning curves of unstable or neurotic
boys at a high-relief finger maze are much more irregular than
those of normals. R e y 4 describes similar results with a learning
problem based on a kind of formboard which can be solved
only by trial and error. Neither test as yet gives a quantitative
1 CatteU, R. B., 4 An Objective Test o f Character-Temperament ’.
J . Gen. Psychol., 1941, 25, 59-73.
' Kehr, T., 4 Versuchsanordnung zur experimentellen Untersuchung
einer kontinuierlichen Aufmerksamkeitsleistung ’. Zsch. f. ang. Psychol.,
1916, 11, 485—479. Freeman, G. L., 4 Suggestions for a Standardized
44 Stress ” Test ’. J . Gen. Psychol., 1945, 82, 3-11.
5 Ball, R. J . , 4 An Objective Measure of Emotional Instability ’. J . Appl.
Psychol., 1929, 18, 226-256.
4 Rey, A . , 4 D ’un proc6d<5 pour ^valuer l’£ducabilit£ ’. Arch, de Psychol.,
1934, 24, 297-337. Cf. also Zangwill, O. L., 4 Some Clinical Applications
of the Rey-Davis Performance Test ’. J . Ment. Set., 1946, 92, 19-84.
M iniature and Real Life Situation Tests 87
score for instability, and inspection of the curves needs to be
supplemented by qualitative observations of the children’s
reactions to difficulties. But a more practicable and more
objective test might be developed along these lines. Thiesen
and Meister 1 report an experiment on maze learning under
frustrating conditions—namely, no solution possible and
criticisms by the tester. Alterations in blood pressure and
psychogalvanic response during stress appeared to relate to
inability to tolerate frustration, and to school adjustment in
general; but only 10 children were tested. Mirror drawing is
another learning task which has, at various times, been alleged
to show freedom from perseveration, or emotional control, or
adventurousness vs. timidity, or yet other traits. Actually it
correlates moderately with intelligence; and in a research by
the writer a score based on the number of errors during the
first four trials correlated better with a trait-composite of
* impulsiveness ’ than with one of ‘ emotionality ’.

SUGGESTIBILITY

As early as 1916 Brown * showed th a t the correlations


between a number of tests supposed to involve suggestibility
were very low and irregular, and subsequent research has
confirmed the lack of generality of this trait. Eysenck considers
th a t it includes a t least three distinct traits which he calls
primary, secondary, and prestige suggestibility. Prim ary
suggestibility is best measured by the body-sway test, described
on p. 80. The Chevreuil pendulum te st of ability to hold a
bob steadily a t the end of a string, despite suggestions, is
sim ilar; also possibly the autokinetic phenomenon. These
and other tests correlate with susceptibility to hypnosis and, as
already mentioned, with neuroticism. (Note th a t all neurotics
tend to be more suggestible, in this sense, than normals, not
only hysterics. Contrary to common psychiatric opinion,
dysthymics obtain higher average scores.)
Secondary suggestibility includes the numerous visual,
1 Thiesen, J. W., and Meister, R. K., * A Laboratory Investigation of
Measures of Frustration Tolerance of Pre-Adolescent Children \ J . Genet.
Psychol., 1949, 75, 277-291.
* Brown, W., ‘ Individual and Sex Differences in Suggestibility ’.
Univ. Calif. P M . Psychol., 1916, 2, No. 6.
88 Personality Tests and Assessments
cutaneous, olfactory, and other tests, such as Binet’s progressive
lines and weights, where the subject is led to anticipate a
series of increasing stim uli; his suggestibility is measured by
the number of stimuli that he judges larger after the actual
increase has stopped. Acceptance of suggestions in recalling a
picture (‘ Aussage ’ test), or suggestions th at certain objects
can be seen in inkblots, also fall under this heading, though the
amount of positive correlation among such varied tests is low.
Secondary suggestibility appears to have no correlation with
emotional traits or syndromes, but (at least among children)
it correlates negatively with intelligence to a marked degree.
‘ Prestige ’ suggestibility has been measured by giving an
attitude or opinion questionnaire, then repeating it while
informing the subjects that most people, or certain important
people, answer each question in such a way. I t is scored by
noting how often the subject changes his previous response in
the direction suggested. Ferguson1 has found that scores
obtained from several such tests are moderately consistent.
But there is no evidence to show that suggestibility to prestige
in everyday life (e.g. to political speakers, to doctors or ministers,
to newspapers, advertisements, or fashions) can be predicted by
tests of this, or of any other type.

LEVEL OF ASPIRATION

This is an important concept in modern personality theory,


which refers to the goals or standards at which a person aims.*
Some people are highly ambitious, optimistic, or confident,
while others, whose actual capacity may be no less, are more
realistic and cautious, or else unduly pessimistic and afraid of
failure. I t has been studied by setting some task which provides
considerable scope for improvement, and in which the subject
can readily gauge how well or badly he is doing; for example,
simple addition sheets, substitution tests, or such manual tests
as the pursuitmeter. This task is done for, say, ten I-minute
1 Ferguson, L. W., * An Analysis of the Generality of Suggestibility to
Group Opinion ’. Char. <£ Person., 1944, 12,237-243.
* Cf. Hunt, Frank, Bibliography. The procedure here described is that
used by Himmelweit, H. T., * A Comparative Study of the Level of
Aspiration of Normal and of Neurotic Persons ’. Brit. J . Psychol., 1947,
37, 41-59.
M iniature and Real Life Situation Tests K9
periods. After each period the subject guesses his own score
( A ); he is told his actual score (B), and guesses what his score
will be in the next period (C). The differences between A and B
yield a measure of what is called Judgm ent Discrepancy, and
differences between C and the previous achievement B yield a
Goal Discrepancy score. Other measures such as Responsive­
ness and Rigidity are derived from the number of times the
estimates (C) go up or down in accordance with achievement,
or remain unaltered.
Unfortunately the results from different tests are rather
unreliable and inconsistent, though Lewin 1 claims th a t this
can be overcome by standardizing the technique and scoring,
and the motivation of the subjects. There is some evidence of
lowered aspiration or discrepancy scorcs among cripples and
physically handicapped children, among pupils or students who
are failing in their work, and students affected by poor economic
circumstances.2 Males usually score higher than females.
Ilim m elw eit8 found th a t Goal Discrepancy tends to be higher
and Judgm ent Discrepancy lower in dysthymics than in
hysterics, bu t the results of aspiration tests have been less
favourable in subsequent researches by this author and Eysenck.
In any case the discovery of plausible and interesting group
differences does not necessarily show th a t the method is of any
real value for individual diagnosis ; and Gardner 4 obtained
very small (though meaningful) correlations between individual
scores and ratings on a number of relevant traits. Thus it
seems to the writer th at aspiration tests are based on over­
simplified and trivial situations ; th a t reactions to them are
extremely chancy and have little bearing on the manner in
which a person’s self-esteem operates in his real-life behaviour.
The same cri ticism applies to the early attem pts of Moore and
Gilliland5 to test ‘ aggressiveness ’. One of their tests was based
on the subject’s ability to carry out oral addition sums while
1 Cf. H unt, Bibliography.
* Cf. R otter, J. B ., ‘ Level of Aspiration as a Method o f Studying
Personality, III. Group V alidity Studies.’ Char. & Person., 1048, 11,
255-274.
* Op cit.
* Gardner, J . W., ‘ The Relation o f Certain Personality Variables to
Level o f Aspiration ’. J . Psychol., 1940, 9, 191-206.
* Gilliland, A. R ., ‘ A Revision and Some Results with the Moore-
Gilliland Aggressiveness Test J . A ppl. Psychol., 1926, 10, 143-150.
90 Personality Tests and Assessments
looking the tester straight in the e y e ; another on the speed
and the aggressive or neutral content of free associations to
such words as ‘ enterprise, success, danger . . The correlations
between the various sub-tests were low and sometimes negative,
in a research by the writer with college students. Although
small positive validity coefficients were obtained with a trait-
composite for Dominance or Leadership, the tests were clearly
too artificial, and too greatly affected by the personal relations
between tester and subject, to be worth following up.

O BJECTIVE TESTS OF INTERESTS

Interests are generally assessed by some form of question­


naire (cf. Chap. IX), but Fryer’s book on interests describes a
number of attem pts to test them through objective measures
of behaviour. Other possible approaches are listed by Cattell
and H e is t1 ; for example :
Fraction of income and/or leisure time spent on each of a
number of types of in te re st;
Free association tests with words representing various
types, scored by speed of response and/or psycho­
galvanic reflex ;
A large sheet of miscellaneous pictures and photographs is
presented ; the subject studies these and is scored by
the number he recalls belonging to different types ;
Reading a passage concerned with an interest which
contains nonsense words ; scored by failure to notice
and cross out such words ;
Subject studies pages containing statem ents relevant to
types of interests, and irrelevant words ; strength of
interest expected to be shown by failure to recall
irrelevant material, bu t the reverse was found.
Cattell and his colleagues applied a dozen such tests in an
attem pt to measure twelve types of interests and attitudes
among small groups of students. The inter-correlations were
so low that, though it might be feasible to build up a battery of
the best objective tests for giving a composite measure of some
1 Cat tell, R. B ., et. al., 'T he O bjective Measurement o f A ttitu d es’.
Brit. J . Psychol., 1949, 40, 81-90. ‘ The O bjective Measurement o f
D ynam ic Traits ’. Educ. Psychol. M casmt., 1950, 10, 224-248.
M iniature and Real L ife Situation Tests 91
single type of interest, it would obviously be im practicable to
use such batteries in assessing num erous different types, e.g.
for vocational purposes.
One kind of objective test, however, which does offer con­
siderable promise is the te st of inform ation or knowledge about
a given field. No dou b t such tests depend on ap titu d e and
training as well as interest, b u t they do tend to correlate w ith
other interest measures, and they do help in the prediction of
vocational or educational success. F ry er describes a num ber of
early tests, particularly of mechanical inform ation, which were
employed in the American A rm y in 1918. C a tte ll1 provides a
set of fifteen short inform ation tests dealing w ith th e following
field s: Travel, Sport, Commercial, Mechanical, Scientific,
Things of the Mind, R ural-N aturalistic, Religious, L iterary,
Artistic, D ecorative, Sensual, Sex, Social, and Home. These
are not sufficiently thorough, nor well enough standardized, to
be recommended for im m ediate use, b u t they provide a useful
starting point. Tests of general inform ation ab o u t everyday
mechanical and electrical m atters were of considerable value in
allocating British recruits during the war,* and are being
adapted by educational psychologists in some areas for selecting
boys for technical education. The American Air Force psychol­
ogists also designed inform ation tests covering technical and
other types of interests. Item s which best differentiated
successful from unsuccessful pilots and navigators were picked
out, and used in a revised test, which m ade a valuable con­
tribution to aircrew selection.3
Peel and L am bert 4 have published an ingenious com bination
of inform ation and ‘ m iniature situation ’ te st for m easuring
academic vs. technical interests, to be used in selecting for
different types of secondary education. Several blocks of
questions are given, in each of which th e pupil is told to
answer half only. Thus he m ay choose to answer m ostly
technical, or m ostly literary questions, and his score is based
1 Cattell, R . B ., A Guide lo M ental Testing. London : University o f
London Press, 1936.
* Cf. Vem on and Parry.
3 Cf. Guilford, J. P., and Lacey, J. I., Printed Classification Tests.
Army Air Forces Aviat. Psychol. Prog. Res. Rep. No. 5. W ashington,
D.C. : U .S. Government Printing Office, 1947.
* Peel, E . A., ‘ Assessment o f Interest in Practical Topics \ Brit. J.
Educ. Psychol., 1948, 18, 41-47.
92 Personality Tests and Assessments
on the proportion of choices as well as on their correctness.
For exam ple:
W rite the meaning, and if necessary make a rough drawing,
of three of the following words : monastery, brake,
actor, clawhammer, optician, hacksaw.
A nother cognitive test of relevance to personality is the
George Washington Test o f Social Intelligence.1 This group test
was devised by Moss, H unt, and Omwake to measure the
ability to understand and get along with people; it can be used
w ith high school or college students. The five sub-tests involve:
Selecting the best responses in various social problem
situations ;
Recognizing the emotions underlying verbal sta te m e n ts;
True-False statem ents about hum an behaviour ;
Memorizing and recalling names and photographs of people ;
Choosing the best completion to humorous stories.
Prim arily the battery measures the same general intelligence as
ordinary verbal tests, b u t it has also given useful correlations of
around -4 with assessments or other tests of social extraversion.

MORAL KNOWLEDGE OR JUDGMENT, AND


CHARACTER, TESTS

Large numbers of paper-and-pencil tests have been published


in America,* including vocabulary tests based on ethical, or
conversely on criminal or slang te rm s ; tests of biblical and
religious knowledge ; ranking offences in order of seriousness ;
and comprehension tests (presented either verbally or pictori-
ally), where children are asked the proper thing to do in various
moral situations. E .g .:
If someone steals your lunch you should :
Steal another lunch to even it up ;
R eport it to the te ac h er;
Cry about i t ;
Say nothing about it.
1 Moss, F. A., el. al., George Washington Social Intelligence Test. Wash­
ington, B.C. : George Washington University, Center for Psychological
Service, 11)30.
* Symonds (cf. Bibliography) provides an excellent description.
Miniature and Real Life Situation Tests 98
Such tests attain good reliabilities and inter-correlations, b u t
the fact th a t they are heavily weighted with intelligence
indicates th a t they are unlikely to be highly predictive o f 4 good
character B urt and others have pointed o u t th a t delinquents
often know as well as non-delinquents w hat society regards as
right or wrong. Nevertheless in H artshorne and May’s
investigation, described below, a battery of tests, which
correlated -70 with intelligence, did correlate -35 with a battery
of objective tests of honesty. In other words, th e information
and judgm ent tests were rather more valid th an teachers’
ratings of the same children’s honesty.
As described by Symonds, several m iniature situation tests of
honesty and cheating were developed by Voelker, Cady,
Raubenheimer, and others in the 1920s. F or example, children
(or college students) may be given a straightforw ard attainm ent
t e s t ; unknown to them a copy of their answers is taken, and
this is used to check the marks when they score their own
papers. A lternatively the self-marked scores are compared with
the scores on an exactly parallel te st marked by the tester.
Peeping tests set the children to trace mazes or carry o u t other
tasks with the eyes closed. As these tasks are impossible
without vision, successful performance gives a measure of
cheating. In the Overstatem ent test, children are given a list
of book titles, several of which are plausible b u t fictitious. The
num ber they claim to have read is again supposed to show
dishonesty.
Such tests may strike the reader as thoroughly obnoxious;
b u t one research in the field of character, using these and
m any other extremely ingenious tests, was of outstanding
importance, namely, H artshorne and May’s Character Education
Inquiry. This included not only rather artificial classroom
situations, b u t also a variety of real-life situations, which were
so arranged as to yield quantitative scores for character. F or
example, games were arranged a t parties which gave scope for
cheating. Children were sent on standard errands and given
excess change, so th a t the am ount they appropriated could be
measured. Opportunities were provided for doing work for,
and giving away things to, other children, or for being selfish.
Several of the persistence tests described above, together with
ethical judgm ent tests, and ratings of children by one another
(cf. p. 118) or by teachers, were also employed.
94 Personality Tests and Assessments
When the results of different honesty measures, or persistence
or altruism tests, were inter-correlated, the agreement was
often so low th a t the authors concluded th a t there is no such
thing as good character in general, rather th a t children should
acquire specific good habits in specific situations. Y et most of
the correlations were in fact positive, so th a t it is equally
legitimate to think of honesty, persistence, etc., as general
underlying factors, which can be measured by combining the
results of a number of different tests, th a t is by trait-composites.
Moreover, there was positive overlapping between several
different composites constructed for measuring honesty,
persistence, service and self-control, and consistency of
behaviour. Thus a kind of super-composite representing
character in general would be justifiable (cf. p. 14), although
this manifests itself so differently in different situations th a t
no single test could be taken as a valid index.

OBSERVATIONAL AND T I M E - S AM P L I N G TECHNIQUES

Olson, Goodenough, Thomas,1 and others in America have


shown th a t it is possible to measure personality traits even
more directly, and in less ‘ m iniature ’ types of situations, than
did H artshorne and May. An observer can record every
instance of some specified form ol behaviour, say aggressiveness,
among children in a nursery school group. B ut it is usually
more convenient to do this a t regular time intervals. Olson
and Cunningham define time-sampling a s : ‘ systematic
recording of a definitely delimited unit of behavior described
in term s of action over a stated time interval, yielding quanti­
tative individual scores by means of repeated time units ’. And
they describe applications of this technique to some forty types
of behaviour. For example, P arten 2 studied social participa­
tion among 34 nursery school children. She first drew up a
list of categories of participant behaviour including solitary
play, onlooker behaviour, organized group play, etc. She
1 Olson, W. C., and Cunningham, E . M., * Time-Sampling Techniques \
Child Devel., 1934, 5, 41-58. Thomas, D . S., Some A'ete Techniques J o t
Studying Social Behavior. New York : Teachers College, Columbia, Bur.
Publ., 1929. Cf. also Arrington, Bibliography.
* Parten, M., ‘ An Analysis of Social Participation, Leadership, and
Other Factors in Pre-school Play Groups ’. Instit. Child Welfare Monogr.
Ser. Minneapolis : University of Minnesota Press, 1931.
Miniature and Real Life Situation Tests 95
then observed each child for 1 minute a day for 60 days
(distributing each child’s minutes throughout an hour’s school
period), and ticked off the category of behaviour into which he
fell during th a t minute. A score for social participation was
thus derived from the numbers of minutes in which he had been
engaged in each category. I t was found in this and other such
studies th a t time-sample scores possess good reliab ility ; for
example, participation scores on odd-numbered days correlated
highly with scores on even-numbered. Moreover, they predict
future behaviour of the same type, or else other types which
one would naturally expect to be related (e.g. leadership,
talkativeness, laughter, and physical activity) in a consistent
fashion. Time-sampling is reliable or consistent too, in the
sense th a t two observers making records of the behaviour of
the same children agree very closely provided th a t they are
thoroughly trained, and th a t the behaviour is defined sufficiently
unequivocally. Thomas shows th a t such consistency is higher
when highly specific and objective activities are recorded (e.g.
total physical contacts with other children) rather than activities
which involve some interpretation (e.g. number of social
contacts). B ut the latter are certainly more meaningful and
useful for personality study.
The technique is mainly applicable among very young
children, both because their behaviour is less complex than th a t
of older children or adults, and easier to classify consistently,
and because they can readily be observed w ithout becoming
self-conscious ; (if a one-way observation screen is available,
they need not know th a t they are being watched a t all). I t
has been extended to older children in a research by Olson,1
who recorded the nervous habits, nail-biting, nose-picking,
head-scratching, tics, etc., of children in class, unknown to
them. Though it was difficult to get high observer-consistency,
a reliable total score was built up, which bore some relation to
teachers’ ratings of behaviour difficulty. Newcomb 8 applied a
similar technique to a study of extraverted and introverted

1 Olson, W. C., ‘ The Measurement of Nervous Habits in Normal


Children ’. Inslit. Child Welfare Monogr. No. 3. Minneapolis : University
of Minnesota Press, 1929.
* Newcomb, T. M., ‘ The Consistency of Certain Extrovert-Introvert
Behavior Patterns in 51 Problem Boys ’. Teachers College Columbia
Conlr. Educ., No. 382, 1929.
Personality Tests and Assessments
behaviour a t a boys’ summer camp. I t m ight well be used with
adults in a factory situation or committee ; indeed it overlaps
with time and motion and accident studies, with Mass Observa­
tion, and with recent work on social dynamics of groups, though
these are not, of course, concerned w ith personality differences.
Such research is much more troublesome than would appear a t
first sight because of the difficulties of defining significant
behaviour sufficiently rigidly, of securing im partial and con­
sistent observers, and ensuring th a t their presence does not
affect the behaviour of the individuals they are observing.
Nevertheless it has yielded results of the greatest value in, for
example, M urphy’s study of sym pathy and aggressiveness
among young children, D. E. M. Gardner’s comparison of the
personalities of 5- to 10-year-old children taught in progressive
and orthodox schools, and Lewin’s investigations of the effects
of frustration and of authoritarian, laissezfaire and democratic
club leadership on boys’ social behaviour.1

GROUP OBSERVATION TECHNIQUES

B urt describes a research carried out during the F irst World


War, where 58 12- to 14-year-old children were assessed on a
num ber of traits (Emotional Stability, Extraversion, Leader­
ship, Delinquent Tendencies, etc.) by several independent
observers employing different techniques. These judgm ents
were validated against exceptionally thorough teachers’ ratings.
The average coefficient for judgments based on interviews was
•44, and for those based on projection, questionnaire, and other
tests was only -27. Considerably more successful—the coeffi­
cients averaging -54—were observations of children in specially
arranged, b u t natural, situations such as a tea-party and a visit
to the Zoo. ‘ On these occasions a number of stock little crises
were stage-managed, so th a t each child’s reactions to typical
everyday emergencies could be observed.’ No further details
are given, b u t similar observational methods were applied in

1 Murphy, L. B ., Social Behavior and Child Personality. New York :


Columbia University Press, 1937. Gardner, D. E . M., Testing Results in
the Infant School. London : Methuen, 1942. Long Term Results of
Infant School Methods. London : Methuen, 1950. Lewin, K., Lippitt, R.,
and White, R. K., ‘ Patterns of Aggressive Behavior in Experimentally
Created “ Social Climates ” ’. J . Soc. Psychol., 1939, 10, 271-209.
M iniature and Real L ife Situation Tests 97
assessing students engaged on biological field-work.1 Clearly
these investigations were forerunners of im portant later
developments.
In the 1930s, G erman m ilitary psychologists devised elaborate
methods of officer selection based less on objective tests th a n on
qualitative observations of expressive m ovem ents and of
behaviour in situations involving stress. In some of th e tasks
candidates had to drill recruits, give instruction or short
lectures, or carry out complex orders requiring quickness of
uptake, physical agility, or endurance, or im provization in
emergencies. Their reactions were observed and interpreted by
trained psychologists. No good evidence of the validity of such
m ethods was ever collected, and it is only too likely th a t the
tasks were too artificial or the judgm ents too subjective to be of
much value. Nevertheless, during the war, B ritish psychologists
in W ar Office Selection Boards, and American psychologists in
the Office of Strategic Services,2 developed sim ilar techniques,
which combined objective tests of abilities, questionnaires and
projection tests of personality, w ith interview s by m ilitary
officers and psychiatrists, and w ith observations of behaviour
a t certain group exercises. The W.O.S.B. technique has
frequently been described. Groups of ab o u t eight candidates
for commissions were studied over a 2- to 3-day period by a
senior arm y officer, a m ilitary testing officer, a psychiatrist, and
a psychologist, who a t the end pooled their inform ation before
deciding on the suitability of each candidate. The exercises,
which were watched by the testing officer and sometimes by the
other staff, were designed, not to bring o u t particular traits
(leadership, co-operativeness, initiative, etc.), b u t to be
analogous to some of the common social situations of arm y life,
and to afford opportunities for observing how each candidate
behaved in a small group. In a typical ‘ leaderless-group ’ test,
for instance, the group is assigned some ta sk such as moving a
heavy object over a set of obstacles, and is left to work o u t its
own solution. Some candidates behave merely as passengers,
others try to dom inate th e rest, while some seem n atu rally to
come to the fore, though acting in the group’s rath e r th a n their
1 Burt, C. L., 4 The Factorial Analysis o f Emotional Traits ’. Char, d r
Person., 1939, 7, 238-254, 285-299.
1 Cf. Vernon and Parry. Also, Office o f Strategic Services Staff.
Assessment of Men. New York : Rinehart, 1948.
98 Personality Tests and Assessments
own interests. Note th a t this is not an objective test of any
quality. I t falls rather under the indirect or expressive methods
of Chap. IV, since the candidates would usually assume th a t
their ingenuity rather than their social response was being
tested. And, along with the other exercises and interviews,
it yields, not scores or mear urements, b u t subjective ratings
by the observer of the candidate’s personality as a whole.
Hence it depends enormously on the skill and experience
of the observers. Other weaknesses have been pointed out
by the writer elsewhere, for example, the obvious depend­
ence of the candidate’s behaviour on his interpretation of
the procedure and his preconceived notions of the sort of
personality th a t he should try to display. Thus the agree­
m ent between different observers, or between ratings given
on two or more occasions, is nothing like so high as th a t of
tim e-sam pling; it is usually around -6 to -7. Nevertheless
this is somewhat higher than the reliability of judgm ents based
on interview alone, or of ratings based on general acquaint­
ance and on casual, as contrasted with directed, observation.
Various follow-up studies of candidates selected by these'
procedures have given quite low correlations with the officers’
subsequent success a t a training unit or in the field. There are,
however, extreme difficulties in securing a reliable criterion of
‘ success ’, and validation is necessarily confined to the highly-
selected candidates who have actually been chosen; i.e. there
is no way of proving how unsuccessful the rejected candidates
would have been. One can a t least state th a t the method is
superior to the older methods based on interview alone. More­
over, it has the tremendous advantage of appearing to be
fair to the army, thus stim ulating the supply of candidates
and improving the confidence of senior officers in their
subordinates.
I t is because of this high ‘ face validity ’, and the acceptability
to candidates and to users, not because of their proven value,
that group observational procedures have had such a vogue
since the war. They are still used, in somewhat abbreviated
form, by all three Services for officer selection. The most
important adaptation was in selecting high-grade civil servants
for the administrative class and foreign service (where the
ordinary interview plus academic examination method of
selection was obviously inappropriate for assessing men and
Miniature and Real Life Situation Tests 99
women whose careers had been interrupted by war service).1
Here the analogous exercises were designed chiefly to bring out
desirable qualities of intellect, for exam ple: free discussions
on a given topic among a group of candidates, giving a short
lecture, studying a brief and expounding it to a committee,
and acting as chairman to a committee. Judgm ents based on
these situations were supplemented by the study of objective
test and examination results, background data and references,
and by interviews. I t was possible to dem onstrate quite high
validity for the procedure as a whole when the selected candi­
dates were followed up after 2 years in the Civil Service (cf.
p. 29). Correlations of around -6 were almost as high as the
correlation between the Civil Service’s own judgm ents of
suitability after 1 and after 2 years. In other words, the
systematic 2- to 3-day procedure was nearly as accurate in
forecasting success as was casual observation of the whole of
the first year’s work. B ut it should be stressed th a t the
exercises, although quite time-consuming, supplied only about
a quarter or a third of the material on which the final choices
were made. Moreover, the staff were exceptionally experienced
and very stable in composition. I t certainly does not follow
th a t any ad hoc adaptations of the 4 house-party ’ method,
either by psychologists or laymen, will be equally successful.
Both the Army and the Civil Service have departed from their
original procedures, on grounds of economy, and satisfactory
validation of their present watered-down methods is not yet
available.
Munro F raser2 describes numerous applications to the
selection of industrial executives, where an appointments
committee, including a psychologist, spends half or a full day
on a group of candidates. He reports general satisfaction
with the products of the method, but this is no substitute
for scientific validation. Candidates for the ministry, for
teacher training, and for youth leaders, have been similarly
dealt with.

1 Cf. Wilson, N. A. B., * The Work of the Civil Service Selection Board ’.
Occup. Psychol., 1948, 22, 204-212. Vemon, P . E ., ‘ The Validation o f
Civil Service Selection Board Procedures ’. Occup. Psychol., 1950, 24,
75-95.
* Fraser, J. M., ‘ New-Type Selection Boards in Industry ’. Occup.
Psychol., 1947, 21, 170-178.
100 Personality Tests and Assessments
In one or two education areas, groups of 11-year-old children
who are borderline candidates for gramm ar school places have
been collected a t a convenient centre for a day, and observed a t
a variety of tasks not unlike the Army leaderless-group tests—
group games, constructional, imaginative, and dram atic
activities. The teachers and psychologists who watch them
apparently reach an agreed judgm ent quite readily as to which
children show most initiative, co-operation, and other desirable
personality qualities ; but again there is no evidence y et as to
how far this predicts anything relevant to gram m ar school
success. Here, too, there is some danger of children behaving
unnaturally, through a sense of the importance of the occasion,
or as a result of coaching by schools or parents. One would
have thought th a t a week or a m onth's trial period in an actual
gram m ar school, with a teacher specially trained to observe
their social and intellectual adjustm ents, would be more
diagnostic; or indeed th a t a system of interm ediate schools
before a final decision is reached a t 13 years would be even
more effective.
In conclusion : these group procedures do not constitute
personality tests. They are likely to be somewhat superior to
the conventional interview method of assessing people, because
they provide a more prolonged and varied set of situations in
which to observe and interpret. B u t they are ju st as dependent
as the interview on the skill, experience, and im partiality of
the observer, and they should be applied w ith all the more
caution because they engender in the observers an undue
measure of confidence in the accuracy of their judgments.
One m ight expect them to be superior also to the observation
of behaviour a t performance or other tests, described in Chap.
IV, because they bring out social reactions of the candidates to
their fellows, instead of only to the tester. B ut this is a dubious,
and as yet unsubstantiated, advantage since it also means th a t
the situation is more complex, less standardized, more a p t to
stim ulate self-consciousness and playing a part.
VII
Ratings and Judgments of Personality
H E object of the rating method is to draw on the knowledge
T th a t a person’s associates have acquired about him, and to
turn this into numerical estimates of his standing on various
personality traits. Let us look first a t the acquisition of such
knowledge. As soon as we meet a person we jump to con­
clusions about him. We interpret his features and expressive
movements, and any actions we see or words we hear, and
arrive a t a kind of picture or schema of his personality as a
whole. Our further contacts, observations, and conversations,
help to fill in and extend, sometimes to modify, this schema.
But when we are asked to rate him and give him, say, a high
mark for Sociability or a low mark for Dependability, it is not
so much because we have observed any particular pieces of
behaviour which are representative of these traits, as because
we generalize from our total impressions. Sometimes certain
observations stand out in our minds and influence our judg­
ments : he may have failed to carry out some commission, so
we call him undependable. B ut usually a whole conglomeration
of more or less unanalysed recollections and emotional reactions
is bound up in any judgment. Earlier conclusions about him
considerably affect later observations; once the schema has
been formed we tend to interpret what we see of him to fit in
with it. Thus the schema is not an objective portrait or
summary of the person. Although it may embody visual
images and verbal descriptions, it also involves an emotional
attitude or sentiment towards him. Landis 1 has studied the
reasons given by raters for their judgments, and pointed out
th a t good or bad reasons have little effect on accuracy. They
tend to be rationalizations, in the psychoanalytic sense, of
whose real origins the rater is largely unaware.
Our own theories of human nature and the meanings we
attach to various traits also affect our judgments. In the
1 Landis, C., 1 The Justification of Judgments \ J . Personnel Res.,
1925, 4, 7-19.
101
8
102 Personality Tests and Assessments
course of our lifetime, our analysis of self, our contacts with
other people, and the books and newspapers we read or the
cinema films and plays that we see, all help to build up in us
a set of stereotypes or stock personalities—the typical athlete,
the aesthete, the absent-minded professor, the pedantic civil
servant, etc. We are very apt to fit each new acquaintance into
one or other of these categories. Actions which fail to conform
are often not noticed. Hence our schemas remain primitive
and far too simple to cover the complexities of the personalities
we actually meet. (An enlightening discussion of the develop­
ment of conceptions of people among children and poorly
educated adults is given by W atts.1 The interplay between
our linguistic education and our understanding of people
deserves much more study.)
The result, as mentioned in Chap. I, is the halo phenomenon.
Either our general liking or disliking affects our judgments of
what should be distinct tra its ; or our schema embodies so
strong an impression of one personality type or trait—say
joviality, devotion to work, selfishness, or their opposites—that
we interpret all other behaviour and rate other traits to accord
with this. The subjectivity of judgments of personality is
apparent whether these judgments are expressed in a free
description (witness the varied interpretations of Napoleon’s
or Hitler’s personalities by different authors), or as ratings.
Although the latter are given in more standardized form, and
the rater is usually warned to avoid halo, yet discrepancies
between raters are probably as great as, or greater than,
between biographers because of ambiguities in the interpretation
of traits to be rated, and variations in standards of judgment.
Ratings are therefore best regarded as samples of the ‘ reputa­
tion ’ of the subject in the eyes of the rater. They are most
inadequate as sole criteria of a person’s traits, or as the sole
source of data for the scientific study of personality. Yet at
the same time they cover a much wider range of more natural
behaviour than any practicable battery of personality tests or
time-samples, and they have the tremendous advantage of
being applicable without taking up the time of the subjects—
even without their knowing anything about it. In an extensive
research by the writer on a small group of students, the average
‘ W atts, A. F ., The Language and Mental Development of Children.
London : Harrup, 1944.
Ratings and Judgments o f Personality 108
validity coefficient of sets of associates’ ratings, when compared
with trait-composites, was +-60, whereas most of the b etter
objective tests yielded coefficients between -30 and -45. U n­
doubtedly, then, ratings are useful, and they should be included
in any approach to the assessment of people, provided th at certain
precautions are observed. Indeed they have probably been more
widely used (in the form of school reports and record cards, merit
ratings in industry, etc.) and more thoroughly studied than any
other psychometric technique except the intelligence test.
I t should be noted th at ratings overlap with many of the
more objective methods described in previous chapters, and
particularly with the expressive methods of Chap. IV. There
we saw that measurements of the speed, extent, pressure, etc., of
expressive behaviour were seldom as diagnostic as judgments
of the behaviour by an impartial observer. (Nevertheless
there is much to justify the argum ent1 that a properly weighted
combination of measurements from such tests as Porteus
Mazes, the Luria apparatus, or some form of dotting or stress
test, would be more accurate than subjective interpretation.)
Time-sampling and group observational methods are kinds of
rating, and the former—when applicable—is superior to the
latter just because it leaves so little to the judgment of the
observer. Some researches based on ratings such as Webb’s,
Burt’s, and Newcomb’s 2 have required the judges to observe
their subjects systematically over a considerable period; and
several rating devices mentioned below try to make the judges
rate more from direct observation than from generalized
recollections. These steps should help to make their schemas
fuller and more impartial, though they certainly do not elim­
inate all halo, stereotypy, and bias.

RATING TECHNIQUES

1. Ranking and Paired Comparisons. A school teacher may


arrange her class, or an officer his platoon, in order of m erit for
* Upheld, for example, by Eysenck, H. J., The Scientific Study of
Personality. London : Routledge, 1952.
* Webb, E ., * Character and Intelligence ’. Brit. J . Psychol. Monogr.
Suppl., 1915, No. 3. Burt, C. L., * The Factorial Study of Temperamental
Traits ’. Brit. J . Psychol. Statist. Sec., 1948,1, 178-208. Newcomb, T. M.,
‘ The Consistency of Certain Extrovert-Introvert Behavior Patterns in 51
Problem Boys ’. Teachers College Contr. Educ., 1929, No. 882.
104 Personality Tests and Assessments
a trait. This is hardly applicable when the number of cases
exceeds, say, tw enty because of the difficulty of distinguishing
among the bulk of middling people. B u t it has the advantage
of avoiding the vagaries of absolute standards (cf. below). I t
is best to p u t each name on a separate card and let the
rater sort them out. If a printed list is given, judges are
ap t to rank people a t the top too high, those a t the bottom
too low.
In the paired comparison method the rater is given every
possible pair of names and asked to say which of the two is
higher. The results can be expressed finally as a ranking or a
scaled score.1 For most purposes this is an unnecessary
refinement. I t is usual to convert rankings into normally
distributed scores, for example by means of H ull’s or Symond’s 2
tables. If this is done it is quite simple to combine or average
rankings provided by different raters, each of whom may have
judged a somewhat different list.3 (For example, the Mathe­
matics, French, and English teachers may each have ranked
some, b u t not all, of a group of children.)
2. Numerical Ratings. I t is an old parlour game to give
people marks for traits. Dr. Johnson is said to have been
annoyed a t getting 0 out of 20 for Good Temper and Manners
from Mrs. Thrale, though he also received 20/20 for Morality,
Obviously it is impossible to distinguish as many as tw enty
grades, or to give any consistent meaning to percentage marks
and the like. Experiments by Symonds and others show th a t
five, or a t most seven, grades is the largest number th a t the
average rater can cope with. These m ay be denoted 5, 4, 8, 2 ,1 ,
or + 2 , + 1 , 0, —1, —2, or turned into letters, A to E, or
verbal labels :
Strongly present, Present, Average, Lacking, Strongly
Lacking, and so on. A smaller number of grades than five (e.g.
Yes, Doubtful, No, or + , 0, —) is rather wasteful of the rate r’s
powers of discrimination, but is nevertheless often used where
1 Cf. Guilford, J. P., Psychometric Methods. New York : McGraw-Hill,
108 0 .
* Hull, C. L., Aptitude Testing. Yonkers, N .Y .: World Book Co., 1928.
Symonds, see Bibliography.
* Note that it is essential that each person should be rated or ranked by
the same number of judges. If 4 judges rate some, and only 2 judges
others, the spread or scatter of ratings among the former will necessarily
be smaller than among the latter.
Ratings and Judgments o f Personality 105
there are m any questions to be answered or traits to be rated,
or when the rater has little detailed knowledge of the subjects
or ‘ ratees
The outstanding defect of this type of scale is the variations
in standards and distributions adopted by different raters.
(Ju st the same difficulty arises in the marking of English
essays or essay-type examinations, which is, of course, a
form of rating.) The psychologist would naturally prefer
the ratings of any large group of subjects to conform fairly
closely to a normal distribution, say :

A B C D E
T 24 38 24 7%

Most raters tend to be too generous; indeed they seem to


regard C or Average as a term of a b u se ; and most avoid
using the extremes. Thus the distribution often reduces to
something like this :

A B C D E
8 60 30 7 0%

B ut unfortunately no two raters distort in the same way, and


this means th a t their judgments cannot be compared or com­
bined. I f one rater scarcely ever awards an A, his A’s represent
a much higher standing on the tra it than do those of another
rater who frequently gives them. The writer would go so far
as to say th a t the ratings given on record cards by prim ary
school teachers from numerous different schools are of practically
no value to the secondary school teacher of the same pupils,
because the latter cannot know w hat distributions the different
raters adopted. The same is true for different supervisors in
industry, or different officers commanding groups of soldiers.
Again, it is impossible to combine the results of several sets of
ratings of overlapping groups of people when distributions
vary (just as different sets of examination marks cannot
properly be combined unless scaled to a common standard).1
Sometimes, therefore, raters or markers are presented with
1 Cf. Vemon, P. E ., The Measurement of Abilities. London ! University
of London Press, 1940.
106 Personality Tests and Assessments
an ideal distribution which they are advised to adhere to,
such as :
A B C D E
10 20 40 20 10

This is a very simple pattern, which approximates closely


enough to normality. But they seldom conform to this unless
they are trained to do so, and are frequently checked. I t may
be preferable, then, to force them to use relative rather than
absolute ratings, that is, to get them to pick out the best 10%
and worst 10% of their group of ratees, then 20% of B’s and
20% of D’s. (The same method can be applied to other forms
of distribution.) They will naturally object th at their groups
are almost all higher, or lower, than the general run. This
may well be true when, for example, ratings are collected from
different primary schools that feed one grammar school, or
from different streams within any one school. If the groups
are quite small they are especially apt to vary in merit, so that
it is unfair to reduce them all to the same mean and standard
deviation. It is sometimes possible to adjust the ratings to
allow for such group differences, as when essay examination
marks are scaled against objective attainment tests.1 But the
difficulty is such a serious one in the field of personality, that
we must conclude that ratings should generally be used only
within groups all of whose members are known to two or more
raters. There is no really satisfactory way of comparing
ratings by different judges of different groups, unless the groups
are large enough and similar enough for relative ratings to be
fair. Obviously this greatly restricts the practical usefulness
of ratings.
8. Man-to-M an Scales. Various devices have been introduced
in an attem pt to pin raters down to more consistent standards.
Thus in rating Leadership among American Army officers in
1917-18, each rater was told to think of an officer, A, whom he
regarded as highest in this trait, then another, E, who was
lowest, another half-way between, and a B and a D. These
names were retained as a private yardstick, so that in rating
any new officer, X, the rater would judge which of the five X
most closely resembled. Similar scales were to be constructed
* Cf. McClelland, W., Selection for Secondary Education. London :
University o f London Press, 1942.
Ratings and Judgments o f Personality 107
for other traits. Since different raters still have different scales,
it is doubtful whether this is of much value. Its application
to the marking of handwriting specimens or children’s com­
positions or drawings by means of quality scales, where all
markers use the same standard set of specimens, is a different
m atter.
4. Verbal and Graphic Scales. The substitution of such terms
as : Excellent, Good, Average, Poor, Bad, for letters or numbers
is of little help ; though a clever choice of term s will sometimes
help to counteract the tendency to undue generosity or the
tendency to avoid extremes. For example : Excellent, Very
Good, Good, Fair, Weak, may produce a better distribution.
B ut an extension of this idea, proposed by F reyd,1 has been
very widely adopted. In the graphic scale, each step is defined
as concretely as possible, so th a t the rater no longer has to
think quantitatively or bother about standards. In addition
the am biguity of vague trait-nam es like leadership, industrious­
ness, etc., is avoided to a considerable e x te n t; verbs describing
behaviour are substituted for nouns and adjectives. H ere is
an example from the American Council on Education scale for
rating college students *:
Does X need constant prodding, or does he go ahead with his
work w ithout being told ?
J ______ I______ I_______I_______I_______I______ I______ I______ i i i
Needs much Needs occas- Does ordin- C o m p le te s Seeks and sets
prodding in ional prod- ary assign- su g g ested for himself
doing ordin- ding ments of his supplem ent- a d d itio n a l
ary assign- own accord ary work tasks
ments

The rater merely puts a tick or cross a t whatever point on the


line th a t he thinks appropriate, b u t the experimenter can
measure off this position as accurately as he wishes. Similar
scales were used during the war for assessing the suitability of
officer candidates, or the efficiency of serving men and officers,
and are still considered the most satisfactory type. Generally
they embody several questions with only three steps or grades
1 Freyd, M., ‘ The Graphic Rating Scale ’. J . Educ. Psychol., 1923, 14,
83-102.
* Cf. Bradshaw, F. F., ‘ The American Council on Education Rating
Scale ’. Arch. Psychol., 1930, 18, No. 119.
108 Personality Tests and Assessments
(ocasionally two or four), these being couched as far as possible
in Service language. F or example :
A .I. H ard conditions tended to get him down.
2. H e accepted bad conditions cheerfully enough.
8. He helped to keep up the men’s spirits when conditions
were bad.
B .l. H e has a flair for improvising (tools, materials, etc.)
in an unexpected difficulty.
2. H e is reasonably good a t making the best use of w hat
is to hand when things go wrong.
8. H e is lost w ithout the usual tools, materials, etc.

The grading on any one question would naturally be too


coarse, bu t with half a dozen to two dozen or so questions,
each covering some different aspect of efficiency, a total score
can be derived, and these scores tend to show good distribu­
tions. Another advantage of such scales is th a t inexperienced
raters find them easy to understand and apply. Even though
the number of questions looks formidable they can actually
be answered quite quickly. A minor objection is th a t they
use up a lot of paper. Major ones are th a t some raters
continue to be much more generous than others, and th a t
correlations between different raters of the same subjects are
still quite low.
5. A nalytic Scales. Our last example illustrates the device of
breaking up a general tra it into a number of more specific
components, which are separately rated and the scores com­
bined. Such components should, of course, be relatively
independent of one another, in order to cover the whole scope
of the tra it as efficiently as possible. However, halo is usually
so strong th a t most of the items within such a scale tend to be
rated in the same direction. Components regarded as more
im portant can, of course, be given higher weight in the total
score, though actually this makes so little difference th a t it is
seldom worth while. One would expect such scales to be more
objective, and to be more consistently answered by different
raters, because they avoid indefinite and equivocal trait-names.
B u t the evidence is not very favourable. In several studies,
the correlations between total scores given by two raters have
been around the same -5 level which is normally found for
Ratings and Judgments o f Personality 109
general-trait ratings. (Similarly in th e m arking of essays,
analytic schemes are often found to be no more reliable th an
impressionistic).
A num ber of detailed scales, or third-person questionnaires
(i.e. personality questionnaires to be filled in by a rater) have
been published. M arston’s scale of tw enty item s indicative of
extraversion-introversion in children, and Bridges’s scales for
the social and em otional m aturity of pre-school children, are
good exam ples.1 L aird’s C-3 te st of introversion, W illoughby’s
Em otional M aturity scale, and H eidbreder’s questionnaires
on introversion and inferiority attitu d e s,2 have given rath er
poor results—perhaps because they are designed for adults,
whose em otional traits are less overtly expressed th an those
of children.
The w riter would suggest, then, th a t breaking down a tra it
into a small num ber of 3-point sub-scales m ay be w orth while,
because m ost raters find this easy to answer, and because it
will usually yield a good distribution of to ta l scores ; b u t th a t
the inclusion of more th an half a dozen item s or aspects of a
tra it is likely to be a w aste of time. I f several traits, presumed
to be distinct, are to be rated a t the same tim e, each should be
covered by three or four items, and the com plete scale tried o u t
on a typical sample of raters and ratees. I f every item is now
inter-correlated w ith every other, the coefficients will show
w hether they do group together as expected, w hether some of
the item s overlap so much th a t they are b e tte r combined or
reform ulated, or w hether some items presumed to represent
T rait A actually overlap more closely w ith T ra it B items, and
so on. E ith er by means of factorial analysis, or by simply
studying average inter-correlations, a m uch improved com­
posite scale can thus be constructed.
6. Standardized and ‘ Derived ’ Scales. We generally assume,
without justification, th a t the various steps on a graphic, verbal,
1 Marston, L. R ., ‘.The Em otions o f Young Children ’. Univ. Iowa
Stud. Child Welfare, 1925, 3. Bridges, K. M. B ., The Social and Emotional
Development of the Pre-School Child. London : Kegan Paul, 1931.
* Laird, D . A ., Personal Inventory, C-3. H am ilton, N .Y .: Hamilton
Republican, 1925. W illoughby, R. R ., ‘ A Scale o f Em otional Maturity ’.
J . Soc. Psychol., 1932, 3, 3-30. Scale published by Stanford University
Press, 1931. Heidbreder, E., * Measuring Introversion and Extroversion
J . Abn. Soc. Psychol., 1926, 21, 120-134,. Heidbreder, E ., 'T h e Normal
Inferiority Complex ’. J . Abn. Soc. Psychol., 1927, 22, 243-258.
110 Personality Tests and Assessments
or numerical scale are equidistant, i.e. th a t A is as m uch superior
to B as B is to C, etc. One technique of achieving equivalent
units is to apply T hurstone’s m ethod of attitu d e scaling to a
long series of statem ents. As shown in Chap. IX , th is enables
us to assign a rational numerical value to each of th e statem ents
we select, and a t the same tim e to elim inate am biguous or
unsatisfactory statem ents. The following are three statem ents,
standardized on a 0 to 8 scale, for assessing th e efficiency of
travelling salesmen 1 :
(6-9) H e is making exceptional progress.
(3-2) H e is somewhat in a r u t on some of his brand talks.
(5-6) H e tends to keep com fortably ahead of his work
schedule.
Any of the statem ents th a t are thought to apply to th e ratee
are checked, and their average scale value gives his score.
W illoughby’s Em otional M aturity scale 2 is similar.
A different form of scaling is represented by th e V ineland
Social M atu rity scale.3 This contains 117 item s such as :
Reaches for fam iliar persons (4 m onths.)
Dries own hands. (2£ years.)
Is trusted with money. (5J years.)
Makes telephone calls. (10| years.)
Provides for the future. (25 years.)
E ach item has been proved to be typical of the average (Ameri­
can) person of the age indicated. The scale is applied b y a
trained exam iner who obtains the required inform ation in an
interview w ith a p aren t or someone else who knows th e subject
well. H e checks the items th a t apply, and works o u t a Social
Age and Q uotient in the same m anner as a B inet M ental Age
and I.Q. The subjective elem ent is greatly reduced since th e
1 Richardson, M. W ., and Kuder, G. F ., * Making a R ating Scale th at
Measures ’. Personnel J ., 1933, 12, 3 6 -4 0 .
2 Op cit.
3 Doll, E. A., Vineland Social M aturity Scale. Vineland, N .J .: Training
School, Educational Test Bureau, 1936. Cf. Doll, E . A., ‘ Preliminary
Standardization o f the Vineland Social Maturity Scale ’. Amer. J .
Orthopsychiat., 1936, 6, 2 8 3 -2 9 3 . I t appears to be suitable for application
in Britain ; cf. Kellmer Pringle, M. L., ‘ Social Maturity and Social
C om petence’. Educ. Hev., 1951, 3, 1 1 3 -1 2 8 , 1 8 3 -1 9 5 .
Ratings and Judgments o f Personality 111
examiner, by skilful questioning, can discriminate fact from
interpretation in the inform ant’s statem ents. I t is claimed th a t
different examiners, even interviewing different informants,
arrive a t Social Ages with as good a reliability as -90. The
resulting S.Q.s tend to correlate rather highly with I.Q.s, b ut
there are good grounds for thinking th a t they also represent
some aspect of social competence and personality m aturity
which is particularly relevant in certifying mentally defective
subjects. A similar scale for assessing altruism among children
is described by Turner.1
The Haggerty-Olson-Wickman Behavior Rating Schedules *
represent another interesting development. A series of graphic
scales for rating thirty-five common traits was applied by
teachers to a group of children who had previously been very
carefully assessed by psychologists for personality m aladjust­
ment. By analysing the numbers of well and poorly adjusted
children who received each particular rating, a m aladjustm ent
index was derived for th a t rating. The Schedule could now be
applied to any fresh child, and the total indices calculated for
all his ratings in order to show his maladjustm ent. This has
the advantage th a t the teachers are not asked to assess person­
ality disorders as such, b u t only some of the more acceptable
social, emotional, and other traits. Thus biases and misinter­
pretations of traits should be greatly reduced. Nevertheless
there is still considerable subjectivity, since the m aladjustm ent
scores obtained from ratings by two teachers usually correlate
only to about -60. As Olson points out, the same set of ratings
could be standardized against other criteria ; a series of scoring
keys could be built up in the m anner of the Strong Interest
Blank (cf. p. 164). I t would be of interest to try out this
technique in gramm ar school selection, where it is known th a t
ordinary personality ratings by prim ary school teachers are of
little value (cf. p. 25). If teachers were asked to check a series
of concrete statem ents (similar in form to the Army rating
scales, p. 108) about the work and behaviour of their pupils,
also about health and home environment, it m ight well be
1 Turner, W. D ., * Altruism arid its Measurement in Children \ J . Abn.
Soc. Psychol., 1948, 43, 502-516.
* Published by World Book Co., Yonkers, N .Y ., 1930. Cf. Olson,
W. C., Problem Tendencies in Children. Minneapolis: University of
Minnesota Press, 1930.
112 Personality Tests and Assessments
found empirically th a t some of the statem ents differentiated
pupils who later had successful and unsuccessful gramm ar
school careers. Only a proportion of the statem ents m ight be
diagnostic, b u t a really valuable measure m ight be constructed
from these. Such a research would be lengthy and would have
to be done on very large numbers, rated by m any teachers, in
order to yield a reliable scoring key.
Somewhat similar is A. H. J. Baines’s 1 method of deriving
gradings of the efficiency of civil servants. A number of super­
visors of, say, clerical officers, rate these officers on a dozen
aspects of their work, on 3-step scales, and also give a final
general (5-step) grading. Correlations are calculated for the
sub-scales with, one another and with the final grading, and
these show which sub-scales most closely predict general
efficiency. A total score is then based on the general grading
plus the best sub-scales, suitably weighted. Note th a t the
supervisors are not asked which qualities or aspects they think
re le v a n t; by their own use of the scale they show the ones to
which they attach most importance.
The most sophisticated development of derived rating scales
is the Forced Choice type, which is employed, for example, in
U.S. Army officer report forms. Such scales contain several
blocks of items, like the following :
A go-getter who always does a good job.
Cool under all circumstances.
Doesn’t listen to suggestions.
Drives instead of leads.
The rater is instructed to pick out the item in each block
which is most characteristic and th a t which is least character­
istic of the individual he is rating. The two favourable items
are known, from previous trials, to be equally popular among
raters but to differ in their discriminatory pow er; one correlates
well, the other badly, with some criterion—say a set of
exceptionally thorough efficiency ratings. Similarly the two
unpopular items differ. Thus a scoring key is available which,
by contrasting the valid and non-valid items, reduces the
liability of the ratings to bias and halo. I t is too early to say
whether the advantages of this technique outweigh the obvious
1 Cf. Anstey, E ., 4 Staff Reporting in a Government Department ’.
Occup. Psychol., 1950, 24, 200-229.
Ratings and Judgments o f Personality 118
disadvantage th a t m ost raters find it highly irritatin g .1
P robably it has greater promise in th e field of self-rating
questionnaires.
7. Voting and Guess-Who Techniques. R atings of pupils by
one another on ordinary rating scales are of little or no value.
Children are even less fam iliar th a n educated adults w ith the
m eaning of traits, less able to think of traits quantitatively, less
able to observe objectively ; and their suggestibility or contra-
suggestibility to the teacher or psychologist who asks for th e
ratings m ay entirely distort the results. The idea of voting,
however, goes down more readily, and by the age of 9 years or
so they can pick out the two or three in their class who they
think are highest or lowest in various respects. H artshorne
and May drew up a series of short character sketches, e.g. of
a very selfish, m oderately selfish, an average, and an unselfish
child. Each member of the class was asked to guess whom
these represented. W ith so large a num ber of raters, a pupil’s
score for selfishness is readily obtained from the num ber of
tim es each sketch is assigned to him, and th e scores show
satisfactory reliability. H ere is an example from a scale used
in a British research on em otional stability w ith 10 to 12 year
olds 2 :

These three people are always happy and enjoying th em ­


selves. I t is impossible to annoy them . They never
change.

(1)......................... (2)--......- .... - ..... (3).........................

Here are three people who are very changeable. You can
never depend on them. They are offended and
annoyed very easily.
( 1 )..................... - (2 )----- ------------- --- (8 ).....................~

1 Cf. Travers, R. M. W., ‘ A Critical Review of the Validity and Rationale


of the Forced-Choice Technique ’. Psychol. Bull., 1931, 48, 62-70. Haier,
D. E., ‘ Reply to Travers’ “ A Critical Review o f the Validity and Rationale
of the Forced-Choice Technique ” ’. Psychol. Bull., 1951, 48, 421-434.
Recent work indicates that the most reliable results are obtained with
blocks of all-favourable items.
* Connor, D. V., The Effect of Temperamental Traits upon Intelligence
Test Performance. Ph.D. Thesis, University o f London, 1952.
114 Personality Tests and, Assessments
Very similar are the well-known sociometric techniques of
Moreno,1 where each child writes the names of other pupils he
would like to sit next to, or to play with, and so on. The results
are generally used to present a picture of the social structure of
the group rather than to assess the popularity or other traits of
individuals. During the war the so-called nominations method
was often applied, particularly in America, for gauging the
suitability of a recruit, in the eyes of his fellows, for a com­
mission. Among relatively uneducated or unsophisticated
adults this very simple form of rating is as appropriate as it is
among school children. Obviously such techniques may yield
extremely biased judgments ; the reputation of a pupil in the
eyes of his class-mates may seldom accord w ith th a t held by his
teachers. Nevertheless it may possess useful validity. Thus
in MacA rthur’s study of persistence (cf. p. 14), pupils’ ratings
agreed rather more highly than teachers’ with th e composite
results of persistence tests. And follow-up results in the
Services indicate th a t the summed opinions of a recruit’s
fellows tend to have better predictive value than the recom­
mendation of a commanding officer.*
8. Ratings W ithin Persons. This refers to the rating or
ranking of a number of traits according to their prominence
within an individual, as contrasted with the ordinary procedure
of rating a number of individuals on a trait. In some ways it
is easier to judge whether a person is most outstanding for, say,
Sociability, Instability, etc., than to arrange numerous
individuals according to degrees of Sociability. E ither the
traits may be ranked, or, if they number a dozen or more,
they can be sorted on a 5-, 7-, or 9-step sc ale; (which are the
10% most outstanding traits in X , which the next 20% most
characteristic, and so on ?) This method is most useful when
the number of traits is large, the number of individuals small,
or when none of the available raters is acquainted witn more
than a few individuals. B urt and Stephenson * have shown
how the results can be analysed statistically by calculating

1 Moreno, J. L., ‘ Who Shall Survive ? ’ Nerv. Ment. D is. Monogr.,


1934, No. 88.
* Cf. Jenkins, Bibliography.
• Burt, C. L., ‘ Correlations Between Persons ’. Brit. J . Psychol., 1987,
28, 59-96. Stephenson, W., ‘ The Inverted Factor Technique ’. Brit. J.
Psychol., 1936, 26, 844-361.
Ratings and Judgments o f Personality 115
correlations between different persons (Stephenson’s ‘ Q-
technique ’), as distinct from correlations between different
tests or trait-ratings (‘ R-technique ’).

DEVICES FOR IM PRO V IN G RATINGS

A number of steps have been shown to increase the reliability


of ratings, i.e. the agreement between different raters, though
little is known regarding their effects on validity. We have
considered already : (1) the superiority of relative to absolute
ratings ; (2) of graphic to numerical or letter scales; (3) the
possible advantages of breaking down traits into components.
(4). Choose only straightforward, unambiguous traits and
define them concretely, avoiding as far as possible terms
suggestive of approval or disapproval. Hollingworth and
others have published lists of relatively equivocal and un­
equivocal traits, though obviously much depends on how they
are defined. An interesting study by Stephenson1 showed
how it is possible to analyse w hat a tra it means to raters. He
got 10 teachers to judge the Reliability of 100 children, and
factor-analysed the correlations between the 10 sets of judg­
ments. About half the teachers agreed quite closely with one
another, while the other half agreed with each other b u t less
closely w ith the first group. Clearly there were two different
conceptions of Reliability. The first group appeared to base
their ratings chiefly on placid, submissive behaviour, whereas
the second type looked for more active and direct evidence of
the trait. Clearly, better definition was called for.
(5). When several traits are to be rated, rate all individuals
on one tra it a t a time, not each individual on all traits (unless
using within-persons technique). This is supposed to produce
greater independence between traits, i.e. less halo.
(6). However, if a rating form is being used for each individual,
containing numerous traits or items, then the items should be
arranged on the form so th a t the relatively desirable and
undesirable extremes alternate in random fashion. Otherwise
the rater is liable to go down the page checking the desirable
(or undesirable) answers or grades throughout.
1 Stephenson, W., ‘ Introduction to Inverted Factor Analysis, with
Some Applications to Studies in Orexis ’. J . Educ. Psychol., 1936, 27,
353-367.
116 Personality Tests and Assessments
(7). W arn the raters of the nature of haio and encourage them
to avoid it. One way of elim inating it is to sum the ratings of
each individual on all traits and to regard this as a m easure of
halo or general popularity, which can then be subtracted from
the separate trait-ratings, or he’d constant by p artial correlation
or factor analysis. Since, however, different traits of which
society approves certainly do overlap to some extent, this will
tend to over-correct. Nevertheless, in a research by th e w riter,
this device did improve the validity of sets of ratings when
com pared w ith trait-com posites. An alternative would be to
get raters to estim ate their personal liking for or dislike of each
ratee, in addition to assessing his traits, and to remove th e
influence of this from the to tal ratings. Though this greatly
oversimplifies the n ature of halo, it would be of some help.
H artshorne and May, and Chi,1 suggest th a t halo differs among
different judges, hence we could estim ate the tru e overlapping
between traits by correlating A’s rating of T ra it 1 w ith B ’s
rating of T rait 2, and so on. The differences between such
correlations and the inter-correlations of A’s (or B ’s) ratings of
all traits, would provide a measure of halo. The flaw in the
argum ent is th a t A ’s and B ’s biases are only too likely to overlap,
especially if they are both people w ith a sim ilar relation to the
ratees, say teachers. And even raters w ith such different
outlooks as teachers, parents, and pupils are liable to be
similarly influenced by liking, or by common confusions ab o u t
the meanings of traits. In other words, the correlation between
T raits 1 and 2, when rated by several judges, is m ade u p o f :
genuine overlapping between the behaviour included in th e
traits, plus confusions and biases common to all, or to
any pair of, raters.
I t is reduced by confusions and biases which affect individual
raters only. This is why, in Chap. I, little credence was attach ed
to factorial studies of ratings. They throw more light on th e
linguistic problem of how people in terp ret tra its th a n th ey do
on the structure or organization of personality traits as such.
(8) Nevertheless it is certainly an advantage in any experi­
m ental or practical application of ratings, to obtain independent
judgm ents from 2 or more judges, since this helps to cancel o u t
1 Chi, P. L., 4 Statistical Analysis o f Personality R atings ’. J . Exper.
Educ., 1937, 5, 229-245.
Ratings and Judgments o f Personality 117
individual prejudices. There is little point in having more
than 4 or 5, unless the rating scale is very coarse (as in voting
among pupils). By inter-correlating the sets of judgm ents and
applying the Spearman-Brown prophecy formula, it is easy to
determine how m any judges are needed in order th a t their
combined rating may reach an acceptable level of reliability,
say -9. The writer would suggest, however, th a t diversity of
judges is more im portant than number. The ideal would be to
have 2 or more of each type of judge, for example, 2 teachers,
2 psychologist-observers, 2 relatives, and a group of pupils
and to aim for high agreement between the judges within
type, b u t lower correlations (representing different viewpoints
between types.
(9). R aters should be trained in the use of the scale, and if
they are required to use it frequently—for industrial, educa­
tional or other purposes—their distributions, reliability, etc.
should be checked periodically. Normally they can be assured
th a t the ratees will not know w hat they say about them, so
th a t they can be completely candid. B u t their attitudes to the
person requiring the ratings should also be considered, and their
full co-operation sought. In fact, much the same difficulties
occur here as with self-ratings and personality questionnaires
(cf. Chap. V III). For example, raters are likely to be defensive
about attributing undesirable traits to their friends. During
the war it was noticed th a t instructor officers a t a training
school would willingly agree th a t some of their cadets were weak
in efficiency, social adjustm ent, intelligence, etc. B u t similar
officers in command of units to which these cadets were posted
would give much higher ratings because, consciously or
unconsciously, they resented the suggestion th a t their own unit
could contain any inefficient people.
(10). I t is usually stated th a t raters should have had plenty
of opportunity to observe the kind of behaviour they are rating,
and should be well acquainted with the ratees. Obviously there
is some tru th in this : an interviewer who merely talks to a
prospective employee for half an hour can hardly assess his
practical skills or the dependability of his character. B ut the
tendency for an individual to express his personality in every­
thing he does, also the tendency for closer acquaintanceship to
lead to more rigid and biased opinions and greater halo, should
not be forgotten. Thus the evidence, so far as it goes, actually
9
118 Personality Tests and Assessments
suggests that an impartial observer and interviewer (particularly
if he has applied performance tests or analogous exercises which
provoke significant behaviour) can give a t least as reliable and
valid ratings as a close friend. Ferguson 1 compared ratings of
travelling salesmen by managers who were acquainted with
them to varying degrees, and did obtain better ratings from the
better acquainted; but all of them, presumably, would be
fairly distant in their relations to the ratees. Slawson * found
th at a period of observation before rating improved the relia­
bility of judgments to some extent. But K n ig h t3 showed that
intimacy or length of acquaintanceship led to more over-rating
and greater halo. Presumably a more superficial knowledge is
also more detached. In Newcomb’s 1 research a t a boys’
summer camp, careful records were kept of actual behaviour,
and these were used for checking the accuracy of raters who
had observed some, but not other, kinds of behaviour. The
validity of the ratings on observed traits was represented by
correlations of 54 and -45, while for traits which had not been
observed but were inferred the correlations were -89 and -40.
The latter figures are somewhat lower, but to a statistically
insignificant extent.

JUDGING ABILITY5

I t is dangerous to generalize about the goodness of different


types of raters, since there is seldom any criterion of accuracy
except other ratings. If rater A coincides more closely with
B, C, and D than does rater B with A, C and D, this shows that
the conformity of A’s judgments is higher, not th a t his intuitive
skill is superior. Moreover, there appear to be large variations,
depending on the particular traits to be judged or the manner
in which the judgments are given, and on whether the ratees
1 Ferguson, L. W ., ‘ The Value of Acquaintance Ratings in Criterion
Research Amer. Psychologist, 1948, 8, 290.
' Slawson, J., *The Reliability of Judgment of Personal Traits J . Appl.
Psychol., 1922, 6, 161-171.
* Knight, F. B., ‘ The Effect of the “ Acquaintance Factor ” upon
Personal Judgments J . Educ. Psychol., 1928, 14, 129-142.
* Op cit., p. 95.
* An excellent review o f this field is given by R . Taft in an as yet un­
published thesis (University of California, 1950). The generalizations
listed here are partly based on his intensive experimental Investigation.
Ratings and Judgments o f Personality 119
are new acquaintances or old friends.1 Nevertheless, there is
fairly strong evidence that ‘ good ’ judges are not so much
outgoing, socially intelligent people as rational, analytic, in
some respects introverted. (Adams * found that the good judges
of self tend to be extraverted ; for the extravert is more
detached about his own personality, but he may be too inter­
ested in others to judge them impartially.) The good rater
tends also to be above average in intelligence and in personality
maturity, and integration. Though in some situations artistic
inclinations seem helpful, there is stronger evidence that
natural scientists are superior to social science or arts students.
They are superior also to psychologists, except in so far as the
judgments involve knowledge of technical terminology. There
is no support for the belief in feminine superiority. Raters are
generally more successful in judging people of similar age, sex,
and cultural background to themselves. Hollingworth and
others provide some evidence that people who are high in a
desirable trait tend to rate it better, and th at the reverse holds
for undesirable traits. Thus the most ‘ snobbish ’ are not good
at rating ‘ snobbishness ’. Finally, the degree of confidence
that a rater expresses in his judgments is a very poor criterion
of their accuracy.
People vary also in their judg-ability, i.e. the extent to which
several raters agree about them, and in the extent to which
ratings of their different traits are influenced by, or free from,
halo. The former is sometimes considered to show their open­
ness vs. enigmaticness, the latter their mediocrity vs. indivi­
duality. But there is little evidence to confirm such suppositions.

OTHER M ETHODS OF E X P R E S S I N G JUDGMENTS


OF P E R S O N A L IT Y

In view of the difficulties of getting useful ratings in education


or industry—where few raters are likely to know large numbers,
and where thorough training is seldom possible—we should
enquire into the value of less formal methods, such as the
testimonial and the free personality sketch. The ordinary
1 Cf. Vemon, P. E., * Some Characteristics o f the Good Judge of
Personality ’. J . Soc. Psychol., 1933, 4, 42-58.
1 Adams, H. F., * The Good Judge o f Personality ’. J . Abn. Soc.
Psychol., 1927, 22, 172-181.
120 Personality Tests and Assessments
testimonial or reference is notoriously superficial and unreliable,
especially when given to the individual himself to pass on to
prospective employers. The tru st placed in more confidential
references generally depends on the employer’s knowledge of,
and respect for, the writer. Actually only one careful experiment
seems to have been carried out, and this gave rather promising
results.1 Confidential references are obtained for candidates
for the higher Civil Service from 5 persons, representing their
schools, universities, employers, and private acquaintances.
Sets of reports on 268 candidates were graded by 13 Civil
Service Selection Board staff members, w ithout any further
information about the candidates. The average inter-correlation
between the grades was -67, and their average correlation w ith
the Final Board decision was -48. Thus predictions based on
testimonials were ju st about as good as those based on exam ina­
tion results and objective tests of abilities.
If the confidential testimonial has been somewhat under­
rated, the case-study and personality sketch written by a
psychologist or psychiatrist has perhaps been over-valued. I t
is based chiefly on interview, supplemented by observation of
behaviour at tests or, in the case of mental patients, in the
wards, and by information from relatives and acquaintances.*
The value of such a study is sometimes judged by its com­
prehensiveness, and by the consistency of the personality
structure that it reveals. But this provides no guarantee that
it is not distorted by the writer’s subjective interpretation of
the complex mass of evidence. We have already referred to
discrepancies between different psychologists or psychiatrists
in discussing the interview (Chap. II). An experiment was
carried out at a War Office Selection Board,3 where pairs of
psychiatrists interviewed candidates independently and
observed them at analogous exercises, finally assessing their
suitability for commissions. Their mean inter-correlation of -65
was distinctly lower than the -86 obtained by military officers
who merely observed the exercises. In the Civil Service Selec­
tion Board follow-up, the psychologist's final judgment corre­
lated as highly as -87 with the Final Board decision, and his
1 Cf. Vernon, P. E., * The Validation o f Civil Service Selection Board
Procedures ’. Occup. Psychol., 1950, 24, 75-95.
* For a useful account, and references, cf. Strang, Bibliography.
* Cf. Vernon and Parry, Bibliography.
Ratings and Judgments o f Personality 121
validity coefficient with efficiency gradings after 2 years was -49.
But these figures were no higher than those of the non-
psychological staff members. On the other hand, the studies of
vocational guidance conducted by the National Institute of
Industrial Psychology (cf. p. 29) show th at psychologists’
judgments based on a free synthesis of interview and other
material possess good, though certainly not perfect, validity.
The following conclusions seem to emerge regarding the best
way of obtaining useful information from associates about a
candidate for employment, or from teachers about a pupil
being considered for promotion to secondary schooling, etc.
A set of ratings should be asked for, probably in the graphic or
questionnaire form described on pp. 107-9. The number of
questions, say ten to tw enty with three answers each, should be
kept as few as possible, consistent with covering most of the
relevant aspects of personality. If the various questions can
be validated and keyed against an external criterion of later
success, in the manner of the Haggerty-Olson-Wiekman or
forced-choice scales, so much the better. B ut the main object
of the ratings will usually be to force the judge to consider the
subject from as many angles, and as systematically and
objectively, as possible. Thereafter he should be asked to
write a free personality description, testimonial, or case-study,
explaining and commenting on his ratings, mentioning sup­
porting evidence, and filling in w hat seem to him th e main
gaps.
VII I
Self-Ratings and Personality Questionnaires
N individual’s w ritten account of his past behaviour,
A feelings and wishes obviously constitutes an im portant
source of information about his personality. G. A llp o rt1 points
out the value of diaries, creative writings, and other personal
documents in the ‘ clinical ’ study of the individual personality ;
and psychiatrists and clinical psychologists often require their
patients to write autobiographies. The interpretation of such
m aterial is inevitably as subjective as th a t of oral interview
responses, and it cannot readily be treated quantitatively. I t
was hoped, therefore, th a t self-ratings and the answers to
standard questions would overcome these difficulties.
Ratings by the individual of his own traits may be obtained
by the same techniques as are used in rating others. I t is
commonly found th a t they deviate even more from associates’
ratings than associates’ do from one another, and th a t most
people overrate themselves considerably on desirable traits, i.e.
they possess a favourable halo towards their own personalities.
A llport finds th a t university students tend to make themselves
as interesting as possible, for example—overrating their
radicalism, introversion, and emotionality ; and Husen * shows
th a t the qualities valued by self-raters vary w ith their education
and social background. B ut ju st as associates’ ratings m ay be
improved by analytic scales and third-person questionnaires, so
self-ratings have developed into the innumerable personality
inventories and paper-and-pencil tests.
These questionnaires contain anywhere from 10 to 223 items
(or even more in multiple scales), thus covering a wide range of
presumed manifestations of some tra it—say introversion-
extraversion. The individual’s score is based on the total
1 All port, G. W., The Use of Personal Documents in Psychological Science.
New York : Social Science Research Council, 1942.
' Husen, T., ‘ The Popular Conception o f Personality as Revealed in
Self-Ratings ’. Essays in Psychology Dedicated to D. Katz. Stockholm,
1951.
122
Self-Ratings and Personality Questionnaires 128
questions, usually unweighted, which he answers in th e in tro ­
verted direction. The questions tend to be highly personal and
em barrassing, and we shall see later th a t so m uch depends on
th e testee’s a ttitu d e to the te st and his interpretation of th e
questions, th a t the results are of very dubious value.
The te st item s are generally made up in the first place (or
borrowed from other tests) to accord w ith the au th o r’s con­
ception of the trait. Thereafter they are always pruned or
standardized by one of the stock item -analysis techniques 1 ;
for a small num ber of good items is more easily answered and is
likely to discrim inate b etter than a long and miscellaneous
te st.2
(а) Internal consistency techniques : these show w hether all
the items correlate with the testees’ to ta l scores, i.e. w hether
they all measure the same presum ed tr a it reliably. N ote th a t
reliability is used now in the sense of consistency of a testee’s
answers to different questions, not (as in the case of associates’
ratings) as m eaning agreem ent w ith anybody else’s judgm ents.
(б) F actor analysis of inter-item correlations again shows
w hether all item s are measuring the sam e variable, or w hether
they should be sub-divided into two or more sets m easuring
distinct traits. G uttm an’s ‘ scale analysis ’ 8 has occasionally
been used to give an even stricter check on the homogeneity or
unidim ensionality of th e items.
(c) Item s are analysed against some external criterion. They
are retained or dropped, or scoring weights are determ ined, by
their success in differentiating, say, neurotic p atien ts from
normals. The forced-choice technique (cf. p. 112) is promising
in this field, b u t has so-far seldom been applied.4
1 Cf. Vernon, P. E ., ‘ Indices o f Item Consistency and Validity ’. B rit. J .
Psychol. Statist. Sec., 1948, 1, 152-166.
* Popular weeklies sometimes publish short questionnaires claiming to
show whether the reader is an easy person to get on with, or possesses
other traits. Almost certainly these have not been item-analysed,
standardized or validated in any scientific way.
* Guttman, L., * On Festinger’s Evaluation o f Scale Analysis Psychol.
Bull., 1947, 44, 451-465.
4 In a recent investigation by Gordon, an ordinary and a forced-choice
questionnaire for measuring Ascendancy, H ypersensitivity, Responsibility
and Sociability were compared with associates’ ratings. The mean
‘ validity ’ coefficients were -34 for the former and -56 for the latter.
Gordon, L. V., * Validities of the Forced-Choice and Questionnaire Methods
of Personality Measurement ’. J . A ppl. Psychol., 1951, 35, 407-412.
124 Personality Tests and Assessments
(d) Several persons besides the author judge the suitability
of each item. An extension of this is the application of th e
Thurstone attitude-scaling technique (cf. Chap. IX ).
Most of the hundred or more tests th a t have been published
are modifications or extensions of three p ro to ty p e s: Wood­
w orth’s Personal Data Sheet, Freyd-H eidbreder’s Introversion-
Extraversion test, and Allport’s A -S ( Ascendance-Submission )
Reaction Study. Instead of trying to give a comprehensive
list, we shall outline these, and mention briefly others which
have been widely used, or which embody special points of
technique.

TESTS OF PSYCHONEUROTIC TENDENCY AND


EMOTIONAL INSTABILITY

The 116 items in Woodworth’s 1 te st were originally derived


from medical psychologists’ descriptions of the symptoms of
neurotic patients. The following are some representative
examples :
Do you usually feel well and strong ?
Do you ever walk in your sleep ?
Have you ever had fits of dizziness 1
Did you have a happy childhood ?
Do you know of anybody who is trying to do you harm T
Does it make you uneasy to cross a bridge over a river ?
Have you ever been afraid of going insane ?
H ave any of your family had a drug habit ?

Each question is followed by ‘ Yes, No ’, one of which is to


be checked. Mathews, Cady,2 and others have adapted th e te st
for use with children, and B urt * publishes a B ritish version
which, however, he recommends as an interview aid rath er th an
as a quantitative test. L aird’s * Personal Inventory B-2
1 Woodworth, R. S., Personal Data Sheet. Chicago : Stoelting, 1020.
* Mathews, E . , 4 A Study of Emotional Stability in Children ’. J . Detinq.,
1928, 8, 1-40. Cady, V. M., 4 The Estimation o f Juvenile Incorrigibility ’.
J . Delinq. Monogr., 1923, No. 2.
8 Burt, C. L., The Subnormal M ind. Oxford University Press, 1988.
4 Laird, D. A., 4 Detecting Abnormal Behavior \ J . Abn. Soc. Psychol.,
1925, 20, 128-141. Inventories published by The Hamilton Republican,
Hamilton, N .J., 1925.
Self-Ratings and Personality Questionnaires 125
contains similar items, b u t with multiple-choice (graphic)
responses, e .g .:
H ave you (during
t h e p a s t f e w --------------------------------------------------------------------------------
weeks) been afraid avoided accepted did not liked welcomed
o f responsibility ? it when forced mind it it it
upon me
The most widely used pre-war test, Thurstone’s 1 Personality
Schedule, contains 228 items collected from Woodworth, Laird,
and other sources. Percentile norms are available for college
students. Many other shorter and simpler tests were devised
during the war, and used with some success in screening recruits
who might be liable to neurotic breakdown. These included
the National Defence Research Council’s inventory (N D R C
Short Format), the Neuropsychiatric Screening Adjunct (N S A ),
and the Cornell Index ,2 in America ; also the Maudsley Medical
Questionnaire,3 and the Sutton Booklet or Bennett-Slater * test
in Britain. The latter is a composite test, in ten sections, whose
items are cleverly disguised. Thre^ sections deal with symptoms
of anxiety, hysteria, and depression; but in about half the
questions a negative, instead of a positive, response indicates
neurotic tendencies, so th at the testee who wants to create a
good impression cannot merely check ‘ No ’ throughout. Four
sections contain lists of various types of annoying situations :
(1) Frustration of self-assertion, e.g. ‘ Somebody tells you
how to do your job ’.
(2) Personal inadequacy, e.g. ‘ You forget what you’re
looking for ’.
(8) Dirt or untidiness, e.g. ‘ An unmade bed
(4) Noise, e.g. *The sound of hammering ’.
1 Thurstone, L. L., and Thurstone, T. G., ‘A Neurotic Inventory’. J . Soc.
Psychol., 1930,, 13-30. Test published by University of Chicago Prest., 1929.
* Cf. Office o f Scientific Research and Development, Human Factors in
M ilitary Efficiency. Washington, D .C .: National Defence Research
Council, 1946. Stouffer, S. A., et. at., Studies in Social Psychology in World
War II , Vol. IV. Measurement and Prediction. Princeton, N .J .: Prince­
ton University Press, 1950. Weider, A., Mittelmann, B., et. al., 4 The
Cornell Selectee Index ’. J . Amer. Med. Assoc., 1944, 124, 224-228. Test
published by Psychological Corporation, New York, 1948.
* Cf. Eysenck, Bibliography.
4 Bennett, E., and Slater, P., * Some Tests for the Discrimination o f
Neurotic from Normal Subjects’. Brit. J. Med. Psychol., 1945,20, 271-282.
126 Personality Tests and Assessments
Evidence is given to show th a t neurotics check items of types (2)
and (4) as annoying much more often th an normals do, whereas
(1) and (3) affect normals and neurotics alike. Thus scores are
based, unknown to the testee, on the differences between these
sections. Finally, three sections are adapted from Pressey’s
Cross-Out test (cf. p. 175). They contain lists of words where
the testee crosses out anything :
(1) For which people should be blamed, e.g. ‘ Flirting,
Speeding
(2) Which he has worried about, e.g. ‘ Loneliness, Falling
(8) In which he is interested, e.g. * Football, Comedians ’.
Neurotics are likely to give many answers to (1) and (2), b u t
relatively few to (3).

INTROVERSION AND ASCENDANCE TESTS

Freyd-Heidbreder Test.1 Freyd collected fifty-four items


descriptive of the introvert type from Ju n g ’s writings, of which
the following are samples :
Blushes frequently ; is self-conscious.
Day-dreams.
Prefers to read a thing rather than experience it.
Shrinks when facing a crisis.
Is reticent and re tirin g ; does not talk spontaneously.
Is slow in movement.
Keeps in the background on social occasions.
Heidbreder turned these into a self-rating test, where the testee
checks each item + , ? or —. L aird’s 2 Personal Inventory C-2
and various other adaptations are available. O ther tests
such as N eym ann-K ohlstedt’s and R oot’s (described and used
in England by W yatt and Langdon),3 were constructed so
1 Freyd, M., ‘ Introverts and Extroverts ’. Psychol. Rto., 1924, 81,
74-87. Heidbreder, E., * Measuring Introversion and Extroversion ’.
J . Abn. Soc. Psychol., 1926, 21, 120-184.
' Op cit.
* Neymann, C. A., and Kohlstedt, K. D ., ‘ A New Diagnostic Test for
Introversion-Extroversion ’. J . Abn. Soc. Psychol., 1929, 28, 482-487.
Root, A. R ., ‘ A Short Test of Introversion-Extroversion Personnel J .,
1981,10, 250-253. W yatt, S., and Langdon, J. N ., ‘ Fatigue and Boredom
in Repetitive Work ’. Industr. Hlth. Res. Board Rep., No. 77. London :
H.M. Stat. Office, 1987.
Self-Ratings and Personality Questionnaires 127
th a t item s discrim inated between schizophrenic and manic-
depressive patients. The assum ption th a t these psychotic
groups represent the extrem es of norm al introversion and
extraversion is very d u b io u s; in fact, Eysenck offers an
experim ental disproof (cf. p. 36). H ence this ty p e and the
Freyd-H eidbreder type of te st give very poor correlations w ith
one another.
A sim ilar scale for schizothym ia-cyclothym ia, based on
K retschm er’s work, is published by Scholl.1
Allport's A -S Reaction Study.2 H ere the items were m ade
up to represent concrete m anifestations of dom inatingness
(ascendance) or submissiveness, and were standardized by
com paring the answers of students who had been rated by
associates as highly ascendant or submissive. The following
are ex a m p le s; the num bers show the weighted scores for
ascendance :
A salesman takes manifest trouble to show Yes, as a rule . . -1
you a quantity o f merchandise ; you are Sometimes . 0
not entirely suited ; do you find it difficult No • +1
to say ‘ N o ’ ?

I f you hold an opinion the reverse of that In class . . +8


which a lecturer has expressed in class, do After class 0
you usually volunteer your opinion ? N ot at all . -8

An alternative form is available for women, and an ad ap tatio n


for children has been prepared.

TESTS OF OTHER TRAITS

A te st for Inferiority Feelings, based on A dler’s writings,


has been compiled by H eidbreder,8 along th e same lines
as her introversion test. B e rn re u te r4 published a te st of
1 Scholl, R ., * Untersuchungen fiber die teilinhaltliche Beachtung von
Farbe und Form bei Erwachsenen und K indem ’. Zs. / . Psychol., 1027,
101, 225-280.
* Allport, G. W ., * A T est for Ascendance-Submiaaion ’. J . Abn. Soc.
Psychol., 1028, 28, 118-180. Allport, F . H ., and Allport, G. W ., A -S
Reaction Study. Boston : Houghton Mifflin, 1928.
* Heidbreder, E ., ‘ The Normal Inferiority Complex ’. J . Abn. Soc.
Psychol., 1027, 22, 248-258.
* Bem reuter, R . G., ‘ The Measurement o f Self-Sufficiency ’. J . Abn.
Soc. Psychol., 1088, 28, 291-300.
128 Personality Tests and Assessments
Self - sufficiency vs. Dependence on Others. M aslow1 has
developed tests of Security-Insecurity and of Self-esteem or
Dominance feeling (the latter for women only), on the basis of
clinical studies of well and poorly adjusted students. Jasper,
and Chant and Myers 1 have tests of Depression-Elation, the
latter being scaled by Thurstone’s technique. Its items range
from :
Everything in the world is against me (Score 0-9) to,
Life could not be better for me (10-7).
Willoughby’s E -M (Emotional M aturity) Scale3 contains
similarly standardized items, e.g. :
S develops affective difficulty in the presence
of a necessity for precise and realistic
thinking, e.g. mathematics. ------(Score 2)
S organizes and orders his efforts in pur­
suing his objectives, evidently regard­
ing systematic method as a means of
achieving them. ------(7)
This is intended primarily for third-person application, e.g. for
ratings of a patient by a psychiatrist, b u t can also be used for
self-rating (at a sophisticated level).
W ang’s 4 Persistence test contains items which, in the
opinion of 75 judges, should differentiate the persistent and
non-persistent person. C ason6 originated the Annoyances
1 Maslow, A. H ., et. al., * A Clinically Derived Test for Measuring
Psychological Security-Insecurity ’. J . Gen. Psychol., 1945, 83, 21-41.
Maslow, A. H., * A Test for Dominance-Feeling (Self-Esteem) in College
Women ’. J . Soc. Psychol., 1940,12,255-270. Social Personality Inventory
for College Women. Stanford, Cal. : Stanford University Press, 1942.
* Jasper, H. H., 4 The Measurement of Depression-Elation and its
Relation to a Measure of Extraversion-introversion. J . Abn. Soc. Psychol.,
1930, 25, 807-318. Chant, S. N. F., and Myers, C. R ., ‘ An Approach to
the Measurement o f Mental Health ’. Amer. J . Orthopsychiat., 1988, 0,
184-140.
• Willoughby, R. R., ‘ A Scale of Emotional Maturity ’. J . Soc. Psychol.,
1982, 8, 8-36. Test published by Stanford University Press, 1931, now
out o f print.
* Wang, C. K. A., ‘ A Scale for Measuring Persistence ’. J . Soc. Psychol.,
1982, 8, 79-90.
• Cason, H., * An Annoyance Test and Some Research Problems ’.
J . Abn. Soc. Psychol., 1980, 25, 224-236.
Self-Ratings and Personality Questionnaires 129
te st (adapted for the Bennett-Slater questionnaire, p. 125). I t
lists 217 situations which the testee rates from 8 (extremely
annoying) to 0 (not annoying). The average score can be used
as measure of Irritability.
Wallen’s 1 test of Food Aversions consists of tw enty foods
which are ticked for liking or disliking. Normal adults dislike
an average of one or less, whereas neurotics average three to
five aversions. Eysenck has found the te st effective in this
country also.

MULTIPLE TESTS

Tests such as W oodworth’s or Thurstone’s obviously contain


a wide range of symptoms drawn from m any different neurotic
or psychotic conditions. I t would be theoretically possible for
several testees to give neurotic answers to entirely different
sets of, say, tw enty items, and yet get the same score and be
labelled equally neurotic or unstable. A ttem pts have been
made to classify the Thurstone Schedule items,* e.g. under
E xtravert-Introvert, Physical Disorders, Fantasy, etc., b u t
these tend to inter-correlate too highly to be accepted as
distinct. In L aird’s B-2 Inventory th e items are classified as
Psychasthenoid, Schizophrenoid, and Neurasthenoid.
Cattell 8 published a questionnaire w ith separate sets of items
for seven syndromes—Neurasthenia, Anxiety Neurosis, Anxiety
H ysteria, Conversion H ysteria, Obsessive-Compulsive, Epilep-
toid, and Paranoid. B etter known is H athaw ay and McKinley’s
M innesota M ultiphasic Personality Inventory (M M P I ),4 which
is widely used in m ental hospitals in this country as well as
America. Its 550 statem ents are more varied th an usual,
including some dealing with interests and social attitudes.
They are generally presented individually, on separate cards,
1 Wallen, R ., * Food Aversions of Normal and Neurotic Males ’. J . Abn.
Soc. Psycho!., 1945, 40, 77-81.
* Cf. Willoughby, R. R., ‘ Some Properties of the Thurstone Personality
Schedule and a Suggested Revision ’. J . Soc. Psychol., 1982, 8, 401-424.
* Cattell, R. B., A Guide to Mental Testing. London : University of
London Press, 1936.
4 Hathaway, S. R ., and McKinley, J . C., 4 A Multiphasic Personality
Schedule (Minnesota). I. Construction of the Schedule.’ J . Psychol.,
1940, 10, 249-254. Inventory published by Psychological Corporation,
New York, 1942.
180 Personality Tests and Assessments
and sorted by the patients into ‘ True, False, and Cannot Say ’
boxes. I t may take anywhere from 80 minutes to several hours
to complete. On the basis of the responses of 500 normal
adults (16 to 55 years) and 800 miscellaneous patients, a series
of empirical scoring keys has been developed, so th a t a profile
is obtained showing relative scores o n : Hypochondriasis,
Depression, Hysteria, Psychopathic Deviate, Masculinity-
Fem inity, Paranoia, Psychasthenia, Schizophrenia, and Hypo-
mania. Note th a t the significance attached to an item depends,
not on its manifest content, but on its correlation with an
external criterion. Other keys can be and are being developed
by various authors, e.g. for differentiating academically unsuc­
cessful from successful students.1 Four additional scores
provide checks on self over-evaluation, malingering, and other
sources of unreliability. The authors do not claim th a t the
profile will provide an autom atic differential diagnosis of
patients. The MMPI is a clinical instrum ent which requires
considerable skill in interpretation, and even then is likely to
agree with the psychiatric diagnosis only in 60% of cases
(according to critics the figure is less than 50%). I t is more
successful in differentiating abnormal persons in general from
normal. While experienced testers can acquire a rem arkably
detailed insight into a personality from the p attern of scores,
this approach is open to all th e weaknesses mentioned in
Chaps. I and II.
Another test with a psychiatric background is the Humm-
Wadsworth Temperament Scale.2 I t aims to measure seven
‘ components ’ distinguished by Rosanoff: Normal, Hysteroid,
Manic Cycloid, Depressive Cycloid, Autistic Schizoid, Paranoid,
and Epileptoid. Scoring keys were based on the responses of
groups of patients, criminals, and normals known to be strong
or weak on these components. Some of the 818 items score
for more than one component, while others are not scored a t
all, since they did not differentiate between any of the groups ;
bu t they were left in the final form of the test for the sound
reason th a t the value of an item may be affected by its context.
1 Cf. Gough, H. G., * Factors Relating to the Academic Achievement of
High-School Students ’. J . Educ. Psychol., 1940, 40, 85-78.
* Humm, D. G., and Wadsworth, G. W ., ‘ The Humm-Wads worth
Temperament Scale ’. Amer. J . Psychiat., 1935, 02, 183-200. Scale
published by Humm Personnel Service, Los Angeles, Cal., 1040.
Self-Ratings and Personality Questionnaires 181
This has the disadvantage of making the test rather long, the
average tim e for answering being 55 minutes. Another inter­
esting feature is th a t the proportions of Yes’s and No’s to the
test as a whole provides a check on the testee’s conscientiousness.
Negativistic persons tend to give an undue number of N o’s,
highly suggestible people too many Yes’s. Humm insists th a t
it is the pattern or profile of scores on all the components,
considered in the light of these checks, which enable the well-
trained tester to diagnose personality trends. He has applied
the test widely in business and industry and quotes striking
instances of correct detection of dishonesty or character
weakness among employees. O ther writers, however, quote
only moderate or poor validities.1
The Bernreuier Personality Inventory 2 has achieved enormous
popularity in America, with little justification. I t claims to
measure four traits—Neurotic tendency, Introversion-Extra-
version, Dominance-Submissiveness, and Self-Sufficiency. Each
of its 125 items (mostly taken over from Thurstone, Heidbreder
and Allport, etc.) is scored for each trait. F or example the
responses to : ‘ Do you day-dream frequently ? ’ are scored :
Neurotic Introversion Dominance Self-Sufficiency
Yes +5 +8 -1 +1
No -4 -4 +1 -1
Doubtful -2 0 +2 +2

These weights were obtained empirically by contrasting the


responses of high and low scorers on four established scales—
Thurstone’s, L aird’s C-2, Allport’s, and B em reuter’s own Self-
Sufficiency test. In this instance the external criteria for item-
validation are highly fallible, and the resulting inventory scores
show odd features. Several experiments have dem onstrated
th a t Neurotic and Introverted are almost identical, correlating
to + 93, and th a t Dominance is nearly the reverse of both, its
1 Cf. Dorcus, R . M., ‘ A Brief Study of the Humm-Wads worth Tempera­
ment Scale and the Guilford-Martln Personnel Inventory in an Industrial
Situation \ J . A ppl. Psychol., 1944, 28, 802-307. Guilford, J . P., and
Lacey, J. I., Printed Classification Tests. Army Air Forces Aviat. Psychol.
Prog. Res. Rep. No. 5. Washington, D .C .: U.S. Government Printing
Office, 1947.
* Bemreuter, R. G., ‘ The Theory and Construction of the Personality
Inventory J . Soc. Psychol., 1983, 4, 887-405. Inventory published by
Stanford University Press. 1931
182 Personality Tests and Assessments
correlations being — 81 and — 67. Self-Sufficiency is relatively
distinct, though overlapping moderately w ith Dominance ; its
correlations are —-41, — 82, and + 58. Flanagan 1 applied
factor analysis to these scores and showed th a t the te st was in
effect measuring two, not four things. The first, a compound
of Neurotic, Introverted and low Dominant, and Self-Sufficient
scores, seems to represent general Lack of Self-Confidence. A
second and smaller factor may be denoted as Sociability.
Flanagan constructed a fresh set of keys so th a t responses could
be scored for these two factors.
Bell's Adjustment Inventory 2 is another test widely used at
high school and college level. Its 160 items are grouped under
four headings—Home, Health, Social, and Emotional Adjust­
ment.
B oyd’s Personality Questionnaire 3 is the only one to have
been used at all extensively among British university students.
Its 120 items are classified under twenty headings or traits,
including the followifig :
Trait Sample Question
Obsessional Carefulness Do you often go over a job again and again
to make it just right 1
Worry, A nxiety Do you brood long over humiliating or
unhappy experiences ?
Suspiciousness Do you sometimes suspect that people are
talking about you ?
Self-consciousness Are you greatly interested in what goes on in
your own mind ?

Testees are not told about these traits, and the questions
are so arranged that they are unlikely to guess that six
deal with carefulness, six with worry, etc. Each question is
answered Yes, Yes ?, 0, No ? or No, or om itted ; and is
scored 4 to 0. Thus there is a possible range of 24 to 0 for
each trait. Naturally the traits overlap considerably, and a
factor analysis by the writer of the scores of 100 students
> Flanagan, J. C., Factor Analysis in the Study of Personality. Stanford,
Cal. : Stanford University Press, 1035.
* Bell, H . M., Adjustment Inventory. Stanford, Cal. : Stanford University
Press, 1934.
* Boyd, W., ‘ A New Personality Test ’. Scot. Educ. J ., 1989, Sept.
lst-1 5 th , 998-999, 1014-1010, 1024-1025.
Self-Ratings and Personality Questionnaires 188
indicated th a t they could be boiled down to three or four
distinct tendencies :
(1) Self-depreciatory and psychoneurotic tendency—a general
factor particularly strong in the scores for Depression, In ­
stability, Worry, Lack of Self-Control, Shrinking Responsibility,
and Lack of Confidence.
(2) ‘ Care-freeness ’, most marked in Shrinking Responsibility,
Suggestibility, Inability to Concentrate, Lack of Definite
Interests, and in low scores on Worry, Self-Consciousness,
Emotional Thinking, Dissociation, and Tenseness.
(3) ‘ Scrupulousness most marked in Obsessional Careful­
ness, Acting Readily w ithout Pressure, Suspiciousness, Self-
Control, and Low Instability, Emotional Thinking, and
Inability to Concentrate.
(4) A sex difference factor.
These results indicate the dangers of taking questions a t
their face value. Even when multiple tests are devised to
measure traits distinguished by factor analysis, the resulting
scores tend to be far from distinct. An example of this is
provided by the :
Guilford-Martin Temperament Profile Chart.1 In several
publications Guilford has analysed correlations between
typical items from extraversion-introversion tests, and claimed
to break down this tra it into separate components. Two of his
tests attem p t to measure such factors, n am ely:

Inventory of I ’actors STD C R Inventory of Factors G A M IN


Social Introversion General Pressure for Overt A ctivity
Thinking Introversion Ascendancy in Social Situations
Depression Masculinity of Attitudes and Interests
Cycloid Disposition Inferiority Feelings
Khathymia (Carefreeness) Nervous Tenseness and Instability

Actually the correlations between these sub-scores are so


high that, in the absence of validation against external criteria,
they all seem to measure much the same introversion-neuroticism
1 Guilford, J. P., and Guilford, R. B., ‘ Personality Factors S, E , and M ,
and their Measurement ’. J . Psychol., 1930, 2, 109-127. Martin, H. G.,
‘ The Construction of the Guilford-Martin Inventory of Factors
G -A -M -I-N J . A ppl. Psychol., 1945,29,298-800. The three inventories
published by Sheridan Supply Co., Beverly Hills, Cal., 1940-1948.

10
184 Personality Tests and Assessments
factor as does the Bernreuter Inventory.1 A third test included
in Guilford’s Profile Chart is his Personality Inventory I. This
aims to pick out trouble-makers and paranoid individuals in
business and industry by means of three groups of items
measuring : Objectivity, Agreeableness, and Co-operativeness.
I t has been shown to correlate moderately with ratings of
employees,* b u t whether it would work equally well as a
selection test, when testees are on the defensive, is not known.
Cattell's 16 P .F . Test.3 Perhaps the most am bitious question­
naire is the one constructed by Cattell to cover the twelve
‘ source traits ’ of his personality factor analysis, together with
four additional traits (radical-conservative, self-sufficiency,
will-control, and nervous tension). There are two parallel
forms, each containing 187 items. They are designed for college
student level, but a children’s edition is being prepared. The
am ount of overlapping among the sixteen scores is not stated,
and no evidence of validity is so far available.

QUESTIONNAIRES FOR CHILDREN

The reactions of children, particularly under 14 years, to


personal questions are even more unpredictable th an those of
adults, and we would strongly deprecate the use of such tests
except in experiments conducted by trained psychologists. In
an attem pt to reduce introspectiveness, contrasuggestibility, or
other undesirable attitudes, some American questionnaires
have adopted third-person questions, similar to those of Guess-
Who ratings.
M ailer's Character Sketches 4 contains 200 short descriptions ;
the testee has to say whether or not he feels or acts Uke the
person described. For example :
------This person insists on having his own way and likes to
command and rule everybody.
------This person finds it difficult to forget unpleasant
memories and can’t help thinking about them.
1 Cf. Lovell, C .,1 A Study of the Factor Structure o f Thirteen Personality
Variables ’. Educ. Psychol. Measmt., 1945, 5, 335-850.
* Cf. Dorcus, op. cil., p. 131.
* Cattell, R. B., el. al., The 16 Personality Factor Questionnaire. Cham­
paign, 111. : Institute of Personality and Ability Testing, 1950.
4 Mailer, J. B., Character Sketches, 1932 ; Personality Sketches, 1986.
New Y ork : Teachers College, Columbia University, Bureau o f Publications.
Self-Ratings and Personality Questionnaires 185
Each item is repeated in reverse form, elsewhere in the test, as
a check on consistency ; e .g .:
------This person never insists on having his own way and
does not like to com m and and rule everybody.
All questions have been shown to differentiate significantly
between groups of 308 delinquent or problem cases and 310
normal pupils. They are classified under the following six
headings, which are admitted to overlap to a moderate e x te n t:
Desirable character traits Personal adjustm ent (freedom from
anxiety)
Self-control and integration Mental health (freedom from psycho­
tic or neurotic sym ptoms)
Social adjustm ent (extraversion) Readiness to confide in others

A later edition, Personality Sketches, consists of 100 items,


presented on cards, so th a t no w ritten response is required.
Pintner’s Aspects o f Personality 1 contains thirty-five items
for measuring each of three traits—ascendance-submission,
extraversion-introversion, and emotionality. I t is sufficiently
simply worded to be applicable from about 10 to 15 years.
Sanders,2 in A ustralia, has developed a te st for individual
application to 9- to 13-year-old boys, containing third-person
item s dealing w ith Physical and Economic Insecurity, Social
U nder-evaluation, and Non-Social Tendencies. The tw o la tte r
sections are highly correlated. He finds th a t delinquents tend
to give high scores on all sections, and th a t anxious or neurotic
boys surpass normals on the second and third.
Several other inventories and questionnaires, either for
American adolescents or for the long-suffering college student,
are listed and critically reviewed in Buros’s Year books.

BIOGRAPHICAL INVENTORIES

Hollingworth, in the 1920s, studied th e predictive value of


responses to application blanks by candidates for em ploym ent,
1 Pintner, R ., and Forlano, G., * Validation o f Personality Tests b y
Outstanding Characteristics of Pupils J . Educ. Psychol., 1989, 80,
25-32. T est published b y World Book Co., Yonkers, N .Y ., 1938.
* Sanders, C., ‘ Insecurity and Social Maladjustment in Children ’.
Brit. J . Educ. Psychol., 1948, 18, 148-155.
186 Personality Tests and Assessments
and found that a weighted score derived from the most valid
questions might correlate well with subsequent success. Such
validation necessitates very large numbers, thus this technique
was useful in selecting pilots, officers, or other large groups in
the Second World War.1 Inventories were devised containing
a hundred or more multiple-choice questions which covered a
wide range of mainly factual information—educational and
occupational career, financial status, skills and trade experience,
home background, marital record, athletic and leisure activities,
health, etc. The relevance of each answer was determined by
giving the inventory to groups of, say, successful and failing
pilots, and a purely empirical scoring key was developed. The
same inventory, with a different key, could be used for navigators
or other groups. Note that this type of questionnaire does not
set out to measure any specified trait or traits. But it was
certainly the most useful of all the personality measures tried
out in the U.S. Army Air Force. Unfortunately it is applicable
only when the job requirements and the type of applicant
remain constant over a long period, and it would probably
work only with adults of good intelligence. I t might well be
tried in the selection of university students.

DISCUSSION OF PAPER-AND-PENCIL
PERSONALITY TESTS

The weaknesses of the questionnaires described above seem


so obvious that their use in countries other than America has
been confined almost entirely to a few tentative experimental
investigations. But in America they have not only been used
in hundreds of studies with pupils and students, but are (or at
least used to be) applied regularly in mental hospitals, clinics,
and schools, for detecting problem cases and neurotics, and for
educational or vocational guidance. They are so easily made
up and given to large numbers that it seems to be forgotten
that a count of emotional responses is a very different matter
from a count of right answers to intelligence or educational tests.
However, since the publication of a highly critical article by
1 Cf. Guilford and Lacey, op. cit., p. 131. Vernon and Parry, Biblio­
graphy. Stuit, D. B. (edit.), Personnel Research and Test Development in
the Bureau of Naval Personnel. Princeton. N .J. : Princeton University
Press, 1947.
Self-Ratings and Personality Questionnaires 187
Ellis, there are signs of greater caution, and of some realization
of the importance of the testees’ attitudes to the tests.
The majority of the questions deal with personal matters
which one might discuss with a sympathetic and trusted friend,
or a psychoanalyst, but would certainly hesitate to commit to
writing for some relatively unknown tester to read. Many
experiments have in fact shown that when people do not have
to give their names they admit to larger numbers of symptoms
of maladjustment.1 Interesting studies by Smith, and Sletto,4
compared items set in a positive or socially acceptable form with
similar ones in negative form, e.g .:
Feels people speak well of him and like him.
Feels people criticize him and dislike him.
I t was found, (a) that more people admitted that the positive
form did not apply than that the negative form did ap p ly ;
(b ) that the correlation between scores on the two types was
not high ; (c) that the negative type showed higher internal
consistency. Presumably this means that negative items all
tend to arouse a suspicious, hostile, or defensive attitude.
Hence testees answer them all alike, regardless of their real
meaning. Positive items, however, are considered more
calmly, and so evoke a greater diversity of response and lower
consistency. Thus the common application of the internal
consistency technique of item analysis tends to produce a
piling up of negative items.3
Many tests, such as Allport’s A-S, try to draw on recollections
of objective behaviour rather than on feelings, and it has been
shown that answers to such items tend to be more stable or less
likely to alter if the test is repeated (although their internal
consistency may be lower). But affective reactions naturally
enter into these just as, in the case of external ratings, halo
continues to operate however objectively the traits are defined.
Although the name of the trait or traits at which a test is aimed
1 Cf. Ellis, Bibliography.
* Smith, R . B., * The Development of an Inventory for the Measurement
of Inferiority Feelings at the High School Level ’. Arch. Psychol., 1932,
22, No. 144. Sletto, R . F., * A Critical Study o f the Criterion o f Internal
Consistency in Personality Scale Construction ’. Amer. Sociol. Rev., 1936,
1, 61-68.
* Cf. Willoughby, R. R ., and Morse, M. E ., * Spontaneous Reactions to
a Personality Inventory Amer. J . C.thopsychiat., 1936, 6, 562-575.
188 Personality Tests and Assessments
is usually withheld, and some non-committal title like Inventory
or A-S Study is given, it is obvious th a t testees will make their
own guesses as to the object of the test, and will answer each
question not so much a t its face value as in accordance w ith
their own interpretation of the object, and with how much
they are willing to reveal.
This explains why, on the one hand, different tests of nomin­
ally the same trait, given on different occasions, often show
rem arkably poor agreement, whereas on the other hand tests of
nominally different traits given in the same experiment (par­
ticularly if combined together into a multiple test) tend to
show much too high correlations. In fifty-five studies listed by
Ellis, the median correlation between tests claiming to measure
the same tra it was -40. The present writer found a figure not
much lower, namely -34, in fifty-eight studies where introversion,
psychoneurotic tendency and submission-ascendance were
inter-correlated. The far higher overlapping in the Bernreuter
and other multiple tests has already been pointed o u t.1 This
strong general factor which runs through the responses to
extremely diverse questions in any one testing situation surely
reflects the testee’s halo and general attitu d e to this situation.
I t is not only the subject’s willingness to co-operate th a t
affects his responses, bu t also his unconscious resistances.
People literally do not know themselves well enough to answer
m any of the questions correctly ; their responses are only too
likely to be rationalizations or unw itting self-deceptions.
Alexander * has compared questionnaires with psychoanalytic
techniques, pointing out th a t the psychoanalyst would never
expect to get valid information from direct questions or intro­
spections, where conscious criticism is a t its maximum. A nother
im portant factor is suggestion. The well-known experiments
on testim ony show how liable to falsification are our recollections
of emotionally toned experiences, and how easily suggestive
questioning may lead us to accept experiences as our own which
never really occurred. Thus while most testees may be expected,
1 A relevant finding by the Research Branch, Information and Education
Division of the U.S. Army, was that its NSA test of 15 items dealing with
psychosomatic symptoms was as effective in screening neurotic recruits as
a multiple test o f over 100 items, which were carefully compiled to cover
all the main aspects of maladjustment.
• Alexander, F., * Evaluation of Statistical and Analytical Methods in
Psychiatry and Psychology ’. Amer. J . Orthopsychiat., 1934, 4, 438-448.
Self-Ratings and Personality Questionnaires 139
wittingly or unwittingly, to disguise their emotional weaknesses
in answering personality questionnaires, others of a more sug­
gestible type may greatly exaggerate. This was brought out in
a study by Hollingworth,1 who applied the Woodworth inventory
to groups of soldiers in a mental hospital shortly before the
1918 armistice, and to other similar groups shortly after. The
average incidence of neurotic symptoms was about twice as
great in the former, presumably on account of their conscious
or unconscious fear of being returned to active service.
Yet another factor appears to raise scores in the well-
educated and academically minded. I t is a remarkable fact
that university students and professional people obtain much
higher average psychoneurotic and introverted scores than do
the relatively uncultured ; not infrequently they are found to
be as unstable as neurotic and psychotic mental hospital
patients. (There is a slight tendency also for the better students
to be more introverted and neurotic, though the evidence on
this point is somewhat contradictory.) Conceivably such
persons are more neurotic than the less educated classes, but
it is just as likely that they are also more self-analytic, more
used to verbalizing their emotional experiences, more willing
to admit to themselves and to the tester the possession of the
symptoms which the tests describe. Thus the explanation of
the high consistency and overlapping of questionnaire tests
probably lies in these various distorting attitude factors. High
scorers are not necessarily the most neurotic, introverted,
submissive, lacking in confidence, nor given to fantasy, shrinking
responsibility, instability, and inferiority feelings; they may
be the more sophisticated and introspective, or the more
suggestible, or the more willing to co-operate with the tester.
The influence of temporary mood—optimism, worry, etc.—
might also be thought to aifect test responses. An experiment
with the Bernreuter test by Johnson 2 showed that this does
occur, though only to a very slight extent. Objections are often
raised, again, to the rigid limitation of testees’ responses to Yes,
No, or at most to five steps for each item. I t is true that this
worries many educated testees, whose natural reactions to the
1 Hollingworth, H. L., The Psychology of Functional Neuroses. New
York : Appleton, 1920.
* Johnson, W. B., * The Effect of Mood on Personality Traits as Measured
by Bemreuter ’. J . Soc. Psychol., 1984, 5, 515-522.
140 Personality Tests and Assessments
questions are infinitely varied. Eisenberg 1 carried o u t an
introspective study of w hat different subjects m eant when they
selected a given response to a typical question, and showed th a t
there were enormous variations. However, this is less serious
than it sounds since variations in interpretation will tend to be
random, and to cancel one another out when total scores are
considered. I t is when all the variations take the same
direction, as when the * tough-minded ’ or the resentful testee
underestim ates his instability throughout, th a t they upset the
test results.
A much less obvious distortion enters whenever testees are
allowed to om it questions, to answer by question m ark, or to
give various grades of response. Some people are much more
cautious or non-committal than others. F or example, in the
Boyd questionnaire it was found by the writer th a t the propor­
tions of extreme responses (definite Yes’s and No’s as contrasted
with Yes ?, 0, or No ?) ranged from 17% to 98%. Cronbach*
calls this factor ‘ response-set ’ and shows th a t it is fairly
persistent or general from one test to another. (It occurs also,
of course, in associates’ ratings and in the markings of exam ina­
tions, where some markers adopt a much wider spread than
others.) In ability tests, as we have seen, it may give some
indication of impulsiveness-caution, b u t its psychological
significance in personality questionnaires is very dubious. For
the most p art it is a quite irrelevant feature of the testees’
interpretations of the test instructions, which nevertheless
operates to raise all high, or lower all low scores. Such tests as
MMPI and Humm-W adsworth try to provide checks, and
recognize th a t abnormal sets may invalidate all test scores.
Cronbach recommends the use of forced-choice items in order
to eliminate it.

VALIDITY

However, the constructors of questionnaires can justifiably


answer th a t speculations about testees’ attitudes and interpre­
tations do not concern them, provided th a t the tests work.
1 Eisenberg, P., ‘ Individual Interpretation o f Psychoneurotie Inventory
Item s ’. J . Gen. Psychol., 1941, 25, 19-40.
* Cronbach, L. J., ‘ Further Evidence on Response Sets and Test
Design Educ. Psychol. Measmt., 1950, 10, 8-81.
Self-Ratings and Personality Questionnaires 141
The im portant thing is whether they correlate w ith other
evidence of the traits. There is no need for us to survey
validation studies in detail, since this has been done by Ellis.
H e outlines the results of 880 relevant investigations and
concludes th a t, w ith two exceptions, these are generally
unfavourable to personality questionnaires. Perhaps his
standards of ‘ favourable ’ are unduly high. Thus he regards a
correlation of -4 as only ‘ questionably positive ’. We would
agree th a t this level of validity is too low for making predictions
about individuals, bu t it does indicate th a t questionnaires
have some value, especially if combined w ith other kinds of
tests. In ju st over half of the following 217 investigations the
results were positive or questionably positive :
Comparisons of the scores of behaviour prob­
lem children and of delinquents with
normal children : (24 o u t of 48)
Comparisons of neurotic and psychotic
adults with normals : (45 out of 75)
Correlations with ratings by associates on the
traits a t which the tests are aimed : (22 out of 44)
Correlations with other tests of the same
traits : (27 o ut of 55)
The fourth category consists entirely of comparisons with other
questionnaires; thus it indicates weak reliability rather than
moderate validity. However, in the present w riter’s research,
the mean correlation of five questionnaires with tra it composites
(chiefly made up of ratings and objective tests) was -45.
More striking than the average result is the wide range of
success and failure of questionnaires, even in similar investiga­
tions. This does not seem to be mainly due to differences in the
value of different questionnaires, although Ellis does show th a t
the Bell and Bernreuter Inventories, the Thurstone Schedule,
and W oodworth questionnaire tend to be the least successful.
More probably it is a m atter of the subjects’ attitudes. Two
studies of the Thurstone Schedule among college students
provide an interesting contrast. H anna 1 gave it to 179 students
who applied for psychological or vocational guidance a t a
college clinic, who, therefore, presumably thought th a t it
1 Hanna, J. V., ‘ Clinical Procedure as a Method o f Validating a Measure
of Psychoneurotic Tendeucy ’. J . Abn. Soc. Psychol., 1984, 28, 485-445.
142 Personality Tests and Assessments
would help them if they answered it really frankly. Independent
estimates of their emotional stability were made by clinical
psychologists and good agreement was found with the Schedule
scores (corresponding to a correlation of over -5). Moran,1 on
the other hand, showed th a t 41 students who were classified
as maladjusted scored scarcely any higher than 146 well-
adjusted students. B u t here the Schedule (in abridged form)
was taken along with various tests of abilities a t the beginning
of the college year, so th a t m any students may have thought
th a t the authorities would be influenced by the kind of picture
they drew of themselves.
Two exceptions were mentioned to Ellis’s general condemna­
tion. The first is the Minnesota Multiphasic Inventory. A
considerable m ajority of studies have shown significant differ­
ences between various abnormal groups and normals, though
the validity of the test for differentiating among different
abnormal types is less well attested. Usually, of course, the
test is adm inistered individually, and patients are more likely
to answer as they would to oral questioning by a psychiatrist.
Secondly, there is no doubt th a t such inventories as the NDRC,
NSA, Cornell and others were of value in screening abnormal
recruits during the war. The great m ajority of seventy-one
studies showed significant agreement with subsequent psychia­
tric diagnoses. H arris’s 2 study of 2081 naval recruits who
answered NDRC and Cornell (in about 6 minutes) is repre­
sentative. Of these 297, or 14-3%, scored above a certain
borderline, and 52 of them were discharged after psychiatric
interview or as a result of referrals during 10 weeks’ training ;
another 16, not * caught ’ by the tests were also discharged.
Thus three-quarters of the discharges were diagnosed, and
11-7% of acceptable recruits were incorrectly picked out.
Similar tests were applied experimentally, b u t n ot regularly,
in the British Services. On the whole they were less successful.
Thus Eysenck found some correlation between the Maudsley
Medical Questionnaire and neuroticism among recruits, b u t
this was smaller than th a t of the body-sway suggestibility, leg
persistence, dark vision, and other objective tests.
1 Moran, T. F., * A Brief Study of the Validity o f a Neurotic Inventory ’.
J . A ppl. Psychol., 1935, 19, 180-188.
* Harris, D. H ., ' Questionnaire and Interview in Neuropsychiatric
Screening ’. J . A ppl. Psychol., 1940, 30, 644-048.
Self-Ratings and Personality Questionnaires 148
Ellis and Conrad point out th a t such tests always produced
large numbers of 4 false positives ’, i.e. normals with high
neuroticism scores. Nevertheless, they did save a considerable
am ount of psychiatric interviewing tim e by picking o ut most
of the potential neurotics. Much of their good validity could
be attributed to 4 criterion contamination ’, th a t is, to the fact
th a t they asked questions similar to those asked orally in a
psychiatric interview ; indeed often the psychiatrists would
know, and be influenced by, the test scores. Thus the results
were far less favourable in thirty-six studies where tests were
checked against objective criteria such as failure in training.
For example, several tests tried out on USAAF pilots gave very
small or zero correlations with learning to fly. No satisfactory
investigations of their value in predicting breakdown in com bat
appear to have been carried out.
In so far as questionnaires do work better in m ilitary than
civilian contexts, it is probably due to the tremendous hetero­
geneity of samples of recruits, and to better motivation.
Recruits may be more candid, either because of m ilitary
discipline, or because they may assume th a t scores indicating
abnorm ality will be to their advantage. F urther, the better
tests were carefully constructed from items each of which had
been proved to differentiate abnormals from normals, in contrast
to the haphazard collection of items in m ost civilian tests,
described a t the beginning of this chapter, plus the misleading
internal consistency check.1 I t is noteworthy th a t the civilian
tests which on the whole give the best results such as Allport
A-S, MMPI, Humm-W adsworth, and Character Sketches, also
consist of empirically validated items.
We may conclude then th at, despite their extreme weaknesses
and dangers, paper-and-pencil personality tests and question­
naires should not be entirely condemned. W ell-constructed
ones, given under suitable motivating conditions, can be of
value both for experimental research, and in clinical or other
applied psychological work.
1 Cf. the useful discussion by Stuit, op. cit., p. 136.
IX
Measurement of Attitudes and Interests

T ^ H E techniques which, as we have seen, work rath e r badly


in the m easurem ent of em otional traits, are consider­
ably more successful in m easuring social attitud es, opinions,
and in te re sts; for example, radicalism vs. conservatism ,
nationalism , favourableness or unfavourableness to religion,
to birth control, or to coeducation, liking for particular school
subjects or occupations, etc. The term ‘ a ttitu d e ’ has been
used by psychologists in a great m any senses, and th ere is
no agreed definition.1 B u t in this context it generally implies
a personality disposition or drive which determ ines behaviour
tow ards, or opinions and beliefs about, a certain ty p e of
person, object, situation, institution or concept. I t includes
both McDougall’s ‘ sentim ents ’ and the medical psychologist’s
1 complexes though it is not necessarily tho u g h t to arise
either from innate instincts or from repressed wishes. Often
our attitu d e s are adopted ready-m ade, as it were, from our
parents, teachers, or friends, though usually modified by our
own experiences.
N ote th a t, as in the case of personality traits, attitu d e s may
be expressed either through behaviour or through verbal
statem ents, and th a t not infrequently these m ay seem to be
inconsistent or contradictory. K atz and A llport 2 point o u t
also th a t our publicly adm itted attitu d es may differ considerably
from our deeper and more private feelings. A m an’s pro- or
anti-sem itism , for example, is exceedingly complex, and could
not be assessed satisfactorily merely by asking him if he liked
Jews, nor only by observing such actions as his willingness to
patronize Jew ish shops. I t would be desirable to sample both
representative acts and opinions. A ctually the m ajority of
a ttitu d e tests sample verbal expressions only, and th e ex ten t
1 Cf. Allport, G. W., ‘ A ttitudes ’. A Handbook of Social Psychology
(edit. C. Murchison). Worcester, Mass. : Clark University Press, 1935.
* K atz, D ., and Allport, F. H ., Students’ Attitudes. Syracuse, N .Y .,
Craftsman Press, 1931.
144
Measurement o f Attitudes and Interests 145
to which they penetrate to the ‘ private ’ level is doubtful.
B ut they are none the less useful provided th a t these lim itations
are realized, and th a t they are not assumed to be predictive of
behaviour w ithout further evidence. And we shall see th a t
quite an am ount of evidence has been collected showing th a t
they do correlate with behaviour, a t least to a moderate
extent.
The notion of broad sampling is im portant in attitu d e
measurement, both because any single verbal statem ent may
give an inaccurate index of a person’s more general attitude,
and because of the need to avoid stereotyped value judg­
ments. If a man is asked straight out how religious he is,
how tolerant to foreigners, etc., his answer will inevitably
be biased by w hat he regards as socially respectable. Thus
it is better to break down the attitu d e into a num ber of
more concrete manifestations, and ask w hat he did in particular
situations, or w hat he thinks about specific points (just as
in analytic rating scales). For example, how often does
he go to w hat church services, w hat religious books has he
read recently, does he accept such-and-such beliefs? I t
does not m atter if some of the items seem only doubtfully
relevant to, or partially dependent on, the general attitude.
Statistical analysis will show whether they are too remote to
be included.
W hatever kind of test is adopted, the conception of attitu d e
measurement necessarily involves a unidimensional variable,
th a t is a definite object or issue towards which some people are
more favourable than others. This requirem ent has given rise
to considerable criticism ; it is said th a t peoples’ attitudes on
any topic show an infinite variety of qualitative differences, and
th a t they cannot be arranged along a single scale w ithout
distortion. For example, liberals do not merely hold views
intermediate between those of socialists and conservatives.
B ut the answer to this is that, if other im portant dimensions
exist, they should be measured separately. Thus if liberals
do differ both from socialists and conservatives in favouring
greater freedom for the individual (or any other doctrine),
then an appropriate test should be devised. As in the wider
sphere of personality, factor analysis is of the greatest assistance
in showing w hat attitudes are sufficiently unidimensional to be
suitable for measurement, which are distinctive and which
146 Personality Tests and Assessments
overlap so m uch th a t they are b etter com bined.1 On m any
issues, shades of opinion are indeed too varied an d unpolarized
to be readily measurable. F or example, in one research in
London on the value of school visits (to museums, factories,
etc.) it was not found possible to produce a satisfactory te st of
children’s attitudes. They alm ost all liked visits, and though
m any recognized various drawbacks, there was no single clear
continuum from pro- to anti-. This is very a p t to occur among
children ; adults’ attitu d e s are generally more crystallized. The
m ost pervasive factors running through the political, social, and
religious opinions of adults are two bipolar tendencies which
c o n tr a s t: (a) progressive or radical w ith conservative, and
(b) authoritarian or ‘ tough-m inded ’ w ith toleran t attitu d es.
B u t num erous sub-factors have been described in th e literature.
F or example, Cattell claims th a t m ost attitu d e s can be grouped
under McDougall’s list of instincts.

PUBLISHED ATTITUDE TESTS

Many readers m ay be more concerned w ith techniques of


attitu d e te st construction, which they can apply to th eir own
social, educational, or other researches, th a n w ith published
tests. However, a few of the b etter known tests will be
described or listed for illustration.
Thurstone's Scales.2 T hurstone and his collaborators have
produced a whole series of scales of attitu d e s to th e Church,
war, negroes, communism, capital punishm ent, etc. There are
m any other scales similar in form : P eterson’s A ttitu d e to W ar,
Rem m ers’s 3 generalized scales for m easuring attitu d e s to any
school subject, or any social institution, etc. (The la tte r are of
* Cf. McNemar, Bibliography ; Cattell, R . B ., Description and Measure­
ment of Personality. London ; Harrap, 1946. Vernon, P .E ., ‘ A Study
o f War Attitudes B ril. J . M ed. Psychol., 1942, 19, 271-291. Eysenck,
H. J ., * Social A ttitudes Current Trends in British Psychology (edit.
C. A. Mace and P. E . Vernon). London : Methuen, 1953.
* Thurstone, L. L., et. al., Scales for the Measurement of Social Attitudes.
Chicago : University of Chicago Press, 1930. Also, Thurstone and Chave,
Bibliography.
* Peterson, R . C., Scales for Attitude Toward W ar. Chicago : University
of Chicago Press, 1930. Remmers, H. H., el. al., Attitude Scales. Lafayette,
Ind. : Purdue University, Division o f Educational Reference, 1934.
Remmers, H. H ., and Silance, E. B., * Generalized Attitude Scales ’.
J . Soc. Psychol., 1934, 5, 298-312.
Measurement o f Attitudes and Interests 147
very dubious value.) In Britain, Jo rd an 1 has published a scale
for children’s a ttitu d e to French, and num erous o th er scales
on various educational and social questions are contained in
stu d e n t M.A. and Ph.D . theses.2 E ach of these consists of
abou* tw enty statem ents graded on a 1 to 11 scale from
highly favourable to highly unfavourable, b u t printed in
random order.
The following examples are from Jo rd a n ’s scale :
I think th a t it takes so long to learn a foreign language
th a t the a tte m p t is not w orth while (2-0);
I like to listen to French talks on the wireless, because I th in k
it will improve m y knowledge of the language (9-5);
I only borrow French books occasionally from th e school
or public library (7-6).

The testee ticks th e statem ents w ith which he agrees and


the m edian or mean scale value of these gives his attitu d e
score.
Multiple-choice Tests o f Radicalism-Conservatism and Other
Attitudes. L entz’s 3 C-R Opinionaire contains fifty statem ents
such a s :
The m etric system of weights and m easures should be
adopted instead of our present system .
E ven in an ideal world there should be protective tariffs.
Conscience is an infallible guide.
Arm istice D ay should be celebrated w ith less m artial spirit.

E ach of these is answered + for agreem ent or — for disagree­


m ent. They are not graded as in T hurstone-type scales;
instead the testee’s score consists of th e num ber of sta te ­
m ents he answers in the conservative direction (i.e. N o in
the first and fourth examples, Yes in th e second and third).
A lternatively the multiple-choice or ‘cafeteria’ question may
1 Jordan, D ., 1 The A ttitude of Central School Pupils to Certain School
Subjects, and the Correlation between A ttitude and Attainm ent ’. B rit.
J . Educ. Psychol., 1941, 11, 28-44.
* Cf. Blackwell, A. M., A L ist of Researches in Education and Educational
Psychology. London : Newnes, 1950.
• Lentz, T. F ., et. al., C-R Opinionaire. S t. Louis, Mo. t Washington
University, Character Research Institute, 1935.
148 Personality Tests and Assessments

be used, as in this example, abbreviated from V etter’s 1


questionnaire:
W hat are your views on H e r e d i t a r y W e a l t h ?
1. All wealth should revert to the S tate a t death.
2. Taxes should confiscate the bulk, leaving only enough
for support of dependents.
3. Inheritances should be taxed on a rapidly graded scale,
up to about 50% for large fortunes.
4. Very large fortunes should pay a reasonable inheritance
tax, bu t not so high as to become confiscatory.
5. Individual thrift and initiative should not be damped
by any inheritance taxation.
Here the score is based on the grade of responses to several such
questions.
L ik e rt2 has published tests of Internationalism , Imperialism,
and A ttitude to Negroes, and all tests with multiple-choice
answers are sometimes referred to as Likert-type, as contrasted
w ith Thurstone-type. In point of fact, Likert did not originate
a new type of question, b u t only a technique of weighting the
graded responses to any question according to their centroids or
z-values. This technique has not been generally adopted
because the simpler method of weighting the responses 5, 4, 3,
2, 1 or + 1 , 0, —1, etc. gives practically the same results.
Eysenck’s 3 test of anti-semitism combines the Thurstone and
multiple-choice types. Thus the following two statem ents have
graded responses : Strongly Agree, Agree, Undecided, Disagree,
Strongly Disagree.
The Jews have too much power and influence in this
country (6-9).
The Jews have survived persecution because of the m any
admirable qualities they show (1-0).
The first is much more strongly anti-semitic than the second,
and possesses a higher Thurstone scale-value. This is allowed
1 Vetter, G. B., * The Measurement of Social and Political Attitudes and
the Related Personality Factors ’. J . Abn. Soc. Psychol., 1930,25, 149-189.
* Likert, R ., ‘ A Technique for the Measurement of Attitudes ’. Arch.
Psychol., 1932, 22, No. 140.
* Eysenck, H. J., and Crown, S., ‘ An Experimental Study in Opinion-
Attitude Methodology Int. J . Opinion Attitude Res., 1949, 8, 47-88.
Measurement o f Attitudes and Interests 149
for by weighting the five responses to the two sta te m e n ts :
2, 8, 4, 5, 6, and 8, 6, 4, 2, 0 respectively (a high score repre­
sents pro-semitism).
O ther tests deserving special m ention include N eum ann,
K ulp and D avidson’s Test of Internationalism , H artshorne and
M ay’s ethical a ttitu d e scales, R undquist and S letto’s studies of
the morale of unem ployed men, and the extensive series of
scales applied in the American A rm y by Stouffer and others.1
An interesting scale recently constructed by Shoben 2 contains
eighty-five item s representing domineering, possessive, rejective,
or other undesirable parental attitu d e s regarding child-upbring-
ing, which have been proved to differentiate m others of problem
children from m others of norm al children, e .g .:
A child should be seen and n o t heard. Strongly Agree,
Mildly Agree, Mildly Disagree, Strongly Disagree.
P aren ts cannot help it if their children are naughty.
A parent should see to it th a t his child plays only, w ith th e
right kind of child.

ATTITUDE TEST CONSTRUCTION :


CHOICE AND W ORDING OF STATEMENTS

No am ount of statistical trea tm e n t will com pensate for


poverty in the initial choice of statem ents. The scope or
content of th e a ttitu d e should be defined in detail, and as m any
differences as possible tabulated between people th o u g h t to be
strong or weak, pro- or anti-. Preference should be given to
differences in behaviour, although, as our illustrations show,
m any tests are based alm ost entirely on beliefs and judgm ents.
T hurstone advocates collecting statem ents about th e issue from
conversations w ith people, stu d en t essays, new spaper editorials,
etc., in order to avoid th e narrowness or lack of v ariety of
item s thought ou t solely by th e author. I f th e te st is intended
» Neumann, G. B., Kulp, D . H ., and Davidson, H ., Teat of International
Attitudes. New Y o r k : Teachers College, Bureau o f Publications, 1926.
Hart8home and May, Bibliography. Rundquist, E . A., and Sletto, R . P.,
Personality in the Depression. Minneapolis : University of Minnesota
Press, 1936. Stouffer, S. A ., et. al., The American Soldier. Studies in Social
Psychology in W orld W ar 11. Princeton, N . J . : Princeton University
Press, 1949-1950.
' Shoben, E . J ., * The Assessment o f Parental A ttitudes to Child Ad­
justm ent Genet. Psychol. Monogr., 1949, 39, 101-148.
II
150 Personality Tests and Assessments
for school pupils, samples of their opinions in their own words
should be collected. Each statem ent or question must also be
formulated in such a way as to seem natural and credible to the
kind of people for whom the test is intended, so th at it may be
readily acceptable to, or rejectable by, them. In a multiple-choice
test, it must be conceivable th at a t least a few people will agree
with the most pro-, and others with the most anti-, answers.
Wang 1 gives a useful list of rules for wording statements or
questions. They should be short, simple, and unambiguous.
The following is bad, since it could be taken as representing
support for, or opposition to, birth co n tro l:
B irth control legislation is a disgrace to our civilization.
Double-barrelled statem ents are always ineffective, since some
testees m ay pay atten tio n to one clause, others to th e other.
As in the case of personality questionnaires, some o f th e more
intelligent and critical testees are likely to object to their
attitu d e s being straight-jacketed by th e form of th e item , and
will w ant to qualify every response. This can be reduced by
careful wording, and by allowing additional space a t th e bottom
of the te st blank for spontaneous com ments, in which they can
let off s te a m ; b u t best of all by having a trial run. T h a t is,
the te st should be discussed freely w ith a small group of typical
testees and suggestions welcomed for reducing am biguities.
T h a t the distortion or simplification of shades of opinion in
a ttitu d e m easurem ent is less serious th a n m any critics th in k
is shown by an experim ent by Stouffer.2 H e obtained from 238
1 Wang, C. K. A., * Suggested Criteria for Writing A ttitude Statem ents \
J . Soc. Psychol., 1932, 8, 867-878.
1 Stouffer, S. A ., * Experim ental Comparison of the Statistical and a
Case H istory Technique o f Attitude Research ’. Publ. Amer. Sociol. Soc.,
1981, 25, 154-156. This experiment could be interpreted the other way
round ; i.e. it shows that spontaneous expressions o f attitude can be
used to give a reasonably reliable index o f the attitude, when assessed by
experienced and impartial judges. Jam es and Tenen raise a number of
legitim ate criticisms of questionnaires, particularly when applied to
children, and advocate assessing attitudes by free interview. B u t they
give no indication as to how the type o f interviewer bias to which Rice
drew attention (cf. p. 21) is to be controlled. Jam es, H . E . O., and
Tenen, C., ‘ How Adolescents Think o f People ’. B rit. J . Psychol., 1950,
41, 145-172. Further evidence on the unreliability o f attitude assessm ents
even by skilled interviewers is given by W edell, C., and Sm ith, K . U.,
* Consistency o f Interview Methods in Appraisal o f A ttitudes ’. J . A p p l.
Psychol., 1951, 35, 392-B96.
Measurement o f Attitudes and Interests 151
students anonymous accounts of their own experiences and
opinions about alcohol and prohibition. These completely
unforced expressions of attitude were rated by four independent
judges for their favourableness or unfavourableness to pro­
hibition, their average agreement being shown by a correlation
of -86. The combined ratings correlated -81 with the scores of
the same students on an ordinary te st of A ttitude to
Prohibition.
By the ‘ acceptability ’ of items is m eant the proportion of a
typical group of testees who agree with the item, or who check
each response to it. This acceptability m ust be estim ated a t
least roughly in advance, since the levels vary markedly in
different types of test. In a Thurstone-type test, statem ents
m ust range from very high to very low, or strong pro- to strong
anti-, and include a good proportion of moderate or neutral
shades. In a te st like V etter’s the statem ent may be non­
com mittal or neutral, while the responses should range down to
about 10% acceptability in either direction. B u t in most
multiple-choice tests (Lentz’s, L ikert’s, Eysenck’s, etc.) the
statem ents are around 85 to 65% and 35 to 15%. For if very
extreme items are included, almost all the responses will be No
or Disagree ; while if they are too middling, they cannot reveal
any definite variations in attitude.
R undquist and Sletto 1 have dem onstrated clearly th at, in a
test which asks for socially approved or disapproved opinions,
questions which state the disapproved view are not always
answered in the same way as apparently identical questions
stating the approved view. For instance, if 40% of a group
accepts as True :
The Government’s policy is subservient to big business
interests,
it will not usually be found th a t all the same testees, or even
the same proportion, will answer No or False to :
The Government’s policy is independent of big business
interests.
Much as in personality questionnaires, the unpopular question
or item seems to arouse more emotion and to be answered less
1 Op cit., p. 149.
152 Personality Tests and Assessments
rationally than the same item stated in the reverse way. These
authors recommend including both types, since they touch off
different aspects of the attitude.
Eysenck and Crown 1 carry the argument further by pointing
out that acceptability depends not only on positive or negative
strength, but also on social stereotypes. For example, these
two statements have almost the same Thurstone scale-values :
The Jews have too much power and influence in this
country.
Jews lack physical courage.
B ut the former was agreed to by 38% of a group of 250 normal
adults, the latter by only 24%. For there is a common stereo­
type th a t Jews are mercenary, b u t not th a t they are cowardly.
One would deduce from these findings th a t it is desirable to
include a large number of varied items in an attitu d e te st in
order to cancel out the effects of such specific stereotypes.

ITEM ANALYSIS AND SCALING

In constructing a test one should start with at least twice


as many items as one is likely to require for the final form. The
various techniques for picking the best of these, and ensuring
their relevance to a unidimensional attitude, were listed briefly
on p. 123. The simplest is to ask a group of judges to assess
each item, and to eliminate those which are criticized by a
substantial proportion. Judges can help also in defining the
scope of a complex a ttitu d e ; for example, the consensus of
their opinions can be taken regarding the proportions of
political, religious, sociological, and other items to include in a
test for general radicalism. Thurstone’s 8 scaling technique is
an extension of this, whose main object is to place each item
on a scale of equivalent units, according to its favourableness
or unfavourableness. I t is based on the psychophysical method
of equal-appearing intervals. Thurstone usually employs
several hundred judges, though reasonably accurate results can
be obtained with as few as twenty-five. Each is given a set
of items, typed on separate slips, and asked to sort them into
eleven piles, the left-most pile, No. 1, containing items con­
sidered most favourable, and the right-most pile, No. 11, the
1 Op cit., p. 148. * Cf. Thuretone and Chave, Bibliography.
Measurement o f Attitudes and Interests 158
most unfavourable. The other piles are interm ediate and
should be approxim ately equally spaced in degree of favourable­
ness, in each judge’s opinion. I t is easier, perhaps, to have
nine piles ; then the judges can first sort into three large piles
and later sub-divide each one. The numbers of items assigned
to each pile can be uneven, bu t no one pile should contain as
many as a quarter of the total.
A cumulative frequency graph is then plotted for each item
(cf. Fig. 5). Say there are 40 judges : 1 puts an item in pile 1,

2 more making 3 in pile 2 ; 8 calls it 3 or higher, and so on. No


one puts it lower than No. 6. The median and quartile positions
are then read off to the first decimal place.1 Thus the position
opposite the tw entieth judge is 4-2, and this is taken as the
scale value of the item. The quartiles fall a t 4-9 and 8 3, and
the difference between them, namely 1-6, gives Q—a measure
of the agreement between the judges. I f their opinions of its

1 A scale value will be less than 1 if more tJtian 50% of judges place it in
the first pile. Its median position is then found by extrapolation. The
same applies to statements placed by more than 50% in the last pile.
More often one or other of the quartiles falls outside the graph, in which
case Q is read off from the median and the other quartile instead o f from
both quartiles.
154 Personality Tests and Assessments
scale value vary widely, it is obviously unsatisfactory and
should be elim inated. On a 9-point scale, Q should seldom
exceed 2 0, and an average of less than 1 5 should be
aim ed at.
To make up the final scale, about twenty to thirty statements
are selected which are : (a) fairly evenly spaced throughout
the scale. Thus there might be three items having each value
1 + , 2 + . . . 8 + ; (b ) low in Q ; (c) heterogeneous in content.
The tester must still use his subjective judgment to ensure th a t
the scale contains a variety of expressions of attitude. Thurstone
does not mention this point, but he describes an ‘ index of
similarity ’ or ‘ irrelevance ’, which would tend to reduce
heterogeneity. It is based on the responses to the statements
of a large group of testees, and resembles a correlation between
each statem ent and every other. Actually, few attitude testers
have ever adopted this.
I t m ight be thought th a t the judges' own attitu d e s would
affect the results of their scaling, b u t experim ents prove th a t
this is not so. Possibly it is unwise to apply scales sorted by
adults to children, since the meanings they read into th e item s
m ay differ. The T hurstone m ethod is laborious, b u t it does
yield scales which * go down ’ well w ith m ost subjects. The
fact th a t they need check only a few statem ents, instead of
having to give a Yes, No, or graded response to every item , is
an advantage. A m ajor defect in m any scales is th a t th e scores
are insufficiently reliable. This m ay be due to vagueness or
m ultidim ensionality in the a ttitu d e itself, or to th e small
num ber of item s th a t subjects check. I t is desirable, therefore,
to try ou t a new scale on a group of one or two hundred people,
and to calculate their scores separately on the odd-num bered
and even-num bered items. These two scores should correlate
to a t least -74 if the scale as a whole is to have a corrected split-
half reliability of -85. N ote th a t T hurstone scale units, although
equal-appearing, are no t absolute. I t is n o t legitim ate to
com pare an individual’s score on two or more scales (except
via percentiles); nor can any one score, say 5, be designated as
a neutral attitu d e . The G uttm an ty p e of scale is th e m ost
successful in establishing the point where favourable changes
over to unfavourable.1
1 Cf. Guttm an, L., ‘ On Festinger’s Evaluation of Scale Analysis
Psychol Bull., 1947, 44, 451-465.
Measurement o f Attitudes and Interests 155

MULTIPLE-CHOICE SCALES

A separate group of external judges is not needed for multiple-


choice scales, unless they be of the type proposed by Eysenck.
Instead, item selection is based on the responses of a group of
200 or so testees to a preliminary draft of the test. The simplest
technique is to tabulate separately the responses of subjects
with the highest, middle, and lowest thirds of scores on the
test as a whole. By combining these, the distribution of
responses for each item in the total group is obtained. As
already mentioned, preference should be given to items whose
distribution is not too bunched. Thus with 5-response items :
Response 1 2 3 4 5
Percent Frequency : 18 42 27 9 4
Is better t h a n : 87 55 7 1 0
By contrasting the upper and lower groups, the most consistent
items can be found, e.g. :
Response 1 2 3 4 5
Percent Freq. in Top T h ird : 30 45 21 4 0
Percent Freq. in Bottom T h ird : 15 28 89 15 3
The statistical significance of the difference in mean response
can be determined. Here the means are 1-99 and 2-63, and if
there are, say, 70 in each group, the Critical Ratio of 4 1 is
highly significant. Alternately, a split may be taken a t the
response which cuts off as nearly as possible half the combined
groups—in this case between Responses 2 and 3, as this cuts off
118/200. Then the difference between the two percentages
above the split, 75—48=82, would be regarded as sufficiently
large. A percentage difference of less than 15 would usually
indicate too little agreement with the test as a whole.1
The group can be rescored on the selected items, or, better,
the revised test given to a fresh group, and the split-half
reliability determined as before. W ith about a dozen items this
will usually be quite high ; and if not, the necessary number of
items can readily be calculated. Eysenck claims th a t his mixed
1 Note that this technique automatically favours items o f middling
acceptability. Cf. Vernon, P. E., ‘ Indices o f Item Consistency and
Validity ’. Brit. J . Psychol. Statist. Sec., 1948, 1, 152-108.
156 Personality Tests and Assessments
type gives even better coefficients. Here Thurstone sorting
and item analysis are applied. The responses to items with
different scale values may be scored according to the following
table, adapted from Eysenck and Crown’s article.
Scale Responses
Value + + + o
8-8 + 8 6 4 2 0
7-7-f 7 6 4 2 1
7-1 + 7 5 4 3 1
6-5 + 6 5 4 8 2
5-9 + 5 5 4 8 8
5-8 + 5 4 4 4 8
4-7 + 5 4 4 4 5
4-1 + 8 4 4 4 5
8-5 + 8 8 4 5 5
2-9 + 2 3 4 5 6
2-3 + 1 3 4 5 7
1-7 + 1 2 4 6 7
1-6 + 0 2 4 6 8
Factor analysis can be applied to test items, as well as to
different attitude tests, in order to ensure reasonable reliability
and unidimensionality 1 ; but this is seldom done because of the
vast number of correlations when there are many items to
analyse. Stouffer 2 and his associates advance strong arguments
for Guttman’s scalogram analysis technique. This is especially
useful for constructing very short unidimensional and reliable
scales. But we will not describe it, since it seems to offer little
practical advantage to compensate for its laboriousness. One
of its weaknesses is that the initial choice of suitable items for
scaling depends entirely on the test-constructor’s skill. Edwards
and Kilpatrick 3 therefore advocate, first choosing a suitable
range of items by Thurstone sorting, then pruning by ordinary
item-analysis, and making a final selection by scalogram
technique.
1 Cf. Eysenck, H. J., ‘ Primary Social Attitudes : I. The Organization
and Measurement, of Social Attitudes ’. Int.- J . Opinion Attitude Res.,
1047, 1, 40-84.
2 Op tit., p. 149.
•Edwards, A. L., and Kilpatrick, F. P., ‘ A Technique for the Con*,
struction of Attitude Scales ’. J . Appl. Psychol., 1948, 32, 874-384.
Measurement o f Attitudes and Interests 157
In duplicating or printing opinionaire or Thurstone-type
tests, pro- and anti- statements should be arranged in random
order. In multiple-choice tests, the most pro- response is
sometimes put first, sometimes last. This is desirable in
order to prevent testees checking Yes (or Strongly Agree,
No, etc.) throughout without considering each item carefully,
also so as to reduce ‘ space errors ’. (In any multiple-choice
test there is a tendency to tick the topmost or left-most
response rather than responses printed on the right, or a t the
bottom, of a set.)

DISCUSSION

I t is probably a mistake to aim a t extremely high reliability


in attitude scales, since this may be obtained at the expense
of validity.1 I t will usually arise when the items are too
homogeneous in content, or very numerous. Under these
circumstances the testee can hardly fail to realize th a t they
all refer to his own radicalism, or other attitude, and he will
be more likely to answer each item according to his stereotype,
or according to what he thinks is socially approved.
The attitude of the subjects to the test and tester is just as
important as it is with personality questionnaires, though
distortion of responses is likely to be less serious because the
subject-matter of most attitude tests is. less disturbing. One
would not expect to get results of any value from school­
children until they reach Mental and Educational Ages of at
least 12 years. But it is certainly possible to construct scales
that get across to adults of lowish ability, witness the extensive
practical use made of them with all grades of American recruits
during the war.
No one appears to have tried to validate attitude tests by the
trait composite method. Such evidence as correlations with
associates’ ratings or with indices of behaviour is useful, but
inconclusive, since these criteria are imperfect for the reasons
already given. As mentioned in Chap. VI, there is little correla­
tion between ethical attitude tests and honest conduct. How­
ever, many tests have been shown to differentiate significantly
1 Cf. Kirkpatrick, C., ‘ Assumptions and Methods in Attitude Measure­
ments ’. Amer. Sociol. Rev., 1936, 1, 75-88. See also McNemar's detailed
analysis of the weaknesses of different types of scales.
158 Personality Tests and Assessments
(albeit w ith a good deal of overlapping) between groups which
would be expected to possess contrasting attitu d es. F o r
example, Eysenck finds good discrim ination on his radicalism
scale between people who vote Conservative, Liberal, and
Labour. On T hurstone’s Church scale the average scores of
Rom an Catholic students was 2-90, of Jew ish students, 5-44.
W atson’s te st of fair-mindedness (cf. below) shows social
psychology students in an eastern American university to be
less prejudiced than middle-west parsons. L ikert finds large
differences in attitu d es to negroes between southern and n o rth ­
eastern colleges. Adorno and Frenkel-Brunswick, also A llport
and K ra m e r1 have traced ou t extrem ely interesting and
psychologically plausible, differences between subjects w ith
high and low scores on tests of racial and religious prejudice
(ethnocentrism ). The consistent way in which strong prejudice
connects w ith factors of upbringing and w ith the personality
as a whole is very convincing. M any scales have been used
successfully in investigations of the effects of propaganda or
of various types of instruction.2

STUDIES OF GROUP ATTITUDES

Actually it is hardly necessary, in studying differences


between attitudes of groups to use tests which yield reliable
individual scores ; less elaborate techniques are often adequate.3
An enormous amount of work is done in social and applied
psychology on group attitudes which falls outside the scope of
the psychology of personality. Examples are children’s interests
in different school subjects or leisure pursuits a t different
ages, employees’ views on sources of satisfaction and dissatis­
faction in their work, the innumerable surveys conducted by
Gallup polls, Mass Observation, and like organizations, not to
speak of newspaper competitions which determine their readers’
most popular film-star or novel, and voting in Parliamentary
1 Adorno, T. W., and Frenkel-Brunswick, E ., et. at., The Authoritarian
Personality. New York : Harper, 1950. Allport, G. W ., and Kramer,
B . M., * Some R oots of Prejudice ’. J. Psychol., 1946, 22, 9-39.
* Cf. Lichtenstein, A., ‘ Can Attitudes be Taught ? ’• Johns H opkins
Univ. Stud. Educ., 1934, No. 21.
* Cf. Vem on, P . E ., ‘ The Assessment o f Psychological Qualities by
Verbal Methods ’. Industr. Hlth. Res. Board Rep., N o. 88. London :
H.M. Stat. Office, 1938.
Measurement o f Attitudes and Interests 159
elections. These normally rely on one or two questions
only as indices of majority attitudes, and compensate for the
consequent loss of reliability by the large number of voters.
Nevertheless, the validity or practical significance of the results
is often unsatisfactory, partly because of the difficulties of
getting really representative samples, and partly because of
doubts as to the proper interpretation of responses. Unless the
questions are extremely straightforward, the respondents may
read different meanings into them from those expected by the
investigators.1 Eysenck points out th at the proportion of a
population favouring a given issue may appear to range almost
anywhere from about 10% to 90%, depending on the kind of
question asked and the extent to which it touches off common
sterotypes (cf. p. 152). Cantril 4 provides a useful summary of
the large body of research into wording of questions, effects of
interviewer bias, sampling and other errors in opinion-polling ;
and McNemar gives a detailed critique of polling methods.
McNemar concludes that the results of such polls would be
considerably more accurate if more use was made of properly
constructed scales and tests, though these could be shorter
and simpler than those required for measuring individual
attitudes.

INDIRECT TESTS OF ATTITUDES

We have already suggested that the questions or statements


in attitude tests should not be too direct. A number of tests
which are still more carefully disguised have been suggested.*
Probably it is because of the obvious difficulties of validation
that they have seldom been used for any practical purpose.
G. B. Watson’s 4 early test of fairmindedness is an ingenious
example. Testees are told that it is a ‘ Survey of Public
Opinion ’, but their prejudice or fairmindedness in political,
* An amusing example of this is quoted by McNemar. In one American
survey a remarkable proportion of negroes were found to be opposed to
government control of profits. Closer questioning showed that they
believed that God alone should control prophets.
* Cantril, H., et. ai., Gauging Public Opinion. Princeton, N .J .: Prince­
ton University Press, 1944.
* A useful survey is provided by Campbell, see Bibliography.
4 Watson, G. B., * The Measurement of Fair-mindedness ’. Teacheri
College Contr. Educ., 1925, No. 176.
160 Personality Tests and Assessments
social, and religious matters is brought out in six varied sub­
tests. One of these contains such statements as :
All, Most, Many, Few, No Roman Catholics are super­
stitious.
An ‘ A ll’ or a ‘ N o n e ’ answer is taken to show religious
prejudice; the less extreme answers are accepted as a sign of
tolerance. In another sub-test a statem ent is presented :
In the United States 3% of the people own 60% of the
wealth.
This is followed by several conclusions th a t m ight be drawn,
including the follow ing:

The great incomes should be more heavily taxed.


Such a concentration of capital is inevitable if industry is
to be effectively developed.
No conclusion stated here can fairly be drawn.
The testee who checks the first of these as a legitimate inference
gets a m ark for socialistic prejudice ; the second—capitalistic.
Only if he checks the last does he obtain a m ark for fair-
mindedness. Other tests which measure the influence of bias
on logical reasoning are Morgan’s, and the Watson-Glaser
Tests o f Critical Thinking.1 W atson’s use of extreme vs.
moderate opinions is amplified in Thouless’s * study of degrees
of certainty in religious beliefs. He shows th a t the distribution
of responses to multiple-choice questions on religious topics,
which are highly charged with prejudice, tend to be bimodal,
since so few individuals take up m oderate views. He suggests,
too, th a t this might provide a measure of irrational thinking.
As in the case of personality questionnaires, however, it is
difficult to prove th a t this ‘ response set ’, or extremeness
tendency, is psychologically meaningful.

1 Morgan, J. J. B., and Morton. J. T., * The Distortion of Syllogistic


Reasoning Produced by Personal Convictions ’. J . Soc. Psychol., 1844, 20,
80-59. Glaser, E. M., ‘ An Experiment in the Development o f Critical.
Thinking ’. Teachers College Contr. Educ., 1941, No. 843.
* Thouless, R. H ., ‘ The Tendency to Certainty in Religious Belief ’.
Bril. J . Psychol., 1985, 26, 16-81.
Measurement o f Attitudes and Interests lf ll
Distortions of perception and memory have also been em­
ployed, e.g. by Horowitz, and Cattell.1 Thus an Aussage test
(cf. p. 88) may be given with suggestive questions th a t allow
scope for colour prejudice to enter. Thematic apperception
(cf. p. 181), doll play, and some other projective techniques have
similarly been applied with m aterial liable to touch off various
attitudes. These do not yield any quantitative scores, b u t
assessments of the subjects’ responses correlate fairly closely
with attitude scale scores. Perhaps the m ost promising of recent
tests are H am m ond’s,2 which take the form of information
tests. A typical item is :
The average weekly wage of workers in the U.S. in 1948
was : (a) $40, (b) $60.
The true answer happens to be $50. Hence the choice of (a)
indicates socialist, (b) capitalist, attitudes. Finally Travers 3
and others have shown th a t a person’s estimates of the propor­
tion of people who hold a given attitude correlates positively
with his self-expressed attitude. For example, the radical is
much more likely than the conservative to overestimate the
proportions of radicals in the population.

INTERESTS

Interests are very much the same as attitudes, though their


definition is also a m atter of controversy.1 Their subject-
m atter is usually more concrete. We are interested in or like
athletics, music, model aeroplanes, etc., whereas we have
favourable or other attitudes to religion, foreigners, etc. B ut
an interest is ju st as complex as amalgam of subjective feelings
and objective behaviour tendencies, and interests are a t least
as manifold and as difficult to reduce to a few unidimensional
variables. The best-known test, Strong’s Vocational Interest
1 Horowitz, E . I.., ‘ The Development o f Attitude Toward the Negro *.
Arch. Psychol., 1936, 28, No. 194. Cattell, R. B ., et. at., ' The Objective
Measurement o f Attitudes Brit. J. Psychol., 1949, 40, 81-90.
* Hammond, K. R ., ‘ Measuring Attitudes by Error-Choice: An
Indirect Method ’. J . Abn. Soc. Psychol., 1948, 43, 88-48.
* Travers, R. M. W., ‘ A Study in Judging the Opinions o f Groups ’.
Arch. Psychol., 1941, 37, No. 266.
4 Cf. Berlyne, D. E., ‘ “ Interest ” as a Psychological Concept \ Brit.
J . Psychol., 1949, 39, 184-195.
162 Personality Tests and Assessments
Blank, solves the problem of what to measure by taking
different occupations as the criterion. Obviously the number
of these is limitless, and many occupational interests overlap,
positively or negatively. Other testers have therefore adopted
a priori classifications of a limited number of more general
types of interest. Factorial analysis is beginning to provide a
more objective answer.1
Allport and Vernon’s Study o f Values is based on the classifica­
tion proposed by Spranger in his ‘ Types of Men ’.* I t is
designed to test an individual’s relative standing on six main
types of value or general interest: theoretical or scientific,
economic or utilitarian, aesthetic or artistic, social or humani­
tarian, political or power-seeking, and religious or spiritual. It
employs the forced-choice technique. The testee is told to rank
his order of preference to the four answers to such questions as :
If you could influence the educational policies of the public
schools of some city, would you undertake :
(a) to promote the study and performance of drama ;
(b) to develop co-operativeness and the spirit of
service;
(c) to provide additional laboratory facilities;
(d) to promote school savings banks for education in
thrift.
If he puts (c) top he scores 8 for theoretical values, and if he
puts (a) bottom he scores 0 for aesthetic values, and so on. By
summing the answers to all the questions a range of marks
from 0 to 60 is possible on each value. Note that, as in an
attitude test, the questions are rather indirect expressions of
the values, couched in terms of behaviour. Testees are not
told the object of the test since one does not wish to get merely
their own impressions of how religious, artistic, etc., they are.
TLe items have been analysed in the manner described above,
and shown to yield reasonably consistent scores.
1 Cf. Vernon, P. E., ‘ Classifying High-Grade Occupational Interests ’.
J . Aim. Soc. Psychol., 1949, 44, 85-96.
• Allport, G. W., Vernon, P. E ., and Lindzey, G., Study of Values (Rev.
ed.). Boston : Houghton Mifflin, 1951. Vernon, P. E., and Allport, G. W.,
‘ A Test for Personal Values, j . Abn. Soc. Psychol., 1981, 26, 281-248.
Spranger, E., Types of Men. Halle : Niemeyer, 1928.
Measurement o f Attitudes and Interests 168
The K uder Preference Record 1 is similar, but aims to measure
nine more specifically occupational interests, a t high school,
college, and adult level: Mechanical, Computational, Scientific,
Persuasive (salesmanship, etc.), Artistic, Literary, Musical,
Social Science, and Clerical. (Other editions include Outdoor,
Sociable, Practical, Theoretical, Smooth Personal Relations,
and Dominant interests, together with a verification or check
score.) It is widely employed by American vocational guidance
centres, and is noteworthy for its ingenious rapid-scoring
device. The testee pricks his preferences through a series
of sheets; and when these are unfolded the numbers of pricks
in printed circles, corresponding to each interest type, are
counted up.
Strong’s Vocational Interest Blank.* The occupational
interests claimed by a person undergoing vocational guidance
tend to fluctuate, and are often based on entirely mistaken
notions of w hat the occupations entail. Freyd and others, a t
the Carnegie Institute of Technology,3 and later Strong a t
Stanford University, hit on the method of recording the testee’s
immediate likes and dislikes for a large num ber of miscellaneous
items, and then deducing his true interests from the total
pattern of his responses. Strong’s Blank contains some 420
items, including lists of :
Occupations, e.g. actor, advertizer . . . Y.M.C.A. worker.
Amusements, e.g. golf, tennis, chess, pet canaries.
Subjects of study, e.g. algebra, arithmetic . . . zoology.
Miscellaneous activities, e.g. repairing a clock, arguments,
saving money.
Types of people, e.g. optimists, pessimists, foreigners,
cripples, socialists.
Famous people, e.g. Caruso, Edison, Henry Ford, etc., etc.
Each item is followed by L I D (like, indifferent, dislike), or
other simple responses, one of which is to be checked. There is
no time limit, but rapid answering is advised, and the test
1 Kuder, G. F., Ruder Preference Record. Chicago : Science Research
Associates, 1942.
* Strong, E . K., Vocational Interest Blank. Stanford, C a l.: Stanford
University Press, 1927. Vocational Interests of Men and Women. Stanford,
Cal. : Stanford University Press, 1948.
* Cf. Fryer, Bibliography.
164 Personality Tests and Assessments
should not taice more than about 35 minutes. The scoring is
purely empirical, being derived from the actual responses of
very large groups of individuals concerned in particular
occupations (artists, architects, doctors, farmers, real estate
salesmen, psychologists, etc.). Several techniques of con­
structing scoring keys have been suggested. Strong finds the
following simple one effective. Take as an example the ite m :
Actor L I D , and its scoring for interest in Personnel Manage­
ment. The percentages of a group of personnel managers, and
of members of other vocations in general, who check the
responses are :
L I D
Personnel Managers 49 88 18
All others 88 85 27
Difference + 11 +8 —14
T h at is, personnel managers are slightly more a p t th an m ost to
say th a t they like this occupation. The differences are trans­
posed into somewhat smaller figures, and the final m arks for
the item are + 2 , + 1 , and —8 respectively. Similarly the scores
are determined for each of the 1200 or so possible responses
for each occupational group. An individual’s score for an
interest is the sum of his + and — marks, which thus indicate
the resemblance of his pattern of interests to the typical
patterns of personnel managers, or artists, etc. Scoring a
single blank by hand against forty occupations takes several
hours ; it is therefore usually done by machine, a t considerable
expense. If the testee’s score falls within the range of scores
given by 75% of the personnel managers (or other) group, he is
given an A-rating for th a t occupation ; if it is within the range
of the lowest 25% he is given a B-rating ; if it is lower than the
score of any of the original personnel managers, a C-rating.
A candidate for guidance is advised to enter only those occu­
pations for which he receives an A-rating.
Keys for some forty-nine different occupations are available.
Another blank is published for measuring women’s interests in
tw enty-six occupations, and a new test has recently been
constructed by Clark 1 for differentiating interests in civilian
and naval skilled trades. The same technique has been applied
1 Clark, K. E., * A Vocational Interest Test at the Skilled Trades Level
J . A ppl. Psychol., 1949, 88, 291-803.
Measurement o f Attitudes and Interests 165
by G a rre tso n 1 to measuring the academic, technical, or
commercial interests of secondary schoolboys. None of these
tests would be of any use in Britain, unless restandardized,
since it is unlikely th a t the likes and dislikes of Californian
and B ritish vocational groups would be sufficiently similar.
Many of the present items, also, m ight arouse ridicule (e.g.
Acting as yell-leader, Pursuing bandits in sheriff’s posse, People
w ith gold teeth, Men who use perfume). The S tu dy o f Values
and K uder Preference Record are also too American in phrase­
ology. Some adaptations of the former have been used here
experim entally, b u t are not published. A te st of th e Strong
type was constructed for A rm y recruits during th e war, w ith a
view to allocating them to one of six main types of em ploy­
ment.* The mechanical scores were shown to have considerable
predictive value in picking men for mechanical jobs, b u t the
te st never came into general use, both because th e scoring was
too lengthy and because the keys were not very reliable.
Strong considers it desirable to have standardization groups of
about 500 in each occupation for which a key is being prepared.
We found also th a t interest tests tend to be less effective among
average or dull adults than a t professional and business levels ;
education and intelligence have a considerable influence on the
responses. F or example, the Likes checked by a skilled
instrum ent mechanic tend to bear a closer resemblance to those
checked by an equally intelligent clerk th a n to those of a
lower-grade mechanical worker. None th e less, a te st covering
a few m ain types of high-grade occupational interests would
be extrem ely useful in this country, though its preparation and
standardization would be a herculean task.
A weakness which applies to all the tests in this (as in the
previous) chapter is th a t they are extrem ely open to faking.
Several experim ents have shown th a t subjects can raise or
lower their scores significantly if they wish to m ake themselves
ou t as possessing certain interests. Thus, none of these tests is
very suitable for educational or vocational selection, where an
incentive to fake m ay operate. B u t this does no t d etract from
their value in vocational guidance or other situations where th e
1 Garretson, O. K ., ‘ Relationships between Expressed Preference and
Curricular Abilities o f N inth Grade B oys ’. Teachers College Contr. Educ.,
1930, N o. 396.
* Cf. V em on and Parry, Bibliography.
12
166 Personality Tests and Assessments
subjects realize the desirability of truthfulness. Some psycho­
logists would criticize the extreme empiricism of a test which
takes no account of the meaning the questions have for the
subject. Their doubts may be heightened by an experiment
carried out by Burnham and Crawford,1 who obtained a set of
purely chance scores on the Bernreuter Inventory and the
Strong Blank by throwing dice to decide each response. On
applying the keys it was found that the dice had obtained
scores characteristic of a psychoneurotic boy scout master or
journalist!
Nevertheless all the interests tests mentioned do work.*
Correlations around -5 are found between Study o f Values scores
and associates’ ratings on these values, which is fairly high
considering the difficulty of explaining the values to th e raters.
Large differences are obtained in values profiles between student
groups taking different courses (engineering, arts, theological,
etc.), and correlations of up to -45 w ith some of the Strong
interests,* in spite of the totally different techniques of the
two tests. Strong has shown th a t the repeat reliability co­
efficients of interest scores in adults are rem arkably high,
approxim ating -80 even over a 20-year period. N aturally
scores are somewhat less stable in adolescents. Persons tested
before taking up their careers, who are not told the results, do
tend spontaneously to enter vocations consonant with these
re su lts; and those who later forsake a vocation are found to
score lower than those who stay in it. The test may also be
useful for predicting occupational success, as in Kelly and
Fiske’s research (cf. p. 26).

OTHER METHODS OF ASSESSING INTERESTS

Many vocational psychologists make use of a blank like


Strong’s, or else a much briefer list, and by going over the
testee’s responses in an interview build up a fairly complete
» Burnham, P. S., and Crawford, A. B ., * The Vocational Interests and
Personality Test Scores of a Pair of Dice ’. J . Educ. Psychol., 1985, 26,
508-512.
* Cf. Berdie, Bibliography.
• Cf. Duffy, E., and Crissy, W. J. E., * Evaluative Attitudes as Related
to Vocational Interests and Academic Achievement ’. J . Abn. Soc.
Psychol., 1940, 35, 226-245.
Measurement o f Attitudes and Interests 167
picture of his interests. This, of course, does no t constitute a
test, and subjective interpretation plays a large p art. I t m ight
be thought th a t by classifying items under a few m ajor types,
and simply adding the num bers of responses under each type,
a useful set of objective scores, or an interest profile, could be
obtained. Thorndike 1 has studied ad u lt interests by such a
test, covering thirty-five topics or fields ; and recently Guilford,
Shneidm ann, and Zimmermann 2 have published their GSZ
In terest test, which includes tw enty item s for measuring each
of eighteen main types of vocational and leisure interests. B u t
this m ethod fails to work well because people are found to vary
so enormously in their standards of m arking Like or Dislike.
Some interpret Like far more broadly th a n others and check
three-quarters or more of the items in this way, whereas others
are more selective and check less than a quarter. To some
ex ten t this m ay reflect genuine breadth or narrowness of
interests, b u t often it is merely an irrelevant ‘ response set ’.
This is the reason why both the Study o f Values and th e Kuder
Preference Record force the testee to indicate relative preferences.
In so far as he scores high on one value or interest, he m ust
obtain lower scores on some other interest or interests. Though
no direct allowance is made for response set in th e Strong
Blank, th e scoring keys probably com pensate for m ost of it
autom atically.3
Strong 4 has investigated the items checked by men of various
ages from 15 to 55 years, and on this basis developed a scoring
key for w hat he calls In terest M aturity. Similarly Furfey and

1 Thorndike, E . L ., * The Value o f Reported Likes and Dislikes for


Various Experiences and Activities as Indications o f Personal Traits \
J . A ppt. Psychol., 1986, 20, 285-813.
* GSZ Interest Survey. Beverly Hills, C a l.: Sheridan Supply Co.,
1948. Cf. Guilford, J. P ., Shneidmann, E . S., and Zimmermann, W. S.,
‘ The Guilford-Shneidmann-Zimmermann Interest Survey ’. J . Consult.
Psychol., 1949, 13, 302-306. Incidentally these authors show that hobbies
often give very poor indications of occupational interests.
* Tyler, L. E. (private communication) has pointed out to the writer
that most subjects tick a vast m ajority o f Likes, hence in fact most o f the
differences between different interest scores are based on the Dislikes that
they choose. Tests of the Strong type do not work well with children, nor
with many adults, because they have not developed sufficiently clear-cut
patterns o f Dislikes.
4 Strong, E . K., Change of Interests with Age. Stanford, Cal. : Stanford
University Press, 1931.
168 Personality Tests and Assessments
W eber 1 constructed tests of em otional m a tu rity or develop­
m ental age for children, which included lists of interests, games,
books, etc., common among boys from 8 to 18 years, together
w ith Pressey X -0 items (cf. p. 175). I t was claimed, for
example, th a t delinquents tend to give responses typical of
norm al children younger than themselves. Probably, however,
fashions in interests vary too m uch for such tests to have much
perm anent value.
Wyman 2 devised an indirect test of interests for Terman’s
studies of gifted children, which was based on free word
association responses. Groups of children were taken who,
according to teachers’ ratings, were keenly or weakly interested
in 1intellectual, social, or activity ’ interests. Their responses
to a list of stimulus words were tabulated, and differential
scores developed for each response. Thus when any new child
takes the test, his responses can be scored for resemblance to
those of the intellectual, social, and activity groups. The
technique is exceedingly laborious, and the resulting interest
measures were found to have low validity (probably because
the three standardization groups of 130 each were too small).
Thus the correlations between the scores of fresh groups and
teachers’ ratings were only -54, -35, and -20. Moreover, the
correlations of the three interest scores with one another ranged
from -68 to -80, suggesting th at the original criteria for selecting
the groups were far too much affected by halo. The sets of
children were not really distinct, as were Strong’s occupational
groups or Kent-Rosanoff’s mental patients and normals (cf. p. 174).
Nevertheless W yman’s technique deserves mention because of its
objectivity. I t would be very difficult to guess the associations
th at score highly, or to fake one’s results. Other objective tests
of interests have already been discussed in Chap. VI.

M ASCULINITY -F E M IN IN IT Y TESTS

Strong provides an additional scoring key based on the


differential likes and dislikes of men and women. As already
1 Furfey, P . H ., * Developm ental Age Amer. J . Psychiat., 1 828, 3,
149-157. Weber, C. O., ‘ Further Tests o f the Wells Em otional Age Scale ’.
J . Abn. Soc. Psychol., 1932, 27, 65-78.
* W yman, J. B ., * Tests of Intellectual, Social, and A ctivity Interests ’.
Genetic Studies of Genius, Vol. I., by L. M. Terman. Stanford, C a l.:
Stanford University Press, 1925.
Measurement o f Attitudes and Interests 189
mentioned (p. 130) a similarly constructed key is available for
the Minnesota Multiphasic Inventory. Sex differences occur
in many other psychological qualities and abilities, for example
in knowledge of meanings of words drawn from different fields
of interest. Thus Slater’s 1 Selective Vocabulary Test contains
forty words answered better by males, and forty by females.
I t can be applied from 13 years up. The most elaborate test is
the Attilude-Interest Analysis (M -F ) Test of Terman and Miles,2
containing seven sub-tests :
Word Association (all items have multiple-choice responses);
Inkblot Associations ; General Information ; Em o­
tional and Ethical A ttitudes ; Interests ; Opinions ;
Introvertive Responses.
Considerable overlapping of the sexes occurs on th e te st as
a whole, b u t mean scores are about + 50 for men and —70 for
women on a scale from +200 (extreme masculine) to —200
(extreme feminine). Homosexuals do not necessarily score
like the opposite sex, and a tentative additional scoring key for
inversion is provided. Athletes of both sexes tend to obtain
more masculine, and artists more feminine, scores than average ;
and numerous other plausible group differences are described.
Thus, although the test, like the Strong Blank, may be criticized
for its blind empiricism and lack of any coherent underlying
psychological theory, it does seem to be a useful research
instrum ent. The three tests—Strong, MMPI, and M-F—over­
lap to a reasonable extent. Shepler 8 finds an average inter-
correlation of -58 in single-sex groups.

1 Slater, P ., Selective Vocabulary Test. London : Harrap, 1944.


* Terraan, L. M., and Miles, C. C., Attitude-Interest Analysis Test. New
York : Psychological Corporation, 1983. Sex and Personality. New York :
McGraw-Hill, 1936.
* Shepler, B. F., * A Comparison o f Masculinity-Feminity Measures
J . Consult. Psychol., 1951, 15, 484^186.
X
Projection Techniques

INtheFreudian theory, ‘ projection ■’ is the mechanism whereby


Ego defends itself from unwelcome or repressed wishes
and ideas by attributing them to others, or by projecting them
on to the external world. The aggressive paranoiac, for
example, thinks that everyone is attacking or plotting against
him. Few of the numerous tests which are commonly grouped
under this heading directly involve Freudian projection;
rather they provide a vehicle through which the subject
expresses his personality structure. (Cattell1 points out how
heterogeneous they are, and suggests that ‘ dynamism ’ tests
might be a better term.) Thus they are closely linked with the
expression of personality through movement, which we con­
sidered in Chap. IV. Indeed Bell’s useful and comprehensive
summary classifies expressive movement and graphology as
projective techniques. In this chapter, however, most of the
techniques reveal personality through its effects on a subject’s
perception of some stimulus, and on the mental associations,
imaginative or creative activities thereby aroused, rather than
through bodily characteristics or activities. Moreover, they
take account of what we called content, in addition to style, of
expression.
Frank 2 has pointed out that they originate from psycho­
analytic techniques of dream analysis and free association on
the one hand, and from Gestalt psychology, with its emphasis
on the whole as more than the sum of its parts, on the other.
The dream is * the royal road to the Unconscious ’ ; and the
creative productions of writers and artists have often been
claimed by psychoanalysts to reveal underlying personality
trends and conflicts. At the same time, no single trait, mech­
anism, or complex manifests itself in isolation from the rest of
1 Cattell, R. B., 1 Projection and the Design of Projective Tests of
Personality Char. & Person., 1944, 12, 177-194.
* Frank, L. K., * Projective Methods for the Study o f Personality
J . Psychol., 1939, 8, 389-413.
170
Projection Techniques 171
the personality structure ; equally no single feature of a person’s
fantasies, nor of his responses to a projection test, should be
considered in isolation. Projection testing arose indeed partly
as a revolt against the search for tests of particular traits,
because it seemed to yield insights into the structure of the
personality as a whole. In effect this means that these methods
are not tests at all. They do not set out to measure specified
variables, any more than the psychoanalyst, say, aims to
measure the strength of his patient’s Oedipus Complex. We
shall see, also, that subjective interpretation by the ‘ tester ’
enters to a great extent, and that attempts to render the
methods more objective—with which this chapter is largely
concerned—have proved disappointing. Another important
difference from orthodox testing is th at the stimulus situation
and the subject’s responses are as unconstricted or ‘ unstruc­
tured ’ as possible. This allows maximum scope for individual
differences, and it helps to reduce the self-consciousness and
the critical attitudes which are so prominent in answering
personality questionnaires. Usually certain instructions are
standardized, or certain material is employed, with a view
to stimulating the subject’s fantasies; and these do limit the
situation sufficiently to make possible comparisons between
one subject and another (thus dreams, artistic productions,
and completely free play are not, in fact, very suitable for
projection testing). But the subject’s spontaneous reactions
are observed and recorded in fu ll; he is not forced to
choose between multiple responses made up by the tester,
nor does the tester attend only to a few specified categories
of behaviour.
Within the limitation just mentioned, an enormous variety
of techniques can be, and have been, employed. Indeed
there has been a plethora of new ones in recent years,
and our survey will be confined to those which are best
established. In addition to Bell’s book, an article by Sargent
may be recommended as giving an excellent description and
classification.1
1 Bell and Sargent, see Bibliography. Useful accounts of selected
techniques such as Sentence Completion, Drawing, and Finger-painting,
Mosaics and Bender-Gestalt, together with a discussion of the theory and
applications of projective testing, are contained in : Abt, L. E .. and Beliak,
L., Projective Psychology. New York : Knopf, 1950.
172 Personality Tests and Assessments

FREE WORD ASSOCIATION

Though relatively little used nowadays, this was not only the
first projection test, but also one of the earliest methods for
exploring mental differences—being developed by Galton in
1879. Usually a list of 50 to 100 stimulus words is read out by
the te ste r; to each word the subject responds with the first
word that comes to mind. He is told not to search about for
particularly apt associations. Many of the associations are
superficial verbal habits—opposites, rhymes, genus-species, etc.
(Jung’s ‘ objective ’ ty p e); e.g. black-white, father-mother.
But a few stimuli may touch on emotional complexes and lead
to personal (Jung’s 4 subjective ’ or ‘ egocentric ’) responses.
Often these are accompanied by signs of embarrassment,
blocking, or laughter, and by a slow reaction time (2 seconds or
more) or complete failure to respond. These so-called complex
indicators draw attention, as it were, to a sore spot in the
personality, which repays fuller exploration. Some testers go
through the list a second time and ask the subject to reproduce
his original responses. Failures of reproduction are also
considered significant.
The first systematic studies of the diagnostic possibilities
of the test among psychoneurotic patients were carried out by
Jung.1 H e also experimented with the psychogalvanic reflex,
whose deflections provide another sign of tension (though
they do not, as W hately S m ith 2 claimed, closely parallel
lengthened reaction times). More fruitful is the Luria technique
of recording voluntary and involuntary muscular accompani­
ments (cf. p. 54).
Jung’s list of 100 words is often used, since it contains
stimuli likely to evoke many common complexes; Rapaport,
Gill, and Schafer provide an alternative. Kent and Rosanoff’s 3
list was selected for a different purpose and avoids words likely
to ‘ call up personal experiences ’. C attell4 gives a list suitable
1 Jung, C. G., Studies in Word Association. London : Heinemann, 1918.
* Smith, W. W., The Measurement of Emotion. London : Kegan Paul,
1922.
* Kent, G. H ., and Rosanoff, A. J . , 4 A Study o f Association in Insanity’.
Amer. J . Insanity, 1910-1911, 67, 37-96, 317-390.
4 Cattell, R. B ., A Guide to Mental Testing. London: University*
London Press, 1936.
Projection Techniques 178
for application to children, and another by Boyd (unpublished)
is used a t several Scottish Child Guidance Clinics. ‘ Chain ’
or continuous association tests are sometimes preferred, where
the subject is instructed to say everything th a t comes to
mind in connection with a given stimulus word or words.
Meltzer,1 for example, studied children’s attitudes to their
parents by getting them to ‘ think aloud first about some
innocuous words like ‘ table ’ and ‘ ball ’, then ‘ father ’ and
* m other ’. H e recorded the first ten associations to the
latter.
The clinical or qualitative applications of free association by
the psychoanalyst or psychiatrist lie outside our scope. The
simplest method of scoring responses is by counting the numbers
th a t fall under various types. Many classifications have been
proposed, b u t Wells and Murphy 2 show th a t there are con­
siderable discrepancies when the same responses are classified
by different testers, and th a t there seems to be no correlation
between the types to which a subject is prone and his personality
traits or his neurotic or psychotic syndrome. The same
conclusion probably holds for the diagnostic scheme elaborated
by Rapaport, et al. Other measures which have been widely
investigated include the average or median reaction time, or its
dispersion, the total number of complex indicators, and the
average psychogalvanic response. None of these seems to
correlate donsistently with tests or ratings of emotionality or
other personality traits. A few more suggestive findings
deserve mention. C a n tril3 showed th a t persons with high
scores on the Study o f Values test react more quickly to
stimulus words connected with their values (cf. also Moore and
Gilliland’s use of ‘ aggressive ’ words, p. 90). Fisher and
Marrow 4 found th a t depressed or elated moods, induced in
their subjects by hypnotic suggestion, affected the content and
1 Meltzer, H., ‘ Children’s Attitudes to Parents ’. Amer. J . Ortho-
psychicU., 1985, 5, 244-205.
* Wells, F. L., * Association Type and Personality ’. Psychol. Rev.,
1919, 26, 871-376. Murphy, G., ‘ Types of Word-Association in Dementia
Praecox, Manic-Depressives, and Normal Persons ’. Amer. J . Psychiat.,
1928, 2, 539-571.
* Cantril, H., ‘ General and Specific Attitudes ’. Psychol. Monogr., 1932,
42, No. 192.
‘ Fisher, V. E., and Marrow, A. J., * Experimental Study of Moods
Char, dk Person., 1934, 2, 201-208.
174 Personality Tests and Assessments
the speed of response. Meltzer 1 was able to classify children’s
chain associations fairly reliably under such headings as
pleasant vs. unpleasant tone, attachm ent to parents, level of
socialization, and then to study the kinds of homes in which
healthy and unhealthy attitudes appeared.
An entirely different approach to the quantification of word
associations was put forward by K ent and Rosanoff * in 1910-11.
(This was the prototype of the empirical method of standardiza­
tion, later adopted by Haggerty, Olson, and W ickm an; by
Bernreuter, Strong, and others in different contexts.) They
tabulated all the responses of 1000 miscellaneous normal persons
to their special 100-word list, and noted the frequency of each
response. When a new subject takes the same test, the fre­
quency values of all his responses are summed to give a measure
of w hat is called his idiosyncrasy (a low score) or commonality
(a high score). Alternative simpler forms of scoring are to
count the number of common responses (i.e. the modal or most
frequent responses of the standardization group), or the number
of individual responses (those not listed in the tables owing to
the infrequency of their appearance). Such tables are, of course,
useful only among people similar in language and background to
the standardization group ; they quickly get out of date, and
would certainly be unsuitable in Britain. Woodrow and Lowell *
prepared a new set for children, incidentally using a w ritten
instead of oral form of the test. O’Connor’s * tables were based
on the responses of 2000 U.S. industrial workers. Probably
the best plan for anyone using the test now is to score individual
responses by reference to his own group of subjects, th a t is, to
count for each subject the number of responses given by no
other subject.
The empirically scored word association test does have some
value as a measure of mental abnormality; for K ent and Rosanoff
showed th a t the average member of their standardization group
gave 7% of individual responses, whereas 247 m ental hospital
patients gave an average of 27%. The present w riter also
obtained fair correlations between individual responses and
other measures of emotionality in a group of normal students.
1 Op cit. • Op cit.
» Woodrow, H., and Lowell, F., * Children’s Association Frequency
Tables Psychol. Monogr., 1916, 22, No. 97.
4 O’Connor, J., Born That Way. Baltimore : Williams <fc Wilkins, 1928.
Projection Techniques 175
Many individual responses, low commonality, slow reaction
time and other complex indicators all overlap to some extent,
bu t their psychological significance is most obscure. Some
writers have identified idiosyncrasy with ‘ autistic thinking
others with introversion or with emotionality, some with
intelligence or originality, others with lack of intelligence, and
so on. B ut for the most part the relations with other tests and
ratings are so inconsistent th at the Kent-Rosanoff technique
appears nowadays to have been abandoned, even by psychia­
trists. Additional reasons for this are the lack of satisfactory
scoring tables, and the length of time required for giving the
test and scoring.
Other applications of empirical techniques are Wyman’s
standardization against interest groups (cf. p. 168), and Kelly
and Krey’s 1 even less successful attem pt to measure children’s
character traits through word associations.

PRESSEY X-O OR CROSS-OUT TESTS

Like free association, Pressey’s 2 tests were devised, not as a


measure of any particular trait, bu t as an exploratory instru­
ment, to be applied in group form, for revealing the stimuli th at
evoke emotional responses. Form A, for adults, contains four
sub-tests, each with twenty-five rows of five words. In the
first sub-test the subject crosses out all words th at are unpleasant
to him and encircles the most unpleasant in each row. In the
second, each set of five is preceded by a word in capitals : the
subject crosses out the words associated with this stimulus and
encircles the most closely associated. In the third, things
regarded as wrong are crossed out and encircled, and in the
fourth, things about which the subject has worried or felt
nervous. Form B, for children, contains three similar sub-tests,
calling for reactions of wrong, worry, and like or interest,
respectively. Collins3 adapted this and published it for
British use, and Bennett and Slater included it in their Sutton
1 Kelley, T. L., and Krey, A. C., Tests and Methods in the Social Sciences.
New York : Scribner, 1934.
* Pressey, S. L., * A Group Scale for Investigating the Em otions ’. J .
Abn. Soc. Psychol., 1921, 10, 55-64. Tests published b y Stoelting, Chicago,
1919.
* Collins, M., ‘ British Norms for the Pressey X-O Tests '. Brit J .
Psychol., 1927, 18. 121-188.
176 Personality Tests and Assessments
Booklet (cf. p. 125). A later edition, known as Pressey’s
Interest-Attitude T e st1 includes a section on personal char­
acteristics that the testee admires.
A part from their possible clinical uses, X -0 tests have been
scored in a number of ways. Pressey regarded the to ta l words
crossed out as measure of ‘ affectivity ’ or ‘ richness in emotional
associations ’. There is no confirmation of this, bu t scores on
dislikes, worries, and blameworthy actions tend to be high in
neurotics and delinquents.* In Form A, Sub-test 1, Pressey
chose the five words in each line so th a t one should refer to
disgust tendencies, one to fears, one to sex, one to suspicions,
and one unemotional or a joker ; e .g .:
drunk choke flirt unfair white
Similarly in the fourth sub-test, the words are supposed to
represent special types of abnormality—paranoiac, neurotic,
schizoid, melancholic, and hypochondriacal; e.g .:
injustice noise self-consciousness discouragement germs
There is no evidence that scores based on the numbers of
responses under each of these headings are diagnostic.3
Empirical scoring, analogous to K ent and Rosanoff’s, has
been applied to the encircled words to yield a measure of
‘ idiosyncrasy ’ or abnormality. Pressey’s own lists of the
words most commonly encircled in each line were based on
small and unrepresentative groups, and the scores are defective
in reliability. Collins listed the commonest responses among
1500 11- to 14-year-old British children, and found somewhat
higher idiosyncrasy scores among delinquents than normals.
But, like the free association measure, it fails to give appreciable
and consistent correlations with other measures of emotional
1 Presscy, S. L., and Pressey, L. C., 4 Development of the Interest-
Attitude Tests ’. J . A ppl. Psychol, 1933, 17, 1-18. Test published by
Psychological Corporation, New York, 1938.
* Cf. Courthial, A., * Emotional Differences of Delinquent and Non-
Delinquent Girls of Normal Intelligence’. Arch. Psychol., 1931', 20, No.
183. Himmelweit, H. T., and Petrie, A., ‘ The Measurement of Personality
in Children ’. Brit. J . Educ. Psychol., 1951, 21, 9-29. Bennett, E ., and
Slater, P., * Some Tests for the Discrimination of Neurotic from Normal
Subjects ’. Bril. J . Med. Psychol., 1945, 20, 271-282.
* Cf. Flugel, J. C., and Radclyffe, E . J. D., ‘ The Pressey Cross-Out
Test Compared with a Questionnaire ’. Brit. J . Med. Psychol., 1928, 8,
112-181.
Projection Techniques 177
traits,1 and has now dropped out of use. The Interest-Attitude
test is more promising, since it consists of words th at boys and
girls tend to answer differently according to age, and was
standardized on 4000 cases aged 11 years upwards. Thus it
yields a measure of emotional maturity. Pressey quotes high
correlations with ratings of this trait, bu t there seems to be no
further confirmation. Durea 2 has prepared a scoring key which
shows some success in differentiating delinquent from normal
children.

OTHER WORD ASSOCIATION TESTS

Word-Connection. Another controlled association test for


adults was devised by Mailer and Malamud, and adapted for
British use by Crown.3 Crown’s list consists of fifty stimulus
words each followed by two responses, one commonly given by
normals, one by neurotics ; e .g .:
SINK wash
drown
Subjects are given this as a printed group test, and tick what
seem to them the best associations. Their scores are the
numbers of neurotic choices. Moderate correlations have been
obtained with neuroticism in several researches. B ut scores
appear to be affected by education or socio-economic level,
since normal groups average anywhere from 6-8 (hospital staff)
to 13-9 (unskilled labourers); neurotic groups average 14-7.
The test is very quick and simple to give, and is perhaps one
of the more promising derivatives of word association.
Incomplete Sentences Tests. One of the difficulties of ordinary
free association is th at the single-word responses provide so
1 Cf. Bridges, J . W., and Bridges, K . M. B ., ‘ A Psychological Study o f
Juvenile Delinquency b y Group Methods ’. Genet. Psychol. Monogr.,
1026, 1, 407-506. Flemming, E . G., ‘ The Predictive Value o f Certain
Testa of Emotional Stability as Applied to College Freshmen Arch.
Psychol., 1928, 15, No. 96.
* Durea, M. A., ‘ Personality Characteristics o f Juvenile Delinquents
Child Develpm., 1987, 8, 115-128, 257-262.
* Mailer, J. B., Controlled Association Test. Teachers College, Columbia
University, Bureau of Publications, 1934. Malamud, D . I., ' Value o f the
Mailer Controlled Association Test as a Screening D evice ’. J . Psychol.,
1946, 21, 37-43. Crown, S . , 4 The Word Connection L ist as a Diagnostic
T e s t : Norms and Validation Brit. J . Psychol., 1952, 43, 103-112.
178 Personality Tests and Assessments
little material to analyse. The very form of the test encourages
subjects to give non-diagnostic associations. Payne 1 suggested
presenting short phrases in printed form, such as :
Other people I failed....
If only. My father.

The subject is instructed to write a few words—th e first th a t


occur to him—after each. This has the great advantage of being
applicable as a group test, and although subjects often do give
superficial responses, and can fairly readily fake if they wish to,
it usually yields a wealth of em otionally toned m aterial.
N umerous lists, ranging from 20 to 100 phrases, have been
published, among the best known being Rohde and H ild reth ’s
Shor’s and R o tter’s.2 Himmelweit and P etrie 3 find sim ilar
tests suitable for use with children of 9 to 14 years. O rdinary
single-word stimuli, b u t w ith w ritten sentence responses, were
also used in British Army officer selection during th e war.
The responses can, of course, be interpreted clinically, b u t
in addition several methods of reasonably objective scoring have
been proposed. The sim plest is based on the proportion of
unpleasantly toned ideas to pleasant. R o tter classifies
responses as unhealthy or showing conflict, neutral, and positive
or healthy, and lists specimens of each so th a t others can follow
his scheme. H e claims correlations around 70 with em otional
m aladjustm ent. Shor looks for recurrent themes, signs of
resistance or evasion, unusual or atypical associations, and
other v ariab les; while Rohde tries to assess th e stren g th of
various needs. The more elaborate m ethods are certainly less
reliable and there is less evidence of their validity.4 I t will be
remembered th a t this was the m ost successful of th e projection
1 Payne, A. F ., Sentence Completions. New York : New York Guidance
Clinic, 1928.
1 Rohde, A. R ., and Hildreth, G., Rohde-Hildreth Sentence Completion
Blank. New York : Psychological Corporation, 1940. Shor, J., ‘ Report
on a Verbal Projective Technique ’. J . Clin. Psychol., 1946, 2, 279-282.
Rotter, J . B ., el. al., 4 Validation o f the R otter Incomplete Sentences
Blank for College Students ’. J . Consult. Psychol., 1949, 18, 848-856.
» Op cit., p. 176.
4 Cf. Symonds, P. M., 4 The Sentence Completion T est as a Projective
Technique ’. J . Abn. Soc. Psychol., 1947, 42, 320-829.
Projection Techniques 179
tests used in Kelly and Fiske's research on selecting clinical
psychologists (p. 26).

STORY-TELLING TESTS

Raven’s Controlled Projection Test.1 This should be mentioned


briefly because, although it does not attem pt to measure any­
thing and has received no experimental validation, it does
provide a method of exploration suitable for children from
6 years and for adults. In the individual form the subject
does a free drawing and is simultaneously told an incomplete
story about someone similar to himself, and asked a number of
questions: what did he (or she) think about so-and-so, what
did he do next, etc. ? The dual task is supposed to produce
more spontaneous responses and more identification with the
person described, though there is no evidence for this. In the
group form the subjects look at a picture containing a person
with whom they can identify themselves, and answer questions
about him in writing. No principles of interpretation are
suggested, but sample drawings and responses for different ages
are reproduced. Foulds * lists typical, and often very different,
responses for normal and delinquent boys ; and Kaldegg 3 has
applied the method in an extensive study of differences in
family and other attitudes among English and German children
and students.
Self-Description. Candidates for commissions in the British
Army and for the higher Civil Service are asked to write
two short descriptions of themselves, first as seen by a
good friend, secondly by a candid critic. These are inter­
preted by the psychologist or psychiatrist in conjunction
with material gathered during the interview and with other
projection test responses. No evidence regarding its value is
available.
Several writers have made use of more free forms of verbal
fantasy than Raven, presenting, for example, skeleton stories
1 Raven, J. C., Controlled Projection Test. London : Lewis, 1952.
* Foulds, G. A., ‘ Characteristic Projection Test Responses of a Group of
Defective Delinquents ’. Brit. J . Psychol., 1950, 40, 124-127.
* Kaldegg, A., ‘ Responses of German and English Secondary School
Boys to a Projection Test Brit. J . Psychol., 1948, 89, 30-58. ‘ A Study
of German and English Teacher-Training Students by means of Projective
Techniques’. Ibid. 1951,42,56-113.
180 Personality Tests and Assessments
or beginnings of stories and getting children or adults to
elaborate or complete these.1 These are usually analysed
along the same lines as responses to the Thematic Apperception
test (q.v.).
Griffiths 2 describes the wealth of material obtainable from
5-year-old children by free story-telling, drawings, accounts of
dreams, reactions to simple inkblots, and interviews. She was
more concerned with emotional development and imagination
in this age group than with personality diagnosis.
Literary Productions. I t is a truism th a t the creative writer,
painter, musician, scientist, or philosopher expresses his
personality in the style and content of his productions. B ut
discussions of this topic are almost wholly subjective and
speculative. K retschmer’s pyknic (cyclothyme) and asthenic
(schizothyme) writers, etc., are claimed to show distinctive
styles. William Jam es’s tough and tender-minded philosophers
represent a similar dichotomy. Eysenck and Gilmour,3 however,
classified 107 philosophers as materialists or idealists on the
basis of a short attitude-type test, b u t found no relationship
between these views and any aspect of extraversion-introversion
as measured by Guilford’s questionnaire. Wolff and A mheim’s
matching experiments (ef. p. 49 f.) have shown some consistency
between literary style and other expressive characteristics. In
one investigation, the present writer found it possible to m atch
essays produced anonymously by 18 students with sketches
of their personalities based on observation during an hour’s
testing session, with moderate success. F. Allport, Walker,
and Lathers 4 carried out a larger study of several essays
written by a group of students, dem onstrating th a t their style
was generally consistent throughout. This style often seemed
to throw light on the personalities of the writers, b u t they did
not try to verify this. Probably the uncontrolled literary
1 E.g. Murray, H. A., ‘ Techniques for a Systematic Investigation o f
F a n ta sy ’. J . Psychol., 1936, 3, 115-143. Sargent, H., ‘ An Experimental
Application of Projective Principles to a Paper and Pencil Personality
Test ’. Psychol. Monogr., 1944, 57, No. 265.
* Griffiths, R ., A Study of Imagination in Early Childhood. London :
Kegan Paul, 1935.
* Eysenck, H. J., and Gilmour, J. S. L., ‘ The Psychology o f Philosophers:
A Factorial Study ’. Char. & Person., 1944, 12, 290-298.
* Allport, F. H., Walker, L., and Lathers, E., ‘ Written Composition and
Characteristics o f Personality Arch. Psychol., 1934, 26, No. 173.
Projection Techniques 181
production is far too complex to provide a basis for any
scientific approach to personality diagnosis.

THEMATIC APPERCEPTION TEST (T. A. T. )

Binet, Burt, and others have used picture interpretation as


a means of studying intellectual and emotional qualities, but
the te st in its present form was first described by Morgan and
Murray.1 They also collected the standard series of 80 photo­
graphs (10 for men, 10 for women, 10 for both). The subject is
shown each picture and asked to make up a story describing
the situation, the events leading up to it, and the outcome,
together w ith the thoughts and feelings of the characters.
Prom pting is allowed. A verbatim record is kept for later
analysis. The session does not usually exceed an hour, though
the subject may not cover all the pictures in this time. A
second interview may be held, or it may be used for discussing
the previous stories and asking the subject himself to help in
interpretation.
Several other sets of pictures have been used ; generally ten
are regarded as sufficient. They should be sufficiently ambig­
uous in content to give free rein to fantasy, and should show a
variety of incidents. B ut each of them should portray an
individual of the same sex and about the same age as, or
younger than, the subject, so th a t he can readily identify
himself and project his own needs and sentiments, frustrations
and conflicts, and resistances, on the ‘ hero The test is widely
used, also, in group written form, pictures being projected by
slides for a few minutes each. Thus it was included in W ar
Office and Civil Service selection, and by the Office of Strategic
Services. I t is very desirable th a t a standard procedure and
standard pictures be used, so th a t the results of different
investigators can be compared. Moreover, if the common
themes or ‘ norms ’ of performance are established, interpre­
tation can more readily be based on any unusual elements (just
as with expressive movements, cf. p. 50). R apaport, Gill and
Schafer publish such a list of expected responses for the Murray
set. An interesting continental version of T.A.T. is Van
1 Morgan, C. D ., and Murray, H. A., * A Method for Investigating
Fantasies : the Thematic Apperception Test ’. Arch. Neurol, db Psychiat.,
1935, 34, 289-806. Test published by Harvard University Press, 1948.

13
182 Personality Tests and Assessments
Lennep’s 1 4-Picture test. Jackson’s * set of 6 pictures, A Test
o f Fam ily Attitudes, appears suitable for use w ith children in
this country (Fig. 6).
Unfortunately the approach to interpretation varies widely
with the theoretical background of the psychologist. Those

Fig. 0.—Reproduction, about two-thirds actual size of Picture II in


Lydia Jackson’s Test.

who follow Murray 3 have adopted his elaborate system of


needs and presses (external forces th a t the individual regards as
beneficial or harmful). B u t R otter, Harrison, R apaport, and
W yatt present simpler schemes.4 These are based on the style
1 Van Lennep, D. J., Four Picture Test. The Hague : Nijhoff, 1948.
* Jackson, L., ‘ Emotional Attitudes Towards the Family of Normal,
Neurotic, and Delinquent Children ’. Brit. J . Psychol., 1950, 41, 35-51,
173-185. Test published by Methuen, London, 1952.
* Murray, H. A., el. al., Explorations in Personality. New Y ork : Oxford
University Press, 1938. Sanford, R. N., et. al., * Physique, Personality and
Scholarship ’. Monogr. Soc. Res. Child Develpm., 1943,8, No. 84. Tomkins,
S. S., The Thematic Apperception Test. New York : Grune and Stratton,
1947.
* Harrison, R., ' The Thematic Apperception and Rorschach Methods
of Personality Investigation in Clinical Practice \ J . Psychol., 1948, IS,
49-74. Rotter, J. B., ‘ Thematic Apperception Tests : Suggestions for
Administration and Interpretation ’. J. Person., 1940, 15, 70-92. W yatt,
F., * The Scoring and Analysis of the Thematie Apperception Test ’.
J . Psychol., 1947, 24, 319-830. Rapaport, et. al., Bibliography.
Projection Techniques 183
or structure—compliance with instructions, language character­
istics, logical coherence, consistency of stories with one another,
realisticness, etc., and on the content of recurrent themes—
predominant emotional tone, social and sexual or other attitudes,
conscious strivings (particularly of the hero), and unconscious
m otivations and defences. I t is essential for the scorer-
interpreter to be trained and experienced; the am ateur clinical
psychologist who uses any old set of pictures and interprets the
responses largely by intuition should certainly be discouraged.
B urt and S e n 1 advocate a still more straightforward
approach, owing little or nothing to abnormal psychology.
They assess the strength of a number of qualities such as Level
of Organization or Coherence, Observation of Details, Verbal
Richness, Imagination, Extraversion-Introversion of the main
themes, M aturity vs. Childishness, Integration vs. Neuroticism.
Symonds,2 on the other hand, claims th a t the stories mainly
represent repressed drives which may be the very reverse of the
overt personality traits.
Many writers have testified to the value of T.A.T. in mental
hospitals, clinics, etc., especially when used along with other
tests such as Rorschach. B ut scientific evidence is more
conflicting. Harrison and R otter 8 have been able to show th a t
interpretations of the same material by different testers are
reasonably consistent, and have achieved a high degree of
success in matching interpretations with independent case
histories. Bell gives an extensive list of the characteristics of
stories produced by various types of psychotics and neurotics.
Thus it should be genuinely useful for differential diagnosis.
B ut it is much less successful when used for predicting suita­
bility of personality for some occupation (Army officer, clinical
psychologist, etc.).4 Harrison 5 was able to estim ate the IQs

1 Unpublished memorandum.
1 Symonds, P. M., ‘ Interpreting the Picture Story (TAT) Method \
Amer. Psychologist, 1947, 2, 288-289.
* Harrison, R., and Rotter, J. B., * A Note on the Reliability of the
Thcmatic Apperception Test J. Abn. Soc. Psychol., 1945, 40, 97-99.
* Cf. Kelly and Fiske, Bibliography. Guilford, J. P., and Lacey, J . 1.,
Printed Classification Tests. Army Air Forces Aviat. Prog. Res. Rep.
No. 5. Washington, D.C. : U.S. Government Printing Office, 1947.
* Harrison, R., ‘ Studies in the Use and Validity of the Thenintie
Apperception Test with Mentally Disordered Patients \ Char, d- Person..
19tO, 9, 122-138.
184 Personality Tests and Assessments
of 87 cases, with a validity of 78. Sen 1 finds the reliability
of assessments by different scorers on B u rt’s qualities to
average only -4; nevertheless such qualities as Observation,
Verbal Ability, Level of Organization and M aturity gave very
promising correlations with follow-up results among high-grade
civil servants. Some attem pts have been made to objectify the
test by providing multiple-choice questions or stories to choose
from, b u t this seems to be a blind alley.

OTHER PICTURE TESTS

Shneidmann’s Make a Picture Story Test (M A P S ).1 This


consists of 22 background pictures, and 67 figures, some of
which are arranged on the background by the subject, while he
tells a story about w hat they are doing. So far this has been
used chiefly w ith schizophrenic patients, and a series of object­
ively scorable characteristics (in addition to qualitative
features) have been found to differentiate them from normals.
Rosenzweig’s Picture-Frustration (P -F ) Study.3 Although
this picture test also involves somewhat subjective scoring and
interpretation, its lim itation to a few fairly clear-cut personality
trends is an advantage. I t is based on Rosenzweig’s classifica­
tion of people’s reactions to frustrating situations : some tend
to blame others or the environm ent (extrapunitive) ; some find
fault with themselves (intropunitive); while some tend to
minimize or evade the frustration (impunitive). A dults’ and
children’s tests are available, and they can be given in group
form. They consist of two dozen cartoons or line drawings,
in which one person describes a situation or makes a rem ark
which deprives, disappoints, accuses, harms, or incriminates
another. The subject is instructed to write the first words th a t
come to mind as the response of the second (frustrated) person.
A number of categories of response are distinguished, and
specimen answers are provided to help in scoring. There is as
1 Op cit.
• Shneidmann, E. S., ‘ Schizophrenia and the M A P S Teat Genet.
Psychol. Monogr., 1948, 38, 145-223.
* Rosenzweig, S., ‘ The Picture-Association Method and its Application
in a Study o f Reactions to Frustration \ Char. & Person., 1945, 14, 3-28.
Tests published by S. Rosenzweig, Western State Psychiatric Hospital,
Pittsburgh, 1944. Cf. also Bernard, J., * The Rosenzweig Picture-
Frustration Study J . Psychol., 1949. 28, 325-343.
Projection Techniques 185
yet little evidence to prove th a t the resulting scores do corres­
pond to typical frustration reactions in everyday life. However,
one of the categories—need-persistence (measured by responses
th a t stress searching for a solution)—correlated positively with
the persistence factor in MacArthur’s research (cf. p. 14).
The Szondi Test.1 This is the most bizarre in our catalogue.
I t consists of 6 sets of 8 photographs. In each set of 8 is a
homosexual, a murderer, an epileptic, an hysteric, a catatonic,
a paranoiac, a manic, and a depressive. The subject chooses
2 pictures in each set which he likes most and 2 which he
dislikes. This should be repeated on several occasions. Accord­
ing to Szondi, it is not those who show these personality
characteristics overtly who like the corresponding pictures, b u t
those in whom the tendencies are latent (owing to recessive
genes). Repressed, rejected, or sublimated tendencies are
represented by dislikes. Although the chief English-speaking
proponent of the test, Deri,2 has p u t forward a less esoteric
account of the theoretical basis and uses of the method, there
seems to be no evidence of its value, ap a rt from case-study
material.

THE RORSCHACH INKBLOT TEST 8

This is far and away the most popular projection technique,


and it is probable th a t more work has been done in standardizing
and developing it into a scientific as well as a clinical instrum ent,
and in training psychologists to use it properly, than on all the
others pu t together. On the other hand, there is an unfortunate
tendency for Rorschach testing to become a cult, like psycho­
analysis, with the same tendency to dogmatism, to an elaborate
jargon and to dissenting sects, and the same implications th a t
only the initiated can understand it, and th a t it is immune
from ordinary scientific standards of reliability and validity
because it is concerned with the ‘ total ’ personality. More
than 1000 books and articles have been published on it to date.
> Szondi, L., Szondi Test. New York : Grune and Stratton, 1937.
1 Deri, S., Introduction to the Szondi Test. New Y ork : Grane and
Stratton, 1949.
• Rorschach, H., Psychodiagnostics. New York : Grune and Stratton,
1942 ; London : Methuen. Blots (4th edit.), same publisher, 1946. A
useful introduction for British readers is Mons, W ., Principles and Practice
of the Rorschach Personality Test. London : Faber, 1948.
180 Personality 1'ests and Assessments
Since 1987 the Rorschach Research Exchange has been published,
a journal exclusively concerned with the test, though this has
now been expanded into the Journal o f Projective Techniques.
Many psychologists from Binet onwards have used meaning­
less inkblots as m aterial for investigating the imagination.1
Rorschach, a Swiss psychiatrist, tried out a large num ber of
rather elaborate blots with m ental hospital patients before

Fig. 7.—Outline reproduction o f Rorschach’s Blot No. III.


(Nine-sixteenths original size.)

publishing the standard set of 10, together with his monograph,


Psychodiagnostik, in 1921. Administration is very simple. The
blots (some black-grey-white, some coloured (cf. Fig. 7)) are
presented in turn, and the subject is asked w hat he sees in each,
w hat it makes him think of. He is encouraged to produce as
many associations as possible, and to turn the cards round if he
wishes. The examiner records unobtrusively, usually on a
specially prepared form.* After completing the series, the
1 Cf. Tulchin, S. H., ‘ The pre-Rorschach Use o f Inkblot Tests ’.
Rorschach Res. Exchange, 1940, 4, 1-7. Cf. also Hertz, Bibliography.
1 E.g. Klopfcr, B., and Davidson, H. H., Record Blank for the Rorschach
Method of Personality Diagnosis. New York : Rorschach Institute, 1939.
Projection Techniques 187
responses are discussed in order to elucidate w hat determined
them, and to which parts of the blots they refer. In Klopfer’s
‘ testing the limits ’, there is a third stage where the tester
probes deeper and prompts responses which were not produced
spontaneously. For a normal production of about th irty
responses, testing usually takes less than half an hour, b u t there
are enormous v ariatio n s; some patients give less th an ten
responses, some several hundreds. Scoring and interpretation
may take several hours.
Each response is scored under three headings :
I. Mode of apperception—whether the whole blot (W
response), an ordinary detail (D) or unusual
detail (Dd), or based on a white space in the blot
(DS), etc.
II. The determ inant—the form or shape (F + or —
according to aptness), colour (C), shading or
chiaroscuro (K), or a response implying movement
(M). The relative proportions of colour and move­
m ent responses yield the Erlebnistypus or experience
balance—extratensive (excess of colour), intro-
versive (excess movement), m any of both types
(dilated ambiequal), few or none (constricted).
III. Content—original ( 0 + o r —), or common response
( P ) ; human (H), animal (A), anatomical, object,
geographical, nature, architecture, plant, etc.
Beck and Klopfer,1 the leaders of th e main American ‘ sects ’,
suggest a number of more detailed scoring categories. Each of
the main types of response is summed to give th e psychogram
or profile, and interpretation is based mainly on these scores
and their inter-relations. The profile epitomizes, as it were,
both the intellect and the affective life of the subject. Thus
intelligence is shown by good form, original and movement
responses; b u t an excess of W shows a more abstract and
synthetic, of D a more practical, and of Dd a pedantic, kind of
intellect. Colour corresponds roughly to emotionality, and if
the colour responses show little influence of form, this emo­
tionality is poorly controlled. Movement responses represent
1 Beck, S. J., * Introduction to the Rorschach Method ’. Res. Monogr.
Amer. Orthopsychint. Ass., 1937, No. 1. Klopfer, B., and Kelley, D. M.,
The Rorschach Technique. Yonkers, N.Y. : World Book Company, 1942.
188 Personality Tests and Assessments
a rich inner life rather than overt emotion, somewhat like
Ju n g ’s introversion, and the various kinds of shading responses
show sensitivity and anxiety. Some of th e content categories
have special significance; thus th e commonest responses,
animals, show lack of imagination, anatomical responses,
m orbidity,1 etc. Though particular responses are sometimes
explored, e.g. along psychoanalytic lines, content plays a
much smaller part in interpretation than it does in th e clinical
application of word association tests or in T.A.T. Indeed m any
psychologists regard Rorschach and T.A.T. as complementary,
and prefer to apply both to their subjects. The sequence of
Rorschach responses, and the context in which certain ones
appear, may also help in the interpretation of neurotic trends
and resistances. The significance of any score naturally
depends on the extent to which it deviates above or below the
normal score for other similar people. Numerous tables of
norms have been published, bu t there are such wide variations
with age and education of the subjects, probably also w ith the
manner of giving the test, and with productivity or total
num ber of responses, th a t these are far from satisfactory.
The above account is much over-simplified, and reference
should be made to Klopfer’s a n d /o r Beck’s manuals. I t is
most essential for the Rorschach tester to receive thorough
training, and fortunately this is fairly readily available in
U.S.A. and in England.
Reliability and validity are difficult to investigate, because
Rorschachites insist th a t no category should be considered in
isolation.2 (The dependence of every category on the total
num ber of responses, and the skewness of score distributions,
constitute additional snags.) 3 Nevertheless, interpretations are
frequently based on small differences in category scores; for
example, 8 Colour-Form and 1 Form-Colour would be regarded
1 Sandler and Ackner have recently classified types o f content by factor
analysis. On comparing their categories with psychotic diagnoses and
symptoms, anatomical responses were found to be strongly associated with
an insecure-aggressive, rather than an hypochondriacal, syndrome.
Sandler, J., and Ackner, B., ‘ Rorschach Content Analysis : An Experi­
mental Investigation ’. Bril. J . Med. Psychol., 1951, 24, 180-201.
1 Cf. Ainsworth, M. D ., ‘ Some Problems o f Validation of Projective
Techniques \ Brit. J . Med. Psychol., 1951, 24, 151-161.
s Cf. Cronbach, L. J., * Statistical Methods Applied to Rorschach Scores
Psychol. Bull., 1949, 46, 398-429.
Projection Techniques 189
as very different from 1 Colour-Form and 8 Form-Colour
responses. Thus it is desirable th a t such scores should possess
stability and consistency. The ordinary split-half and retest
methods of studying reliability have been used, b u t are open
to criticism. Perhaps the fairest method is to compare with
the results of a parallel series of blots, two of which are available
—one by Behn-Eschenburg, one by Harrower.1 Meadows *
obtained correlations averaging only -50 for 25 score categories
between Rorschach and Behn responses of 100 normal persons
and 100 neurotics. They ranged up to -84, b u t many coefficients
were very low. The reliability of scoring m ight also be queried,
bu t several experiments have shown th a t different experts do
agree very closely. Tables have been published to assist in the
scoring of doubtful responses (good vs. poor forms, normal vs.
rare details, etc.). O ther researches indicate th a t it is scarcely
possible for a subject to fake a good or poor personality, unless
of course, he has been primed. Temporary mood, however, and
the rapport between the subject and the particular tester do
seem to have some influence.3
Claims for the diagnostic significance of the categories are
by no means as fanciful as might appear, since a vast am ount of
evidence is available on the trends in different neurotic and
psychotic groups. A typical manic patient, for example, gives
a very different psychogram from a typical depressed case.
Moderate correlations are often obtained between certain
scores and intelligence tests ; H ertz 4 quotes higher ones with
neuroticism, though these have not been confirmed. Sen 6 has
shown by factorizing the various category scores th a t they can
be reduced to three main components which might be termed—
associative fluency or productivity, intelligence, and neurotic
tendency. Correlations of around -5 were obtained between
1 Behn-Eschenburg, H., Psychische SchUkruntersuchungen mit dem
Formdeutversuch. B e rn : Bircher, 1921. Harrower-Erickson, M. R .,
Psychodiagnostic Inkblots : A Scries Parallel to the Rorschach Blots. New
York : Grune and Stratton, 1945.
* Meadows, A. W., An Investigation of the Rorschach and Behn Tests.
Ph.D. Thesis, University of London, 1951.
* Cf. Lord, E., 4 Experimentally Induced Variations in Rorschach
Performance Psychol. Monogr., 1950, 64, No. 810.
* Hertz, M. R., 4 The Rorschach Inkblot T e s t : Historical Summary
Psychol. Bull., 1985, 32, 33-66.
* Sen, A .4 A Statistical Study of the Rorschach Test ’. Brit. J . Psychol.,
Statist. Sec., 1950, 3, 21-39.
190 Personality Tests and Assessments
the factor scores of students and associates’ ratings on these
traits. This type of piecemeal validation is, of course, entirely
foreign to the manner in which the Rorschach is normally
used, though the fact th a t it is successful suggests th a t it
might be turned into a more objective personality test. Match­
ing methods of validation have been tried, though they, too,
have their disadvantages. Undoubtedly they show th a t
diagnoses based on the psychogram as a whole (even when
interpretation is done 4 blindly ’, w ithout seeing the subjects
themselves) can be identified w ith case-studies or clinical d ata
with considerable success.1 Thus the skilled Rorschach tester
can certainly help in the differential diagnosis of mental
patients. H e seems, however, to be less able to make valid
predictions about normal persons. Thus th e test was of very
little value in selecting clinical psychologists. Nor did any of
the scores correlate with success among U.S. Army Air Force
pilots, b u t predictions made from the complete psychograms
did yield an appreciable validity coefficient.
Several modifications of technique have been tried. Harrower*
has worked out a method of group examination, where the blots
are shown by slides and subjects write their responses on
prepared forms. This seems to yield similar m aterial to the
usual method, b u t rather less complete. Munroe 3 developed
an 4 inspection technique ’ for rapid and relatively objective
scoring. The presence or absence of tw enty-eight signs of
emotional instability in group records was counted, and these
sign-scores gave rather promising validity in identifying students
whose academic work fell below the level expected from intel­
ligence tests and previous examinations. Similar is Biihler and
Lefever’s 4 method of scoring for adequacy vs. malfunctioning
1 Cf. Vernon, P. E., • The Significance of the Rorschach Test ’. Brit.
J . Med. Psychol., 1935, 15, 199-217. Krugman, J. E ., ‘ A Clinical Valida­
tion o f the Rorschach with Problem Children ’. Rorschach Res. Exchange,
1942, 6, 61-70. Benjamin, J. D., and Ebaugh, F . G., ‘ The Diagnostic
Validity of the Rorschach Test ’. Amer. J . Psychiat., 1938, 94, 1163-1178.
* Harrower-Erickson, M. R., and Steiner, M. E ., Large Scale Rorschach
Techniques. Springfield, 111. : Thomas, 1945.
* Munroe, R. L., ‘ Prediction of the Adjustment and Academic Per­
formance of College Students by a Modification of the Rorschach Method ’.
A ppl. Psychol. Monogr., 1945, No. 7.
4 Buhler, C., Btihler, K., and Lefever, W. D ., Development of the Basic
Rorschach Score with Manual of Directions. Los Angeles, C a l.: Rorschach
Standardization Studies, No. 1, 1948.
Projection Techniques 191
of personality. Unfortunately Munroe’s technique failed to
yield appreciable correlations with college grades in another
experiment by Cronbach,1 with a rather different type of student
population, though there was moderate agreement with
dorm itory staff ratings of social adjustm ent. Harrower also
devised an entirely objective multiple-choice test. The blots
are shown on slides and nine possible responses to each are
presented, some commonly given by normal people, some by
neurotics (these are listed by Eysenck). The subject ticks those
which he thinks appropriate and his score is the num ber of
neurotic choices. This seems to have very poor validity, b u t a
more reliable method suggested by Eysenck, where the subjects
rank each set of nine responses, has given small correlations
with neuroticism.

ARTISTIC PRODUCTIONS

Lowenfeld’s Mosaic Test.1 The subject is given a box


containing some 2 to 400 small squares, triangles, and diamond­
shaped pieces, coloured white, black, red, green, blue, and
yellow—together with a wooden tray lined with a sheet of
white paper. He is told to make anything he likes with the
pieces, and is encouraged to go on until he is satisfied with his
construction. A perm anent record can be made by drawing
round the pieces before removing them and chalking in their
colours. A few subjects make concrete or representational
patterns, b u t more commonly they create abstract ones which
are classified into various types : compact, scattered, incoherent,
also according to success or the completeness of the Gestalt, the
use of edge or frame designs, winged and arrow patterns, etc.
Though a considerable am ount of d ata is available on the types
of designs occurring in different psychopathological groups,* and
among children of different ages, the te st can hardly be said
to provide an objective method of diagnosis. Lowenfeld, and
those whom she has trained, can give rem arkably penetrating
1 Cronbach, L. J., ‘ Studies of the Group Rorschach in Relation to
Success in the College of the University of Chicago ’. J . Educ. Psychol.,
1950, 41, 65-82.
* Lowenfeld, M., ‘ The Mosaic Test ’. Amer. J . Orthopsychiat., 1949, 19,
587-550. Test material published by Institute o f Child Psychology,
London, and Psychological Corporation, New York.
* Summarized by Bell.
192 Personality Tests and Assessments
accounts of personality structure, b u t they are less successful
so far in laying down definite principles of interpretation which
others can follow. Kerr, and Himmelweit and E y sen ck 1
quote experiments in which experienced testers successfully
matched Mosaics with personality sketches, or wrote sketches
from the Mosaics which were identified by psychiatrists. The
tester was, however, unable to predict the answers of neurotic
patients to a personality questionnaire. Notable advantages
of Mosaics over other projection tests are th a t it is quite brief,
and th a t it can be given many times over. Thus it is particularly
useful for following through personality changes during psycho­
logical treatm ent.
Drawings and Paintings. Psychological studies of a rt * have
concentrated chiefly o n :
(a) Developmental stages—A useful application of such
work is Goodenough’s ‘ Draw-a-Man ’ performance
test of intelligence;
(b) Cultural influences ;
(c) Artistic productions of psychotic patients, e.g. schizo­
phrenics 3 ;
(d) the therapeutic value of drawing and painting among
maladjusted children, and the changes th a t occur
during the course of treatm ent. Among adult neurotics
also, particularly those who are treated by followers of
Jung, drawings are sometimes analysed in the same
manner as dream s;
(e) Tests, and factorial analysis of types, of artistic taste.
I t is very widely assumed in child guidance th a t the style and
content of drawings provide valuable diagnostic hints. Bell
gives a lengthy table of aspects or elements and the significance
of each for personality which has been alleged by various
authors. B ut there is an astonishing dearth of evidence other
than case-study material which, w ithout a control group,
1 Kerr, M., * The Validity of the Mosaic Test ’. Amer. J . Orlhopsychiat.,
1939, 9, 232-236. Himmelweit, H. T., and Eysenck, H. J ., * AnJExperi-
mental Analysis of the Mosaic Projection Test ’. Brit. J . Med. Psychol.,
1945, 20, 283-294.
* Cf. Goodenough and Harris, Bibliography.
s Cf. Anastasi, A., and Foley, J. P ., 4 A Survey of the Literature
on Artistic Behavior in the Abnormal ’. Psychol. Monogr., 1940, 52,
No. 237.
Projection Techniques 198
proves nothing. For example, one cannot claim th a t such-and-
such a feature is characteristic of ‘ anxious ’ children, unless
one has also shown th a t this feature is relatively absent from
drawings of normal children. Even the matching method,
which gave strong support to the validity of T.A.T., Rorschach,
and Mosaics, has been neglected.1 I t is far too easy to read
adult associations into children’s productions—for example, to
say th a t fences represent repressions, dark colours or shading,
anxieties, etc. Distorted human figures, houses with tiny
windows and the like often arise merely from backwardness or
from defective drawing skill. (Some authors recommend
finger-painting as making fewer demands than paint-brush,
crayon, or pencil.) * However, Goodenough and others have
proved th a t there tend to be more incongruities in drawings of a
man by maladjusted and delinquent children than in those of
n orm als; and such children often score below their Binet
Mental Ages on her test. Buck s uses as a te st for adults
drawings of a house, a tree, and a person (H-T-P Test), and
accompanies these by a series of questions. His method of
interpretation, largely based on Freudian symbolism, is highly
subjective and lacking in validation.
In conclusion, we do not deny th a t drawings and paintings
are expressive of personality, nor th a t they are valuable as an
exploratory tool in clinical treatm ent. There are also well-
established abnormalities in the productions of various psycho-
pathological groups. B ut the origins of any feature are so
complex th a t interpretation is in much the same state as was
graphology 50 years ago.

AESTHETIC APPRECIATION

Many writers on art, such as H erbert Read, have attem pted


to classify modes of appreciation, as well as types of artistic
1 Waehner describe* a system of analysis of adult drawings and shows
that personality sketches derived from this can be matched very success­
fully with sketches based on Rorschach. Waehner, T. S., ‘ Interpretation
of Spontaneous Drawings and Paintings ’. Genet. Psychol. Monogr., 1940,
83, 8-70.
* Cf. Napoli, P. J., • Finger Painting and Personality Diagnosis ’.
Genet. Psychol. Monogr., 1946, 34, 129-231.
* Buck, J . N ., * The H -T -P Technique : A Qualitative and Quantitative
Scoring Manual J . Clin. Psychol., 1948, 4, 817-396.
194 Personality Tests and Assessments
production, and have suggested relationships with Ju n g ’s or
other systems of personality types. On the basis of intro­
spective experiments, Bullough, Valentine, and Myers derived
four main types of response to colours, musical tones, and other
aesthetic stim u li: the objective or technical, the subjective or
emotional, the associative, and the characterizing types. They
did not claim, however, th a t individuals adhere consistently to
one type, nor did they attribute these differences to personality
traits. This is a problem which is amenable to experimental
investigation, using B u rt’s technique of factorizing correla­
tions between persons. Individuals are asked to rank a set
of pictures or other artistic objects in order of appreciation
(or rate them on an appropriate scale). When their orders
are inter-correlated, a fairly high degree of overlapping is
usually found even between experts and untrained adults and
children. The correlation between each individual’s order
and the average or standard order may be taken as a measure
of his artistic taste. Eysenck 1 finds some consistency in
taste even when measured by such different sets of materials
as reproductions of portraits, landscapes, statues, furniture,
abstract figures (polygons), pieces of poetry, and odours.
B ut over and above this general factor, he finds th a t one
or more bipolar factors can be extracted, showing th a t some
people have a fairly consistent preference for one type of art,
others for another. The most clear-cut of these dichotomies
is th a t between highly coloured, fairly simple, and impression­
istic artistic styles and more complex, subtle and formal,
or classical styles. Eysenck provides some evidence th a t
preference for the former type is associated with extraversion,
also with radical vs. conservative attitudes and (inversely)
with age.
Similar factors have been distinguished in the appreciation of
poetry and music, though there is less evidence of their con­
nection with personality traits. B urt * claims, however, th a t
persons belonging to his four main tem peram ental types (cf.
1 Eysenck, H. J., * Some Factors in the Appreciation of Poetry, and
their Relation to Temperamental Qualities ’. Char. & Person., 1940, 9,
160-167. * The General Factor in Aesthetic Judgements ’. Bril. J .
Psychol., 1940, 31, 94-102. * “ Type ’’-Factors in Aesthetic Judgements ’.
Brit. J . Psychol., 1941, 81, 262-270.
2 Burt, C. L., ‘ The Factorial Analysis of Emotional Traits ’. Char. <t
Person., J939, 7, 288-254. 285-299.
Projection Techniques 195
p. 16) tend to show characteristic tastes in all the arts, as
follows:
Unstable extraverts like romantic, emotional, and dram atic
productions, with strong human interest, e.g. Titian
and Rubens, flamboyant Gothic architecture, Wagner
and Strauss, Byron and Shakespeare.
Stable extraverts like more objective and realistic art, for
the associations it arouses rather than the emotions,
e.g. Raphael and Chardin, Handel and Verdi, Johnson
and Macaulay.
Unstable introverts like mystical and impressionistic art,
e.g. El Greco and Blake, early Gothic, Debussy and
Delius, Shelley and Yeats.
Stable introverts like intellectual and formal productions,
e.g. Van Eyck, Vermeer, and Cezanne; B a c h ;
Wordsworth and Henry James.
In one experiment with postcard reproductions, people judged
as belonging to these types did express liking for 73% of the
corresponding and only 42% of th e non-corresponding types of
pictures. The other attributions are plausible, b u t there is no
published evidence for them so far.

PLAY METHODS

Older theories explained play as resulting from surplus


energy, or as practising the instincts, or as recapitulating the
evolution of the race. Nowadays it is generally regarded as a
medium of self-expression in which wishes, anxieties and
conflicts, and socially unacceptable impulses are worked out.
Thus it is particularly valuable in the psychotherapy of young
children, for whom language is such a difficult means of com­
municating emotions. Like artistic productions, however, it
certainly does not provide a straight-forward diagnostic tool.
One would expect th a t fantasy behaviour would often be
compensatory, and th a t quite different attitudes would be
expressed, e.g. in doll play, from those normally shown to
the family, teacher, or other children. Inevitably, there­
fore, subjective interpretation enters in the use of play as
a clue to personality, and there are grave dangers in jumping
to conclusions about children on the basis of casual
196 Personality Tests and Assessments
observations and a slight acquaintance with psychoanalytic
theories.
A further difficulty in adapting play as a projection technique
is th a t it does not normally leave a permanent record ; descrip­
tions by an observer or play therapist introduce many possible
sources of error. Researches under Sears 1 a t Iowa show th a t
valuable results from the standpoint of personality theory can
be obtained by the application of exact observational techniques
to various aspects of doll play. An alternative approach is to
employ a standard set of toys with which the child constructs a
scene, or plays out a drama, and to photograph the resulting
constructions. This is the basis of Lowenfeld’s 2 ‘ World ’ test,
so called because the child portrays, as it were, his view of the
world with m iniature people, animals, houses, trees, etc.
Biihler and Kelley have collected and published a set of
materials for this test, which they regard as a test of emotional
disturbance. The method has been found useful also with
adults.® Moreno’s 4 psychodrama is a more elaborate method of
involving adults in play for diagnostic and therapeutic purposes.
I t is far too dependent on the skill of the director-interpreter
to be considered seriously as a diagnostic test. However,
it connects up with the group observational methods of
Chap. VI.
Although play provides such a rich field for personality
investigations among children, and has a very extensive
literature, it seems unlikely to yield any convenient or practi­
cable tests. We should recall, however, th a t it does n ot only
reflect unconscious motives. As described in Chap. VI, valuable
measurements of social and character traits were obtained from
time-sampling and tests based on play. Obviously, too,, it
provides an indication of interests. I t would take too long to

1 Cf. Pintler, M. H., Phillips, R ., and Sears, R . R ., ‘ Sex Difference* in


the Projective Doll Play of Pre-school Children ’. J . Psychol., 1946, 21,
78-80.
* Lowenfeld, M., 4 The Nature and Use o f the Lowenfeld World Tech­
nique in Work with Children and Adults ’. J. Psychol., 1950, 80, 825-881.
Biihler, C., and Kelley, G., The World Test. A Measurement of Emotional
Disturbance. New York : Psychological Corporation, 1941.
* Cf. Bolgar, H ., and Fischer, L. K., 4 Personality Projection in the
World Test ’. Amer. J . Orthopsychiat., 1947, 17, 117-128.
4 Moreno, J. L., Sociodrama : A Method for the Analysis of Social
Conflicts. New York : Beacon House, 1944.
Projection Techniques 197

observe and record all a child’s play activities in order to survey


his interests ; and the results might have poor predictive value ;
(for example, early mechanical interests among boys are
notoriously unstable). B ut the Pressey X -0 and Strong Interest
Blank try to cover the same ground a t a more sophisticated
level.

SENSE OF HUMOUR TESTS

The mechanism of projection certainly enters into our


appreciation of the comic, and there have been some attem pts to
base personality tests on the types of jokes appreciated.
Eysenck 1 found th a t when people rank jokes or other humorous
material in order of funniness, or rate their appreciation, there
is remarkably little agreement between them. Some persons,
however, consistently tend to rate the simpler sexual and
aggressive jokes more highly, whereas others consistently prefer
more subtle and clever humour. In statistical terms, there is
no clear general factor in the correlations between raters, but
there is a prominent bipolar. Some correlations were found
between the latter factor and extraversion-introversion, as
measured by a personality questionnaire in normal subjects,
and there was a slight tendency for hysterical patients to rate
the sexual type more highly than did dysthymics.
A much more elaborate test has been published by Cattell and
Luborsky,2 where some 200 jokes have been classified by factor
(cluster) analysis into 11 pairs of types ; for example :
Carefree, sexual vs. m ordant and morose.
Derision of stupidity, etc. vs. stable acceptance.
Disregard of conventions vs. light badinage.
High positive or negative scores on each of these are claimed to
derive from repressed personality tendencies, and certain
correlations have been found with personality-questionnaire
factors. For example, the first type connects with traits akin
to surgency-desurgency. The scheme is probably too elaborate ;
’ Eysenck, H. J., ‘ The Appreciation of Humour : An Experimental and
Theoretical Study ’. Brit. J . Psychol., 1942, 32, 295-309.
5 Cattell, R. B., and Luborsky, L. B., ‘ Personality Factors in Response
to H um or’. J . Abn. Soc. Psychol., 1947, 42, 402-421. C-L Humor Test.
Champaign, 111. : Institute for Personality and Ability Testing, 1950.

14
198 Personality Tests and Assessments
for many of the scores have low reliability, and they overlap
considerably. Indeed when they are factorized they appear to
resolve largely into the same bipolar dimension th a t Eysenck
found. Moreover, the significance of the various kinds of jokes
is dubious. When several judges were asked to classify them
under Cattell’s 11 types, they showed little agreement. Never­
theless, the test is both genuinely projective and objective, and
clearly merits further development.
XI
Conclusions and F uture Developments

INittrying to sum up the practical implications of our survey,


is advisable to distinguish three main situations in which
personality tests or assessments are required—selection,
experimentation, and diagnosis or guidance. The field of
selection is the most straightforward because it is the least
affected by the difficulties of personality theory and the many
unsolved problems discussed in Chap. I. For example, there
is no need to reach agreement as to the main traits or dimensions
of personality. The value of any proposed test or other method
can be determined directly by comparison w ith some external
criterion such as educational or vocational success and failure.
At the same time, progress along purely empirical lines is
likely to be slow ; the choice of suitable methods depends
largely on adequate personality theory, and on advances in the
experimental and diagnostic study of personality. And, as
Vernon and Parry point out, follow-up research and the
discovery of good external criteria are far from easy. The
application of tests for selection is handicapped, too, by the
need for such tests to be short and simple, not dependent on
highly trained testers nor on elaborate apparatus, and so forth.
We therefore naturally ask first how useful are paper-and-
pencil tests, since they can generally be applied to groups of
candidates, and scored, by slightly trained testers, a t little
cost. Tests of interests and attitudes such as the Strong Blank
(p. 163), the Kuder Preference Record (p. 168), and the Study o f
Values (p. 162) have certainly proved their worth in America,
though nothing similar is immediately available in Britain ;
and it is doubtful how far they are suitable for average adults
or for 11- to 14-year-old children, as contrasted with college
students. The forced-choice questionnaire covering a limited
number of main types of interest,'(like K uder’s) appears more
promising than Strong’s or any simple check list of interests
and inclinations, because it eliminates response sets (cf. p. 167).
It is also easier to standardize and to score. A grave defect of
200 Personality Tests and Assessments
such tests is their susceptibility to faking, and they should at
least be supplemented by objective information tests (p. 91).
When a small number of interests is involved, as in selection for
technical schooling, Peel and Lam bert’s technique (p. 91) may
be an improvement. Other aptitude tests to some extent reflect
personality and interests; for example, Moss’s Social Intelligence
Test (p. 92) might well be superior to ordinary intelligence tests
in selecting for appointments involving social contacts.
It is difficult to see any use whatever in selection for miscel­
laneous personality inventories like the Bernreuter, Bell,
Guilford, Boyd, etc., valuable though it would be to have
measures of such traits as instability, extraversion, and
ascendance. The reasons are fully set out in Chap. V III. The
only type worth considering would have to be a t least as
disguised as the Bennett-Slater (p. 125), and should preferably
employ the forced-choice principle. T hat is, self-rating items
should be tried out in actual selection situ atio n s: a small
number of valid items should be chosen on the basis of correla­
tions with an external criterion, and combined with invalid
ones in such a way as to defeat the faker or the malingerer.
More factual biographical inventories, perhaps containing a
few relatively innocuous self-description items, are worth
developing but, as pointed out on p. 136, they necessitate large
numbers and a rather stable selection set-up. In the case of
children we advocated (p. 121) the construction of a similar
third-person questionnaire and rating scale for teachers to apply,
whose items would be empirically scored after validation
against, say, secondary school success.
Few of the well-known projection tests are suitable. They
are too lengthy, much too dependent on the subjective judgm ent
of the tester-intcrpreter, and evidence for their validity is poor.
Thus it seems unlikely th a t the group Thematic Apperception
contributed anything worth while to the selection of British
Army officers or civil servants. Sentence Completion (p. 177)
is most free from these defects, but needs to be pu t across very
carefully if candid responses are to be obtained. Picture
Frustration (p. 184) might be tried, and group Rorschach scored
by inspection technique would be worth validating against
external criteria. A useful, quick, and objective controlled
association test might be developed from Malamud’s and
Crown’s Word Connection List (p. 177) and Prcssey’s Interest-
Conclusions and Future Developments 201
A ttitude test (p. 176), given sufficiently large num bers for the
preparation of up-to-date scoring keys.
Physical and physiological measurem ents and tests of
behaviour m ust be approached very cautiously. The electro­
encephalograph does indicate certain types of abnorm ality, and
it is possible th a t the psychogalvanic reflex or other measures
of autonomic functioning m ight prove relevant to certain jobs.
Probably, however, they are too variable, and involve too
elaborate recording. Static ataxia (p. 80) is one test of proven
validity, and Sheldon's som atotypes (p. 37) may be significant.
E ven if the correlations of any such measure with th e criterion
are low, provided they are stable, it can contribute som ething
to selection.
Most of the objective tests fall into two classes, described in
Chaps. V and V I respectively. The former are too simple and
specific, for example m ost of the / , p, and o tests. The latter,
which approxim ate more or less closely to samples of everyday
life behaviour, are too dependent on the subject’s attitu d es and
on the way they are pu t across, and, therefore, too readily
distorted by his desire to display an acceptable personality.
Nevertheless, there are some exceptions : body sway (p. 80),
m anual dexterity and bodily co-ordination (p. 80) are certainly
connected w ith emotional s ta b ility ; and the discrepancy
between verbal and spatial-m echanical abilities (p. 74) has
some significance. More hopeful are indirect measures including
handw riting pressure characteristics (p. 60), oral fluency and
the verb/adjective quotient in speech (p. 56), muscular
reactions in the L uria apparatus (p. 54) and, particularly,
measures of tension and breakdown under stress in complex
reaction, co-ordination and learning tests (p. 85). The Q-score
in Porteus Mazes (p. 63) and 1 carefulness ’ a t group tests
(p. 66) fall under the same heading. Persistence can he
measured in complex performance tests and in resistance-to-
discomfort, b u t an adequate b attery of persistence tests tends
to involve too elaborate arrangem ents and too much tim e to be
practicable.
I t is a moot point w hether a psychological te ste r’s assessments
of such traits as stability, impulsiveness, persistence, etc. based
on observation of behaviour at apparatus tests (p. 66), would
not be more reliable and valid than objective measures of such
behaviour. Both m ight be employed. The various group
2 02 Personality Tests and Assessments
observation methods developed by W ar Office Selection Boards
(p. 96) are highly acceptable to employers and candidates but,
being far more complex, they involve more subjective interpre­
tation on the part of the observer and allow more scope for th e
candidate to modify his normal behaviour. T hus they depend
greatly on the skill of the particular observer, and are generally
less valid th an m ight be expected. By contrast, tim e-sam pling
(p. 94) or other techniques of recording specified categories of
behaviour are highly objective, bu t are too time-consuming and
elaborate to be employed in selection. A compromise m ight be
worked out, where the situations would be more structured
than in m ost group exercises used a t present, and th e recording
scheme more standardized.
A nother interesting contrast m ay be draw n between group
exercises and ordinary ratings by associates. The form er make
use of rather artificial situations, which are usually too brief to
provide representative samplings of the candidates’ normal
behaviour (though superior in this respect to the conventional
interview situation). The latter are based on norm al behaviour,
usually observed over a considerable period, b u t in a casual
rath e r than system atic fashion. The prim e defect of ratings is
th a t they represent crystallized emotional attitu d es of th e raters
tow ards the ratees, even when the rating scales a tte m p t to
emphasize observed behaviour. Observers of group exercises,
or psychologists who judge qualitative behaviour a t perform ance
tests, are b etter able to m aintain im partiality in com paring
one candidate with another. Thus there is much to be said for
arranging—where possible—trial periods of a week, a m onth,
or more, on the job (or in the case of children, a t the school) for
which they are being selected. Their perform ance during this
period should be assessed, not by generalized ratings, b u t by
careful recording of specified types of b eh a v io u r; and these
should be made by a trained observer rath e r than by an em ­
ployer, head teacher, or supervisor. The objection will doubtless
be raised th a t ‘ specified types of behaviour ’ are too narrow,
and th a t an overall picture of the candidate’s personal and
social adjustm ent to the job is needed. B u t a great "deal of
evidence indicates th a t a few really reliable m easurem ents of
people provide better predictions of educational and occupa­
tional success than do subjective generalizations about their
personalities as wholes.
Conclusions and Future Developments 208
Although we have seen th a t ratings seldom contribute any­
thing more than do school m arks in educational selection, they
can nevertheless be of value in occupational selection. I t is
most desirable th a t each rater should be acquainted with th irty
or more candidates, in order th a t relative—not absolute—
judgm ents can be used. Possibly, however, a carefully
constructed forced-choice scale would overcome the distortions
due to variations of standards am ong raters of small numbers.
If neither of these approaches is practicable, the best alternative
is for the rate r to fill in a highly concrete third-person question­
naire, accom panying this with a qualitative personality sketch
or testim onial, and for the selecting psychologist to interpret
this m aterial so as to reach a final rating which will be com­
parable from one candidate to another. The Vineland Social
M aturity scale (p. 110) fits in well w ith this conception, and
there is a need for other scales along the same lines. Several
other points regarding the technique of rating are made in
Chap. V II, for example, the desirability of getting judgm ents
from raters w ith varied outlooks, and of m aking rath e r more
use of ratings or nominations by peers, less of ratings by
superiors (teachers or previous employers, etc.).
Finally, the functions of the selection interview should be
severely restricted, for the reasons given in Chap. II, though it
can seldom be dispensed w ith because of its acceptability to
candidates and employers, and its greater practicability than
th a t of more objective techniques. I t does have value in
marshalling the evidence about a candidate’s previous educa­
tional or occupational career, and it can provide useful indica­
tions of such traits as social aplomb, fluency, physical appear­
ance, etc. As a diagnostic technique for assessing personality
in general, or for synthesizing d ata from all sources into a final
judgm ent of suitability, it is—on the average—extrem ely unre­
liable and invalid, although some interviewers (not necessarily
psychologists or psychiatrists) are much better th an others.

EXPERIM ENTAL RESEARCH

There is scarcely a m ethod either praised or condemned in


this book which could not be clarified by further investigation.
Results are so variable in the personality field th a t the repetition,
extension, and interlocking of previous researches would be
204 Personality Tests and Assessments
more useful than the continued construction of more or less
novel tests. The concentrated effort which has gone into the
improvement of social attitude scales, and made them into a
genuinely useful instrum ent for sociological and opinion surveys,
might well be applied to the measurement of autonomic
variables, to time-sampling and systematic observation among
older children and adults, to interest classification and measure­
ment, to the selection and training of good interviewers, to the
development of apparatus tests which will indirectly evoke
really significant personal behaviour, to the improvement of
ratings or measurements 4 within persons ’ and of techniques
for the quantitative treatm ent of patterns of scores, and to a
host of other fruitful problems.
Perhaps the greatest need is for more studies like those of
Hartshorne and May, Eysenck, MacArthur and others, of
particular traits by the trait-composite, syndrome, and factorial
approaches, though unfortunately they have to be extremely
elaborate. This is the only approach which provides satis­
factory criteria of the validity of different tests, rating or other
techniques, and so indicates those th a t can be most profitably
employed in selection, diagnosis, and guidance. We m ust rely
on it also for the discovery of the main personality dimensions
or factors—those which cover the greatest am ount of variance.
For though it is true th a t the personalities of a set of individuals
differ in innumerable ways, and th a t no two persons possess
just the same traits differing only in amount, yet if we could
but measure even three or four of the chief emotional-social
and character dimensions and half a dozen or so of the chief
interests and social attitudes, we would probably be able to
make as accurate predictions about personality as we already
can about abilities by means of a short battery of ability tests.
And we could afford to neglect the diverse shadings in different
persons and dispense with subjective interpretations of the
‘ total personality ’. The greater the variety of tests or assess­
ments included in such composites or factors the better. B ut
it is particularly desirable to make more use than hitherto of
measures based on time-sampling or records of specified cate­
gories of behaviour. This approach can, of course, readily be
combined with the engineering of diagnostic situations, as in
B urt’s and H artshorne and May’s researches.
From the viewpoint of psychological science, there are two
Conclusions and Future Developments 205
main defects in the trait-composite approach—th a t it is 4 cross-
sectional ’ rather than dynamic, and th a t it abstracts personality
traits as though they were wholly properties of individuals.
Direct recording of behaviour should help to reconcile trait-
psychology with field theory and the study of social groups and
processes on the one hand, and with ‘ longitudinal ’ studies of
personality development on the other hand. Although we have
used the framework of tra it psychology throughout this book,
there is good reason to hope th a t some more fruitful system will
eventually replace it. Obviously there are m any other types of
contemporary research—into experimental depth psychology,
into the relations between perception and social and personality
factors, etc.—which will help in constructing an adequate
theory of personality, although a t present they have little
relevance to problems of personality assessment.

DIAGNOSIS AND GUIDANCE

The testing or assessment of personality for purposes of


diagnosis, guidance, treatm ent, or control, is much the most
intractable problem. For there is neither an external criterion
of its value (as in selection), nor an internal criterion—the
consistency of the results with one another (as in experimental
research). The success of vocational guidance has indeed been
followed up, b u t it is not possible to attribute this to any
particular element in the procedure. Agreement with psychiatric
diagnoses among mental patients also appears promising, b u t
actually leads to such variable results th a t it is hardly possible
to dissociate the test from the tester. One clinical psychologist
does well with interviewing or with Rorschach, another with
Thematic Apperception or drawings, another w ith deterioration
tests or expressive movements, and so on. Hence in order to
prove the worth of these or other methods, the diagnostician
has to rely chiefly on evidence provided by the selector or the
experimentalist. Unfortunately he is much too a p t to tru st to
the face-validity of his case-studies and to his own experience ;
these may be supremely good in individual instances and w orth­
less in others. The lack of any generally accepted framework
for the description of personality creates further difficulties;
the graphologist, Rorschachite, Freudian, or the average non­
medical vocational psychologist, talk in quite different
*5
206 Personality Tests and Assessments
languages. T hat is why we have stressed making the maximum
use of a small number of operationally defined composite
variables, in spite of their apparent narrowness.
The diagnostician is often less limited by tim e and cost than
the selector, and may well spend a dozen hours on a single case
if he applies Rorschach, T.A.T., Minnesota Multiphasic, and a
biographical interview. Kelly and Fiske’s results throw the
gravest doubts on this intensive approach, and justify some
resort to instrum ents which have received more objective
validation, even though they seem less helpful in providing
insights. Any of the techniques mentioned earlier in this
chapter under selection could play a useful p a rt in guidance,
since any which have been successfully validated provide the
clinician with checks on his intuitions. In addition, personality
inventories are more worth while than in selection, because of
the generally better motivation of candidates for guidance.
Incidentally there is no need for questionnaires to be as long as
MMPI or Strong’s Blank, even if these are a t present the best of
their kind. Needless to say, their results m ust be interpreted
with caution, b u t a t the same time the psychologist should
remember th a t his main object in using them is to correct
biases in his own subjective summing-up of the personality he
is studying. Another step which would be salutary in a
vocational, psychiatric, or other clinic with a large staff,
would be for different members to carry out th e interviewing
and the favoured projection or other tests involving subjective
judgment, instead of expecting a single psychologist to improve
his diagnoses by applying several techniques. Finally, there is
no evidence th a t psychiatric interviewers or psychologists who
make use of the concepts of abnormal psychology are any
more consistent or capable of making more valid educational
or vocational predictions than those trained in normal and
applied psychology. However valuable the medical approach
may be in therapy, th e bulk of research shows th a t it has little
to contribute to selection or guidance.
Thus we conclude as we started, th a t the testing or assessment
of human personality is fraught with so m any difficulties—it is
more complex indeed than any other problem in individual
psychology—th a t even the application of the highest psycho­
logical skill and technical accomplishment cannot be expected
to bring about rapid success.
SHORT BIBLIOGRAPHY OF SUGGESTED READING
AUport, G. W., Personality : A Psychological Interpretation. New Y o rk :
H olt, 1987.
AUport, G. W., and Vernon, P. E ., Studies in Expressive Movement. New
York : Macmillan, 1932.
Bell, J . E ., Projective Techniques. New York : Longmans Green, 1948.
Buros, O. K., The Nineteen-Forty Mental Measurements Yearbook. High­
land Park, N .J., 1941. The Third Mental Measurements Yearbook.
New Brunswick, N .J. : Rutgers University Press, 1949.
Burt, C. L., ‘ Personality, a Symposium. I. The Assessment of Person*
ality.* Brit. J. Educ. Psychol., 1945, 15, 107-121.
Cattell, R. B., An Introduction to Personality Study. London : Hutchinson,
1950.
Eysenck, H . J ., Dimensions of Personality. London : Kegan Paul,
1947.
H artshom e, H., and May, M. A., Studies in Deceit. Studies in Service and
Self-Control. Studies in the Organization of Character. New York ;
Macmillan, 1928-1980.
Fryer, D., The Measurement of Interests. New York : H olt, 1931.
Hollingworth, H. L., Vocational Psychology and Character Analysis.
New York : Appleton, 1929.
H unt, J . McV. (ed.), Personality and the Behavior Disorders. Chapters by
Jones, Mailer, White, Sheldon, Lewin, H un t and Cofer. New York :
Ronald Press, 1944.
Kelly, E. L., and Fiske, D. W., The Prediction of Performance in Clinical
Psychology. Ann Arbor, Mich. : University of Michigan Press,
1951.
Paterson, D. G., Physique and Intellect. New Y o rk : Appleton-Century*
1930.
R apaport, D., Gill, M., and Schafer, R., Diagnostic Psychological Testing.
Chicago, 111.: Year Book Publishers, 1945-1940.
Stagner, R ., Psychology of Personality. New York : McGraw-Hill, 2nd.
edit., 1948.
Strang, R ., Counseling Technics in College and Secondary School. New York :
H arper, 2nd. edit., 1949.
Symonds, P . M., Diagnosing Personality and Conduct. New York : Apple-
ton-Century, 1931.
Thurstone, L. L., and Chave, E . J ., The Measurement of Attitude. Chicago,
HI., University of Chicago Press, 1929.
Vernon, P . E ., and Parry, J . B., Personnel Selection in the British Forces.
London : University of London Press, 1949.
207
208 Personality Tests and Assessments

A RTICLES IN T H E P SY C H O L O G IC A L B U L L E T I N

A rrington, R . E ., 4 Tim e Sam pling in Studies o f Social Behavior \ 1948.


40, 81-124.
Berdie, R . P ., 4 Factors R elated to V ocational In terests 1044, 41,
137-157.
Campbell, D . T ., 4 The Indirect Assessment of Social A ttitu d es \ 1950,
47, 15-38.
Ellis, A .,4 The V alidity of Personality Q uestionnaires \ 1940, 43, 385-^440
Ellis, A., and Conrad, H . S., 4 The V alidity o f P ersonality Inventories in
M ilitary Practice *. 1948, 45, 385-426.
F rank, J . D., 4 R ecent Studies of th e Level of A spiration*. 1941, 88,
218-226.
Goodenough, F. L., and H arris, D. B., 4 Studies in th e Psychology of
Children’s Drawings \ II. 1928-1949. 1950, 47, 369^,33.
H ertz, M. R ., 4 Rorschach : T w enty Y ears A fter *. 1942, 39, 529-572.
Jenkins, W. O .,4 A Review of Leadership Studies with P articu lar Reference
to M ilitary Problem s \ 1947, 44, 54-79.
McNemar, Q., 4 O pinion-A ttitude Methodology \ 1946, 43, 289-374.
O rlansky, H ., 4 In fa n t Care and Personality \ 1949, 40, 1-48.
R yans, D. G., ‘ The M easurem ent of Persistence : An H istorical R ev iew ’.
1939, 36, 715-739.
Sanford, F . H ., 4 Speech and Personality \ 1942, 39, 811-845.
Sargent, II., 4 Projective Methods : Their Origins, Theory, and Application
in P ersonality Research \ 1945, 42, 257-293.
Vemon, P. E ., 4 The M atching Method Applied to Investigations of
Personality \ 1936, 33, 149-177.
I N D E X OF A U T H O R S
Abt, L. E ., 171 Binet, A., 88, 181, 186
Ackner, B., 188 Blackwell, A. M., 147
Adams, H . F ., 119 B obbitt, J . M., 24
Adler, A., 38, 127 Bobertag, O., 58
Adorno, T. W., 77, 158 Bolgar, H ., 196
Ainsworth, M. D ., 188 Bousfield, W . A., 71
Albino, R. C., 55 Boyd, W., 132-3, 140, 173, 200
Alexander, F., 188 Bradshaw, F. F., 107
Alexander, W. P ., 14 Bridges, J . W., 177
Allport, F. H ., 124, 127, 131, 137, Bridges, K. M. B., 109, 177
143, 144, 180 Brody, M. B., 72
Allport, G. W., 2, 4, 6, 18, 50-2, Brogden, H. E ., 14
56-7, 60-1, 69, 71, 122, 124, 127, Bronner, A. F ., 40
131, 137, 143, 144, 158, 162, 207 Brow n, W., 87
Anastasi, A., 192 Brown, W. M., 67
Anstey, E ., 112 Buck, J . N., 193
Arnheim, R ., 49, 58, 180 Buhler, C., 14, 61, 190-1, 196
A rrington, R. E ., 94, 208 Biihler, K ., 190-1
Ash, P., 25 Bullough, E., 194
Burnham , P. S., 166
Babcock, H ., 72 Buros, O. K., 135, 207
Baier, D. E ., 113 B urt, C. L., 7, 10, 15-16, 19, 29-30,
Baines, A. H . J ., 112 36, 39, 47, 48, 62, 68, 93, 96-7,
Balken, E. R ., 57 103, 114-15, 124, 181, 183-4,
Ball, R . J., 86-7 194-5, 204, 207
Banks, C., 7, 15, 19 Busem ann, A., 57
Beck, S. J ., 187-8
Behn-Eschenburg, H ., 189 Cady, V. M., 93, 124
Bell, H . M., 132, 141, 200 Cameron, D. C., 24
Bell, J . E ., 57, 59, 170-1, 183, Campbell, D. T., 159-61, 208
191-2, 207 Cantril, H., 24, 56, 59, 159, 173
Beliak, L., 171 Carter, L., 5
Bender, L., 61 Cason, H ., 128
Benjamin, J . D ., 190 Cattell, R . B., 2, 6, 11-13, 15-16,
B ennett, E ., 125-6, 129, 175-6, 200 19, 41, 70, 72, 76, 78, 86, 90-1,
Berdie, R. F ., 166, 208 129, 134, 146, 161, 170, 172-3,
Berg, E . A., 74 197 8, 207
Berlyne, D. E ., 161 C hant, S. N. F ., 128
Berm an, L., 40 Chapman, J . C., 72
Bernard, J ., 184 Chave, E . J ., 146-7, 152, 207
Bernreuter, R. G., 127, 131-2, 134, Chi, P. L., 116
139, 141, 166, 174, 200 Child, I. L., 37
Bernstein, E ., 76 Clark, K. E ., 164
Biesheuvel, S., 66 Clarke, A. D. B., 55
209
2 1 0 Personality Tests and Assessments
Cleeton, G. U .f 52 Ferguson, L. W., 88, 118
Cockett, R ., 70 F em ald, G. G., 83
Cofer, C. N ., 207 Fischer, L. K ., 196
Collins, M., 175-6 Fisher, S., 78
Connor, D . V., 79, 113 Fisher, V. E ., 178
Conrad, H . S., 141-3, 208 Fiske, D. W ., 15, 26-9, 87, 160,
Courthial, A., 176 179, 188, 206, 207
Crawford, A. B., 166 Flanagan, J . C., 182
Crissy, W. J . E ., 166 Flemming, E . G., 177
Cronbach, L. J ., 19, 140, 188, 191 Flugel, J . C., 176
Crown, S., 148-9, 152, 156, 177, Foley, J . P., 192
200 Forlario, G., 185
Culpin, M., 85-6 Foulds, G. A., 66, 179
Cunningham , E . M., 94 F rank, J . D., 88, 208
F rank, L. K ., 170
D arling, R . P ., 41 Fraser, J . Munro, 99
Darrow , C. W., 43-4 Freem an, G. L., 86
D arw in, Charles, 46 French, J . W., 84
Davidson, H. H ., 149, 186 Frenkel, E ., 14
Davies, D . T ., 39 Frenkel-Brun8wick, E ., 77, 158
Davis, R . C., 53 F reud, Sigmund, 4, 15, 50, 170-1,
Deri, S., 185 193
Dispensa, J ., 41 Freyd, M., 62, 107, 124, 126, 168
Doll, E. A., 71, 110-11 F ruchter, B., 67
Dorcus, R . M., 131, 134 Fryer, D ., 90-1, 168, 207
Downey, J . E ., 68, 81 F u rch tg o tt, 40
Duffy, E ., 55, 166 Furfey, P . H „ 167-8
D unbar, H . F ., 38
D urea, M. A., 177 Gahagan, L., 49
D usenbury, D ., 55 Gallon, Francis, 34, 172
Gardner, D. E . M., 96
Karl, C. J . C., 74 Gardner, J . W., 89
Ebaugh, F . G., 190 G arretson, O. K ., 165
Edw ards, A. L ., 156 Gilchrist, J . C., 40
Eisenberg, P ., 140 GUI, M., 73, 172-8, 181-2, 207
Ellis, A., 137-8, 141-3, 208 Gilliland, A. R ., 89-90, 173
E nke, W., 53 Gilmour, J . S. L., 180
Estes, S. G., 52 Glaser, E. M., 160
Eysenck, H . J ., 3, 10-12, 14-15, Goldman, F ., 15
17-18, 3^-7, 39, 41, 52, 59, 70, Goldstein, K., 74
76, 79-80, 84, 87-9, 103, 125, Goodenough, F. L., 62, 94, 192-8,
127,142,146,148-9,151-2,155-6, 208
158-9, 180, 191-2, 194, 197-8, Gordon, L. V., 123
204, 207 Griffiths, R ., 180
G uertin, W . H ., 78
Farm er, E ., 85-6 Guilford, J . P ., 27, 66, 79, 91, 104,
Faterson, H. F ., 38 131, 133-4, 136, 167, 180, 188,
Fay, P. J ., 56 200
Fearing, F., 24 Guilford, R. B., 133-4
Ferguson, G. A., 77 G uttm an, L., 128, 154, 156
Index of Authors 211
H aggerty, M. E ., 28, 111, 121, 174 Jasper, H . H ., 70, 128
H am m ond, K. R ., 161 Ja sta k , J ., 73
H anfm ann, E ., 74 Jenkins, W . O., 81, 114, 208
H anna, J . V., 141 Johnson, W. B., 50, 189
H arris, D. B., 192, 208 Jones, E . S., 207
H arris, D. H ., 142 Jones, H . E ., 38
H arrison, R ., 182-8 Jordan, D ., 147
H arrow er-Erickson, M. R ., 189-91 Jung, C. G., 15,10, 17, 18, 120, 172,
H artog, P ., 24 188, 192-3
H artshom e, H ., 5, 7, 14, 84, 93-4,
118, 110, 149, 204, 207 Kaldegg, A., 179
H arvey, O. L., 58 K atz, D., 144
H athaw ay, S. R ., 129-80 K ehr, T., 80
H aythom , W., 5 Kelley, D, M., 187
H ealy, W., 40 Kelley, G., 190
H eath, L. L., 43 Kelley, T. L., 175
H eath, S. R ., 80 Kelly, E . L., 20-9, 100, 179, 183,
H eidbreder, E ., 109, 124, 120-7, 200, 207
131 Kellmer-Pringle, M. L., 110
Heist, A. B., 90 K enna, J . C., 75-0
Henle, M., 50 K ent, G. H ., 23, 108, 172, 174-0
H errington, L. P ., 41 K err, M., 192
H ertz, M. R ., 180, 189, 208 K ilpatrick, F. P., 150
H ildreth, G., 178 K irkpatrick, C., 157
Hill, A. B., 24 Klages, L., 58
Hill, D ., 44 Klopfer, B., 180-8
Himmelweit, H . T ., 20, 52, 07, 09, K night, F. B., 52, 118
74, 88-9, 170, 178, 192 Knower, F. H ., 55
Hollingworth, H . L., 23, 115, 119, K ohlstedt, K. D., 120
135-0, 139, 207 K ounin, J . S., 77
H orowitz, E . L ., 101 K ram er, B. M., 158
Howell, M., 5 K retschm er, E ., 10, 17, 85-8, 127,
Howells, T. H., 84 180
Hubbell, M. B ., 50 K rey, A. C., 175
H ull, C. L., 58, 80, 104 K rout, M. H ., 51
H ull, H . L., 84-0 K rugm an, J . E ., 190
Humrn, D. G., 130-1, 140, 143 K uder, G. F ., 22, 110, 103, 105, 107,
H unt, J . M., 79 199
H unt, J . McV., 72, 88, 89, 207 Kulp, D. H ., 149
H unt, T ., 92
H u n t, W . A., 28 Lacey, J . I., 00, 91, 131, 130, 183
H untley, C. W „ 53 Laird, D. A., 109, 124-6, 129, 131
Hus4n, T ., 122 L am bert, C. M., 91-2, 200
L and, A. H ., 58
Jackson, L ., 182 Landis, C., 40, 48, 101
Jacobson, E ., 53^4 Langdon, J . N ., 120
Jacoby, H . J., 57 Lankes, W., 70
Jaensch, E ., 17, 42 L athers, E ., 180
Jam es, H . E . O., 150 L avater, J , C., 32
Janies, William, 180 Lefever, W. D., 190-1
212 Personality Tests and Assessments
Lentz, T. F ., 147, 151 Moreno, J . L., 114, 196
Levy, L., 72 Morgan, C. D., 181
Lewin, K., 77, 89, 96, 207 Morgan, J . J . B .f 84-6, 160
Lichtenstein, A., 158 M orgenthaler, W., 34
L ikert, R ., 148, 151, 158 Morse, M. E ., 187
Lindzey, G., 162 Morton, J . T., 160
L ippitt, R ., 96 Moses, P. J., 55
Lombroso, C., 33 Moss, F. A., 92, 200
Lord, E ., 189 Munroe, R ., 59, 190-1
Lovell, C., 134 Murdock, K., 34
Lowell, F ., 174 Murphy, G., 173
Lowenfeld, M., 191-2, 196 M urphy, L. B., 96
L uborsky, L. B., 197-8 M urray, H. A., 52, 180-2
Luchins, A. S., 77 Myers, C. R ., 128
Ludgate, K. E ., 82 Myers, C. S., 194
L uria, A. R ., 54-5, 103, 172, 201
Lurie, L. A., 40, 103 Napoli, P. J ., 193
Neilon, P., 4
M acArthur, R . S., 14, 84-5, 114, N eum ann, G. B., 149
185, 204 Newcomb, T. M., 95-6, 103, 118
Magson, E. H ., 24 Newman, S. H., 24, 55-6
M alamud, D . I., 177, 200 N eym ann, C. A., 126
Mailer, J . B., 14, 134-5, 177, 207 N orth, R. IX, 16
Manson, G. E ., 67 N o tcu tt, B., 69, 76
Marrow, A. J ., 173
M arston, L. R ., 109 O’Connor, J ., 79, 174
M artin, H . L., 27, 133 O ’Connor, N ., 71, 80
Maslow, A. H ., 128 Office of Scientific Research and
Masserman, J . H ., 57 D evelopment, 125
M ather, V. G., 55-6 Office o f Strategic Services Staff,
M athews, E ., 124 97, 181
May, M. A., 5, 7, 14, 84, 93-4, 113, Oldfield, R. C., 20
116, 149, 204, 207 Oliver, J . A., 77
McClelland, W., 25, 106 Olson, W. C., 23, 94-5, 111, 121, 174
McDougall, W., 4, 42, 79, 85-6, 145, Omwake, K. T., 92
146 O rlansky, H ., 3, 208
McKinley, J . C., 129-30
McNemar, Q., 146, 159, 208 P arry, J . B., 1, 9, 20, 24, 25, 30, 66,
Meadows, A. W., 209 91, 97, 136, 165, 199, 207
Meister, R. K ., 87 P arten, M., 94-5
Meltzer, H ., 173-4 Pascal, G. R ., 61
M iddleton, W . C., 56 P aterson, D. G., 32, 207
Miles, C. C., 34, 169 Payne, A. F., 178
Mira, E ., 61 Pear, T. H., 55
M ittelm ann, B., 125 Pearson, K., 34
Mons, W., 185 Peel, E. A., 91-2, 200
Montgomery, R. B., 58 Peterson, R. C., 146
Moore, H. T., 89-90, 173 P etrie, A., 52, 70, 176, 178
Moore, T. V., 10 Phelps, L. W., 48
Moran, T. F ., 142 Phillips, R ., 196
Index of Authors 218
Piaget, J ., 56 Scheerer, M., 74
P inard, J . W., 76 Scholl, R ., 127
Pintler, M. H ., 196 Schwegler, R. A., 69
P intner, R ., 47, 185 Sears, R. R ., 4, 196
Porteous, S. D ., 62-3, 66, 74, 108, Sedgwick, C. H . W., 71
201 Sen, A., 183-4, 189-90
Powers, E ., 59 Shakespeare, William, 35
Prell, D . B., 3 Sheldon, W. H., 34, 37-8, 201, 207
Pressey, L. C., 176 Shepler, B. F ., 169
Pressey, S. L., 126, 168, 175-7, 197, Shipley, W. C., 72
200-1 Shneidm ann, E. S., 167, 184
Shoben, E. J ., 149
R a ath , M. J ., 18, 15 Shor, J ., 178
R abin, A. I., 73 Silance, E . B., 146-7
Radclyffe, E. J . D ., 176 Slater, E ., 39
R and, H . A., 59 Slater, P ., 39, 125-6, 129, 169,
R apaport, D., 73, 172-3, 181-2, 207 175-6, 200
Raubenheim er, A. S., 93 Slawson, J ., 118
R aven, J . C., 74, 179 Sletto, R. F ., 137, 149, 151-2
Read, H ., 193 Sm ith, H . C., 37
Remmers, H . H ., 146-7 Sm ith, K. U., 150
Rethlingshafer, D., 78 Sm ith, M., 39, 79, 85-6
Rey, A., 86-7 Sm ith, R . B., 137
R eybum , H . A., 13, 15 Sm ith, W. W., 43, 172
Rhodes, E. C., 24 Spearm an, C. E ., 68, 74
Rice, S. A., 21, 48, 150 Spranger, E ., 17, 18, 162
Richardson, M. W ., 22, 110 Stagner, R ., 6, 207
Rim aldi, H . J . A., 69 Staines, R. G., 75-6
R oback, A. A., 35 Stein Lewinson, T., 59
Rogers, C. A., 56, 70-1 Steiner, M. E ., 190
Rohde, A. R ., 178 Stephenson, W., 17, 19, 70, 114-5
Rokeach, M., 77 Stevens, S. S., 37-8
R oot, A. R ., 126 Stevenson, I., 28
Rorschach, H ., 23, 27-8, 183, Stouffer, S. A., 125, 149-51, 156
185-90, 193, 200, 205-6 Strang, R ., 120, 207
Rosanoff, A. J ., 23, 130, 168, 172, Strong, E . K ., 23, 27, 111, 161-9,
174-6 174, 197, 199, 206
Rosenzweig, S., 184-5 S tudm an, G. L., 70
R otter, J . B ., 89, 178, 182-3 S tuit, D. B., 25, 136, 143
R undquist, E. A., 149, 151-2 Sullivan, L. R ., 34
R yans, D. G., 14, 84-5, 208 Summerfield, A., 26, 67
Suttell, B. J ,. 61
Sanders, C., 135 Symonds, P . M., 92-3, 104, 178,
Sandler, J ., 188 183, 207
Sanford, F. H., 56, 208 Szondi, L., 185
Sanford, R. N ., 13, 41, 182
Sargent, H ., 171, 180, 208 T aft, R ., 118
Saudek, R ., 58 Tenen, C., 150
Saunders, D. R ., 11 Term an, L. M., 34, 71, 168, 169
Schafer, R ., 73, 172-3, 181-2, 207 Theiss, H ., 58
214 Personality Tests and Assessments
Thiesen, J . W ., 87 W allen, R ., 129
Thom as, D . S., 94-5 W alter, W. G., 44
Thompson, J . R ., 71 W alton, R . D., 79
Thomson, G. H ., 71 W ang, C. K. A., 128, 150
T horndike, E . L., 5, 107 W atson, G. B., 158-00
T hornton, G. R ., 47, 85 W atson, H ., 19
Thouless, R . H ., 100 W attereon, D ., 44
T hurstone, L. L ., 12, 70, 81, 110, W atts, A. F ., 102
124-5, 128-9, 181, 141-2, 14&-54, W ebb, E ., 13, 15, 24, 08, 108
150-7, 207 W eber, C. O., 108
T hurstone, T . G., 125 Wechsler, D., 72
Tizard, J ., 71, 80 Wedell, C., 150
Tomkins, S. S., 182 W eider, A., 125
T ravers, R. M. W., 113, 101 WeigI, E ., 74
Tulchin, S. H ., 180 Wells, F. L., 173
T urner, W . D ., I l l W enger, M. A., 41
Tyler, L. E „ 107 W erner, H ., 77-8
W hite, R . K ., 90
U hrbrock, R. S., 47 W hite, R . W., 207
W ickman, E . K ., 23, 111, 121, 174
V alentine, C. W., 194 Williams, D . J ., 24
Vance, J . G., 50 Willoughby, R . R ., 109-10, 128-9,
Van Lennep, D . J ., 181-2 187
Vernon, M. D., 00 Wilson, A. T. M., 39
Vernon, P. E ., 0, 9, 11, 14, 15, 18, Wilson, N. A. B., 99
20, 24, 25, 28-50, 48-52, 57, 00-1, W ittenhom , J . R ., 10
03, 00, 09-70, 91, 97-9, 105, 119, W ittson, C. L., 28
120, 123, 180, 140, 155, 158, 102, Wolff, C., 83
105, 190, 199, 207, 208 Wolff, W ., 52-3, 58, 180
V etter, G. B., 148, 151 Woodrow, H ., 174
Vigotsky, h . S., 74 W oodworth, R. S., 124-5, 129, 139,
Viola, G., 30 141
Voelker, P . F ., 93 WTy a tt, F., 182-8
W y att, S., 120
W adsw orth, G. W ., 130-1, 140, W ym an, J . B., 108, 175
148 W ynn Jones, L., 08
W aehner, T. S., 59, 193
W alker, K . F ., 75-0 Zangwill, O. L., 80-7
W alker, L., 180 Zimm ermann, W. S., 107
I N D E X OF SUBJECTS
Abilities indicative of personality, | Attitudes, nature of, 144-5, 161
25-6, 47, 62-7, 69-75, 88-4, 88, I A ttitudes to personality tests, 62,
91-3, 201 78, 88-4, 89-90, 98, 117, 128,
Ability to judge personality, 25, 80, 187-40, 142-3, 157, 165-6, 171,
47, 53, 98, 100, 118-19, 206 201, 206
Abstraction, 72-5 Aussage test, 88, 161
Achievement Quotient, 72 Autobiographies, 122
Acquaintanceship, effects on ratings, Autokinetic effect, 87
118 Autonomic balance, 41, 201, 204
Age, personality and, 14, 72, 167-8,
194 Babcock deterioration tests, 72
Aggressiveness, 6-9, 11, 84, 89-90, Ball and Field test, 61
96, 178 Bell’s Adjustm ent Inventory, 132,
Allport*8 A-S Reaction Study, 124, 141, 200
127, 131, 187-8, 143 Bem reuter Personality Inventory,
Allport-Vemon-Lindzey Study of 181-2, 184, 188-9, 141, 166, 174,
Values, 162, 165-7, 173, 199 200
Altruism, 98-4, 111 Binet progressive lines and weights,
American Council on Education 88
rating scale, 107 Binet-Simon, see Stanford-Binet.
Analytic rating scales, 108-9 Biographical inventories, 135--6, 200
Annoyances test, 125-6, 128-9 B irth order, 38
Anti-semitism, 144, 148-9, 152 Blonde and brunette traits, 82-3
Army officers, selection of, 1, 24-5, Body-sway test, 80, 87, 142, 201
31, 88, 97-9, 107, 114, 136, 178-9, Boyd’s Personality Questionnaire,
181, 183, 200 132-3, 140, 200
Artistic productions and style, 17, Brain characteristics and person­
50, 61, 170-1 ality, 85
Artistic tastes, 192-5 Brain injury, 72, 74
A8cendance-submis8ion, 13, 16, 41, Buck’s H -T-P test, 198
56, 65, 124, 131-2, 135, 138-9,
200 Case-study methods, 20, 80, 49,
Asthenic-pyknic types, 86-7, 39, 41, 120-1, 192-8, 205
53, 55-6, 69, 80, 180 CatteU’s 16 P-F test, 184
Astrology, 10, 82-3 Caution, see Impulsiveness.
A ttitude-Interest Analysis (M-F) Character, see Moral Character.
test, 169 Character Education Inquiry, 98-4
A ttitude scaling, 110, 124, 128, Cheating tests, 93-4
152-4, 156 Chevreuil pendulum, 87
A ttitude test construction, 146, Child Guidance Clinics, 1, 78
149-57 Child upbringing, attitudes to, 149
A ttitude tests, indirect, 159-61 Chirognomy, 32-3
A ttitudes, measurement of, 2, 59, Civil Service selection, 24, 28-81,
71, 88, 90, 129, 144-62, 204 98-9, 120-1, 179, 181, 184, 200
216 Personality Tests and Assessments
Clinical interpretation of personal­ Em otionality, see Emotional In ­
ity, 6-7, 17-19, 27-8, 30, 122, stability.
130, 171-3, 178, 183, 204-6 Emotional m aturity, 109-10, 128,
Clinical psychologists, selection of, 167-8, 177, 183
26-8, 166, 179, 183, 190 Emotions from facial expression,
C.M.S. (Cursive Miniature Situation) 46-7
test, 86 Emotions from gestures, 51
Concept formation, 72-5, 77 Emotions from voice, 55
Cornell Selectee Index, 125, 142 Endocrine glands, 2, 33-^1, 39-42
Correlations between persons, 19, Error scores, 66-7
114-5, 194 E thical knowledge and judgment,
Criminals, see Delinquents. 92-3, 149, 157
Criminals, physical features of, 33 Ethnocentrism, 77, 158
Criterion analysis, 10-12, 36 Excitability, 3, 55
Cross-validation, 23 Expressive movements, 20, 22, 45-
Cyclothyme-schizothyme, 13,16-17, 68, 98, 101, 103, 170, 180, 205
36, 127, 180 Expressive vs. adaptive behaviour,
50-1
Delinquents, personality character•: Extraversion-introveraion, 6, 13,
istics of, 1, 7, 10, 19, 40, 44, 66, 15-17, 35-7, 44, 56, 58, 63, 69-71,
74, 83, 86, 93, 96, 130, 135, 141, 74, 76, 79, 92, 95-6, 109, 119, 122,
168, 176-7, 179, 193 124, 126-7, 129, 131-5, 138-9,
Dependability, 2, 12-16, 65, 76, 101, 175, 180, 183, 188, 194-5, 197,
115, 117 200
Deterioration, 72-5, 205
Diagnostic tests, 73, 130-1, 173, Face-validity, 98, 100, 205
183, 190 Facial expression, 46-50, 63, 101
Digit memory tests, 72-3 Factor analysis, 6, 11-16, 18, 22,
Disposition rigidity, 76 36-7, 41, 68-9, 76, 81, 85, 94, 109,
Dominance, see Ascendance. 115-16, 128, 132—4, 145-6, 156,
D otting machine, 66, 85-6, 103 162, 189, 192, 194, 197-8, 204
Downey tem peram ent tests, see Finger-painting, 171, 193
Will-temperament. Flexibility, 68, 75-8, 80-1
Draw-a-Man test, 192-3 Fluency, 56, 64-5, 68-71, 189, 201,
Drawings and paintings, 52, 171, 203
179-80, 192-3, 205 Food aversions test, 129
Dream analysis, 20, 170-1, 192 Forced choice technique, 112-13,
121, 123, 140, 162, 199-200, 203
Educational selection and guidance, Form vs. colour dominance, 80
1, 10-11, 14, 16, 20-1, 24-6, 67, Freyd-Heidbreder’8 Introversion
84, 91-2, 100, 111-2, 119-21, test, 109, 124, 126, 131
180, 136, 165, 199-206 Functional autonomy, 4
Efficiency ratings, 107-8, 112, 117
Eidetic imagery types, 42 Gallup polls, 158-9
Electroencephalograph, 44-5, 201 George W ashington Social In tel­
Emotional instability, 3, 11, 13-15, ligence test, 92, 200
89-40, 44, 53-4, 57-8, 63, 65, 71, Gestures, 50-3
74, 79, 85-7, 96, 113, 122, 129, Goldstein-Scheerer tests, 74
135, 139, 142, 178-5, 178, 187, G ottschaldt figures, 80
190, 196, 200-1 Graphic rating scales, 107-8
Index of Subjects 217
Graphology, see H andw riting. In tern al consistency, 22-3, 123,
Group observation methods, 27, 29, 137, 139, 143
83, 96-100, 103, 196, 201-2 Internationalism , 148-9, see also
Guess-Who ratings, 113, 134 E thnocentrism .
Guilford-M artin P ersonality I n ­ Interview s, 19-20, 31, 57, 62, 96-
ventories, 27, 133-4, 200 100, 117, 120-1, 124, 143, 150,
166-7, 202-3, 205-6
H aggerty-Olson-W ickman Behavior Item analysis, 123-^, 137, 151-6,
Schedules, 23, 111-12, 121, 174 162
H alo effect, 5, 13, 53, 102, 108,
115-19, 122, 137-8, 108 K ent-Rosanoff word association, 23,
H andw riting and personality, 9, 19, 168, 172, 174-6
49-53, 50-01, 69, 78, 81, 170, 193, Kohs Blocks, 74
205 K uder Preference Record, 163, 165,
H andw riting pressure, 60-1, 201 199
H ead size and shape, 32-5
H eight and personality, 34-5 L aird’s Personal Inventories B2,
H ereditary factors in personality, 124-5, 129
2-3, 81-2 L aird’s Personal Inventories C2,
H onesty, 5-6, 14, 93-4, 137 126, 131
H um m -W adsw orth Tem peram ent L aird’s Personal Inventories C3,
Scale, 130-1, 140, 143 109
H um our, sense of, 11, 47, 197-8 Language, see Voice.
H ypnotizability, 87 Leaderless group tests, 97-8, 100
H ysteric and dvsthym ic patients, Leadership, 1, 4-6, 11, 34, 81, 90,
10, 13, 15, 36,* 41, 57, 69-70, 74, 95-7, 106-7
79, 84, 87, 89, 125-6, 129-30, 197 Lefthandedness, 39
Level of aspiration, 88-9
Ideographic vs. nom othetic ap ­ Lie detection tests, 42
proaches, 17-18 L iterary productions and style, 17,
Impulsiveness, 11, 13, 41, 56, 63, 49-50, 52, 61, 170, 180-1
66-7, 81, 87, 140, 201 Lowenfeld Mosaics test, 171, 191-8
Incom pleted tasks, memory for, 78 Lowenfeld W orld test, 196
Indirect tests, 62-7, 98, 159-61, 201 Luria ap paratus, 54-5, 103, 172,
Industrial executives, selection of, 201
99
Inferiority feelings, 38, 109, 127 Make a P icture Story test, 184
Inform ation tests of attitudes, 161 M aladjusted children, 1, 20, 39 ^ 0 ,
Inform ation tests of interests, 91-2, 61, 71, 73, 79-80, 95, 111, 182,
200 134-6, 141, 149, 192-3
Inform ation tests of moral know­ M aladjusted college students, 59,
ledge, 92-3 128, 182-3, 130 7, 139, 141-2,
Inkblots, 70, 88, 180, 186 190-1
Inkblots, see also Rorschach. Mailer’s Character and Personality
Integrate-distintegrate tvpes, 17, Sketches, 134-5, 143
42, 80 M anual dexterity, see Psychom otor
Interests, m easurem ent of, 2, 16, tests.
25, 90-2, 129, 144, 158, 161-9, Masculinity-femininity, 16, 84, 40,
196-7, 204 180, 168-9
Interests, nature of, 161 Mass O bservation, 90, 158-9
218 Personality Tests and Assessments
Matching technique, 4, 19, 48-9, 52, Observational methods, 94-100, 204
50, 58-9, 68, 188, 190, 192-5 O’Connor Tweezer D exterity test,
Maudflley Medical Questionnaire, 79-80
125, 142 Only children, 89
Measurement of personality, 7-19 Opinion surveys, 158-9, 204
Memory, 72, 161 Oscillation, see Variability.
Memory for Designs test, 61, 78 O verstatem ent tests, 93
Mental deficiency, 71, 74, 77, 80, 111
Merit ratings, 103, 105, 119. See Palm istry, 32-4
also Efficiency ratings. P attern analysis, 19, 204
Merrill-Palmer test, 62 Peptic ulcer, 89
Miller Analogies, 27 Perceptual tests, 17, 79-81,161, 205
Miniature and real-life situation Performance tests, 62-6, 74, 100,
tests, 7-9, 14, 68, 83-100, 201 118, 201
Minnesota Multiphasic Personality Perseveration, 68, 75-8, 87, 201
Inventory, 129-80, 140, 142-3, Persistence, 6, 11, 13-14, 65, 78,
169, 206 88-5, 98-4, 128, 142, 144, 185, 201
Mirror drawing, 75, 87 Personal documents, 122
Moral character, 2, 5, 13-14, 47-8. Personality, 2-7, 170-1, 205
71, 93-4, 175, 204 Personality inventories, see Ques­
Morgan and Hull’s maze test, 84, 86 tionnaires.
Morphological indices, 36-7 Personality testing, difficulties of,
Motor tests, see Psychomotor. 2, 6-7, 206
Muscle tension, 42, 53-5, 201 Personality testing, uses of, 1-2,
Myokinetic diagnosis, 61 199, 206
Personal tempo, 69
National Institute of Industrial Philosophic views, 86, 180
Psychology, 80, 121 Photographs, judgments based on,
NDRC Short Form at, 125, 142 46-9, 56
Nervous habits, 51, 95-6 Phrenology, 32-5
Neuropsychiatric Screening Ad­ Physical signs of personality, 20,
junct, 125, 188, 142 32-45, 66, 201
Neuroticism, 10-15, 55, 79-80, 84-5, Physiognomy, 10, 82-8, 52, 58
181—4, 188-9, 142-8, 177, 188, Physiological tests, 40-4, 201
189, 191 Physique and personality, 17, 84-8,
Neurotics, personality character­ 41-2, 45, 56, 64
istics of, 1, 4, 7, 12, 89, 52, 59, 61, Pilots, selection of, 88, 44, 66, 91,
66, 71-5, 85-7, 124-6, 129-80, 186, 148, 190
185-6, 189, 141, 172-4, 176-7, Pintner’s Aspects of Personality,
188, 188-9, 191-2, 205 135
Neurovoltmeter, 54 Play techniques, 94-6, 100, 161,
Neymann-Kohlstedt introversion 171, 195-7
test, 126 Porteus Mazes test, 62-8, 66, 74,
Nominations, 114, 208 103, 201
Nordic vs. Mediterranean types, 83, Prejudice, 11, 21, 77, 145, 158-60
76 Pressey Interest-A ttitude test, 170-
7, 200-1
Objective tests, 7-10, 12, 14, 18-19, Pressey X-O tests, 126, 168, 175-7,
21, 28, 27-81, 68-94, 98, 102-8, 197
142, 161, 168, 171, 198, 201, 208 Progressive Matrices test, 73-4
Index of Subjects 219
Progressive vs. orthodox schooling, Response sets, 140, 160, 167
96 Rey’8 learning test, 86
Projection, 170-1 Rigidity, 75-8, 89
Projective techniques, 20, 22-3, Rorschach inkblots, 23, 27-8, 183,
27-30, 96-7, 161, 170-98, 200-1, 185-91, 193, 200, 205-6
206-6 Rosenzweig’s Picture-Frustration
Propaganda, effects of, 158 tests, 184-5, 200
Psychiatric diagnoses, 11, 22, 24-5,
28, 73, 120-1, 130, 143, 173, 190, Scale analysis, 123, 154, 156
203, 205-6 Scaling of marks and ratings, 105-6
Psychoanalysis, 3^4, 137-8, 170-1, Schemata, 101-3
173, 185, 188, 196, 205 Schizothyme, see Cyclothyme.
Psychogalvanic reflex, 41-5, 55, 87, School teachers* judgments, 8, 24-6,
90, 172-3, 201 105, 111-12, 114-16, 121, 203
Psychomotor tests, 66, 68, 75-6, Screening tests for recruits, 125,
78-80, 86, 88, 201 142-3
Psychopaths, 44, 130 Selectivity, effects on correlations,
Psychotic patients, 10, 15-17, 28, 26-7, 29
28, 35-6, 52, 56, 59, 70-5, 78-9, Self-confidence, 11, 88-9, 128, 182,
86, 126, 129-30, 141, 168, 173-4, 139
176, 183-4, 189, 192, 205 Self-description test, 179
Pupil record cards, 20, 103, 105, 121 Self-ratings, 122, 128
Pyknic, see Asthenic. Sensory defects, 88-9
Sensory tests, 68, 75-6, 79
Q (Quality) score, 63, 66, 201 Sentence Completion test, 27, 171,
Q-technique, 17, 115 177-9, 200
Questionnaire tests, 8,10,12,15, 23, Sex differences, 10, 40, 58, 89, 119,
38, 59, 96-7, 122-44, 150-2, 160, 133, 169
171, 192, 197, 206 Shipley-Hartford test, 72
Questionnaires, third-person, 109, Situations, see Group observation
122, 203 methods.
Sociability, 2, 5,11, 15-16, 86-7, 41,
Radicalism-conservatism, 122, 184, 47, 92, 94-5, 132
144-8, 152, 157-8, 160-1, 194 Social attitudes, see A ttitudes.
Rail-walking test, 80, 86, 201 Social factors, effects on personality,
Ranking methods, 47-8, 108-4 4-5, 7, 84, 205
Raters, training of, 106, 115-17,119 Social intelligence, 92, 200
Rating methods, 4-5, 9-10, 12-16, Social m aturity, 109-11
22, 24, 52, 93, 98, 101-22, 140, Sociodrama, 196
202-4 Sociometry, 29, 96, l i 4
Rating scales, 8, 20, 62-5, 100-18 Soma to t ypes, 37-8, 201
Raven’s Controlled Projection test, Sorting tests, 74-5
179 Spearman-Brown prophecy formula,
Reaction times, 12, 69, 78, 172 117
Reliability of character, see De­ Speech, see Voice.
pendability. Speed, 68-70, 81
Reliability, statistical, 21-8, 31, 95, Standards, variations of in rating,
108-9, 123, 154-7, 166, 188-9 104-12, 117, 203
Religious attitudes, 92, 144-6, 149, Stanford-Binet test, 61, 78-^1
157-8, 160 Static ataxia, 80, 201
220 Personality Tests and Assessments
S tatistical techniques in personality Types o f personality and tem pera­
study, 19 m ent, 16-18, 35-40, 80, 194-5
Stereotypes, 17, 46, 48-9, 56, 102, Type/Token ratio, 57
152, 157
S tory-telling tests, 52, 179-80 Unconscious mechanisms, 8-5, 9,
Stress tests, 66, 85-7, 103, 201 39, 51, 53, 59, 101, 139, 144,
S trong Vocational In terest Blank, 170-2, 183, 185, 188, 193, 195-6,
28, 27, 111, 161-9, 174, 197, 199, 206
206
Suggestibility, 77, 80, 87-8 Validation of personality tests, 9-10,
Suggestion in personality question­ 21-2, 31, 58, 98, 141-3, 199, 204-5
naires, 188-9 Values, types of, 17, 162
Surgency, 18, 15-16, 56, 70-1, V ariability, 68, 73-4, 78-81, 85-6,
197 201
S utton Booklet, 125, 129, 175-6, Verb-Adjective ratio, 57, 201
200 V igotsky Blocks, 74
S ym pathetic nervous system , 41 Vineland Social M aturity Scale,
Syndromes, 10-11, 17-18, 204 71, 110, 203
S ynthetic-analytic types, 80 Visual-Motor G estalt test, 61, 171
Szondi test, 185 Vocational selection and guidance,
1, 14, 16, 18-21, 23-6, 28-30, 38,
T.A .T., see Them atic Apperception. 48, 84, 91, 97-100, 119-21, 131,
T em peram ent, 2-3, 33, 3 5 ^ il, 68, 134-6, 163-6, 183, 190, 199-206
76, 81-2 Voice and personality, 9, 19, 49-50,
Terman-M errill test, 73 52-3, 55-7, 63, 65, 69, 201
Testimonials, 20, 119-21, 203
Tetanoid-Basedowoid types, 42 W ar Office Selection Boards, 97,
T hem atic Apperception test, 27-8, 120, 202
161, 180-4, 188, 193, 200, 205-6 W atson-Glaser te st of Critical
T hurstone’s attitu d e scales, 145-54, Thinking, 160
156-7 W atson’s te st of Fairm indedness,
T hurstone’s Personality Schedule, 158-60
125, 129, 131, 141-2 Wechsler-Bellevue te st, 72-3
Time-sampling, 23, 94-6, 102-3, Weigl Sorting te st, 74
196, 202, 204 W illoughby’s Em otional M aturity
T rack Tracer, 79 Scale, 109, 128
T rait-com posite technique, 7-11, W ill-tem peram ent tests, 68, 81
16-18, 22, 45, 63, 90-1, 94, 116, W oodw orth’s Personal D a ta Sheet,
141, 157, 204-6 124-5, 129, 139, 141
T raits of personality, 4-16, 101-8, W ord association, 2 0 ,2 3 ,4 2 -4 , 54-5,
115-16, 144, 171, 199, 204-5 70, 90, 168-9, 172-5, 177-8, 188
Trend tests, 66 W ord Connection te st, 177, 200

You might also like