Test Measurements of Humor: January 2014
Test Measurements of Humor: January 2014
Test Measurements of Humor: January 2014
net/publication/281727829
CITATIONS READS
0 1,292
2 authors, including:
Willibald Ruch
University of Zurich
375 PUBLICATIONS 8,069 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Humor and Laughter, Playfulness and Cheerfulness: Upsides and Downsides to a Life of Lightness View project
All content following this page was uploaded by Willibald Ruch on 23 March 2016.
person’s humor profile, to relate these differences to other phenomena (like personality or
health), and to document changes in humor due to interventions. The question of what instrument
to use arises. Is it one for all research questions? Is it the one with the highest number of
subscales, or the most recent one, the one with the most sophisticated name? Several formal- and
content-related factors determine the choice of which instrument to use for what purposes. This
entry first discusses the criteria used to determine if a test is psychometrically sound. Second,
currently used humor instruments (joke / cartoon tests and questionnaires) are presented..
Formal Criteria
Formal criteria refer to the construction and documentation of a test and to the psychometric
properties. A well-documented test will contain information on the nature of the concepts to be
measured (e.g., how sparse vs. elaborated the variable definition is and if it is based on a theory),
the type of construction procedure employed (e.g., factor analytic, empirical, rational), how
elaborate the construction stage was (e.g., how were the items generated, how many samples
were used, was there an item analysis), and the psychometric properties.
evaluation of the quality of the test measurement. According to Gustav A. Lienert and Ulrich
degree to which the measurement is free from error variance, i.e., how accurate the test
measures. Validity of a test is the extent to which it measures or predicts some criterion of
interest. Sufficient objectivity is the necessary precondition for high reliability; and high
Reliability varies between zero (unreliability) and one (perfect reliability), and should
exceed .60 for group assessment and .80 for individual assessment. Reliability can be estimated
through internal consistency (ie., the intercorrelations of all scale items; a frequently reported
measure for continuous measures is Cronbach’s Alpha), the parallel-test method (correlation of
two parallel tests), and the retest method (stability; correlation of the same test across two points
in time).
Validity entails several aspects, of which construct and content validity will be discussed
in more detail. Frederic Lord and Melvin R. Novick distinguish between empirical and
theoretical validity, i.e., relations of a measurement with observable variables versus latent
validity and indicates the efficiency of a test in terms of measuring what it was designed to
measure. It comprises convergent (high correlations with scales that measure the same or a
similar construct) and discriminant validity (low correlations with scales that measure a
analysis, which traditionally involves comparing correlation matrices of at least two measures
and traits. Modern statistical approaches to conduct MTMM analyses include structural equation
Content validity has been described as the amount to which a test represents the criterion
or construct to be measured. Content validity can be ensured by (a) defining the scope of the
criterion or construct of interest and (b) obtaining expert ratings of the representativeness of the
test items according to the definition. In humor research, tests employing jokes and cartoons to
assess humor appreciation or production are inherently content valid, as they obtain direct ratings
of the criterion at hand (like the funniness of a joke or writing a humorous cartoon caption).
The Measurements
The different kinds of humor assessment tools can be grouped into seven categories: 1. Informal
surveys, joke telling techniques, or diary methods; 2. Joke and cartoon tests; 3. Questionnaires,
self-report scales; 4. Peer-reports; 5. State measures; 6. Children humor tests; 7. Humor scales in
general instruments; and 8. Miscellaneous and unclassified. More than 60 instruments have been
developed that fall into one of these categories. Meanwhile more than two dozen new measures
were constructed. A survey of the various instruments allowed some conclusions about them,
most of which are still valid today. One was that over the entire span covered, the instruments
often purported to measure “sense of humor” even when the methods used or the contents
diverged largely (questionnaires, jokes/cartoons tests) and zero correlations can be expected
Until the 1980s joke and cartoon tests were most frequent, and more recently
questionnaires have been more frequent. Little effort has been invested in peer-evaluation
techniques or experimental assessments. Also, most instruments are for adults and few are
applicable to children. Many instruments are trait-oriented and thus not well suited for measuring
change (e.g., as needed in intervention studies). Another observation was that the same labels do
not necessarily imply the same concepts (as in nonsense, which may stand for harmless jokes or
ones that do not resolve incongruity), and scales with different labels might still measure the
same construct. Also, there has been little interest in multiple operationalizations of the same
construct to determine convergent validity. This would allow determining how much method
variance and how much content variance are in the measures. Another observation is that very
often an instrument was designed for one study only, and only a couple of tests were published
with a company (e.g., the IPAT Humor Test of Personality). While work was devoted to
constructing scales, comparatively little effort was spent on working on the concepts.
The selection of current instruments for the assessment of humor traits and states in children and
adults that follows is not comprehensive, but contains measurements of humor with adequate
psychometric properties.
respectively. In addition, scores for the total funniness and total aversiveness of humor,
(with removed variance of structure) can be assessed. There are different versions of the 3
WD, including a short version (30 items, plus five “warming up” items). Many studies across
♦ The Escala de Apreciación del Humor (EAHU; Humor Appreciation Scale) measures six
and nonsense), and four are content-related (sexual, black, woman disparagement, and man
disparagement). The test comprises 32 items, which are rated on funniness and aversiveness
on unipolar five-point scales.
♦ The Cartoon Punch Line Production Test (CPPT) is a measure of quantity or fluency
(i.e., number of produced punch lines) and quality or origence (i.e., peer-rated funniness and
originality) of humor production. As many funny captions as possible are written in response
30-minute time limit (the short form, or CPPT-k, features six cartoons and a 15-minute time
limit). Besides quality and quantity of humor production, wit and imagination of the
participant are assessed as well. The instrument was tested for construct validity.
Questionnaires
♦ The Humorous Behavior Q-Sort Deck (HBQD) assesses 10 humor styles located on five
bipolar dimensions, namely socially warm vs. cold, reflective vs. boorish, competent vs. inept,
earthy vs. repressed, and benign vs. mean-spirited. It consists of 100 humorous behaviors,
which are ranked on bipolar seven-point scales according to their typicality using a Q-sort
answers and thus results in ipsative scoring (i.e., an intraindividual ranking of the items).
♦ The Humor Styles Questionnaire (HSQ) measures four everyday functions of humor
(affiliative, self-enhancing, aggressive, and self-defeating). The HSQ has 32 items with a
unipolar seven-point answer format. There is initial evidence for its construct validity and a
large body of studies showing predictive validity. Recently, a children’s version of the HSQ
♦ The Sense of Humor Scale (SHS) measures playful vs. serious attitude, positive vs.
negative mood, and six facets of sense of humor (enjoyment of humor, laughter, verbal
humor, finding humor in everyday life, laughing at yourself, and humor under stress), which
can be combined into one “humor quotient”. The instrument has 40 items with a bipolar
bad mood as traits and states using a four-point answer format. The trait version (STCI-T)
assesses the temperamental basis of humor and comes in different versions (a short, standard,
and long form with 30, 60, and 106 items, respectively). The STCI-T has a well-studied
content and construct validity. The state version (STCI-S) has 30 items and instructions for
different time spans (now, last week, last month, in general) and is suited for pre/post
comparisons. In addition, a peer-report version and a version for children and adolescents of
with a total of 240 items in a bipolar five-point answer format. Humor (playfulness) forms
one of these strengths (assessed with 10 items), and is related to the virtues of temperance,
Willibald Ruch
Sonja Heintz
See also 3 WD Humor Test; Appreciation of Humor; Cheerfulness, Seriousness and Humor;
Further Readings
the content and structure of humor: Construction of a new scale. Humor: International Journal
Casu, G., & Gremigni, P. (2012). Humor measurement. In Gremigni, P. (Ed). Humor and
Cattell, R. B., & Tollefson, D. L. (1966). The IPAT Humor Test of Personality.
Craik, K. H., Lampert, M. D., & Nelson, A. J. (1996). Sense of humor and styles of
doi:10.1515/humr.1996.9.3-4.273
Köhler, G., & Ruch, W. (1995). On the assessment of 'wit': The Cartoon Punch line
Köhler, G., & Ruch, W. (1996). Sources of variance in current sense of humor
inventories: How much substance, how much method variance? Humor: International Journal of
Lienert, G. A., & Raatz, U. (1998). Testaufbau und Testanalyse [Test construction and
Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Reading,
MA: Addison-Wesley.
Martin, R. A., Puhlik-Doris, P., Larsen, G., Gray, J., & Weir, K. (2003). Individual
differences in uses of humor and their relation to psychological well-being: Development of the
Peterson, C., & Seligman, M. E. P. (2004). Character strengths and virtues: A handbook
Ruch, W., & Köhler, G. (1999). The measurement of state and trait cheerfulness. In I.
Mervielde, I. Deary, F. De Fruyt, & F. Ostendorf (Eds.), Personality Psychology in Europe (Vol.