The Mini-IPIP Scales: Tiny-yet-Effective Measures of The Big Five Factors of Personality

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/7014171

The Mini-IPIP Scales: Tiny-yet-Effective Measures of the Big Five Factors of


Personality

Article  in  Psychological Assessment · July 2006


DOI: 10.1037/1040-3590.18.2.192 · Source: PubMed

CITATIONS READS
1,040 49,820

4 authors, including:

Frederick L Oswald
Rice University
146 PUBLICATIONS   6,382 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Relative Importance View project

All content following this page was uploaded by Frederick L Oswald on 04 June 2014.

The user has requested enhancement of the downloaded file.


Psychological Assessment Copyright 2006 by the American Psychological Association
2006, Vol. 18, No. 2, 192–203 1040-3590/06/$12.00 DOI: 10.1037/1040-3590.18.2.192

The Mini-IPIP Scales: Tiny-Yet-Effective Measures of the Big Five


Factors of Personality
M. Brent Donnellan, Frederick L. Oswald, Brendan M. Baird, and Richard E. Lucas
Michigan State University

The Mini-IPIP, a 20-item short form of the 50-item International Personality Item Pool—Five-Factor
Model measure (Goldberg, 1999), was developed and validated across five studies. The Mini-IPIP scales,
with four items per Big Five trait, had consistent and acceptable internal consistencies across five studies
(␣ at or well above .60), similar coverage of facets as other broad Big Five measures (Study 2), and
test-retest correlations that were quite similar to the parent measure across intervals of a few weeks
(Study 4) and several months (Study 5). Moreover, the Mini-IPIP scales showed a comparable pattern of
convergent, discriminant, and criterion-related validity (Studies 2–5) with other Big Five measures.
Collectively, these results indicate that the Mini-IPIP is a psychometrically acceptable and practically
useful short measure of the Big Five factors of personality.

Keywords: personality assessment, Big Five, International Personality Item Pool

Investigators often want to measure a wide range of constructs pendent samples. It is perhaps understandable that these sins are
in their research; however, completing a large packet of question- committed given that guidelines concerning the development of
naires can be a boring or irritating task for participants. This might short forms are in their infancy (see Marsh, Ellis, Parada, Richards,
end up producing transient measurement errors (e.g., Schmidt, Le, & Heubeck, 2005; Smith et al., 2000; Stanton et al., 2002);
& Ilies, 2003) because participants are in a negative mood, or nonetheless, using short forms with poor or poorly understood
because they respond carelessly due to frustration with the length psychometric properties will likely slow down scientific progress.
of the assessment. Moreover, to the extent that it is even mildly In this paper, we describe the process we used to develop and
unpleasant to participate in research, long questionnaires may validate a short measure of the Big Five factors of personality (see
increase the likelihood that participants will decide not to complete John & Srivastava, 1999): Extraversion, Agreeableness, Consci-
the study, will drop out of subsequent data collections in longitu- entiousness, Neuroticism, and Intellect/Imagination (or Open-
dinal studies, or will refuse to take part in future studies. Given ness).1 The Big Five serve as the dominant model of personality
these kinds of practical concerns, researchers often create shorter
structure in trait psychology (e.g., Funder, 2001; but see Block,
forms of longer assessment instruments (Stanton, Sinar, Balzer, &
1995), and in clinical psychology there is growing interest in using
Smith, 2002). However, Smith, McCarthy, and Anderson (2000)
the Big Five to understand Axis I disorders (e.g., Trull & Sher,
cautioned that many well intentioned researchers commit several
1994), Axis II disorders (Costa & Widiger, 2002; Durrett & Trull,
“sins” in the process of shortening these forms. For instance,
although the content is derived from the parent measure, research- 2005; Widiger, 2005; Wiggins, 2003), as well as substance abuse
ers sometimes fail to evaluate whether or not the short form has and antisocial behavior (e.g., Ball, 2005; Miller & Lynam, 2001).
comparable amounts of reliability and validity and whether psy- For example, the premise of the edited book by Costa and Widiger
chometric findings for the short form will generalize across inde- (2002) is that Axis II personality disorders represent constellations
of extreme variants of these five broad dimensions of personality.
In a related vein, it has often been proposed that basic personality
M. Brent Donnellan, Frederick L. Oswald, Brendan M. Baird, and traits like Neuroticism, serve as a diathesis for the development of
Richard E. Lucas, Department of Psychology, Michigan State University. clinically significant impairments (e.g., Krueger, Caspi, & Moffitt,
Data in Study 1 were collected with generous financial support from the 2000). The increased interest in integrating models of normal
College Board and the cooperation of multiple colleges and universities.
Data in Study 5 were supported by the MSU Intramural Research Grants
1
Program Award to Richard E. Lucas. We acknowledge helpful assistance There is sometimes disagreement over the fifth factor in the Big Five
and comments from Neal Schmitt regarding Study 1. John Johnson gra- literature. Inventories derived from lexical studies have traditionally la-
ciously provided us with the IPIP-NEO items used in Study 2. We thank beled the fifth factor Intellect/Imagination, whereas inventories derived
Samuel Gosling, and Lewis Goldberg for constructive comments on this from questionnaire items have used the label Openness or Openness to
article. The first and second authors contributed equally to this article. Experience. For the sake of clarity we refer to the fifth factor as Intellect/
Authorship was decided by last name. Imagination when we are generally describing this dimension of person-
Correspondence concerning this article should be addressed to M. Brent ality and when discussing measures that explicitly use that label. We
Donnellan, Department of Psychology, Michigan State University, East restrict the use of the term Openness to our discussion of inventories that
Lansing, MI, 48823. E-mail: [email protected] or to Frederick L. Os- use that label. However, many authors use the terms Openness and Intel-
wald, Department of Psychology, Michigan State University, East Lansing, lect/Imagination more or less interchangeably (e.g., John & Srivastava,
MI, 48823. E-mail: [email protected] 1999).

192
MINI-IPIP SCALES 193

personality with clinical research on psychopathology (e.g., because of the relatively low reliabilities of the scales. We sought
Durrett & Trull, 2005) represents an important bridge between to develop a measure with even lower intercorrelations, with more
clinical psychology and personality psychology. Thus, with the items than the TIPI to increase reliability as well as content and
increasing interest in personality traits and the Big Five, we an- construct breadth, but with much fewer items than Saucier’s ex-
ticipate a growing need for short measures of these constructs (e.g., plicitly orthogonal scales (Saucier, 2002). Thus, we wanted to
Gosling, Rentfrow, & Swann, 2003). create short Big Five scales with very low intercorrelations to
maximize efficiency in multivariate prediction research. More-
Why Another Short Big Five Measure? over, empirically distinct scales prevent interpretational problems
caused by multicollinearity, which can have “devastating effects
So why develop another short Big Five measure? This is a on regression statistics to the extent of rendering them useless,
reasonable question to ask given that there are several relatively even highly misleading” (Pedhazur, 1997, p. 295). Although com-
short Big Five measures, with good psychometric characteristics, plex items with content that relates to multiple Big Five constructs
including the 60-item NEO Five-Factor Inventory (NEO-FFI; can prove to be useful, we wanted our short form to focus on items
Costa & McCrae, 1992), the 50-item International Personality Item that were relatively pure indicators of only one of the Big Five
Pool – Five Factor Model (IPIP-FFM; Goldberg, 1999), the 44- constructs.
item Big Five Inventory (BFI; John & Srivastava, 1999) and the A third and final consideration is that we were interested in
40-item Big Five Mini-Markers (Saucier, 1994). These inventories shortening an existing measure so that we could follow many of
may still be too long, however, particularly in studies where the guidelines that do exist for developing short forms (Smith et
participants will be completing a considerable number of items al., 2000; see Marsh et al., 2005, for examples of these proce-
(e.g., large-scale panel studies) or whenever use of participants’ dures). Specifically, we investigated whether the short-form scales
time must be very brief (e.g., experience sampling studies). In fact, have levels of reliability and criterion-related validity that are
these sorts of concerns motivated Gosling et al. (2003) to develop adequate and similar to the original inventory. Moreover, we
the Ten-Item Personality Inventory (TIPI) measure of the Big Five. wanted to learn about the practical costs to reliability and validity
The creation of the TIPI is encouraging because it suggests it is that occur when researchers remove a large number of items from
possible to measure the Big Five with very few items. However, the parent inventory even when following recent “best practice”
several considerations motivated us to construct our own short Big guidelines.
Five measure rather than rely on the TIPI. To summarize, we set out to construct a short inventory of the
First and foremost, we suspected that a slightly longer measure Big Five with the objective of producing scales that were efficient
of the Big Five would be more practically useful than the TIPI, predictors of meaningful outcomes in psychological research. We
while still reaping practical benefits over the other short-form elected to shorten the 50-item IPIP-FFM because it is frequently
personality measures just described. Simply put, we believed that used in personality research and because it is publicly available to
it would be too difficult to obtain adequate internal consistencies researchers on the IPIP website at no cost (http://ipip.ori.org/
and reasonable content or construct breadth with only two items newQform50b5.htm). We had originally planned to create a 15-
per scale (see Saucier & Goldberg, 2002, for a similar argument). item measure but were persuaded by recommendations in Saucier
Indeed, three of the five TIPI scales had internal consistency and Goldberg (2002, p. 43– 44) that four items serve as “a practical
coefficients at or below .50 as reported in Gosling et al. (2003). minimum” for scale length. Thus, we opted to create a 20-item
Moreover, the TIPI might pose serious problems in structural measure, with four items per Big Five scale, that we dubbed the
equation modeling contexts because having only two indicators Mini-IPIP. As an additional consideration, Saucier and Goldberg
per latent factor can lead to estimation problems and limited (2002) stressed the need for balanced scales that have equal
modeling flexibility (e.g., Bollen, 1989; Kenny, 1979; Kline, 2004; numbers of positively and negatively keyed items. For each Big
Little, Lindenberger, & Nesselroade, 1999). This concern over the Five scale, we therefore strived to select 2 items keyed in the
few number of items on the TIPI also applies to exploratory factor negative direction and 2 items keyed in the positive direction, and
analytic contexts where it is often recommended that each com- we were able to meet this goal for all scales except for the
mon factor has at least three or four primary indicators (e.g., Intellect/Imagination scale which comprised 3 negatively keyed
Fabrigar, Wegener, MacCallum, & Strahan, 1999; Floyd & Wida- items and 1 positively keyed item.
man, 1995; Guadagnoli & Velicer, 1988). Thus, the TIPI faces
some serious limitations in latent variable applications. Overview of the Present Studies
Second, we specifically wanted our short form to have empiri-
cally distinct scales; in other words, we wanted the correlations We developed and evaluated the 20-item Mini-IPIP inventory
between the separate scales to be very low. Critics of the Big Five across five independent and diverse samples. The 20-item short
point out nontrivial empirical overlap between Big Five scales that form was derived in Studies 1 and 2. Study 1 details the develop-
are conceptually orthogonal (see Block, 1995; Funder, 2001). ment of the Mini-IPIP using a very large sample of students from
Fortunately, Saucier (2002) demonstrated that this overlap was multiple colleges and universities (N ⫽ 2,663). We used Study 2
largely an accidental by-product of scale development techniques to fine-tune the item selection for the Mini-IPIP and to examine
that place a premium on maximizing reliability and illustrated that how well the Mini-IPIP relates to Big Five facets, the IPIP-FFM,
it is possible to create orthogonal and valid measures of the Big and the TIPI using a sample (undergraduates in psychology
Five. The scale intercorrelations for the TIPI were in fact low (e.g., courses) and sample size (N ⫽ 329) that is similar to those
an average correlation of .20 with a maximum of .36 in Study 2 by typically used in psychological research. That is, we wanted to
Gosling et al., 2003); however, this figure might be attentuated know how the Mini-IPIP related to facets of the Big Five as well
194 DONNELLAN, OSWALD, BAIRD, AND LUCAS

as other broad measures of the Big Five. Study 3 was conducted to implemented with available statistical packages. We conducted a
replicate findings for the psychometric properties of the Mini-IPIP principal axis exploratory factor analysis (EFA) with a varimax
on an independent sample to ensure that the desirable properties rotation on the 50-item IPIP-FFM dataset. We forced a five-factor
found in the items selected for the measure were not a result of solution and exported the factor-loading matrix to a spreadsheet
capitalizing on chance. Study 3 also provided data on how well the program. As expected, all of the items loaded appropriately on
Mini-IPIP related to an alternative Big Five measure and to several their corresponding factors; however some loaded more highly
criterion measures. Studies 4 and 5 were based on datasets not than others, and some had large cross-loadings (i.e., nontrivial
originally designed to validate the short scales. Nonetheless, these loadings on multiple factors). In the next step, for each IPIP-FFM
datasets provided short-term (Study 4) and longer-term (Study 5) item, we calculated the difference between its loading on its
test–retest reliability information for the Mini-IPIP as well as more
primary factor and the average of the absolute “off” factor load-
criterion-related validity evidence. Moreover, Study 5 provided
ings. We labeled this the “discrimination score” for each item. For
data on how well the Mini-IPIP correlated with informant-reports
example, the first IPIP-FFM item, “I am the life of the party,”
of personality. As a package, these five studies reflect the simple
primarily loads on the Extraversion factor, so we subtracted the
yet powerful technique of replication to demonstrate that the
average of the absolute value of its loadings on the other factors
20-item Mini-IPIP is a reasonable replacement for the parent
inventory in situations where the length of the personality assess- (the average of the absolute value of the loadings on the Agree-
ment is a major concern. ableness, Conscientiousness, Neuroticism, and Intellect/Imagina-
tion factors ⫽ .19) from its primary loading on the Extraversion
Study 1: Development of the 20-item Mini-IPIP Inventory factor (loading ⫽ .64 to yield a discrimination score of .45).
After calculating discrimination scores for each item, we se-
Sample and Measures lected the two positively and two negatively keyed items from
Complete IPIP-FFM data were available from 2,663 freshman each scale that had the largest discrimination scores. (Regarding
undergraduate students across 10 colleges and universities in- the latter, recall that one of our goals in creating the Mini-IPIP was
volved in a large-scale project designed to study how well indi- to create scales that were empirically distinct from one another.) It
vidual difference measures worked as predictors of academic per- is important to note that we were limited in item selection for the
formance. The substantive work from that project is detailed Neuroticism and Intellect/Imagination scales because the IPIP-
elsewhere (Oswald, Schmitt, Kim, Ramsay, & Gillespie, 2004). FFM contains only two negatively keyed Neuroticism items and
Regarding the demographic breakdown, 97% of the sample was three negatively keyed Intellect/Imagination items.
either 18 or 19 years of age, 64% of the sample was female, 96% We then subjected our initial 20-item pool to a separate EFA on
were U.S. citizens, and 94% indicated that English was their first the sample in Study 2 (to be described subsequently). Our goal was
language. The racial composition for this sample was 55% Cau-
to see if we could recover a five-factor structure with reasonable
casian, 25% African American, 6% Hispanic, 7% Asian, and 7%
factor loadings for each of the items. We were able to get a clear
other ethnicities. Participants completed the 50-item IPIP-FFM
five-factor solution, but we had to make modifications to two of
measure (Goldberg, 1999) by indicating, on a 5-point scale, how
our original scales because some items did not have particularly
well each statement described them. Descriptive statistics for each
of the 10-item subscales are displayed in Table 1. strong factor loadings. First, for Conscientiousness scale, the ini-
tially selected item “I follow a schedule” (Original IPIP-FFM item
Results and Discussion #43; discrimination score ⫽ .38) had a relatively low loading in the
new sample (.33). Thus, we replaced it with the item, “I like order”
Method for Creating the Mini-IPIP Scales (Original IPIP-FFM item #33; discrimination score ⫽ .37) because
Although there are many approaches to selecting items for short this item had the next highest discrimination score of the positively
forms, we elected to use a simple approach that can be readily keyed items.

Table 1
Descriptive Statistics for the IPIP-FFM and Mini-IPIP (Studies 1 and 2)

IPIP-FFM (50-items) Mini-IPIP (20-items)

Convergent Validity ␣ Mean SD ␣ Mean SD

Study 1 (N ⫽ 2,663)
Extraversion .93 .87 3.36 .77 .77 3.28 .90
Agreeableness .89 .80 4.00 .57 .70 4.01 .69
Conscientiousness .90 .80 3.57 .61 .69 3.42 .78
Neuroticism .92 .85 2.72 .73 .68 2.54 .80
Intellect/Imagination .85 .80 3.63 .58 .65 3.70 .73
Study 2 (N ⫽ 329)
Extraversion .94 .91 3.50 .81 .82 3.45 .90
Agreeableness .91 .80 4.10 .54 .75 4.15 .64
Conscientiousness .90 .81 3.49 .63 .75 3.40 .86
Neuroticism .93 .89 2.84 .83 .70 2.62 .83
Intellect/Imagination .83 .78 3.68 .57 .70 3.74 .76
MINI-IPIP SCALES 195

Second, for the Intellect/Imagination scale, we had to replace Study 2: An Examination of the Content Validity of the
the items “I have excellent ideas” (Original IPIP-FFM item #25; IPIP-FFM and Mini-IPIP Scales
discrimination score ⫽ .40) and “I use difficult words” (Original
IPIP-FFM item #40; discrimination score ⫽ .42) because neither Sample
loaded on the primary factor in the new sample above .30. The
Intellect/Imagination scale was the only instance where we devi- This sample consists of 329 undergraduate students enrolled in
ated from the procedure of selecting items solely on the basis of psychology courses at a large public research university in Mich-
the discrimination scores. We deviated from this procedure be- igan who participated in exchange for course credit or extra credit
cause we wanted to select items with content that seemed distinct during the Fall Semester of 2005. Data were collected over the
from those items that appeared to measure general intelligence and Internet using a web-based interface. Participants were primarily
even narcissism. We first selected the item “I have a vivid imag- female (73.3%) and either first-year or second-year college stu-
ination” (Original IPIP-FFM item #15; discrimination score ⫽ .38) dents (33.3% and 35.5% respectively). Four additional partici-
because it was the positively keyed item with the fourth highest pants’ data were discarded because they failed to answer “Yes” to
discrimination score. The item “I am full of ideas” (Original the final statement: “I answered all of these questions honestly.”
IPIP-FFM item #50; discrimination score ⫽ .40) had a higher
discrimination index, but we eliminated it in favor of the former
IPIP-FFM and Mini-IPIP Measures
item because the former item seemed like a purer measure of
imagination. We then selected the last negatively keyed Intellect/ IPIP-FFM and Mini-IPIP. The bottom half of Table 1 pre-
Imagination item, “I do not have a good imagination” (Original sents descriptive statistics and convergent scale intercorrelations
IPIP-FFM item #30; discrimination score ⫽ .26) because the for the 50-item IPIP-FFM and the 20-item Mini-IPIP for Study 2.
remaining positively keyed items that had decent discrimination In general, the values of these coefficients are close to those in
scores and primary factor loadings also seemed to tap general Study 1. The average absolute scale intercorrelation for the Mini-
intelligence and/or narcissism (i.e., “I have a rich vocabulary” and IPIP scales was r ⫽ .14 (SD ⫽ .08; maximum ⫽ .30 for Agree-
“I am quick to understand things”). Appendix A lists the 20 items ableness and Intellect/Imagination) compared to r ⫽ .21 for the
selected for the Mini-IPIP scales, where the item numbers refer to IPIP-FFM scales (SD ⫽ .11; maximum ⫽ .36 for Agreeableness
the order of the items in the version of the 50-item IPIP-FFM and Intellect/Imagination). We repeated these analyses correcting
inventory available at the IPIP website. the intercorrelations for attenuation due to measurement error, and
the results were similar (Mini-IPIP: average r ⫽ .19, SD ⫽ .11;
IPIP-FFM: average r ⫽ .25, SD ⫽ .13).
Basic Psychometric Properties of the Reduced Scales

Table 1 presents means, standard deviations, and coefficient Comparison Measures


alphas for the Mini-IPIP scales. As seen in Table 1, the Mini-IPIP
scales had acceptable reliability especially in light of their reduced The IPIP-NEO. Participants completed the 120-item IPIP
length. These coefficients ranged from .65 for Intellect/Imagina- measure developed by Johnson (2000) to measure the same facets
tion to .77 for Extraversion. Table 1 also shows the convergent of the Big Five that are assessed by the NEO-PI (Costa & McCrae,
correlations between the Mini-IPIP scales and the 10-item “parent” 1992). The IPIP-NEO includes items that measure 30 facets of the
IPIP-FFM scales. These were high, ranging from .85 for Intellect/ Big Five– 4 items per facet and 6 facets per factor— on a 5-point
Imagination to .93 for Extraversion. These correlations are inflated scale. Internal consistencies for the facet scales are reported in
by the fact that each pair of scales contains four identical items, but Table 2. Scores for the Big Five were calculated by taking the
from the practical standpoint of comparing the shorter with the average of all of the respective facet-level items for each of the Big
longer scale, they are still informative. That said, we computed the Five: Extraversion: M ⫽ 3.53, SD ⫽ .51, ␣ ⫽ .87; Agreeableness:
associations between the Mini-IPIP scales and the 6-items not M ⫽ 3.75, SD ⫽ .46, ␣ ⫽ .84; Conscientiousness: M ⫽ 3.62, SD ⫽
included in the scale, and these correlations were also high (Ex- .54, ␣ ⫽ .89; Neuroticism: M ⫽ 2.77, SD ⫽ .61, ␣ ⫽ .90;
traversion ⫽ .78, Agreeableness ⫽ .67, Conscientiousness ⫽ .67, Openness: M ⫽ 3.40, SD ⫽ .44, ␣ ⫽ .78. Perhaps with the
Neuroticism ⫽ .76, and Intellect/Imagination ⫽ .56). exception of Openness, these Big Five scores demonstrated rea-
Regarding discriminant validity, the scale intercorrelations were sonable convergent validity with the IPIP-FFM (r ⫽ .75, .61, .75,
successfully reduced for the Mini-IPIP scales as compared to the .79, .55, for Extraversion, Agreeableness, Conscientiousness, Neu-
IPIP-FFM scales. Specifically, the average absolute scale intercor- roticism, and Openness with Intellect/Imagination, respectively)
relation for the Mini-IPIP was r ⫽ .13 (SD ⫽ .08, Range ⫽ and the Mini-IPIP (r ⫽ .70, .52, .63, .73, .55, for Extraversion,
.02–.24) compared to an average absolute scale intercorrelation for Agreeableness, Conscientiousness, Neuroticism, and Openness
the full IPIP-FFM of r ⫽ .20 (SD ⫽ .08, Range ⫽ .07–.35). We with Intellect/Imagination, respectively).
repeated these analyses correcting the intercorrelations for atten- Ten-Item Personality Inventory (TIPI). Participants com-
uation due to measurement error, and the results were similar pleted the TIPI using a 7-point scale, with the following results:
(Mini-IPIP: average r ⫽ .18, SD ⫽ .11, Range ⫽ .02–.34; com- Extraversion: M ⫽ 4.98, SD ⫽ 1.45, ␣ ⫽ .70 (r for the 2 items ⫽
pared to the IPIP-FFM: average r ⫽ .25, SD ⫽ .10, Range ⫽ .56); Agreeableness: M ⫽ 5.38, SD ⫽ 1.10, ␣ ⫽ .32 (r for the 2
.08 –.41). items ⫽ .22); Conscientiousness: M ⫽ 5.55, SD ⫽ 1.17, ␣ ⫽ .43
196 DONNELLAN, OSWALD, BAIRD, AND LUCAS

Table 2 sponding facets of that factor as assessed by the IPIP-NEO. For


Big Five Facet Coverage by Measures of the Big Five Factors instance, Table 2 shows how these three measures of Extraversion
(Study 2) correlate with the six Extraversion facets of Friendliness, Gregar-
iousness, Assertiveness, Activity Level, Excitement Seeking, and
IPIP-FFM Mini-IPIP TIPI Cheerfulness. Generally speaking, the pattern of correlations was
Scale Scale Scale
very similar across all three measures. More specific to our inter-
Extraversion ests, the content coverage of the Mini-IPIP was very close to
Friendliness (.73) .67 .60 .60 coverage of the IPIP-FFM. For instance, across the 30 facets there
Gregariousness (.70) .69 .70 .57 were only 2 cases where the difference in magnitude between the
Assertiveness (.85) .52 .45 .46
Activity Level (.65) .24 .23 .20 IPIP-FFM and for the Mini-IPIP was greater than |.12|. This
Excitement Seeking (.63) .45 .44 .40 occurred for the facets of Achievement Striving (difference ⫽ .16),
Cheerfulness (.78) .44 .39 .38 and Self-Efficacy (difference ⫽ .17).
Agreeableness
Trust (.85) .32 .22 .38
Morality (.75) .48 .41 .33
Altruism (.66) .68 .64 .36
Study 3: Replication of the Psychometric Properties and
Cooperation (.72) .34 .25 .39 Evidence of the Criterion-Related Validity of the
Modesty (.70) .06 .05 .04 Mini-IPIP Scales
Sympathy (.68) .52 .52 .21
Conscientiousness
Self-Efficacy (.79) .45 .28 .38
Sample
Orderliness (.82) .72 .74 .57
Dutifulness (.56) .42 .33 .43 This sample consists of 300 undergraduate students enrolled in
Achievement Striving (.79) .52 .36 .54 psychology courses at a large public research university in Mich-
Self-Discipline (.69) .62 .50 .54 igan who participated in exchange for course credit or extra credit
Cautiousness (.85) .37 .31 .40
Neuroticism
during the Spring Semester of 2005. Data were collected over the
Anxiety (.77) .72 .61 .57 Internet using a web-based interface. Participants were primarily
Anger (.83) .60 .53 .54 female (78.7%) and identified themselves as European American
Depression (.84) .65 .65 .56 (63.3%), Asian American (5.0%), African American (2.3%),
Self-Consciousness (.72) .28 .28 .22
Latino/a (2.0%), or self-reported they were a member of some
Immoderation (.66) .31 .28 .26
Vulnerability (.74) .68 .64 .63 “Other” ethnic group (27.3%). Data from three additional partic-
Openness ipants were discarded because they failed to answer “Yes” to the
Imagination (.68) .45 .47 .32 final statement: “I answered all of these questions honestly”.
Artistic Interest (.66) .40 .43 .35 IPIP-FFM and Mini-IPIP. Participants completed the 50-item
Emotionality (.55) .16 .15 .19
Adventurousness (.60) .26 .26 .40 IPIP-FFM measure, with the following results: Extraversion: M ⫽
Intellect (.67) .59 .53 .32 3.48, SD ⫽ .73, ␣ ⫽ .90; Agreeableness: M ⫽ 4.02, SD ⫽ .52, ␣ ⫽
Liberalism (.66) .05 .10 .13 .83; Conscientiousness: M ⫽ 3.53, SD ⫽ .60, ␣ ⫽ .83; Neuroti-
cism: M ⫽ 2.87, SD ⫽ .76, ␣ ⫽ .89; Intellect/Imagination: M ⫽
Note. N ⫽ 329. Numbers inside parentheses are the alpha reliability
estimates for the 4-item facet scales of the 120-item IPIP-NEO. TIPI ⫽ Ten 3.55, SD ⫽ .52, ␣ ⫽ .79. Statistics for the 20-item Mini-IPIP were
Item Personality Inventory (Gosling, Rentfrow, & Swann, 2003) then calculated from the relevant subset of items within this
measure: M ⫽ 3.46, SD ⫽ .82, ␣ ⫽ .82; Agreeableness: M ⫽ 4.06,
(r for the 2 items ⫽ .29); Neuroticism: M ⫽ 3.17, SD ⫽ 1.51, ␣ ⫽ SD ⫽ .61, ␣ ⫽ .77; Conscientiousness: M ⫽ 3.48, SD ⫽ .76, ␣ ⫽
.69 (r for the 2 items ⫽ .54); Openness: M ⫽ 5.51, SD ⫽ 1.07, ␣ ⫽ .74; Neuroticism: M ⫽ 2.70, SD ⫽ .61, ␣ ⫽ .78; Intellect/
.41 (r for the 2 items ⫽ .27). With the possible exception of Imagination: M ⫽ 3.64, SD ⫽ .67, ␣ ⫽ .70. Once again, there was
Agreeableness and Openness, the TIPI demonstrated reasonable good convergence between the IPIP-FFM and the Mini-IPIP for
convergent validity with the IPIP-NEO (r ⫽ .65, .46, .69, .67, .48, each of the Big Five (r ⫽ .95, .88, .92, .93, .85, for Extraversion,
for Extraversion, Agreeableness, Conscientiousness, Neuroticism, Agreeableness, Conscientiousness, Neuroticism, and Intellect/
and Openness, respectively), the IPIP-FFM (r ⫽ .79, .41, .70, .74, Imagination, respectively).
.52, for Extraversion, Agreeableness, Conscientiousness, Neurot-
icism, and Openness with Intellect/Imagination, respectively), and
Convergent Validity and Criterion-Related Validity
the Mini-IPIP (r ⫽ .75, .33, .63, .73, .46, for Extraversion, Agree-
Measures
ableness, Conscientiousness, Neuroticism, and Openness with In-
tellect/Imagination, respectively). The Big Five Inventory (BFI). Participants completed this
44-item measure which also assesses the Big Five traits (John &
Results and Discussion Srivastava, 1999) using a 5-point scale: Extraversion (8 items):
M ⫽ 3.43, SD ⫽ .72, ␣ ⫽ .86; Agreeableness (9 items): M ⫽ 3.82,
Content Validity of the IPIP-FFM, Mini-IPIP, and TIPI
SD ⫽ .56, ␣ ⫽ .79; Conscientiousness (9 items): M ⫽ 3.63, SD ⫽
Scales
.60, ␣ ⫽ .83; Neuroticism (8 items): M ⫽ 2.93, SD ⫽ .73, ␣ ⫽ .84;
Table 2 displays the correlations between the Big Five scales Openness (10 items): M ⫽ 3.50, SD ⫽ .57, ␣ ⫽ .79. The BFI
assessed by the IPIP-FFM, Mini-IPIP, and TIPI and the corre- demonstrated good convergent validity with the IPIP-FFM (r ⫽
MINI-IPIP SCALES 197

.84, .64, .73, .86, .74 for Extraversion, Agreeableness, Conscien- Keeping these caveats in mind, we tested the structure of the
tiousness, Neuroticism, and Openness with Intellect/Imagination, Mini-IPIP using a CFA model. The fit of the CFA model was
respectively) and the Mini-IPIP (r ⫽ .81, .49, .66, .80, .68 for rejected by the chi-square test of exact fit (which is sensitive to
Extraversion, Agreeableness, Conscientiousness, Neuroticism, and even trivial misspecifications with increasing sample sizes); how-
Openness with Intellect/Imagination, respectively). ever, the fit of the model was within reason based on the RMSEA
Self-esteem. Participants completed the 10-item Rosenberg value (␹2 ⫽ 359.30, df ⫽ 160, p ⬍ .05; ␹2/df ⫽ 2.25; CFI ⫽ 0.88;
Self-Esteem Scale using a 5-point scale (Rosenberg, 1965; M ⫽ RMSEA ⫽ 0.07, p close fit ⬍ .05). Relevant coefficients from this
3.84, SD ⫽ .66, ␣ ⫽ .88). Robins, Tracy, Trzesniewski, Potter, and CFA model are reported in Table 3. The only potentially low
Gosling (2001) presented evidence that the Big Five are linked loading was for the Neuroticism item “I seldom feel blue” (stan-
with global self-esteem. Specifically, they reported modest to dardized loading ⫽ .39). We also examined the model modifica-
strong correlations between self-esteem and Neuroticism and be- tion indices and found suggestions that modeling several cross-
tween self-esteem and Extraversion in an extremely large Internet loadings for specific items would improve model fit. However,
sample (see also Schmitt & Allik, 2005). They also reported small given that many parameter estimates obtained through model
to modest positive correlations between self-esteem and the three modification fail to replicate (e.g., MacCallum, 1986) and the fact
other Big Five factors. Thus, we expected a similar pattern of that we found reasonable fit without the modifications, we did not
associations with the Big Five measures in Study 3. One partici- choose to model secondary factor loadings post hoc. Our results
pant did not complete the self-esteem items, so the sample size was suggest that the Mini-IPIP is at least as useful as other Big Five
299 for this scale. measures for CFA models.
Behavioral inhibition/behavioral approach (BIS/BAS). Re-
spondents completed the original Carver and White (1994) mea-
sures using a 5-point scale. The Behavioral Inhibition Scale (BIS) Criterion-Related Validity for the Big Five Measures
measures individual differences in the sensitivity to the behavioral
Table 4 displays the correlations between the Big Five measures
avoidance or inhibition system (7 items: M ⫽ 3.67, SD ⫽ .62, ␣ ⫽
and the three criterion measures used in Study 3. Patterns of
.79; sample item: “Criticism or scolding hurts me quite a bit”),
criterion-related validities were similar across all Big Five mea-
whereas the Behavioral Approach scale (BAS) measures individ-
sures, with the exception of the Agreeableness-BIS relationship
ual differences in the sensitivity to the behavioral approach system
being positive for the IPIP-FFM and Mini-IPIP and near-zero for
(13 items: M ⫽ 3.67, SD ⫽ .50, ␣ ⫽ .85; sample item: “It would
the BFI. Differences between the IPIP-FFM and the Mini-IPIP
excite me to win a contest”). The behavioral inhibition system is
scales were not substantial from a practical standpoint. As a final
believed to underlie Neuroticism primarily (Carver & White, 1994;
comparison between measures, we conducted a series of regression
Watson, Wiese, Vaidya, & Tellegen, 1999), whereas the behav-
analyses to examine the predictive validity of each of the Big Five
ioral approach system is believed to underlie Extraversion (Lucas,
measures as a set. For each analysis, we regressed each of the three
Diener, Grob, Suh, & Shao, 2000; Watson et al., 1999). Thus, the
criteria on a given set of Big Five scales and recorded the multiple
Neuroticism and Extraversion scales of the Big Five measures
R values (see the bottom of Table 4). In general, the multiple R
should relate to the BIS and BAS scales, respectively, along these
values were very similar across the three Big Five measures.
same lines.

Results and Discussion Study 4: Short-Term Retest Reliability and Additional


Evidence for Criterion-Related Validity of the Mini-IPIP
Confirmatory Factor Analysis of the 4-Item IPIP Scales Related to Psychopathology
We conducted a confirmatory factor analysis (CFA) on the Sample
structure of the Mini-IPIP using the program AMOS 5.0 (Ar-
buckle, 2003; variance-covariance matrix available from the first Test-retest data were available from 216 undergraduate students
author). We restricted these analyses to those 296 participants with enrolled in psychology courses at a large research university in
complete item level data. A recent CFA of the IPIP-FFM did not Michigan who participated in exchange for course credit or extra
show good model fit at the item level (Lim & Ployhart, 2006), and credit during the Spring Semester of 2005. Data were collected
indeed, the fit of the CFA model for IPIP-FFM for this sample was over the Internet using a web-based interface. The interval between
certainly not ideal (␹2 ⫽ 2,822.68, df ⫽ 1,165, p ⬍ .05; ␹2/df ⫽ the Time 1 and Time 2 assessments was approximately three
2.42; CFI ⫽ .74; RMSEA ⫽ .07, p close fit ⬍ .05; parameter weeks. All psychopathology-related criterion-related measures
estimates for this model are available from the first author). Past were administered at Time 2. Information on gender and racial/
published CFA models of Big Five inventories have had to esti- ethnic group membership was not collected, but the participant
mate secondary loadings to obtain even remotely satisfactory pool was the same as Study 2 and Study 3, and therefore demo-
indices of overall model fit (e.g., Church & Burke, 1994; McCrae, graphic characteristics are likely to be quite similar. These data
Zonderman, Costa, Bond, & Paunonen, 1996). Thus, it might not were initially collected as part of a different study investigating
be possible to obtain reasonable fit from a CFA perspective on response biases in daily self reports.
many or even most Big Five inventories. Much of the model misfit IPIP-FFM and Mini-IPIP. Participants completed the 50-item
in omnibus inventories arises from the fact that, “most items and IPIP-FFM measure at Time 1, with the following descriptive
tests tend to have substantial relations with at least two factors statistics: Extraversion: M ⫽ 3.41, SD ⫽ .73, ␣ ⫽ .88; Agreeable-
rather than with only one” (Goldberg, 1993. p. 186). ness: M ⫽ 4.11, SD ⫽ .52, ␣ ⫽ .80; Conscientiousness: M ⫽ 3.51,
198 DONNELLAN, OSWALD, BAIRD, AND LUCAS

Table 3
Confirmatory Factor Analysis of the Mini-IPIP (Study 3)

Intellect/
Mini-IPIP Item Number Extraversion Agreeableness Conscientiousness Neuroticism Imagination

Standardized Loading

1 .68
6 .76
11 .74
16 .75
2 .76
7 .56
12 .75
17 .72
3 .65
8 .67
13 .59
18 .67
4 .80
9 .58
14 .80
19 .39
5 .68
10 .50
15 .52
20 .72

Correlations between Latent Variables

1. 2. 3. 4. 5.

1. Extraversion 1.00
2. Agreeableness .30 1.00
3. Neuroticism ⫺.21 ⫺.18 1.00
4. Conscientiousness .09 .12 ⫺.22 1.00
5. Intellect/Imagination .35 .40 ⫺.18 .02 1.00

Note. N ⫽ 296. Model fit results: ␹2 ⫽ 359.30, df ⫽ 160, p ⬍ .05; ␹2/df ⫽ 2.25; CFI ⫽ 0.88; RMSEA ⫽ 0.07, p close fit ⬍ .05.

SD ⫽ .58, ␣ ⫽ .77; Neuroticism: M ⫽ 2.85, SD ⫽ .80, ␣ ⫽ .89; Agreeableness, Conscientiousness, Neuroticism, and Intellect/
Intellect/Imagination: M ⫽ 3.64, SD ⫽ .56, ␣ ⫽ .79. The 20-item Imagination, respectively).
Mini-IPIP was then calculated from the relevant subset of items
within this measure: M ⫽ 3.39, SD ⫽ .87, ␣ ⫽ .81; Agreeableness:
M ⫽ 4.16, SD ⫽ .61, ␣ ⫽ .69; Conscientiousness: M ⫽ 3.42, SD ⫽ Measures of Psychopathology-Relevant Criteria
.73, ␣ ⫽ .60; Neuroticism: M ⫽ 2.67, SD ⫽ .85, ␣ ⫽ .76; at Time 2
Intellect/Imagination: M ⫽ 3.72, SD ⫽ .75, ␣ ⫽ .70. There was Anxiety. Participants completed the 20-item Spielberger Trait
good convergence between the IPIP-FFM and the Mini-IPIP for
Anxiety Scale (Spielberger, 1983) using a 4-point scale (M ⫽ 2.08,
each of the Big Five at Time 1 (r ⫽ .94, .88, .89, .94, .87, for
SD ⫽ .51, ␣ ⫽ .92).
Extraversion, Agreeableness, Conscientiousness, Neuroticism, and
Depression. Participants completed the recently developed
Intellect/Imagination, respectively). Participants completed the
10-item short form of the Center for Epidemiological Studies
50-item IPIP-FFM measure at Time 2: Extraversion: M ⫽ 3.46,
Depression scale using a 4-point scale (Cole, Rabin, Smith, &
SD ⫽ .77, ␣ ⫽ .90; Agreeableness: M ⫽ 4.11, SD ⫽ .54, ␣ ⫽ .81;
Conscientiousness: M ⫽ 3.58, SD ⫽ .59, ␣ ⫽ .77; Neuroticism: Kaufman, 2004; M ⫽ 1.78, SD ⫽ .48, ␣ ⫽ .80).
M ⫽ 2.97, SD ⫽ .83, ␣ ⫽ .90; Intellect/Imagination: M ⫽ 3.75, Aggression/hostility. Participants completed the 29-item
SD ⫽ .59, ␣ ⫽ .81. As before, the 20-item Mini-IPIP was then Buss-Perry Aggression Questionnaire using a 5-point scale (Buss
calculated from the relevant subset of items within this measure: & Perry, 1992; M ⫽ 2.21, SD ⫽ .57, ␣ ⫽ .91). We computed a
Extraversion: M ⫽ 3.47, SD ⫽ .90, ␣ ⫽ .83; Agreeableness: M ⫽ total score by taking the average of all of the items.
4.20, SD ⫽ .63, ␣ ⫽ .72; Conscientiousness: M ⫽ 3.52, SD ⫽ .76, Psychological entitlement. Participants completed the 9-item
␣ ⫽ .63; Neuroticism: M ⫽ 2.78, SD ⫽ .85, ␣ ⫽ .75; Intellect/ inventory developed by Campbell, Bonacci, Shelton, Exline, and
Imagination: M ⫽ 3.77, SD ⫽ .75, ␣ ⫽ .67. There was good Bushman (2004) using a 5-point scale (M ⫽ 2.45, SD ⫽ .75, ␣ ⫽
convergence between the IPIP-FFM and the Mini-IPIP for each of .87). This measure assesses the degree to which individuals feel
the Big Five at Time 2 (r ⫽ .94, .89, .88, .94, .87, for Extraversion, more deserving than others, an individual difference that in past
MINI-IPIP SCALES 199

Table 4 Study 5: The Mini-IPIP Scales’ Longer-Term Stability,


Criterion-Related Validity for Big-Five Measures for Predicting Associations With Informant Reports, and Further
Self-Esteem and Behavioral Approach/Avoidance (Study 3) Evidence of Criterion-Related Validity Related to Affect
Criteria and Life Satisfaction

Big Five Scale Self-Esteem BIS BAS Method

Extraversion
Participants
BFI .36 ⫺.14 .50 Data for this study come from Study 3 in Baird, Le, and Lucas (2006).
IPIP-FFM .38 ⫺.12 .52 The present analyses are unrelated to the aims of Baird et al., and the most
Mini-IPIP .35 ⫺.13 .50
substantial overlap is that retest correlations for the IPIP are also contained
Agreeableness
BFI .37 .06 .11 in that report. Nonetheless, this dataset is particularly important for eval-
IPIP-FFM .36 .21 .28 uating the Mini-IPIP because it provides longitudinal data collected over a
Mini-IPIP .24 .26 .23 six- to nine-month period in addition to an informant report of personality.
Conscientiousness Retest data were available from 148 undergraduates (78% women) who
BFI .38 ⫺.02 .13 were recruited for a study of personality and affect that took place over the
IPIP-FFM .30 ⫺.03 .14 nine months of the school year at a large public university in Michigan.
Mini-IPIP .26 ⫺.07 .04 Data analyzed here are based on questionnaire assessments that were
Neuroticism completed at both time points.
BFI ⫺.55 .57 ⫺.08
Each participant was asked to provide the names of friends or family
IPIP-FFM ⫺.59 .51 ⫺.02
Mini-IPIP ⫺.60 .42 ⫺.03 members who could serve as informants. Informants rated participants’ per-
Imagination/Intellect sonality using the IPIP-FFM items and were given the choice of responding
BFI .23 ⫺.11 .35 via web site or a paper-and-pencil questionnaire. Informants who completed
IPIP-FFM .37 ⫺.12 .43 the questionnaire were entered into a drawing for $50. There were an average
Mini-IPIP .28 ⫺.14 .29 of 3.08 reports per participant (SD ⫽ 1.06), and we obtained at least one
R informant report for 133 participants (90% of the 148). For individuals with
BFI .67 .64 .55 more than one informant report (91% of the 133), we averaged item responses
IPIP-FFM .72 .64 .60 across informants before creating aggregate scales.
Mini-IPIP .68 .57 .54

Note. N ⫽ 299 to 300. BFI ⫽ Big Five Inventory (John & Srivastava, Personality Measures
1999), BIS ⫽ Behavioral Inhibition Scale (Carver & White, 1994), BAS ⫽
Behavioral Approach System (Carver & White, 1994). IPIP-FFM and Mini-IPIP. Participants completed the 50-item
IPIP-FFM measure at Time 1, with the following descriptive
statistics: Extraversion: M ⫽ 3.58, SD ⫽ .83, ␣ ⫽ .91; Agreeable-
research has been associated with unpleasant interpersonal behav- ness: M ⫽ 4.25, SD ⫽ .50, ␣ ⫽ .81; Conscientiousness: M ⫽ 3.61,
iors, such as taking candy that was intended to be given to young
children (Campbell et al., 2004). As to be expected, this scale was
positively correlated with the Aggression/Hostility scale in these Table 5
data (r ⫽ .32). Criterion-Related Validity for Big-Five Measures Predicting
Psychopathology-Relevant Outcomes (Study 4)
Results and Discussion Criteria at Time 2 (⬇3 Weeks Later)
Short-Term Retest Correlations Big Five Scale at Hostility/ Psychological
Time 1 Anxiety Depression aggression entitlement
The short-term retest correlations for the IPIP-FFM scales were
high: r ⫽ .89, .72, .79, .87, and .83, for Extraversion, Agreeable- Extraversion
ness, Conscientiousness, Neuroticism, and Intellect/Imagination- IPIP-FFM ⫺.30 ⫺.24 ⫺.07 .22
,respectively. The retest correlations for Mini-IPIP scales were Mini-IPIP ⫺.22 ⫺.20 ⫺.06 .19
also high: r ⫽ .87, .62, .75, .80, and .77, for Extraversion, Agree- Agreeableness
IPIP-FFM ⫺.17 ⫺.16 ⫺.33 ⫺.16
ableness, Conscientiousness, Neuroticism, and Intellect/Imagina- Mini-IPIP ⫺.18 ⫺.15 ⫺.26 ⫺.12
tion, respectively. Conscientiousness
IPIP-FFM ⫺.20 ⫺.22 ⫺.22 .02
Mini-IPIP ⫺.19 ⫺.16 ⫺.21 .03
Criterion-Related Validity Neuroticism
IPIP-FFM .80 .54 .48 .03
Table 5 displays the validity coefficients for the Big Five
Mini-IPIP .77 .53 .49 .01
assessed at Time 1 predicting scores on criterion measures as- Imagination/Intellect
sessed at Time 2, approximately three weeks later. In general, the IPIP-FFM ⫺.14 ⫺.15 .01 .07
coefficients were similar for the IPIP-FFM and Mini-IPIP scales, Mini-IPIP ⫺.07 ⫺.12 .03 .06
further indicating that the Mini-IPIP displays a similar pattern of R
IPIP .82 .58 .59 .34
criterion-related validity. Lastly, the total amount of variance Mini-IPIP .79 .56 .56 .28
explained by the Big Five was quite similar for the IPIP-FFM
scales and the Mini-IPIP scales, as shown by the multiple Rs. Note. N ⫽ 216.
200 DONNELLAN, OSWALD, BAIRD, AND LUCAS

SD ⫽ .64, ␣ ⫽ .82; Neuroticism: M ⫽ 2.99, SD ⫽ .87, ␣ ⫽ .91; Associations With Informant Reports
Intellect/Imagination: M ⫽ 3.86, SD ⫽ .52, ␣ ⫽ .79. Statistics for
the 20-item Mini-IPIP were then calculated from the relevant The IPIP-FFM scales at Time 1 were associated with informant
subset of items within this measure: Extraversion: M ⫽ 3.55, SD ⫽ reports of the Big Five: r ⫽ .52, .31, .51, .34, and .35, for
.94, ␣ ⫽ .83; Agreeableness: M ⫽ 4.32, SD ⫽ .57, ␣ ⫽ .72; Extraversion, Agreeableness, Conscientiousness, Neuroticism, and
Conscientiousness: M ⫽ 3.49, SD ⫽ .84, ␣ ⫽ .73; Neuroticism: Intellect/Imagination, respectively. Likewise, the Mini-IPIP scales
M ⫽ 2.79, SD ⫽ .98, ␣ ⫽ .83; Intellect/Imagination: M ⫽ 3.98, at Time 1 were associated with informant reports of the Big Five:
SD ⫽ .67, ␣ ⫽ .69. There was good convergence between the r ⫽ .53, .30, .47, .37, and .26, for Extraversion, Agreeableness,
IPIP-FFM and the Mini-IPIP for each of the Big Five at Time 1 Conscientiousness, Neuroticism, and Intellect/Imagination, re-
(r ⫽ .94, .87, .91, .96, .86, for Extraversion, Agreeableness, spectively. Thus, the Mini-IPIP scales had similar properties as the
Conscientiousness, Neuroticism, and Intellect/Imagination, IPIP-FFM scales in terms of the associations between self reports
respectively). Participants completed the 50-item IPIP-FFM and informant reports of personality. Note that these correlations
measure at Time 2, with the following descriptive statistics: Ex- do not approach unity, nor were they expected to, given that
traversion: M ⫽ 3.59, SD ⫽ .83, ␣ ⫽ .92; Agreeableness: M ⫽ self-other correlations for personality traits are often in the range
4.25, SD ⫽ .54, ␣ ⫽ .87; Conscientiousness: M ⫽ 3.61, SD ⫽ .67, of .30 to .60 (e.g., Funder, 1999).
␣ ⫽ .85; Neuroticism: M ⫽ 2.93, SD ⫽ .83, ␣ ⫽ .90; Intellect/
Imagination: M ⫽ 3.87, SD ⫽ .53, ␣ ⫽ .80. Again, statistics for the Criterion-Related Validity
20-item Mini-IPIP were then calculated from the relevant subset of Table 6 displays the correlations between the Big Five assessed
items within this measure: M ⫽ 3.59, SD ⫽ .94, ␣ ⫽ .87; at Time 1 and criterion measures assessed at Time 2. As in the
Agreeableness: M ⫽ 4.30, SD ⫽ .56, ␣ ⫽ .77; Conscientiousness: previous studies, the coefficients were similar for the IPIP-FFM
M ⫽ 3.52, SD ⫽ .85, ␣ ⫽ .76; Neuroticism: M ⫽ 2.74, SD ⫽ .90, and Mini-IPIP scales, further indicating that the Mini-IPIP scales
␣ ⫽ .78; Intellect/Imagination: M ⫽ 3.94, SD ⫽ .67, ␣ ⫽ .70. have a similar pattern of criterion-related validity as the 10-item
There was good convergence between the IPIP-FFM and the scales. Likewise, the total amount of variance explained by the Big
Mini-IPIP for each of the Big Five at Time 2 (r ⫽ .95, .91, .92, .96, Five for each of the criterion variables was quite similar for the
.86, for Extraversion, Agreeableness, Conscientiousness, Neurot- IPIP-FFM scales and the Mini-IPIP scales, as indicated by the
icism, and Intellect/Imagination, respectively). multiple Rs.
Informant reports. Informants rated participants on the 50-
item IPIP-FFM scales, with the following descriptive statistics:
Extraversion: M ⫽ 3.70, SD ⫽ .64, ␣ ⫽ .92; Agreeableness: M ⫽
General Discussion
4.15, SD ⫽ .49, ␣ ⫽ .90; Conscientiousness: M ⫽ 3.62, SD ⫽ .60, Sometimes researchers need to use short assessments. This
␣ ⫽ .91; Neuroticism: M ⫽ 2.85, SD ⫽ .61, ␣ ⫽ .90; Intellect/ simple fact may raise concerns from reviewers of manuscripts and
Imagination: M ⫽ 3.91, SD ⫽ .43, ␣ ⫽ .86. Note that reliabilities grant proposals and will aggravate purists who prefer the original
are relatively high because the “items” are actually aggregated measure to that of a shorter form. Unfortunately, there is no way
reports across multiple informants.

Table 6
Criterion Variables at Time 2 Criterion-Related Validity for Big-Five Measures Predicting
Affect and Life Satisfaction (Study 5)
Positive affect. Participants completed the 8-item scale from
the Intensity and Time Affect Survey (ITAS; Diener, Smith, & Criteria at Time 2 (⬇ 6–9 Months Later)
Fujita, 1995) using a 5-point scale (M ⫽ 3.85, SD ⫽ .65, ␣ ⫽ .87).
Negative affect. Participants completed the 16-item scale from Big Five Scale at Positive Negative Life
the ITAS (M ⫽ 2.21, SD ⫽ .70, ␣ ⫽ .93). Time 1 Affect Affect Satisfaction
Life satisfaction. Participants completed the 5-item Satisfac- Extraversion
tion with Life Scale (Diener, Emmons, Larsen, & Griffen, 1985) IPIP .31 ⫺.39 .29
using a 5-point scale (M ⫽ 3.61, SD ⫽ .98, ␣ ⫽ .86). Mini-IPIP .31 ⫺.34 .28
Agreeableness
IPIP .50 ⫺.22 .27
Results and Discussion Mini-IPIP .43 ⫺.09 .21
Conscientiousness
Longer-Term Retest Correlations IPIP .25 ⫺.14 .21
Mini-IPIP .16 ⫺.16 .15
The longer-term retest correlations for the IPIP-FFM scales Neuroticism
were very high: r ⫽ .88, .79, .81, .82, and .82, for Extraversion, IPIP ⫺.27 .63 ⫺.35
Mini-IPIP ⫺.29 .61 ⫺.40
Agreeableness, Conscientiousness, Neuroticism, and Intellect/ Imagination/Intellect
Imagination, respectively. The retest correlations for Mini-IPIP IPIP .16 .01 .06
scales were also high: r ⫽ .86, .68, .77, .82, .75, for Extraversion, Mini-IPIP .06 ⫺.03 .02
Agreeableness, Conscientiousness, Neuroticism, and Intellect/ R
IPIP .57 .67 .47
Imagination, respectively. Thus, the longer-term retest reliability Mini-IPIP .56 .65 .49
of Mini-IPIP scales resembles the coefficients of the original
IPIP-FFM scales. Note. N ⫽ 148.
MINI-IPIP SCALES 201

to escape many practical constraints such as the limited patience or existing measures. Foremost, we concur with Smith et al. (2000) in
attention span of research participants, the fixed time period al- that researchers should begin the process of constructing short
lowed for testing, and financial limits for conducting a study. To forms based on an empirically refined “parent” instrument (see
meet the practical need for short forms of broad personality traits, also Marsh et al., 2005). The 50-item IPIP-FFM inventory is
we attempted to create a brief measure of the Big Five in a psychometrically strong in terms of the internal consistencies of its
scientifically appropriate manner by reducing the 50-item public scales and its factor structure (at least from the perspective of
domain IPIP-FFM inventory (Goldberg, 1999) to 20 items. The exploratory factor analysis). These properties greatly facilitated the
series of studies just presented give us several reasons to have creation of the short form. Conversely, if the parent instrument has
confidence that we created a practically useful brief measure of the severe flaws, then it will be nearly impossible to create a reason-
Big Five. able short form from it. In addition, the content validity of the
First, results from all five studies indicated that the Mini-IPIP parent form places the upper limit on the content validity of the
scales had respectable internal consistencies given their length and shorter form.
content breadth. The alphas were well above .60 across all five Second, short-form development is greatly enhanced when re-
studies. Second, the Mini-IPIP scales tapped nearly the same Big searchers can proceed in a theoretically informed manner. For
Five facet content as the IPIP-FFM scales as demonstrated in example, by drawing on the perspective that the five factors of
Study 2. That is, when we correlated the Mini-IPIP and the personality are in fact orthogonal (see Saucier, 2002), we were
IPIP-FFM scales with a separate IPIP measure assessing the facets able to develop an additional rationale for selecting items for our
of the Big Five, we obtained a very similar pattern of associations. short form. That is, we selected items for scales that were rela-
To be sure, brief scales may not capture all facets of the Big Five tively “pure” indicators of one and only one of the Big Five.
with equal fidelity; however, our four-item scales did not seem Without this basic conceptual guidance, we would have had more
remarkably deficient when compared to their parent scales. The difficulty in the process of developing the Mini-IPIP. Moreover, a
broad conclusion that we draw from Table 2 is that researchers clear idea of the content span of the Big Five factors aided our
who opt to assess the Big Five broadly can do so with relatively selection of items, which was indicated by the distribution of the
few items. Indeed, the TIPI scales showed a similar pattern of IPIP-FFM items across the facets of each Big Five factor. We have
content coverage as the Mini-IPIP and IPIP-FFI scales. This attests
thus provided specific examples of what may be an obvious
to the usefulness of the TIPI for those situations where even 20
underlying principle: Developing an appropriate short form re-
items are too many.
quires theoretical and conceptual understanding of the constructs
Moreover, the retest correlations for the Mini-IPIP scales were
under investigation.
quite similar to the IPIP-FFM scales across intervals of a few
Lastly, replication is a crucial aspect of good short form devel-
weeks (Study 4) and several months (Study 5). Finally, the Mini-
opment. Indeed, Smith et al. (2000) emphasized the need for
IPIP scales showed a similar pattern of convergent and criterion-
independent replication in the development of short forms, and we
related validity as compared to the 10-item IPIP-FFM scales
closely followed their guideline. We started the process with a
(Studies 2–5). It is important to note that researchers might have to
large dataset to ensure that the factor loading matrix was reason-
tolerate some reduction in validity because smaller scales have
ably stable. We then used a second independent dataset to adjust
somewhat lower internal consistency reliability. Even given this
possible tradeoff, the Mini-IPIP reliability coefficients were high the item selection and replicated our results using three other
and the validity coefficients were very similar to the IPIP-FFM datasets. This process increased our confidence in the Mini-IPIP,
scales. These pieces of evidence suggest that researchers will not because we found that the adequacy of the short form was a
sacrifice much predictive validity when using these brief scales, at repeatable phenomenon across samples and contexts. We encour-
least with respect to the family of criterion variables that we age other short-form developers to rely on this method of replica-
assessed having to do with psychological distress and well-being. tion when constructing their inventories. Likewise, we suggest that
On balance, any slight reduction in criterion-validity has to be users of short forms demand this kind of evidence when selecting
weighed against the practical benefits of reducing the total number these sorts of measures for their studies.
of items in the IPIP-FFM scale from 50 to 20 items (a 60% Despite our enthusiasm for the Mini-IPIP, a few limitations and
reduction). caveats should be noted. The most important issue concerns the
In light of these considerations, we believe that the Mini-IPIP samples that were used to develop this measure. All of our anal-
has much to recommend to researchers who need a brief assess- yses were based on college-student samples, and future research
ment of the Big Five. Lim and Ployhart (2006) recently demon- should extend these results to other populations. We have no
strated that the IPIP-FFM was a very suitable replacement for the reason to suspect that the results will not generalize to other
widely used NEO-FFI (Costa & McCrae, 1992) and concluded that groups, but this is an empirical matter to be tested. Second, rather
“researchers could readily benefit from having this free alternative than examine the subset of Mini-IPIP items as we did, future
to the existing proprietary instruments when conducting research should administer the smaller scales by themselves to
personality-related research” (p. 50). By extension, especially in additional samples to examine the properties of the Mini-IPIP. We
light of our study findings, it appears that the Mini-IPIP also may had argued that transient errors could be reduced by administering
prove to be a very useful, efficient, and ultimately economical shorter forms, and thus the very brief inventory may reduce re-
instrument given that it is in public domain. spondent fatigue and content redundancy, thereby increasing the
In addition to creating a practically useful tool, we gained reliability and validity of the measure compared with the results
several insights into the construction of short forms that are likely reported here. This potential benefit might be countered by the fact
to have broad applicability to researchers who opt to shorten that participants may need more items to get in a suitable mindset
202 DONNELLAN, OSWALD, BAIRD, AND LUCAS

for responding to items, and in this sense fewer items may lead to The BIS/BAS scales. Journal of Personality and Social Psychology, 67,
greater measurement error. 319 –333.
Finally, there is content overlap between some of the items in Church, A. T., & Burke, P. J. (1994). Exploratory and confirmatory tests of
the Mini-IPIP scales. This reduces the scales’ construct breadth in the Big Five and Tellegen’s three and four-dimensional models. Journal
of Personality and Social Psychology, 66, 93–114.
the service of increased internal consistency. This is perhaps an
Cole, J. C., Rabin, A. S., Smith, T. L., & Kaufman, A. S. (2004).
inevitable consequence of our attempt to balance several poten-
Development and validation of a Rasch-derived CES-D short form.
tially competing considerations: short scale length, high internal Psychological Assessment, 16, 360 –372.
consistency reliabilities, and empirical distinctiveness between Costa, P. T. Jr., & McCrae, R. R. (1992). NEO-PI-R professional manual.
scales. As noted by Saucier and Goldberg (2002), “scale construc- Odessa, FL: Psychological Assessment Resources.
tion can serve any of many possible masters, but these masters can Costa, P. T., Jr., & Widiger, T. A. (2002). Personality disorders and the
lead us in divergent directions” (p. 53). Although we believe that Five-Factor Model of personality (2nd ed.). Washington, DC: American
the Mini-IPIP has achieved a workable balance among these Psychological Assocation.
concerns, we do acknowledge that the construct breadth of the Diener, E., Emmons, R. A., Larsen, R. J., & Griffen, S. (1985). The
Mini-IPIP scales is a potential limitation of the measure. satisfaction with life scale. Journal of Personality Assessment, 49, 71–
All told, our studies indicate that the Mini-IPIP is a useful tool 75.
Diener, E., Smith, H., & Fujita, F. (1995). The personality structure of
for researchers needing a very short assessment of the Big Five.
affect. Journal of Personality and Social Psychology, 69, 130 –141.
Our bottom line is that the 20-item Mini-IPIP is nearly as good as
Durrett, C., & Trull, T. J. (2005). An evaluation of evaluative personality
the longer 50-item IPIP-FFM parent instrument in terms of both terms: A comparison of the Big Seven and Five-Factor model in pre-
reliability and validity. This is, of course, our judgment based on dicting psychopathology. Psychological Assessment, 17, 359 –368.
the empirical findings presented here, and we stress that test users Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J.
must consider the Mini-IPIP within their own context of research (1999). Evaluating the use of exploratory factor analysis in psycholog-
or application, and they should consider the many costs and ical research. Psychological Methods, 4, 272–299.
benefits of using short forms when deciding whether to use the Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development
Mini-IPIP or its parent inventory. When making this decision, we and refinement of clinical assessment instruments. Psychological As-
believe it is useful to adopt a practical perspective, namely that of sessment, 7, 286 –299.
research participants who are often faced with the task of actually Funder, D. C. (1999). Personality judgment: A realistic approach to person
completing a long series of questionnaires. Our experience is that perception. San Diego: Academic Press.
Funder, D. C. (2001). Personality. Annual Review of Psychology, 52,
many study participants hold the view that the shorter the testing
197–221.
time, the better the study. We learned that it is possible to make Goldberg, L. R. (1993). The structure of personality traits: Vertical and
very effective measures of broad constructs with relatively few horizontal aspects. In D. C. Funder, R. D. Parke, C. Tomlinson-Keasey,
items. As such, we suspect that many instruments might be longer & K. Widaman (Eds.). Studying lives through time: Personality and
than necessary and therefore could be successfully shortened by development (pp. 169 –188). Washington, DC: American Psychological
taking an approach similar to ours. In closing, we note that short- Association.
ening the length of instruments may lead to subtle improvements Goldberg, L. R. (1999). A broad-bandwidth, public-domain, personality
in the experience and motivation of those participating in psycho- inventory measuring the lower-level facets of several five-factor models.
logical research, one outcome that could yield big dividends for In I. Mervielde, I. J. Deary, F. De Fruyt, and F. Ostendorf (Eds.),
psychological science. Personality psychology in Europe (Vol. 7, pp. 7–28). Tilburg, The
Netherlands: Tilburg University Press.
Gosling, S. D., Rentfrow, P. J., & Swann, W. B., (2003). A very brief
References measure of the Big-Five personality domains. Journal of Research in
Personality, 37,504 –528.
Arbuckle, J. L. (2003). AMOS 5.0 update to the AMOS user’s guide. Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the
Chicago: Small Waters. stability of component patterns. Psychological Bulletin, 103, 265–275.
Baird, B. M., Le, K., & Lucas, R. E. (2006). On the nature of intraindi- John, O. P., & Srivastava, S. (1999). The Big Five trait taxonomy: History,
vidual personality variability: Reliability, validity, and associations with measurement, and theoretical perspectives. In L. A. Pervin & O. P. John
well-being. Journal of Personality and Social Psychology, 90, 512–527. (Eds.), Handbook of personality: Theory and research (2nd ed., pp.
Ball, S. A. (2005). Personality traits, problems, and disorders: Clinical 102–138). New York: Guilford Press.
applications to substance use disorders. Journal of Research in Person- Johnson, J. A. (2000). Developing a short form of the IPIP-NEO: A report
ality, 39, 84 –102. to HGW Consulting. Unpublished manuscript. Department of Psychol-
Block, J. (1995). A contrarian view of the five-factor approach to person- ogy, University of Pennsylvania, DuBois, PA.
ality description. Psychological Bulletin, 117, 187–215. Kenny, D. A. (1979). Correlation and causality. New York: Wiley.
Bollen, K. A. (1989). Structural equations with latent variables. New Kline, R. B. (2004). Principles and practice of structural equation mod-
York: Wiley. eling (2nd ed.). New York: Guilford Press.
Buss, A. H., & Perry, M. (1992). The aggression questionnaire. Journal of Krueger, R. F., Caspi, A., & Moffitt, T. E. (2000). Epidemiological
Personality and Social Psychology, 63, 452– 459. personology: The unifying role of personality in population-based re-
Campbell, W. K., Bonacci, A. M., Shelton, J., Exline, J. J., & Bushman, search on problem behaviors. Journal of Personality, 68, 967–998.
B. J. (2004). Psychological entitlement: Interpersonal consequences and Lim, B. C., & Ployhart, R. E. (2006). Assessing the convergent and
validation of a self-report measure. Journal of Personality Assessment, discriminant validity of Goldberg’s International Personality Item Pool:
83, 29 – 45. A multitrait-multimethod examination. Organizational Research Meth-
Carver, C., S., & White, T. L. (1994). Behavioral inhibition, behavioral ods, 9, 29 –54.
activation, and affective responses to impending reward and punishment: Little, T. D., Lindenberger, U., & Nesselroade, J. R. (1999). On selecting
MINI-IPIP SCALES 203

indicators for multivariate measurement and modeling latent variables: Saucier, G. (2002). Orthogonal markers for orthogonal factors: The case of
When “good” indicators are bad and “bad” indicators are good. Psycho- the Big Five. Journal of Research in Personality, 36, 1–31.
logical Methods, 4, 192–211. Saucier, G., & Goldberg, L. R. (2002). Assessing the Big Five: Applica-
Lucas, R. E., Diener, E., Grob, A. Suh, E. M., & Shao, L. (2000). tions of 10 psychometric criteria to the development of marker scales. In
Cross-cultural evidence for the fundamental features of extraversion. B. De Raad & M. Perugini (Eds.), Big Five assessment (pp. 29 –58).
Journal of Personality and Social Psychology, 79, 452– 468. Seattle. WA: Hogrefe and Huber Publishers.
MacCallum, R. C. (1986). Specification searches in covariance structure Schmidt, F. L., Le, H., & Ilies, R. (2003). Beyond alpha: An empirical
modeling. Psychological Bulletin, 100, 107–120. examination of the effects of different sources of measurement error on
Marsh, H. W., Ellis, L. A., Parada, R. H., Richards, G., & Heubeck, B. G. reliability estimates for measures of individual differences constructs.
(2005). A short version of the Self Description Questionnaire II: Opera- Psychological Methods, 8, 206 –224.
tionalizing criteria for short-form evaluation with new applications of Schmitt, D. P., & Allik, J. (2005). Simultaneous administration of the
confirmatory factor analyses. Psychological Assessment, 17, 81–102. Rosenberg Self-Esteem scale in 53 nations: Exploring the universal and
McCrae, R. R., Zonderman, A. B., Costa, P. T. Bond, M. H., & Paunonen, culture-specific features of global self-esteem. Journal of Personality
S. V. (1996). Evaluating the replicability of factors in the revised NEO and Social Psychology, 89, 623– 642.
personality inventory: Confirmatory factor analysis versus Procrustes Smith, G. T., McCarthy, D. M., & Anderson, K. G. (2000). On the sins of
rotation. Journal of Personality and Social Psychology, 70, 552–566. short-form development. Psychological Assessment, 12, 102–111.
Miller, J. D., & Lynam, D. (2001). Structural models of personality and Spielberger, C. D. (1983). Manual for the State-Trait Anxiety Inventory.
their relation to antisocial behavior: A meta-analytic review. Criminol- Palo Alto, CA: Consulting Psychologists Press.
ogy, 39, 765–798. Stanton, J. M., Sinar, E. F., Balzer, W. K., & Smith, P. C. (2002). Issues
Oswald, F. L., Schmitt, N., Kim, B. H., Ramsay, L. J., & Gillespie, M. A. and strategies for reducing the length of self-report scales. Personnel
(2004). Developing a biodata measure and situational judgment inven- Psychology, 55, 167–194.
tory as predictors of college student performance. Journal of Applied Trull, T. J., & Sher, K. J. (1994). Relationship between the five-factor
Psychology, 89, 187–207. model of personality and Axis I disorders in a non-clincal sample.
Pedhazur, E. J. (1997). Multiple regression in behavioral research: Expla- Journal of Abnormal Psychology, 103, 350 –360.
nation and prediction. New York: Harcourt College. Watson, D., Wiese, D., Vaidya, J., & Tellegen, A. (1999). The two general
Robins, R. W., Tracy, J. L., Trzesniewski, K., Potter, J., & Gosling, S. D. activation systems of affect: Structural findings, evolutionary consider-
(2001). Personality correlates of self-esteem. Journal of Research in ations, and psychobiological evidence. Journal of Personality and Social
Personality, 35, 463– 482. Psychology, 76, 820 – 838.
Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, Widiger, T. A. (2005). Five factor model of personality disorder: Integrat-
NJ: Princeton University Press. ing science and practice. Journal of Research in Personality, 39, 67– 83.
Saucier, G. (1994). Mini-markers: A brief version of Goldberg’s unipolar Wiggins, J. S. (2003). Paradigms of personality assessment. New York:
Big-Five markers. Journal of Personality Assessment, 63, 506 –516. Guilford Press.

Appendix

20-Item Mini-IPIP
Original
Item Factor Text Item Number

1 E Am the life of the party. 1


2 A Sympathize with others’ feelings 17
3 C Get chores done right away. 23
4 N Have frequent mood swings. 39
5 I Have a vivid imagination. 15
6 E Don’t talk a lot. (R) 6
7 A Am not interested in other people’s problems. (R) 22
8 C Often forget to put things back in their proper place. (R) 28
9 N Am relaxed most of the time. (R) 9
10 I Am not interested in abstract ideas. (R) 20
11 E Talk to a lot of different people at parties. 31
12 A Feel others’ emotions. 42
13 C Like order. 33
14 N Get upset easily. 29
15 I Have difficulty understanding abstract ideas. (R) 10
16 E Keep in the background. (R) 16
17 A Am not really interested in others. (R) 32
18 C Make a mess of things. (R) 18
19 N Seldom feel blue. (R) 19
20 I Do not have a good imagination. (R) 30

Note. E ⫽ Extraversion; A ⫽ Agreeableness; C ⫽ Conscientiousness; N ⫽ Neuroticism; I ⫽ Intellect/Imagination; (R) ⫽


Reverse Scored Item. Original 50-item IPIP-FFM available at http://ipip.ori.org/newQform50b5.htm.

Received June 28, 2005


Revision received January 19, 2006
Accepted January 20, 2006 䡲

View publication stats

You might also like