Main
Main
Main
Respondent rationale for neither agreeing nor disagreeing: Person and item
contributors to middle category endorsement intent on Likert personality indicators
John T. Kulas a,⇑, Alicia A. Stachowski b
a
Department of Psychology, Saint Cloud State University, St. Cloud, MN 56301, USA
b
Department of Psychology, University of Wisconsin – Stout, Menomonie, WI, USA
a r t i c l e i n f o a b s t r a c t
Article history: The current study examines intentions behind middle category endorsement in personality assessment,
Available online 28 February 2013 and investigates person and item antecedents to these intentions. Participants verbally explained their
responses to 100 personality items and completed personality, self-concept clarity, and cognitive ability
Keywords: measures. Talked through items were scaled with respect to clarity, complexity, and need for contextu-
Personality assessment alization. Verbal protocols suggest that the predominant respondent orientation when selecting the Lik-
Middle category ert middle category is it depends. Candidate item and person antecedents indicate that middle category
Likert
endorsement intentions are more closely attributable to item rather than respondent characteristics.
Self-report
These findings suggest that consecutive integer scoring algorithms may result in personality scale atten-
uation – particularly with instruments that contain indicators reflecting an ambiguous or unspecified
context.
Ó 2013 Elsevier Inc. All rights reserved.
0092-6566/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.jrp.2013.01.014
J.T. Kulas, A.A. Stachowski / Journal of Research in Personality 47 (2013) 254–262 255
development of social–motivational–cognitive models of the psy- ment literature), there remains a void regarding specific process
chological inventory response process. elaboration.1 What is truly missing in the field of psychological
inventory assessment is a descriptive model of the response process
1.1. Reasons to endorse the middle category that encompasses context, motives, capabilities, and (particularly) so-
cial–cognitive forces contributing to response selection (e.g., address-
Several reasons have been offered for a respondent choosing to ing the rather large black box between stages 1 and 4 in Tourangeau
select neither agree nor disagree on psychological inventories. We and Rasinski’s (1988) model). The construction of such a model is be-
grossly characterize these orientations as middle category use yond the scope of the current paper,2 but we hope that our method-
(the respondent can be characterized as possessing relatively mod- ology and results are viewed as possible tools to be leveraged toward
erate construct/indicator standing) and misuse (the respondent is developing a framework that informs a more comprehensive under-
not reasonably characterized as possessing moderate construct/ standing of the psychological inventory response process.
indicator standing). Shaw and Wright (1967) initially noted three
possible response orientations in the context of attitude measure- 1.2. The current study
ment. First, respondents may, in fact, hold no attitude regarding an
attitude object. Second, participants may be balanced in their eval- The primary focus of our investigation is in determining whether
uation of an attitude object. Lastly, a participant’s attitude may not or not there is a predominant response orientation for those who
be clearly defined. endorse the middle category (neither agree nor disagree) in personal-
DuBois and Burns (1975) similarly argue that respondents may ity assessment. If so, is this general orientation better characterized
endorse the middle category because of either ambivalence or as use or misuse? Secondly, we seek to determine whether response
indifference. The former indicates an inability to decide whether orientations toward the middle category can be reasonably expected
to agree or disagree, and the latter reflects simply not caring either to differ based on item and/or person characteristics (e.g., are there
way. Yet an additional possibility is that respondents may select some conditions that are more likely to elicit a true moderate orien-
the middle category if they believe that they are not sufficiently in- tation?). Our expectation is that the primary use of the middle cat-
formed to take a valenced position (DuBois & Burns, 1975; Kulas egory will, in fact, be a misuse. Kulas and Stachowski (2009)
et al., 2008). Stone (2004) stated that he could see no value for a documented a primary middle-category orientation of it depends,
middle category given that such a response ‘‘can reflect a decision meaning that in some situations respondents would agree, while
not to prefer either end, a lack of information by which to choose, in others the respondent would disagree. This orientation was more
or an unwillingness to commit to a definitive response’’ (p. 212). likely (across respondents and items) than was a moderate or aver-
Respondents, however, do prefer that such a category be offered age trait standing orientation. The current study employs an open-
(McDonald, 2004). ended categorization of response orientations, and specific hypoth-
For the above reasons, researchers and assessment specialists eses (such as prediction of a targeted response orientation) are,
have long grappled with what to do with middle category endorse- therefore, not attempted. Rather, we simply anticipate a predomi-
ments. The key consideration here lies in the proper attribution of nant non-moderate endorsement orientation (e.g., misuse):
endorsement intent (e.g., use or misuse). However, without a
descriptive and explanatory framework describing the mecha- Hypothesis 1. The predominant middle category orientation does
nisms and process of assessment response, the attribution of any not reflect moderate trait standing.
endorsement to valid versus invalid sources is difficult. Although
Of possibly greater interest are the specific reasons that respon-
such a model is lacking in the psychological assessment domain,
dents endorse the middle category with different intentions. While
initial steps toward such a framework have been presented via
we can make predictions concerning the broad likelihood of mid-
stage models of the general survey response process (e.g., Schwarz dle category use and misuse based on prior research, we do not
& Oyserman, 2001; Shulruf, Hattie, & Dixon, 2008; Tourangeau & have sufficient theoretical or empirical bases for predicting the
Rasinski, 1988). specific reasons that underlie different intentions with selection
of this category. Empirically, Hernández et al. (2004) documented
1.1.1. The survey response process inter-individual differences, whereas Kulas et al. (2008) and Kulas
As noted by Krosnick (1991), survey respondents are frequently and Stachowski (2009) documented strong item associations with
asked to put forth substantial cognitive effort for very little reward. middle category endorsements. As such, we investigate both per-
To accommodate this observation, he developed satisficing theory, son and item antecedents in the present study.
which builds on Tourangeau and Rasinski’s (1988) question–
answering process-model. Tourangeau and Rasinski’s (1988) 1.2.1. Person antecedents
model posits that respondents proceed through four steps (i.e., Our framework to identify person antecedents is taken primar-
understanding and interpreting the question, retrieving informa- ily from Krosnick (1991), who proposed that both ability and per-
tion from memory, consolidating information to form a judgment,
and reporting the judgment) when encountering survey or ques-
1
Note here that testing is deemed to be qualitatively different from administration
tionnaire items. Krosnick added to this model by specifying two
of an inventory. Ericsson and Simon (1993), for example, noted several differences
meta-processes: optimizing and satisficing. These two processes inherent in the cognitive processes required for response to ability versus personality
correspond to making an effortful, thoughtful, and accurate deci- assessment items. Specifically, the ability-testing application is thought to evoke a
sion, which requires proceeding through all four decision-making more carefully sequenced set of cognitive operations than is the personality
steps (optimizing), or taking cognitive shortcuts in the form of application. Inventory respondents can pursue relatively simpler cognitive processes
such as acquiescing or choosing the most socially desirable alternative – these options
skimming some steps or skipping them altogether (satisficing; Kro-
are not commonly available in knowledge or ability testing applications.
snick, 1999). 2
Krosnick’s work on satisficing theory is an important contribution in this domain,
We consider the psychological inventory response process to but it does not satisfy the need to describe process and predict response. The lack of
represent a specific form of survey response, and therefore build such a guiding model is perhaps most glaringly irksome in the literature(s) on faking
our nascent perspective on middle category response orientations and social desirability, where little progress has been made in the understanding of
these response orientations beyond well-established documentation of elevated
primarily from within this survey research literature. Although rel- scores on certain items (e.g., Edwards, 1957), with certain people (e.g., Crowne &
atively more emphasis has been placed on response processes Marlowe, 1960), or in certain contexts (e.g., Mueller-Hanson, Heggestad, & Thornton,
within this literature (than, for example, the psychological assess- 2003).
256 J.T. Kulas, A.A. Stachowski / Journal of Research in Personality 47 (2013) 254–262
sonality traits may be related to individual effort exerted during the investigated, but the expectation is that middle category response
survey response process. The process involved in determining orientations are relatively stable across persons (whereas they exhi-
whether one agrees or disagrees with a personality statement is bit relatively more variability across items).
more cognitively taxing than opting out of this process (and there-
by selecting the middle category with corresponding misuse in- 1.2.2. Item antecedents
tent; e.g., I don’t know, I don’t care, I’m not willing to say). In addition to the above noted candidate inter-individual ante-
Consistent with Krosnick’s perspective, Warwick and Lininger cedents of middle category use and/or misuse, a few investigations
(1975) identified intellectual satisfaction as a motive to respond have documented inter-item characteristics of interest. Kulas et al.
accurately. Relatedly, Krosnick et al. (2002) noted that respondents (2008) replaced personality item verbs with less commonly
higher in cognitive skill exhibited less attraction to a no-opinion encountered synonyms and recorded increased middle category
endorsement than those with lower cognitive skill. Collectively, it endorsements for these less clear or more complex items. The
is feasible that participants of higher cognitive capacity would be implication is that middle category endorsements may be the pre-
less likely to endorse the middle response for reasons other than ferred response option for respondents who have difficulty compre-
moderate trait standing. We therefore retain a measure of cogni- hending an item. Kulas and Stachowski (2009) documented middle
tive ability as an inter-individual candidate for middle category category endorsement intent, but only investigated four possibili-
endorsement intent. ties: not applicable, uncertain, average, or it depends. Results here
The second person characteristic identified by Krosnick (1991) is again implicate item clarity as an antecedent, with it depends being
personality. Hernández et al. (2004), who also investigated whether the predominant intent with middle category endorsement (of the
the middle category may be differentially used and interpreted, limited candidate possibilities). They suggested that contextualiza-
identified two categories/classes of 16PF respondents. One class en- tion of items (e.g., editing an item stem of ‘‘I get along with people’’
dorsed the middle category with greater frequency and one with less to ‘‘I get along with people at work’’) may minimize the number of
frequency. They found that people who were more reserved and less it depends middle category orientations elicited from an item. Col-
outgoing (e.g., low on Warmth) used the middle category with rela- lectively, based on past empirical findings, author suggestions, and
tively greater frequency – perhaps because they had difficulty in consideration of Krosnick’s (1991) effort component, we retain and
openly sharing information about themselves. They documented a explore candidate item characteristics of clarity, need for contextu-
similar positive association for the Socially Bold, stating that those alization, and item length/complexity.
individuals were perhaps more likely to ignore instrument direc- Given the above discussion, we offer the following hypotheses:
tions to avoid endorsing the middle option, and used the category
regardless. Emotionally stable, apprehensive, and insecure individu- Hypothesis 2. Person characteristics are only moderately related
als avoided the middle category whereas those respondents scoring to middle category use and misuse.
high on the Impression Management scale selected it more fre-
quently, perhaps to avoid a negative social image.
In addition to the above cognitive ability and personality Hypothesis 3. Item characteristics are strongly related to middle
dimension candidates, we also retain self-concept clarity as a pos- category use and misuse.
sible inter-individual antecedent to middle category use and in-
tent. Self-concept clarity concerns the extent to which someone
has both clearly defined and stable self-beliefs (Campbell et al., 2. Method
1996). Similar to our rationale for a negative association between
intelligence and middle category (mis)use, we acknowledge that Our study design utilized a talk aloud methodology, whereby
there may be a negative relationship between self-concept clarity participants responded to a series of personality items using a
and middle category use. Durand and Lambert (1988) note that common 5-point Likert-rating scale while simultaneously explain-
with more knowledge of a topic, fewer no opinion response selec- ing their selection to each question (i.e., their thought process).
tions are observed. Expanding on this, the more clearly defined Here, we asked respondents to verbalize anything that came to
attitudes one has about oneself, the fewer no opinion (and by mind as they provided responses. This procedure is most typically
extension) non-valid neither agree nor disagree responses one encountered in item pretesting contexts (e.g., DeMaio & Rothgeb,
may be expected to select. 1996) although our interest was in categorizing the verbalizations
Although the above identified person antecedents will be inves- themselves. In total, participants responded to four psychological
tigated, it is anticipated that item characteristics are more closely assessments: (1) a five-factor model (FFM) personality inventory,
associated with both gross frequency of middle category endorse- (2) a cognitive ability test, (3) a measure of self-concept clarity,
ment as well as differential endorsement intent than are respon- and (4) the focal 100-item computer-administered personality
dent characteristics. It is important here to note that the primary assessment (each described below in greater detail).
empirical source of inter-individual antecedents – the Hernández
et al. (2004) investigation – was focused on 16PF scales and re- 2.1. Participants
sponses. The response options on this inventory are: a, ?, and c,
which are considered ordered responses by Hernandez et al., but One hundred and twenty-two undergraduate students from a
differ semantically as well as in graded degree from the common large Midwestern university were given course extra credit for
Likert application.3 Inter-individual difference constructs will be participation.
2.2. Measures
3
Corresponding labels for a and c differ across 16PF items – most commonly, a is
accompanied by the label true and c is accompanied by the label false. However, some 2.2.1. Cognitive ability
items across the assessment have a and c correspond to (for example) different The Wonderlic Personnel Test (WPT) is a timed (12 min), 50-
sentence continuations, rarely versus often, or sometimes versus never. Furthermore, item cognitive ability measure that yielded a current sample mean
instructions direct respondents to avoid selecting ? and these middle category
endorsements are in fact omitted from scale definitions. With the above in mind, we
of 19.76 and a standard deviation of 6.15. For comparison, a 2003
propose that 16PF item response scales are (for the majority of items) better job applicant normative sample (N = 109,729) was centered on a
characterized as forced choice options rather than graded responses. WPT mean of 20.3 (SD = 7.0; Wonderlic, 2007). The test manual re-
J.T. Kulas, A.A. Stachowski / Journal of Research in Personality 47 (2013) 254–262 257
ports test–retest coefficients ranging from .82 to .94 and split-half cated [item contextualization] in previous research), a simple char-
coefficients ranging from .88 to .94 (Wonderlic, 2002). acter count of item length was also investigated as a rough proxy of
item complexity (M = 34.21; SD = 15.86).
2.2.2. Self-concept clarity
A 12-item measure developed by Campbell et al. (1996) cap- 2.3. Procedure
tures the extent to which ambiguity or clarity exists in one’s con-
ceptualization of self. The authors indicate that the response scale All participants were tested individually. Following general
‘‘ranges from 1 (strongly disagree) to 5 (strongly agree)’’ (p. 145). For instructions and signing consent forms, participants completed
the current investigation, the administered response scale was: counterbalanced orderings of the WPT, the NEO-FFI, the self-con-
strongly disagree, disagree, neither agree nor disagree, agree, and cept clarity measure, and the 100-item computer administered
strongly agree. An example (reverse scored) item is, ‘‘I spend a lot personality assessment. Apart from the timed WPT, all measures
of time wondering about what kind of person I really am’’. were completed alone (e.g., without the presence of a researcher).
The computer program E-prime v1.1 (Schneider, Eschman, &
2.2.3. Personality Zuccolotto, 2002) was used to record response categories and
The NEO-FFI (Costa & McCrae, 1992) assesses individual differ- latencies on one computing machine. Each item presentation was
ences across five dimensions: Neuroticism (a = .84), Extraversion preceded by a ready screen that audibly stated the item number
(a = .79), Openness to Experience (a = .72), Agreeableness and prompted the respondents to indicate when they were pre-
(a = .72), and Conscientiousness (a = .81). The instrument contains pared to see and respond to the next question. Item presentations
60 items (12 items per dimension) with response options consist- began with the cursor set at a location in the middle of the com-
ing of: strongly disagree, disagree, neutral, agree, and strongly agree. puter screen (320 pixels from the left, 240 pixels from the top)
and response options located 165 pixels distant from the center
2.2.4. Focal personality items start point. Responses were selected using the computer’s mouse.
The personality items of focal interest were obtained from the A microphone was suspended from the ceiling and was connected
International Personality Item Pool (IPIP). The IPIP is an on-line to a cassette tape recorder. These audio tapes provided records of
item bank that at the time of assessment contained 2413 indicators the respondents’ verbalizations across items.
(these items can be found at http://ipip.ori.org; Goldberg, 1999).
Each participant provided responses to a random sample of 100 2.3.1. Response coding
of these items (upon completion of the experiment 2391 items Verbalizations were transcribed and each statement was then
were administered between 1 and 14 times). Example items in- printed and represented on a strip of paper. One graduate and
clude, ‘‘I am always prepared,’’ and ‘‘Pay attention to details’’. Re- two undergraduate research assistants unfamiliar with the pur-
sponses were provided using a 5-point Likert scale: strongly pose of the experiment met and placed the strips of paper (re-
disagree, disagree, neither agree nor disagree, agree, and strongly sponses) into piles based on what they deemed to be primary
agree. recurring themes. These research assistants were different individ-
Each of the 2391 administered items was rated (by graduate uals than the research assistants who administered the experi-
students and the authors) along dimensions of clarity and need ment. In total, there were 11 categorization meetings, each of
for contextualization. All item ratings were made independent of which consisted of 1–2 h of placing strips of paper into categories,
knowledge of participant response patterns across the items. naming and renaming categories, and creating new categories if
needed. Ultimately 22 substantive categories were identified that
2.2.5. Clarity the students felt did a good job of capturing responses across the
Clarity was assessed via direct ratings of the extent to which random sampling of approximately 2000 responses that they re-
item content was clearly or unclearly expressed along an 8-point viewed. In addition to these 22 substantive categories, there were
continuum ranging from (4) extremely unclear to (4) extremely responses with no elaboration (e.g., the respondent simply men-
clear. Five graduate student or author raters provided clarity esti- tioned a response option such as strongly agree) and a category of
mates (interrater reliability [Spearman–Brown corrected intraclass responses that did not fit neatly into any identified category
coefficient] = .54; cf., Ebel, 1951). Most items were rated as being (these were primarily filled with typographical errors or the sen-
relatively clear (M = 2.37; SD = .77). An example high clarity item tence structures were incomprehensible). This resulted in a total
is, ‘‘Am often late to work’’ (M = 4.0). An example of an item exhib- of 29 possible response categories (e.g., primary respondent
iting low clarity is, ‘‘Dislike tastes that I usually like’’ (M = 1.6). orientations).
These categories were provided short descriptive definitions to
2.2.6. Need for contextualization be used for purposes of statement coding (e.g., each of the tran-
An item’s need for contextualization was characterized as the scribed statements was to be placed into one most appropriate cat-
extent to which a respondent’s answer would differ or depend on egory). The 29 categories were then grouped by the authors into
the context within which the question is asked. Six graduate stu- nine broader response themes in an attempt to help our under-
dent or author raters were instructed that a respondent’s answers graduate and graduate student coders identify response orienta-
to some personality items may depend greatly on context while tions with greater ease (the nine broad response themes, 29
others are likely not dependent on context. Ratings were made specific response categories, and a brief description of each cate-
along a 6-point scale ranging from (1) context is not at all important gory are presented in Appendix A). Student coders were next
to (6) context is extremely important (M = 2.30; SD = .64; interrater trained on each response theme and category and given a defini-
reliability = .57). An example item rated as needing contextualiza- tion sheet prior to providing categorical ratings for statements.
tion is, ‘‘Act comfortably with others’’ (M = 4.7). An example of an The coders were naïve to the purpose of the coding project. Indeed,
item rated low for need for contextualization is, ‘‘Often feed home- upon debriefing, it was revealed that the coders were unaware that
less people at my front door’’ (M = 1.0). response endorsements were recorded as well as spoken reactions
to the item stems. As an additional safeguard for coding accuracy,
2.2.7. Length all transcribed statements were coded by two raters. These coders
In addition to the above item characteristics of primary interest met once per week to resolve disagreements regarding coded cat-
(e.g., these were explicitly documented [clarity] or directly impli- egory. All codes, therefore, reflect eventual agreement.
258 J.T. Kulas, A.A. Stachowski / Journal of Research in Personality 47 (2013) 254–262
Table 1
Response code percentages as a function of category endorsement (all percentages reflect within column comparisons).
Code Strongly agree (%) Agree (%) Neither agree nor disagree (%) Disagree (%) Strongly disagree (%)
1. Repeats response scale with no further elaboration
1.1. ‘‘Strongly agree’’ 10
1.2. ‘‘Agree’’ 1 14
1.3. ‘‘Neither agree nor disagree’’ 5
1.4. ‘‘Disagree’’ 14 1
1.5. ‘‘Strongly disagree’’ 10
1.6. ‘‘Yes’’ 6 4
1.7. ‘‘No’’ 1 5 4
2. Frequency
2.1. To a certain extent/Once in a while 2 3 1 1
2.2. Most of the time/Normally/Usually 3 5 1 4 2
2.3. ‘‘Always’’/’’Never’’ 5 2 1 6
2.4. Sometimes/Situational/It depends/Conflicted 3 13 35 9 2
3. Emotion/attitude
3.1. Empathetic 2 1 1 2 2
3.2. Optimism 4 2 1 1 1
3.3. Agitated/Emotional 6 2 1 3 6
4. Hesitation/confidence
4.1. Changes mind 1 1 3 2 1
4.2. Indecisive/Confused by question 1 7 13 8 4
4.3. ‘‘I don’t know’’ 1 5 1 1
4.4. Confidence/’’Definitely’’ 10 3 2 7
5. Self versus other orientation
5.1. Individualism 2 1 1 1 2
5.2. Compares self to others 3 3 2 2 2
5.3. Self-worth 3 2 2 2 2
5.4. Self-focus/Self-awareness/Introspection 21 19 9 25 25
5.5. Self-presentation 2 1 1 2 2
5.6. Interpersonal relationships 6 5 3 5 4
5.7. Preference expressed 6 5 2 5 7
6. Motivation/Try to/Attempt 3 5 3 3 3
7. Neutral 7
8. ‘‘Not really’’ 1 2
9. Other (does not fit into any of the above) 2 1 1 1 2
Note: Percentages were rounded to integer values. The absence of a percentage value indicates a frequency of less than 0.50%. Response code categories enclosed by double
quotation marks indicate that the category is defined by explicit respondent articulation of the word(s) within the quotation marks.
J.T. Kulas, A.A. Stachowski / Journal of Research in Personality 47 (2013) 254–262 259
Table 2
Correlations of number of middle category endorsements, cognitive ability, self-concept clarity, and FFM standing.
1. 2. 3. 4. 5. 6. 7. 8.
1. Middle category endorsement –
2. WPT .09 –
3. Self-concept clarity .09 .02 –
4. Neuroticism .27 .01 .53 –
5. Extraversion .15 .07 .29 .49 –
6. Openness to experience .02 .02 .20 .13 .10 –
7. Agreeableness .11 .09 .14 .40 .39 .02 –
8. Conscientiousness .11 .00 .25 .41 .35 .24 .26 –
M 15.68 19.68 3.32 2.89 3.55 3.48 3.58 3.68
SD 7.50 6.18 .74 .66 .54 .53 .47 .50
Table 3
Tendency toward middle category response orientations as a function of cognitive ability, self-concept clarity, and FFM
standing.
Table 5 0.9
Multinomial regression coefficients for item characteristics predicting response
rationale. 0.8
0.7
Response rationale B SE OR 95% CI for OR
0.6 Neither (1.3)
Neither agree nor disagree
Probability
It depends (2.4)
Intercept .344 .925 0.5 Confused (4.2)
Clarity .620** .173 .538 .383–.756 I don't know (4.3)
Need for contextualization .426 .249 .653 .401–1.064 0.4 Self-aware (5.4)
Length .002 .010 1.002 .982–1.022 Neutral (7)
0.3
Indecisive/Confused
0.2
Intercept 1.656* .619
Clarity .540** .123 .583 .458–.741 0.1
Need for contextualization .750** .169 .472 .339–.658
0
Length .012 .006 1.012 1.000–1.025 1 1.5 2 2.5 3 3.5 4 4.5 5
I don’t know Need for Contextualization
Intercept 1.807* .904
Clarity .647** .174 .524 .373–.736 Fig. 2. Probability of middle category orientation as a function of an item’s ‘‘need
Need for contextualization .938** .256 .392 .237–.647 for contextualization’’.
Length .003 .010 .997 .977–1.017
Self-focused/Introspection
Intercept 1.713* .714 0.9
Clarity .332* .146 .717 .538–.956 Neither (1.3)
0.8
Need for contextualization .934** .196 .393 .268–.577
It depends (2.4)
Length .002 .008 .998 .983–1.014 0.7 Confused (4.2)
Neutral 0.6 I don't know (4.3)
Probability
Intercept .241 .819 Self-aware (5.4)
Clarity .269 .167 .764 .551–1.059 0.5 Neutral (7)
Need for contextualization .528* .219 .590 .384–.907 0.4
Length .014 .008 1.014 .998–1.030
0.3
Notes: These data are compared to a referent ‘‘Situational/It Depends’’ response
rationale. OR = Odds Ratio, CI = confidence interval. 0.2
*
p < .05. 0.1
**
p < .01. Nagelkerke R2 = .098. N = 837.
0
10 20 30 40 50 60 70 80 90 100
0.9 Neither (1.3)
Number of Characters
It depends (2.4)
0.8 Fig. 3. Probability of middle category orientation as a function of item length.
Confused (4.2)
0.7 I don't know (4.3)
Self-aware (5.4)
0.6 Neutral (7)
items had an oddity component to them that may reasonably be
Probability
reflects a strong introspective perspective or an expression of an sonably expected to have a more situationally-specific orientation
awareness of one’s own traits, strengths, or weaknesses. This re- (e.g., their moderate trait expression is, in a real sense, conditional).
sponse orientation was the most common categorical intent for This above possibility has not been found, however (e.g., extreme
all other response alternatives (strongly agree, agree, disagree, and individuals avoiding situational middle-category orientations and
strongly disagree) which is perhaps comforting in the context of moderate individuals engaging in situational endorsements on
personality assessment (e.g., this implies optimizing via proceed- trait-relevant indicators; see Kulas & Stachowski, 2009). Instead,
ing through stages 2 and 3 of Tourangeau and Rasinski’s (1988) the Situational/it depends orientation should truly be considered
model). The finding that this orientation is less common for middle a misuse, as these endorsements have been consistently associated
category endorsements in general, but relatively more common for with cross-construct item characteristics (as contrasted with, for
middle category endorsing introverts is not entirely surprising. In- example, inter-individual person characteristics).
deed, introverted individuals would be expected to have relatively Our primary recommendation is for assessment specialists to
more self-focused orientations across all response categories investigate the number of middle category endorsements that
(including the middle-response option) than would less intro- are elicited per item, and target the items that yield the largest
verted respondents. numbers of middle category endorsements for revision – primarily
Regarding the broader idea of a descriptive model of the self-re- regarding item clarity or need for contextualization. Additionally,
port response process, our results (as currently presented) are lim- the current results suggest that middle category endorsements
ited with regard to potential for process insight, although we should only be cautiously included in scale construction algo-
believe our methodology holds promise for building such a model rithms. Hernández et al. (2004) also note the danger of simple
in the future. For the current investigation, we chose to code only summated algorithms for scale construction across their classes
the primary respondent orientation within each verbalization. Sev- of respondents. The roughly 50% of middle category endorsements
eral respondent verbalizations, in fact, would have been better de- that were best characterized as situational or confused misuse ori-
scribed as possessing several different orientations (in most entations are, of course, yielded from our particular sample and
circumstances, there was an apparent hierarchy of orientations instrument (we note here that although other candidate item
which permitted our coding procedure). Therefore, although we and person antecedents could have been included in our investiga-
only documented predominant response orientations in the cur- tion, our response orientation coding procedure should be consid-
rent study, our data contains much more qualitative complexity, ered fairly comprehensive regarding possible response
and could be revisited if the scope of a project included collective orientations). Inclusion of situational or confused moderate re-
orientations descriptive of self-report response. Although we be- sponses will attenuate scale scores – the current investigation sug-
lieve this form of information is of potential use to researchers gests that this effect will be more likely with instruments in need
interested in creating such a descriptive model, it is also, of course, of item revision regarding clarity or context.
limited by focus on conscious processes, respondents’ willingness
to divulge accurate information, and self-awareness. Such qualita- Appendix A. Brief response code definitions
tive talk aloud data should, therefore, be considered a potentially
helpful tool in the building of a model of psychological inventory
(1) Repeats response
response, but would need to be accompanied by theory and meth-
scale/yes–no
ods that accommodate (for example) unconscious processes and
1.1. Strongly agree Verbalizes only ‘‘strongly agree’’
differing respondent motives.
1.2. Agree Verbalizes only ‘‘agree’’
We believe the primary contribution of our study lies in the
1.3. Neither agree nor Verbalizes only ‘‘neither agree nor
application of this novel talk aloud methodology to document pre-
disagree disagree’’
dominant response orientations across endorsement options (Ta-
1.4. Disagree Verbalizes only ‘‘disagree’’
ble 1). Here we note that our use versus misuse characterization
1.5. Strongly disagree Verbalizes only ‘‘strongly disagree’’
of middle category endorsements can be applied to other scaled
1.6. Yes Verbalizes only ‘‘yes’’
endorsements as well. For example, processes and thoughts associ-
1.7. No Verbalizes only ‘‘no’’
ated with acquiescence or extreme response orientations could
potentially be uncovered via similar methodologies. Additionally, (2) How Frequently
it is well established that response category thresholds (for in- You. . .
stance, between agreeing and strongly agreeing) differ across 2.1. To a certain Answer is qualified – to a certain
respondents. It is possible that qualitative talk aloud data of a sim- extent extent or occasional
ilar nature to ours could provide some guidance regarding ratio- 2.2. Most of the time/ Indicates frequent occurrence of act/
nale as well as item and person antecedents to inter-individual normally behavior/thought
category threshold differences. 2.3. ‘‘Always’’/’’Never’’ Explicitly states ‘‘always’’ or ‘‘never’’
in response
2.4. Situational/it Indicates that answer would differ
5. Summary and future directions depends depending on the situation
Durand, R. M., & Lambert, Z. V. (1988). Don’t know responses in surveys: Analyses
(4) Hesitance/ and interpretational consequences. Journal of Business Research, 16, 169–188.
0.1016/0148-2963(88)90040-9.
confidence
Ebel, R. L. (1951). Estimation of the reliability of ratings. Psychometrika, 16, 407–424.
4.1. Change mind Changes mind about initial answer http://dx.doi.org/10.1007/BF02288803.
after thinking about it Edwards, A. L. (1957). The social desirability variable in personality assessment and
4.2. Indecisive/ Confused by or hesitant to respond to research. New York: Dryden.
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data.
confused by question Cambridge, MA: The MIT Press.
question Goldberg, L. R. (1999). A broad-bandwidth, public domain, personality inventory
4.3. ‘‘I Don’t Know’’ Explicitly states that they ‘‘do not measuring the lower-level facets of several five-factor models. In I. Mervielde, I.
Deary, F. De Fruyt, & F. Ostendorf (Eds.). Personality psychology in Europe (Vol. 7,
know’’ in their response pp. 7–28). Tilburg, The Netherlands: Tilburg University Press.
4.4. Confidence/ Gives a confident answer, sure of the Hernández, A., Drasgow, F., & González-Romá, V. (2004). Investigating the
’’Definitely’’ response, sassy functioning of the middle category by means of a mixed-measurement
model. Journal of Applied Psychology, 89, 687–699. http://dx.doi.org/10.1037/
(5) Self-orientation/ 0021-9010.89.4.687.
Hofacker, C. F. (1984). Categorical judgment scaling with ordinal assumptions.
other orientation Multivariate Behavioral Research, 19, 91–106.
5.1. Individualism Expresses individualistic traits or Klopfer, F. J., & Madden, T. M. (1980). The middlemost choice on attitude items:
characteristics in response Ambivalence, neutrality, or uncertainty? Personality and Social Psychology
Bulletin, 6, 97–101. http://dx.doi.org/10.1177/014616728061014.
5.2. Compare self to Directly compares self to other people Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of
others or person in response attitude measures in surveys. Applied Cognitive Psychology, 5, 213–236.
5.3. Self-worth Expresses confidence [or lack of] in Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537–567.
Krosnick, J. A., Holbrook, A. L., Berent, M. K., Carson, R. T., Hanemann, W. M., Kopp, R.
the self
J., et al. (2002). The impact of ‘‘no opinion’’ response options on data quality:
5.4. Self-focus/self Expresses an awareness of personal Non-attitude reduction or an invitation to satisfice? Public Opinion Quarterly, 66,
awareness traits/limitations; introspective 371–403. http://dx.doi.org/10.1086/341394.
5.5. Self-presentation Attempts to portray themselves in a Kulas, J. T., & Stachowski, A. A. (2009). Middle category endorsement in Likert-type
response scales: Associated item characteristics, response latency, and intended
particular manner meaning. Journal of Research in Personality, 43, 489–493. http://dx.doi.org/
5.6. Interpersonal References interpersonal 10.1016/j.jrp. 2008.12.005.
relationships in response Kulas, J. T., Stachowski, A. A., & Haynes, B. A. (2008). Middle response functioning in
Likert responses to personality items. Journal of Business and Psychology, 22,
5.7. Preference Indicates a preference or affinity for 251–260. http://dx.doi.org/10.1007/s10869-008-9064-2.
certain objects or situations McDonald, J. L. (2004). The optimal number of categories for numerical rating scales.
(6) Motivation/try to/ Describes things that they try to do or Unpublished doctoral dissertation, University of Denver.
Mueller-Hanson, R., Heggestad, E. D., & Thornton, G. C. (2003). Faking and selection:
attempt want to do Considering the use of personality from select-in and select-out perspectives.
(7) Neutral Expresses neutrality regarding the Journal of Applied Psychology, 88, 348–355. http://dx.doi.org/10.1037/0021-
question 9010.88.2.348.
Nowlis, S. M., Kahn, B. E., & Dhar, R. (2002). Coping with ambivalence: The effect of
(8) ‘‘Not Really’’ Explicitly states ‘‘not really’’ in removing a neutral option on consumer attitude and preference judgments.
response Journal of Consumer Research, 29, 319–334.
(9) Doesn’t fit Participant’s response doesn’t fit Presser, S., & Schuman, H. (1980). The measurement of a middle position in attitude
surveys. Public Opinion Quarterly, 44, 70–85.
anywhere within any of the above categories
Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in
rating scales: Reliability, validity, discriminating power, and respondent
preferences. Acta Psychologica, 104, 1–15. http://dx.doi.org/10.1016/S0001-
6918(99)00050-5.
Ryan, M. (1980). The Likert scale’s midpoint in communications research. Journalism
Quarterly, 57, 305–313.
Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-prime user’s guide. Pittsburgh:
Psychology Software Tools, Inc..
References
Schwarz, N., & Oyserman, D. (2001). Asking questions about behavior: Cognition,
communication and questionnaire construction. American Journal of Evaluation,
Campbell, J. D., Trapnell, P. D., Heine, S. J., Katz, I. M., Lavallee, L. F., & Lehman, D. R. 22, 127–160.
(1996). Self-concept clarity: Measurement, personality correlates, and cultural Shaw, M. E., & Wright, J. M. (1967). Scales for the measurement of attitudes. NY:
boundaries. Journal of Personality and Social Psychology, 70, 141–156. http:// McGraw-Hill.
dx.doi.org/10.1037/0022-3514.70.6.1114. Shulruf, B., Hattie, J., & Dixon, R. (2008). Factors affecting responses to Likert type
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective questionnaires: Introduction of the ImpExp, a new comprehensive model. Social
scale development. Psychological Assessment, 7, 309–319. http://dx.doi.org/ Psychology of Education: An International Journal, 11, 59–78. http://dx.doi.org/
10.1037/1040-3590.7.3.309. 10.1007/s11218-007-9035-x.
Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO-PI-R) and Stone, M. H. (2004). Substantive scale construction. In E. V. Smith, Jr. & R. M. Smith
NEO Five-Factor Inventory (NEO-FFI). Professional Manual. Odessa, FL: (Eds.), Introduction to Rasch measurement (pp. 201–225). Maple Grove, MN: JAM
Psychological Assessment Resources. Press.
Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability independent Tourangeau, R., & Rasinski, K. A. (1988). Cognitive processes underlying context
of psychopathology. Journal of Consulting Psychology, 24, 349–354. http:// effects in attitude measurement. Psychological Bulletin, 103, 299–314. http://
dx.doi.org/10.1037/h0047358. dx.doi.org/10.1037/0033-2909.103.3.299.
DeMaio, T. J., & Rothgeb, J. M. (1996). Cognitive interviewing techniques: In the lab Warwick, D. P., & Lininger, L. A. (1975). Psychology of reasoning: Structure and
and in the field. In N. Schwarz & S. Sudman (Eds.), Answering questions content. Cambridge, MA: Harvard University Press.
(pp. 177–196). San Francisco, CA: Jossey-Bass. Wonderlic (2002). Wonderlic personnel test & scholastic level exam user’s manual.
Donnay, D. A. C., Morris, M. L., Schaubhut, N. A., & Thompson, R. C. (2005). Strong Libertyville, IL: Wonderlic, Inc.
Interest Inventory manual: Research, development, and strategies for interpretation. Wonderlic (2007). Wonderlic personnel test normative report. Libertyville, IL:
Mountain View, CA: CPP Inc.. Wonderlic, Inc.
DuBois, B., & Burns, J. A. (1975). An analysis of the meaning of the question mark
response category in attitude scales. Educational & Psychological Measurement,
35, 869–884. http://dx.doi.org/10.1177/001316447503500414.