The Measurement of Happiness: Development of The Memorial University of Newfoundland Scale of Happiness (MUNSH)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Journal of Gerontology

1980, Vol. 35, No. 6, 906-912

The Measurement of Happiness:


Development of the Memorial
University of Newfoundland Scale of
Happiness (MUNSH)1

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


Albert Kozma, PhD, and M. J. Stones, PhD2
Items of the Affect Balance Scale, the Life Satisfaction Index-Z and the Philadelphia Geriatric Center
Scale together with 22 new items were used in the construction of a happiness scale for the elderly. Items
were initially administered to 301 subjects from urban, rural, and institutional settings and correlated
with ratings of happiness. A new scale consisting of 24 items was cross-validated on an additional 297
subjects. Test-retest reliability scores were obtained on 56 subjects. Results indicated that the new
scale was a better predictor of "avowed happiness" in both validation and cross-validation samples than
the existing scales used for comparison. Moreover, the new scale's test-retest reliability was within an
acceptable range for this type of scale.
Key Words: Elderly, Human, Well-being, Happiness scale

HE present article is addressed to the ques- Because the subjective state of another person
T tion of measurement in the area of "mental
health" or "psychological well-being" among
cannot be gauged directly, any form of objec-
tive measurement must be indirect. The basic
the elderly. Because of the diversity of involve- problem, therefore, reduces to one of selecting
ment in this gerontological sub-area (e.g., the indirect index that represents the inner
behavioral, biological, medical, and social), condition most appropriately and most consis-
several meanings have become attached to the tently.
mental health concept (Jahoda, 1958). Among Four types of measurement have achieved
those utilized by psychosocial gerontologists, prominence in the psychological literature.
happiness (Bradburn, 1969), life-satisfaction These are ratings by "expert" judges, behav-
(Wood et al., 1969), and morale (Lawton, 1972) ioral assessment, self-appraisal, and perfor-
are favored constructs. Stones and Kozma mances on specially developed scales or tests.
(1980a) considered these three constructs from Each type of measurement has associated
historical, rational, and empirical perspec- errors. For the expert judge and behavioral
tives. From all three stances, happiness assessment methods, errors may arise from
emerged as the construct of choice to best the qualities of (a) the assessor, (b) the
represent the mental health concept in psycho- assessee, and (c) the range of situations
social gerontology. Accordingly, its measure- sampled. Assessor errors arise for reasons too
ment rather than that of life-satisfaction or numerous to detail fully here (but primarily
morale became the focus of our work. concern the fallibility of the assessor as an
The measurement of happiness presents expert observer and interpreter of behavior)
imposing problems. At the heart of the issue and can be augmented by an overly narrow
lies the dichotomy between the subjective range of situations under which the subject's
nature of the construct and the requirement of behavior is observed. Extent of assessor error
objectivity for any measurement technique. can be checked to some degree by interjudge
reliability estimates. Error originating with the
assessee arises because the same behavior may
"Support for this project came from the Social Sciences and Humanities reflect either current inner condition or habitual
Research Council of Canada, Grant No. 410-78-0377. Reprint requests should
be2 addressed to the first author.
Dept. of Psychology, Memorial Univ. of Newfoundland, St. John's,
style of self-expression, or both. A smile upon
Newfoundland, Canada. A1C 5S7. greeting an acquaintance may indicate feelings

906
MEASUREMENT OF HAPPINESS 907

of happiness occasioned by that particular from three elderly populations on the island
individual or the habitual adoption of a friendly portion of the Province of Newfoundland:
posture. Assessee errors may prove especially urban, rural, and institutional. An array of
serious when inter-individual comparisons are items from various sources (including some
required or when the assessor is little newly devised) was presented to all subjects
acquainted with the subjects. and, in addition, avowed happiness ratings
Self-appraisal as a measure of internal state were obtained. All items were correlated with
is free from the measurement errors outlined in self-appraisal (i.e., avowed happiness ratings)
the preceding section and has been used quite and only those that displayed a high degree of
successfully as a measure of happiness in the relationship were retained for inclusion in the
form of avowed happiness ratings (Bradburn, scale. By this procedure, it was hoped to

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


1969, among others). Scales which are based ensure that all scale items were relevant to the
on such self-appraisal have several advantages happiness construct. During Phase II the
over their criterion measures. The most fre- reduced item set was administrated to fresh
quently cited ones include the greater reliabil- random samples of subjects for the purpose of
ity of scales and the lower susceptibility of checking whether the scale scores bore the
scales to conscious or unconscious distortion. same relationships to self-appraisal as in Phase
An important, but much underestimated, I. Finally, during Phase III the scale was read-
feature of scales lies in their potential for gen- ministered to a sub-sample of subjects in order
erating models for the phenomena they to provide a check on the test-retest reliability
measure. of the scale.
A prime example of such an effort is con-
tained in the work of Bradburn (1969). In his Phase I: Validation
earlier work Bradburn used self-ratings of
happiness; later he developed the Affect Bal- METHOD
ance Scale (ABS) and from its internal struc-
ture derived a model of happiness that has Subjects. — Prior to the commencement of
proved very influential. Essentially, Bradburn the investigation a map of the island portion of
(1969) suggested that happiness is a function of Newfoundland was divided into regions, and
the balance between equally but oppositely each region subdivided into specific locales.
weighted independent structural components By randomly sampling from locales within
(i.e., positive affect and negative affect). Later, each region representative samples from the
Beiser (1974) added a third component that whole geographic area could be obtained.
reflects a dispositional feature of happiness. During Phase I samples were obtained from
Although the ABS has proved to be a popular urban, rural, and institutional settings. Urban
measure of happiness, a number of disadvan- and rural settings were defined as locales with
tages are associated with its use. Firstly, the populations in excess of or less than 3,000,
range of item sampling may be overly restric- respectively. Institutions were defined as the
tive. Secondly, the weightings assigned to the larger residential homes for senior citizens
component subscales may not be as applicable exceeding a population of 25. The sample size
to elderly populations (Kozma & Stones, 1978; from urban, rural, and institutional settings
Moriwaki, 1974). Thirdly, the test-retest relia- was 104, 100, and 97, respectively.
bilities are unacceptably low with intervals Once urban and rural locales were selected
longer than a few days (Bradburn, 1969). lists of elderly adults between 65 and 95 years
Finally, the scale fails to include item content of age living in these places were obtained from
relevant to the dispositional aspects of happi- parish priests, public health nurses, and other
ness. For these reasons the present authors service agencies. From these lists a sample
undertook to develop, validate, and cross- proportionate to the total population of elderly
validate a measure of happiness for use with in the area was randomly selected to achieve
elderly populations. The scale is to be known our required number of subjects. In rural areas
as the Memorial University of Newfoundland the lists were exhausted; in urban ones names
Scale of Happiness (MUNSH). were pulled from a hat using a non-replacement
The study was divided into three phases. procedure. For the institutional sample lists of
During Phase I the authors sampled randomly all residents considered to be physically and
908 KOZMA AND STONES

mentally capable of completing our test battery components. Accordingly, AVHT was utilized
were obtained from the administrators. Where as our criterion measure during subsequent
possible all subjects on these lists were in- stages of scale construction.
cluded in the study. Subsequent analyses deal with the develop-
ment of the MUNSH and its relation to existing
Apparatus and materials. —The test battery scales of psychological well-being. To reduce
consisted of the 21-item Philadelphia Geriatric the size of the item pool and maintain its predic-
Center Morale Scale (PGC), the 11-item Life tive power only items significantly correlated
Satisfaction Index-Z (LSI-Z), the 10-item with AVHT were retained (r ^ .28, p < .005).
Affect Balance Scale (ABS) with its positive Of the 27 items meeting this criterion seven
affect (PAS) and negative affect (NAS) sub- were of the positive affect type (i.e., from the

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


scales. Also included were 30 newly con- positive affect subscale of the ABS or the new-
structed items of the ABS-type, but with 10 ly constructed items assessing positve affect),
items sampling longer-term affective states five of the negative effect type, eight from the
rather than current affect. Avowed happiness PGC, and seven from the LSI-Z. A further
ratings were obtained by use of 7-rung ladders, reduction of three items was necessary to
where the top rung represented the state of maintain a balance between items for which an
being very happy and the bottom rung repre- affirmative response is scored in the positive
sented great unhappiness.
direction and items for which an affirmative
response is scored in the negative direction —
Procedure. — Once selected for inclusion in the basic distinction between positive and
the study, the subject was approached by a negative subscales of the ABS. Within this
member of the research team and the overall balanced scale the distinction was maintained
issues of the project were briefly described. between the "general emotive experience"
Prior to administration of the test battery two type items predominating in the LSI-Z and
types of avowed happiness ratings were ob- PGC scales (as I look back on life I am fairly
tained (i.e., avowed happiness "at this moment satisfied) and the more specific affective state
in time" or AVH and AVH30 which tapped items of the ABS (are you in high spirits at this
happiness "over the past month"). Order of time?) in order to sample content relevant
presentation of the avowed happiness ratings both to specific affective experiences and the
was varied across subjects. The test battery more general emotive experiences. This dis-
then was administered with a systematically tinction is discussed extensively by Stones and
varied presentation sequence. All items were Kozma (1980a, b). The final form of the
presented orally, as pilot research had sug- MUNSH contained five positive affect type
gested reading difficulties in a substantial items (PA), five negative affect type items
minority of subjects, and were scored accord- (NA), seven items of general positive experi-
ing to dichotomous categories (i.e., yes/no). ence (PE), and seven items of general negative
experience (NE).
In order to examine the predictive powers
RESULTS AND DISCUSSION of the MUNSH, it was entered into a regres-
The data were subjected to a set of prelimi- sion analysis with the other three scales to
nary analyses. Correlation coefficients were predict AVHT. A step-wise multiple regres-
computed between the two avowed happiness sion procedure which adds the predictors one
ratings to ascertain the amount of common at a time in the order of their contribution to
variance associated with their use. Correla- the multiple correlation was carried out. The
tions were obtained for each of the three results of the procedure are presented in Table
samples as well as the total subject population. 1. The results indicate that the asymptote of
The obtained values ranged from .71 to .74 and predictive accuracy was reached only after all
were thus considered sufficiently high to com- four scales were entered but that using the
bine the two avowed happiness ratings into a MUNSH alone produces essentially as accu-
single general index (AVHT). Correlations rate a prediction as using all four predictors.
between AVHT and its two components The independent contributions from the PGC,
ranged from .91 to .93 and indicate that this while statistically significant, accounted for
general measure adequately represents its two only 1.6% additional AVHT variance. Contri-
MEASUREMENT OF HAPPINESS 909

Table 1. A Stepwise Regression Analysis of ficients were obtained: PAS, .495; NAS, .640;
MUNSH, ABS, LSI-Z and PGC Scores to ABS, .591; PGC, .775; LSI-Z, .624; MUNSH,
Predict AVHT. .858. Of the four scales only the MUNSH has
a coefficient above .80 and meets minimum
Standard criteria for consistency. The low values ob-
Scales Multiple R Simple R Beta Error of B F
tained for the ABS, especially its PAS sub-
MUNSH .667 .667 .783 .033 49.07* scale, make it risky to use it as a measure of
PGC .679 .500 -.219 .031 7.17* psychological well-being.
ABS .681 .546 .081 .044 1.56
The predictive power of the MUNSH for our
LSI-Z .681 .486 .014 .040 0.05
three sub-groups was assessed by obtaining
*p < .01 correlations between AVHT and MUNSH

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


scores for urban, rural, and institutional sub-
jects. The respective values for the three sub-
Table 2. " t " Values Reflecting Scale Differences groups were .580, .735, and .703. Since the
in the Magnitude of their Correlation with AVHT. coefficient is significantly lower for urban than
for rural and institutional subjects (p < .05) we
Scales PGC ABS MUNSH attempted to devise separate scales for each of
LSI .33 1.30 5.62 our sub-populations. However, subscales
PGC 1.05 7.13 specifically derived for each subject sample
ABS 3.81 led to disappointingly low increments in ac-
*p < .005 counted AVHT variance (0.3%, 1.8%, and
2.3% respectively for the urban, rural, and
institutional samples). Consequently, it was
butions from the other scales failed to reach decided to report only on the construction of
statistical significance. our general scale in the cross-validation phase
Differences between the predictive powers of the investigation.
of each of the four scales were assessed by
comparing their zero-order correlation with Phase II: Cross-Validation
AVHT. Tests of significance on the difference
between two correlation coefficients for corre- METHOD
lated samples (Ferguson, 1959) were carried
out and the obtained " t " values are recorded Subjects. —New subject samples were used
in Table 2. These results show the MUNSH to during Phase II. The selection procedure fol-
be significantly better at predicting AVHT than lowed the guidelines described in Phase I in
the ABS, LSI-Z, and PGC. On the other hand, order to ensure representative sampling.
the last three scales do not appear to differ in Sample sizes were 97 urban, 100 rural, and
their predictive powers. 100 institutional residents. The ages of our
Before reporting on the cross-validation subjects ranged from 65 to 95 years.
results several other issues merit discussion.
These include the internal consistency of the Apparatus and materials. — All subjects
MUNSH and the scale's predictive power with were presented with the 24 items MUNSH
our sub-populations. The internal consistency developed during the validation phase of our
of the ABS, PGC, LSI-Z, and MUNSH were study. As with any true cross-validation study
assessed by computing standardized alpha only the final set of items obtained during vali-
coefficients for each of the scales. The compu- dation were included in this phase. The
tations also were carried out for the two sub- MUNSH is presented in Table 3. In addition to
scales of the ABS since the low intercorrela- our scale, subjects were administered the
tions between PAS and NAS would tend to LSI-Z. The LSI-Z was employed for the fol-
lead to a low alpha value for the total scale. lowing reasons: (1) its ability to predict AVHT
Since one-half of the items of the ABS and the during the validation phase was as good as that
MUNSH depict negative affect/experiences, of the PCG and the ABS; (2) it was shorter than
scores on these items were recoded before the PGC; and (3) it had a higher internal consis-
alpha coefficients could be computed (2 = 0,1 = tency coefficient than the ABS. Happiness
1,0 = 2). The following standardized alpha coef- ratings for "at this moment in time" and "over
910 KOZMA AND STONES

Table 3. The MUNSH: Instructions, Items, AVHT criterion score. Correlations between
and Scoring. AVHT and its two components ranged from .90
to .93 for the various subgroups. Correlations
We would like to ask you some questions about how things have between the MUNSH and AVHT were ob-
been going. Please answer "yes" if a statement is true for you
and "no" if it does not apply to you. In the past months have you
tained for total, rural, urban, and institutional
been feeling:1 subjects to determine the MUNSH's effective-
(1) On top of the world? (PA)
ness in predicting our criterion measure. The
(2) In high spirits? (PA) respective values for the four groups were .616,
(3) Particularly content with your life? (PA) .735, .564, and .605. While these values are
(4) Lucky? (PA) slightly lower than those of the validation
(5) Bored? (NA) phase, a statistical comparison of correlation

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


(6) Very lonely or remote from other people? (NA) coefficients for independent samples failed to
(7) Depressed or very unhappy? (NA) reach significance between the two phases for
(8) Flustered because you didn't know what was expected of any of the subgroups (Z values ranged from . 15
you? (NA) to 1.17, p > .20).
(9) Bitter about the way your life has turned out? (NA)
(10) Generally satisfied with the way your life has turned out?
Correlation coefficients between AVHT and
(PA) MUNSH, AVHT and LSI-Z, and MUNSH and
LSI-Z scores were calculated for the total sub-
The next 14 questions have to do with more general life ject sample. The respective values were .62,
experiences. .50, and .76. A correlated samples procedure,
(11) This is the dreariest time of my life. (NE) used to compare the first two coefficients,
(12) I am just as happy as when I was younger. (PE) revealed a significant difference between them
(13) Most of the things I do are boring or monotonous. (NE) (t = 3.79; p < .001). Thus, the MUNSH was
(14) The things I do are as interesting to me as they ever were. significantly better at predicting AVHT than
(PE) the LSI-Z.
(15) As I look back on my life, I am fairly well satisfied. (PE)
(16) Things are getting worse as I get older. (NE)
The final question to be considered was the
(17) How much do you feel lonely? (NE)
internal consistency of the MUNSH during
(18) Little things bother me more this year. (NE) cross-validation. The computation of Cron-
(19) If you could live where you wanted, where would you bach's Alpha led to a standardized coefficient
live? (PE) of .853. This value is almost identical to the
(20) I sometimes feel that life isn't worth living. (NE) one obtained during validation and reflects an
(21) I am as happy now as I was when I was younger. (PE) acceptable consistency for our scale.
(22) Life is hard for me most of the time. (NE) Cross-validation results are consistent with
(23) How satisfied are you with your life today? (PE) the validation data in pointing to the superiority
(24) My health is the same or better than most people's my
of the MUNSH as a predictor of AVHT. The
age. (PE)
MUNSH's internal consistency remained
Scoring: Yes = 2; Don't Know = 1; No = 0. Item 19: Present within an acceptable range, its predictive
Location = 2; Other Location = 0. Item 23: Satisfied = 2; Not
Satisfied = 0. MUNSH Total = PA - NA + PE - NE.
powers remained significantly greater than
those of the LSI-Z, and there was no appreci-
able loss in the prediction of AVHT from
the past month" were obtained from everyone.
validation to cross-validation phases.
To obtain the two sets of happiness ratings,
the 7-point scale described in the validation
phase was used. Phase III: Test-retest Reliability

Procedure. — The procedure was similar to METHOD


that described in Phase I. The major difference
lay in the administration of the restricted items Subjects. — Thirty-two subjects from Phase
comprising our scale. It is this difference that I and 23 subjects from Phase II were randomly
makes Phase II a cross-validation and not a selected from the institutional sample for re-
replication. evaluation.

RESULTS AND DISCUSSION Apparatus and materials. — All subjects


For reasons advanced earlier the two happi- were re-administered the MUNSH, the LSI-Z
ness ratings were combined into a single and the two happiness ratings. In addition, the
MEASUREMENT OF HAPPINESS 911

subjects from Phase I were presented with all It is quite possible that there may exist signifi-
PGC items and all ABS items. The latter two cant differences between elderly adults in
tests had been administered to Phase I sub- Newfoundland and those in major cities on
jects. By readministering them, a comparison mainland Canada and in the United States.
of the test-retest reliability of the MUNSH Although over 75% of our urban sample came
with the PGC and the ABS became possible. from cities exceeding a population of 15,000,
none of our cities approached populations of
Procedure. —The administration procedure 200,000. Accordingly, plans are underway to
was similar to the one used during the prior administer the scale to elderly adults in
relevant phase of the study (described in Phase London, Ontario, Toronto, Ontario, and
I and II). The test-retest intervals ranged from Buffalo, New York. A brief report will be

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


six months to one year. published as soon as these data become avail-
able.
Techniques of scale construction distinguish
RESULTS AND DISCUSSION
the MUNSH from earlier endeavors and may
The test-retest reliability coefficients were explain both its superior prediction of the cri-
as follows: MUNSH, .70; AVHT, .57; LSI-Z, terion and its greater reliability. Final item
.35; PGC, .36; ABS, .27. The value associated selection proceded on empirical grounds to
with the MUNSH exceeds the values of all but ensure that all items, irrespective of their
that for AVHT (p < .05). initial source, were related to a single con-
It is noteworthy that both AVHT and struct. This procedure, we believe, had two
MUNSH scores remained more stable over a positive consequences. Firstly, it led to the
long period of time than did ABS, PGC and
LSI-Z scores. Thus the MUNSH scores are inclusion of general experience items of the
more consistent with changes in the criterion PGC type and an increase in the test-retest
than those of the other scales. One could, of reliability over that of the ABS. Secondly, it
course, argue that a test-retest coefficient of produced an internally consistent scale and
.70 is not very high. It is, however, much better thus a better predictor of the criterion.
than one which is only .36 or less. While these Finally, we attribute the superior perfor-
data do not entirely rule out Bradbum's (1969) mance of the MUNSH to its correspondence
argument that happiness is characterized by to what we consider to be the best model of
constantly shifting values, they suggest that happiness. Like Bradbum, we believe that the
the shifts may not be as great as initially be- construct is best assessed by subtracting nega-
lieved. Part of the low test-retest scores tive from positive affective experiences. A
reported by Bradbum may be due to the low balanced scale in which negative and positive
internal consistency of his scale rather than to items are equally represented, therefore,
fluctuations in happiness alone. should be a better measure than one in which
the two components are unequally repre-
sented. Unequal representation is present
GENERAL DISCUSSION both in the LSI-Z and the PGC. The ABS, of
The present endeavors can be evaluated course, equally reflects positive and negative
both in respect to practical and methodological affect but appears to suffer from inadequate
contributions to gerontology. Practically, the item sampling. In the construction of the
MUNSH provides a carefully validated and MUNSH we have tried to make use of the
cross-validated measure of psychological well- strengths of all three scales.
being in three sub-populations of elderly. In Future efforts will be directed towards
comparison with other scales in common use ascertaining whether items on the MUNSH
the MUNSH was a better predictor of the cri- correlate more highly with AVHT than with
terion measure, was the only scale with an such related constructs as Zest, Anomie, and
acceptable internal consistency coefficient, Adjustment. High correlations with AVHT and
and had the greatest temporal stability. The low correlations with other constructs would
MUNSH, therefore, would appear to be a good provide additional support for our belief that
measure of psychological well-being. our scale measures a more precisely defined
Despite these impressive results it is neces- construct than is measured by such scales as
sary to interject a note of caution at this point. the LSI-Z and the PGC.
912 KOZMA AND STONES

REFERENCES planning and action for the elderly. Behavioural Publ.,


Beiser, M. Components and correlates of mental well- New York, 1972.
being. Journal of Health and Social Behaviour, 1974, Moriwaki, S. Y. The Affect Balance Scale: A validity study
15, 320-327. with aged sample. Journal of Gerontology, 1974, 29,
Bradburn, N. M. The structure of psychological well- 73-78.
being. Aldine Publ. Co., Chicago, 1969. Stones, M. J., & Kozma, A. Issues relating usage and
Ferguson, G. A. Statistical analysis in psychology and conceptualization of mental health constructs em-
education. McGraw & Hill, Toronto, 1959. ployed by gerontologists. International Journal of
Jahoda, M. Current concepts of positive mental health. Aging and Human Development, 1980. (a, in press)
Basic Books, New York, 1958. Stones, M. J., & Kozma, A. The components of happiness:
Kozma, A., & Stones, M. J. Some research issues and Implications for retirement counselling. Canadian
findings in the assessment of well-being in the elderly. Counsellor, 1980. (b, in press)
Canadian Psychological Review, 1978, 19, 241-249. Wood, V., Wyllie, M. L., & Sheafor, B. An analysis of

Downloaded from http://geronj.oxfordjournals.org/ at York University Libraries on November 12, 2014


Lawton, M. P. The dimensions of morale. In D. Kent, short self-report measure of life satisfaction. Journal
R. Kastenbaum, & S. Sherwood (Eds.), Research of Gerontology, 1969,24, 465-469.

You might also like