0% found this document useful (0 votes)
73 views11 pages

Scale Validity in Exploratory Stages of PDF

This document discusses methods for validating scales in the exploratory stages of research. It emphasizes that exploratory methods like content validity analysis and Q-sorting should be used before confirmatory analyses like reliability testing and factor analysis to improve the validity of results. The document applies these exploratory validation methods, specifically Lawshe's content validity ratio and Q-sorting, to develop a scale for measuring perceived risk. It highlights the importance of establishing validity over reliability in scale development.

Uploaded by

sajid bhatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views11 pages

Scale Validity in Exploratory Stages of PDF

This document discusses methods for validating scales in the exploratory stages of research. It emphasizes that exploratory methods like content validity analysis and Q-sorting should be used before confirmatory analyses like reliability testing and factor analysis to improve the validity of results. The document applies these exploratory validation methods, specifically Lawshe's content validity ratio and Q-sorting, to develop a scale for measuring perceived risk. It highlights the importance of establishing validity over reliability in scale development.

Uploaded by

sajid bhatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/266025023

Scale Validity In Exploratory Stages Of Research

Article in Management & Marketing · June 2013

CITATION READS

1 17

2 authors, including:

Adriana Zait
Universitatea Alexandru Ioan Cuza
34 PUBLICATIONS 49 CITATIONS

SEE PROFILE

All content following this page was uploaded by Adriana Zait on 27 September 2014.

The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document
and are linked to publications on ResearchGate, letting you access and read them immediately.
 

SCALE VALIDITY IN EXPLORATORY STAGES OF RESEARCH  

 
«SCALE VALIDITY IN EXPLORATORY STAGES OF RESEARCH»  

 
by Adriana Zait; Elena BERTEA
 

Source:  
Management & Marketing ­ Craiova (Management & Marketing ­ Craiova), issue: 1 / 2013, pages: 38­46,
on www.ceeol.com.  

The following ad supports maintaining our C.E.E.O.L. service 

 
SCALE VALIDITY IN EXPLORATORY STAGES OF
RESEARCH
PhD Patricea Elena BERTEA
Romanian Academy Iaşi, Romania
Email: [email protected]
Professor PhD Adriana ZAIŢ
University “A. I. Cuza”, Iaşi, Romania
Email: [email protected]

Abstract:
Scale development assumes that certain steps are to be taken in order to
obtain a valid measurement instrument. Most of the researchers jump to the
confirmatory stage and avoid exploratory measures. However, exploratory
methods that are used in the first stages of scale development are
recommended so as to avoid further problems regarding the validity of the
scale. Before conducting reliability analysis and factorial analysis, exploratory
methods can be applied. The main purpose of this paper is to draw the
attention on alternative methods for scale validation that should be used in
the exploratory phase. The role of these methods is to improve validity of
results of the further confirmatory phases of research. The Lawshe (1975)
content validity ratio and the Q-sorting procedure for testing construct validity
are applied in the process of developing a scale for perceived risk.

Keywords: scale development, content validity, q-sorting, perceived risk

Introduction An important aspect in scale


The main purpose of this paper is development is assessing validity.
to draw the attention on alternative Validity refers to the ability of a
methods for scale validation that should construct to measure what it was
be used in the exploratory phase. The supposed to measure (Goodwin, 2009).
role of these methods is to improve When assessing the validity of a scale
validity of results of the further we are actually looking how accurate
confirmatory phases of research. The the scale is (Groth-Marnat, 2009).
methods are exemplified on a scale that Establishing the validity of a scale is
aims to measure perceived risk in e- rather difficult, especially when we are
commerce. dealing with psychological variables.
Scale development has become The main issue is that such variables
an important research area since are not observable and the researcher
several seminal works (Cronbach, 1951; has to identify the underlying latent
Nunnaly, 1967; Churchill, 1979). The variables by constructing measurement
use of scales in Management and instruments. Validation of measurement
Marketing research has become instruments assumes that the
common since both fields deal with inferences and conclusions that are
studies on latent variables. Thus, the drawn in a research are actually valid
methodology from Psychology is now (Schultz & Whitney, 2004).
successfully employed by researchers Another issue when talking about
from the previously mentioned areas. validity is that it should not be
confounded with reliability. Reliability,
which is usually measured using the
Access via CEEOL NL Germany

Management&Marketing, volume XI, issue 1/2013 39

Cronbach alpha coefficient, refers to the domain. To have concurrent validity for
consistency of the measurement. A a construct it is compulsory that there is
more clarifying perspective is given by a high correlation with the benchmark
Campbell and Fiske (1959), who explain construct. Researchers can also
that reliability is the agreement of two choose the benchmark as being a
attempts to measure the same totally opposed variable and in this case
underlying construct through similar low correlation is expected in order to
methods, while validity refers to the have good concurrent validity. Usually,
same issue, but the methods used are to test for concurrent validity
totally different. Cronbach alpha researchers apply two different
measures a certain type of reliability instruments measuring the same
which is defined as internal consistency variable on the same sample, just that
and offers information on how items that one of the instruments must be a
form a scale correlate with each other. standard in the domain, with previously
An accepted level of internal tested psychometric characteristics.
consistency has to be at least of 0,7, but Predictive validity refers to the
not higher than 0,9 (Cronbach, 1951), ability of a measurement instrument to
which indicates that some items might predict future attitudes or behaviors.
be redundant inside the scale. Alwin Establishing predictive validity means
(2007) considers that alpha Cronbach that data is collected twice at different
should be used more as an internal moments in time, so as to check if the
consistency measure that shows how “a scale predicted or not a certain event. In
set items hangs together to form a this case there is also need to do a
scale” and that other approaches should correlation between the variable we are
be employed in assessing reliability. trying to measure and another variable
Among these, Alwin (2007) talks about that is used as a criterion.
using multi-trait multi-method/
confirmatory factor analysis to measure Content validity
reliability. As far as validity is Content validity refers to a correct
concerned, Alwin (2007) explains that “a definition of the domain of the latent
reliable measure is not necessarily a variable that one intends to measure.
valid one”. Another important aspect is the
identification of possible facets of the
Types of validity construct. Thus, when we want to
There are different types of validity measure a latent variable is important to
that researchers should look into when introduce in the construct all possible
developing a scale. Specialists talk items which could capture the essence
about three types of validity: criterion of the variable (Haynes, et al., 1995).
validity, content validity and construct For instance, if we include items that
validity. have no connection with the variable
that we generate measurement errors,
Criterion validity while if we exclude items that we will
Criterion validity stands for how have exclusion errors (Straub, et al.,
well an instrument measures a variable 2004).
in comparison with another instrument Content validity assumes two
or a predictor. There are two types of stages (Lynn, 1986): the development
criterion validity: concurrent and stage and the judgement-quantification
predictive validity. stage. The first stage implies the use of
Concurrent validity assumes there qualitative methods such as interviews,
is another construct that measures the focus groups and, of course, an
same variable, a construct considered intensive review of literature. The
to be a benchmark in the research second stage, which is intended to
40 Management&Marketing, volume XI, issue 1/2013

quantify the validity of a scale, requires anchors (1 – not relevant, 2 – somewhat


that a panel of experts evaluate the relevant, 3 – quite relevant and 4 – very
scale’s items accordingly to the relevant). The index computation is
Although methods have been actually a percentage given by the
developed for the second stage, most number of experts that rate quite
researchers appeal to literature review relevant or very relevant an item. A total
and other qualitative methods to assure index per scale can also be computed.
content validity of the scale. This According to Waltz et. al (2010) the CVI
qualitative type of validation is more or per scale is recommended when there
less prone to subjective influences are only two experts involved in the
coming from the researchers. Yet, this judgment stage. When more than two
approach is intensively used and there judges are involved, Waltz et. al (2010)
are few who reach for alternative recommend to use alpha coefficient,
quantitative methods. Nevertheless, that quantifies the extent to which there
using a more empirical method with a is agreement between experts.
quantitative foundation adds more
scientific value to our research and Construct validity
prevents validation problems to further Construct validity refers more to
affect results. the measurement of the variable. The
issue is that the items chosen to build
Content validity measures up a construct interact in such manner
There are several ways to test that allows the researcher to capture the
content validity using a quantitative essence of the latent variable that has
approach. to be measured. Content validity must
Lawshe (1975) developed a be assessed priori to construct validity.
quantitative measure for assessing Construct validity implies the use of
content validity called the content more quantitatively oriented analyses.
validity ratio (CVR). The content validity It is important to make the
ratio offers information about item-level distinction between internal validity and
validity. The procedure consists in construct validity. The first one refers to
using a panel of experts to rate items assuring a methodology that enables
according to the relevance for the the research to rule out alternative
domain of the scale. Each item of a explanations for the dependent
scale is rated on 3-point rating system variables, while construct validity is
(1- item is irrelevant, 2 – item is more concerned with the choice of the
important, but not essential, 3 – item is instrument and its ability to capture the
essential). For each item a CVR is latent variable. Internal validity becomes
computed, that is basically the a problem in experimental studies,
proportion of experts that considered where each experimental group has to
the items important or essential for the follow the same methodology in order to
content of the scale. There is also the be able to correctly isolate the effect.
possibility of having an overall measure Construct validity has three
for the content validity of the scale. This components: convergent, discriminant
is called an index and it is computed as and nomological validity. Convergent
a mean of items’ CVR values. validity and discriminant validity refers
Another quantitative measure was to the way the construct relates to other
proposed by Waltz & Bausell (1983) constructs. Convergent validity tests if
and it is called the Content Validity the items of a scale correlate higher
Index (CVI). The difference between among them and have significant higher
this measure and the previous (Lawshe, loadings. Convergent validity can also
1975) is that experts rate items on a 4- be assessed buy checking the
points rating scale with slightly different correlation between the instrument and
Management&Marketing, volume XI, issue 1/2013 41

other instruments that mean to measure


the same latent variable. Discriminant Research methodology
validity assumes that items should The present study presents to
correlate higher among them than they alternative methods for assessing scale
correlate with other items from other validaty: the content validity ratio and
constructs that are theoretically the Q-sorting procedure. Both
supposed not to correlate. Nomological procedures were applied on a scale that
validity tests if the construct has the measures perceived risk in e-
same relationships with other variables commerce.
that have been previously tested and For building up the construct for
confirmed in other studies. perceived risk in e-commerce we
Construct validity can be tested followed the methodology used by
during early stages of research using Jacoby and Kaplan (1972). They
the Q-sorting procedure. The main idea divided perceived risk into six
of the analysis is to separate items in dimensions: financial, performance,
construct according to their specific time, social, psychological and physical.
domain. The procedure is more close to We did not use the same dimensions as
measuring discriminant validity. There listed above, since Jacoby and Kaplan
are two ways that it can be done (1972) did research on products.
(Storey, et al., 1997): We aimed to study perceived risk
• Exploratory, when respondents of Internet as an alternative shopping
are given the items and asked to group channel. As a consequence, there was
and identify category labels for each need to restate the dimensions. In order
group of items. to do, that we investigated the work of
• Confirmatory, when the Featherman and Pavlou (2003) together
categories are already labeled and with Crespo, et al. (2009). In the end
respondents are asked to classify each we defined six dimensions of perceived
item in one category. risk in e-commerce: financial,
Q-sorting is applied on experts and security/privacy, psychological, social,
other persons of interest for the time/delivery and product risk. Each
research. It helps eliminate items that dimension was identified through a
do not discriminate well between number of items ranging from 3 to 8,
categories. which were extracted from the literature
review and in-depth interviews (table 1).

Table 1
Dimensions of perceived risk in e-commerce
Type of risk Items
Product risk I believe that online shopping is risky because I cannot
examine the product.
If I choose to buy online I do not have the certainty that the
product will be of good quality.
I believe online shopping is risky because I cannot touch the
product before buy it.
I cannot be sure that a product bought online has the
characteristics advertised on the website.
I believe that a product bought online will not perform as well
as one bought from a bricks and mortar store.
If I buy a product online I risk not to be given the guaranty.
42 Management&Marketing, volume XI, issue 1/2013

Financial risk I do not trust online payment.


When I pay online there is an increased probability to lose
the money on my credit card.
Using online payment there is a chance I pay more due to
hidden fees.
There is a low probability to lose money for a product ordered
on the Internet if I pay on delivery.
I believe that paying by credit card is a secure payment
method.
There are high chances of losing money when paying online
for a product.
Online shopping means potential money loss due to possible
Internet frauds.
The risk of losing money when buying online is the same
whether I pay by credit card or on delivery.
Security/ If I buy online there is a high risk that my personal data would
privacy be used without my consent.
risk There is high chance that hackers take over my personal
account from a e-shop.
If I decide to buy products online I risk losing control over my
personal data.
Time/ If I do my shopping online, there is a high risk that I receive a
delivery different product that the one I ordered.
risk When I buy online I am sure that I will receive exactly the
product I ordered.
If I buy online there are low chances that my product would
have a delivery delay.
When I buy online, I not sure that the e-shop will respect the
promised deadline.
Social Risk There is small chance that my friends will change their
opinion about me because of me using Internet to do
shopping.
If I buy online I am taking the risk that my friends will change
their opinion about me.
Online shopping is positively seen by my family.
My friends do not approve online shopping.
Psychological Online shopping does not fit my self-image.
risk Online shopping is not compatible to my self-image.
Online shopping gives me a state of stress because it does
not fit with my self-image.
Online shopping fits me well.

In order to apply the two methods Lawshe (1975). We introduced all the
we had to do two separate studies for items grouped for each type of risk. We
which we developed two interviewed six experts that were asked
questionnaires. to answer if each item was “1=
Irrelevant, 2=Important, but not
Methodology for the content essential and 3=Essential” for
validity ratio measuring a certain type of perceived
For the content validity ratio we risk.
followed the methodology explained by
Management&Marketing, volume XI, issue 1/2013 43

Table 2
CVR questionnaire example
Important, Essential
Product risk item Irrelevant but not
essential
I believe that a product bought
online will not perform as well as one
bought from a bricks and mortar store.

Methodology for the Q-sorting to which type of perceived risk.


procedure Respondents had to classify items into
For the Q-sorting study we 6 categories: social, psychological,
developed a questionnaire were we financial, security, product and delivery
included all items measuring perceived risk (table 3).
risk without showing which item belongs

Table 3
Q-sorting questionnaire example
Risk Item Risk Type
Social
Financial
Online shopping gives me a state of stress because it Psychological
does not fit with my self-image. Security
Delivery
Product

As a quantitative indicator of the I – number of experts who


Q-sorting procedure we used the considered the item “Irrelevant”;
correct classification percent, which N – total number of experts;
describes the percent of respondents The logic behind the formula is that
that have correctly classified an item the more experts are in favor of one
(Straub, et al., 2004). item as being important or essential, the
more we can consider that item as
Results being part of the construct. Thus, we
Content Validity Ratio can attain content validity of the
To calculate the content validity construct. As one can easily see, the
ratio we used the methodology formula gives a negative result when
described by Lawshe (1975), which less than 50% of the experts rate the
indicates that all items should be item as essential or important but not
analyzed by a group of experts, each essential or a null result when 50% rate
expert having the possibility to describe it as irrelevant.
the item as: 1= Irrelevant, 2=Important, A panel formed by six experts
but not essential and 3=Essential. The rated the items according to Lawshe
formula to calculate the ratio is: (1975) specifications. After analyzing
n−I , the data, we identified 7 items which
CVR = presented serious problems, CVR value
N
Where n – number of experts who being negative, which suggests that
considered the item to be “Essential” or more than 50% of experts found the
“Important, but not essential”; items to be irrelevant (table 4).
44 Management&Marketing, volume XI, issue 1/2013

Table 4
CVR values

Item CVR values

Product risk – I believe that a product bought online will not


-0.67
perform as well as one bought from a bricks and mortar store.
Social risk – There is small chance that my friends will change
their opinion about me because of me using Internet to do -0.67
shopping.
Social risk – If I buy online I am taking the risk that my friends
-0.67
will change their opinion about me.
Psychological risk – Online shopping does not fit my self-
-0.67
image.
Psychological risk – Online shopping gives me a state of stress
-0.33
because it does not fit with my self-image.
Psychological risk – Online shopping does not fit with my self-
-0.67
image.
Psychological risk – Online shopping suits my self-image. -0.33

These results suggest that the 7 but also items with lower percents -4
items should be removed from the items. We considered items with a low
construct before advancing the classification percent those who were
research. below 60% (table 5).
Taking into account that more than
Q-sorting 80% of all 26 items were correctly
In order to calculate the percent of classified, we can consider that the
correct classification, we identified the scale has a good level of discriminant
frequency of respondents that checked validity. However, it is important to
the correct category for each item. We further analyze those items that were
had items that obtained a 100% correct not correctly recognized as belonging to
classification – 3 items, items that had a certain category of risk.
percents higher than 70% – 22 items,

Table 5
Q-sorting results (items with low classification)

Risk type Item Percent

Psychological Online shopping does not fit my self-image. 0.52

Security/ There is high chance that hackers take over


0.59
privacy my personal account from a e-shop.
If I do my shopping online, there is a high risk
that I receive a different product that the one I 0.52
Time/delivery ordered.
When I buy online I am sure that I will receive
0.22
exactly the product I ordered.
Management&Marketing, volume XI, issue 1/2013 45

for applying these types of methods


Conclusions should exist whenever the aim is to
There is only one item that raise the quality of a research. That
presented common problems in both would show a profound investigation of
procedures, the one belonging to all possible issues which may affect
psychological risk. However, the scale validity.
objective of the research was no to see Further research should
whether there are items with problems concentrate on establishing how these
in both cases, but to identify items that methods can improve convergent
do not match validity. So, CVR was validity, discriminant validity and
measured to test for content validity, nomological validity. Moreover, it could
while Q-sorting was applied to test for be useful to examine who are the most
construct validity, more specifically appropriate respondents for each
discriminant validity of items. method. If we have to use only experts
Both alternative methods revealed or we could also use non-experts, just
items with significant problems, items consumers. An interesting approach
that should be removed in next stages would be to compare results coming
of the study or should be refined in from two different samples and to see
order to express more clearly a certain whether respondents’ type is an issue.
type of risk. The problem is, however, that the
The major implications of this experts sample will always be smaller
research rest in the importance of than the consumers’ one and it is
correctly developing a measurement difficult to obtain representativity.
instrument for a latent variable. There is The value of this research stands in
need for applying alternative methods to the revival of rather isolated methods of
test scale validity especially when we scale validation that can prove high
develop a whole new construct and we utility in exploratory phases of research.
use qualitative methods such as in- Content validity ratio and Q-sorting are
depth interviews or focus groups, but less employed, so we wanted to
also when we want to use a scale that introduce them and raise researchers’
was previously developed, but never interest for these alternative methods.
used on a certain sample. The concern

REFERENCES

Alwin, D. (2007), 'Margins of error: A study of reliability in survey measurement',


547.
Campbell, D. & Fiske, D. (1998), 'Convergent and discriminant validation by the
multitrait-multimethods matrix', Personality 56, 162.
Churchill Jr., G. A. (1979), 'A Paradigm for Developing Better Measures of
Marketing Constructs.', Journal of Marketing Research (JMR) 16(1), 64 - 73.
Crespo, Á.. H., del Bosque, I. R. & de los Salmones Sánchez, M. M. G. (2009),
„The Influence Of Perceived Risk On Internet Shopping Behavior: A
Multidimensional Perspective”, Journal of Risk Research, 12(2), 259–277.
Cronbach, L. (1951), „Coefficient alpha and the internal structure of tests”,
Psychometrika 16(3), 297-334.
46 Management&Marketing, volume XI, issue 1/2013

Featherman, M. S. & Pavlou, P. A. (2003), „Predicting e-services adoption: a


perceived risk facets perspective”, International Journal of Human-Computer
Studies, 59(4), 451 - 474.
Goodwin, C. (2009), Research in psychology: Methods and design, Wiley.
Groth-Marnat, G. (2009), Handbook of psychological assessment, Wiley.
Gwet, K. (2001), Handbook of inter-rater reliability.
Haynes, S.; Richard, D. & Kubany, E. (1995), 'Content validity in psychological
assessment: A functional approach to concepts and methods', Psychological
Assessment 7(3), 238--247.
Jacoby, J. & Kaplan, L. B. (1972), „The Components Of Perceived Risk”, in M.
Venkatesan, ed., Proceedings, Third Annual Conference, College Park, ED,
Association for Consumer Research, 382-393.
Lawshe, C. (1975), „A quantative approach to content validity”, Personnel
Psychology 28(4), 563-575.
Lynn, M. (1986), 'Determination and quantification of content validity.', Nursing
research.
Mitchell, V.-W. (1999), 'Consumer Perceived Risk: Conceptualisations And Models',
European Journal of Marketing 33, 163-195(33).
Nunnally, J. (1967), Psychometric theory, Tata McGraw-Hill.
Storey, V., Straub, D., Stewart, K. & Welke, R. (2000), „A conceptual investigation
of the e-commerce industry”, Communications of the ACM 43(7), 117-123.
Straub, D.; Boudreau, M. & Gefen, D. (2004), 'Validation guidelines for IS positivist
research', Communications of the Association for Information Systems 13(24),
380--427.
Waltz, C. & Bausell, R. (1981), Nursing research: Design, statistics, and computer
analysis, FA Davis Company.
Waltz, C.; Strickland, O. & Lenz, E. (2010), Measurement in nursing and health
research, Springer Publishing Company.

This paper is supported by the Sectoral Operational Programme Human Resources Development (SOP
HRD), financed from the European Social Fund and by the Romanian Government under the contract
number POSDRU/89/1.5/S/56815.

View publication stats

You might also like