0% found this document useful (0 votes)
54 views46 pages

Module 5-6 - Research Design

This document discusses different types of measurement scales used in research: nominal, ordinal, interval, and ratio scales. It explains the key properties of each scale - nominal scales only have identity, ordinal scales have identity and magnitude, interval scales add equal intervals, and ratio scales have an absolute zero point. Examples are provided for each scale to illustrate how variables might be measured at each level, from simply categorizing data at the nominal level to performing all mathematical operations at the highest ratio level. Researchers must carefully select the appropriate scale to match the properties of the variables they are studying.

Uploaded by

voiceofmehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views46 pages

Module 5-6 - Research Design

This document discusses different types of measurement scales used in research: nominal, ordinal, interval, and ratio scales. It explains the key properties of each scale - nominal scales only have identity, ordinal scales have identity and magnitude, interval scales add equal intervals, and ratio scales have an absolute zero point. Examples are provided for each scale to illustrate how variables might be measured at each level, from simply categorizing data at the nominal level to performing all mathematical operations at the highest ratio level. Researchers must carefully select the appropriate scale to match the properties of the variables they are studying.

Uploaded by

voiceofmehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Research Design

Defining, Measuring, and Manipulating Variables (Module 5)


Reliability and Validity (Module 6)
Defining Variables – Another aspect
Operational Definition
• In our research some variables are fairly easy to define, manipulate, and measure.
• For example, if a researcher is studying the effects of exercise on blood pressure, she can
find the relation by:
• Manipulating the amount of exercise by length of time and see the results or
• Manipulate the intensity of the exercise (and monitoring target heart rates).
• She can also periodically measure blood pressure during the course of the study by a
machine built to measure in a consistent and accurate manner.
• Does the fact that a machine exists to take this measurement mean that the measurement
is always accurate? The answer is Yes and No.
(We shall discuss this issue in Module 6 when we address measurement error.)
• Now let us suppose that a researcher wants to study hunger, depression, aggression, in a
group of patients.
• Measurement of these parameters is not as easy as blood pressure.
• One researcher’s definition of what it means to be hungry may be vastly different from other
researcher. Even patients may define hunger in different ways.
• The solution of this problem is that researcher should define, hunger, and this definition is called
operational definition, that is, criteria, the researcher uses to measure or manipulate it.
• In other words, the investigator might define hunger in terms of specific activities such as not
having eaten for 12 hours.
• Thus one operational definition of hunger could be that simple: Hunger occurs when 12 hours
have passed with no food intake
• Researchers must operationally define all variables: those measured (dependent variables) and
those manipulated (independent variables)
• Another example, If a researcher says he measured anxiety in his study,
• The question becomes how did he operationally define anxiety? There are different ways
as following
• Anxiety can be defined as the number of nervous actions displayed in a 1-hour time period, or
• As a person’s score on a GSR (Galvanic Skin Response) machine or on the Taylor Manifest
Anxiety Scale.
Some measures are better than others, better meaning more reliable and valid (concepts we discuss in
Module 6).
• Once other investigators understand how a researcher has operationally defined a
variable, they can replicate the study if they so desire.
• They can better understand the study and whether it has problems. They can also better
design their own studies based on how the variables were operationally defined

Exercise
1. You are a teacher in a school, and want to send a team of 3 students to participate in an
interschool poetry competition. You have to select best 3 students among the school.
Now, Define BEST THREE STUDENTS
2. You are observing “How many people abide by road rules, by checking, number of
people stop on traffic signal”. How will you define “stopping at traffic signal – break,
break for time, car rolling slowly, etc.”
Properties of Measurement
• After operationally defining independent and dependent variables, next step is to consider
the level of measurement of the dependent variable.
• There are four levels of measurement, each based on the characteristics, or properties, of
the data, which can be:
• Identity,
• magnitude,
• equal unit size, and
• Absolute zero.
• Identity – In this case numbers are allocated just for the sake of identification.
• The number cannot be used in mathematical operations. Thus numbers assigned are just
to convey a particular meaning.
• For instance we can assign 1 to Male, 2 to Female for our study.
• Similarly, if participants in a study have different political affiliations, they receive
different scores (numbers)
• Magnitude – a variable could have Identification and Magnitude as well,
• This means that numbers have an inherent order from smaller to large. For example, Position
in Class, Level of Education or Rank in an Organization
• Variables having Identity and Magnitude are measured on Ordinal Scale.

• Equal Intervals – also called Equal Unit Size means that difference between numbers
anywhere on the scale are the same
• In Most business researches, variables are taken as having equal interval or any variable
where the difference between two units is the same
• As difference between any of the following or previous two units for instance the
difference between 4 and 5 is the same as the difference between 76 and 77 i.e. 1.
• Variables with Equal Intervals, Magnitude and Identification Properties are measured on
Interval Scale.
• Absolute/true zero – means that the zero as a response represents the absence of the
property being measured (e.g., no money, no behavior, none correct)
• In other words, A property of measurement in which assigning a score of zero indicates
an absence of the variable being measured
• However, temperature on 0 is not absolute zero as it still has some effect and we cannot
say no temperature.

http://www.mnestudies.com/research/scales-measurement
Scales of Measurement
• The level, or scale, of measurement depends on the properties of the data.
There are four scales of measurement:
• nominal,
• ordinal,
• interval, and
• ratio.
• Each of these scales has one or more of the properties described in the previous section.
• We discuss the scales in order, from the one with the fewest properties to the one with the
most, that is, from the least to the most sophisticated.
• As we see in later modules, it is important to establish the scale of data measurement in
order to determine the appropriate statistical test to use when analyzing the data.
Nominal Scale
• From the Statistical point of view it is the lowest measurement level.
• Nominal Scale is assigned to items that is divided into categories without having any
order or structure,
• For example, Colors do not have any assigned order,
• We can have 5 colors like Red, Blue, Orange, Green and Yellow and could number them 1
to 5 or 5 to 1 or number them in a mix, here the numbers are assigned to color just for the
purpose of identification, and ordering them
• Ascending or Descending does not mean that Colors have an Order.
• The number gives us the identity of the category assigned.
• The only mathematical operation we can perform with nominal data is to count.
• Another example from research activities is a YES/NO scale, which is nominal.
• It has no order and there is no distance between YES and NO.
Ordinal Scale –
• Next up the list is the Ordinal Scale.
• Ordinal Scale is ranking of responses, for instance Ranking Cyclist at the end of the race
at the position 1, 2 and 3.
• Not these are rank and the time distance between 1 and 2 may well not be the same as
between 2 and 3, so the distance between points is not the same but there is an order
present,
• when responses have an order but the distance between the response is not necessarily
same, the items are regarded or put into the Ordinal Scale.
• Therefore an ordinal scale lets the researcher interpret gross order and not the relative
positional distances.
• This is similar to three positions in a class – difference is there, but not equal.
• Ordinal Scale variables have the property of Identity and Magnitude.
• The numbers represent a quality being measured (identity) and can tell us whether a
case has more of the quality measured or less of the quality measured than another case
(magnitude). The distance between scale points is not equal. Ranked preferences are
presented as an example of ordinal scales encountered in everyday life.
Interval Scale
• A normal survey rating scale is an “Interval Scale”
• Example: When asked to rate satisfaction on a 5 point scales i.e. Strongly Agree, Agree,
Neutral, Disagree and Strongly Disagree, an interval scale is being used.
• It is called as an interval scale because it is assumed to have equal distance between each
of the scale elements i.e. the Magnitude between Strongly Agree and Agree is assumed to
be the same as Agree and Strongly Agree.
• We can interpret differences in the distance along the scale.
• Interval scale is different from Ordinal scale where we can only talk about differences in
order, not differences in the degree of order i.e. the distance between responses.
Properties of Interval Scales
• Interval scales have the properties of:
• Identity
• Magnitude
• Equal distance
• Variables which fulfill the above mentioned properties are put in this scale. The equal
distance between scale points helps in knowing how many units greater than, or less than,
one case is from another. The meaning of the distance between 25 and 35 is the same as
the distance between 65 and 75.
Ratio Scale
• A Ratio Scale is at the top level of Measurement.
• The factor which clearly defines a ratio scale is that it has a true zero point.
• The simplest example of a ratio scale is the measurement of length (disregarding any
philosophical points about defining how we can identify zero length) or money.
• Having zero length or zero money means that there is no length and no money but zero
temperature is not an absolute zero, as it certainly has its effect.
• Ratio scales of measurement have all of the properties of the abstract number system.
Properties of Ratio Scale
• Identity
• Magnitude
• Equal distance
• Absolute/true zero
• These properties allow to apply all possible mathematical operations that include addition,
subtraction, multiplication, and division. The absolute/true zero allows us to know how
many times greater one case is than another. Variables falling in this category and having
all the above mentioned numerical properties fall in ratio scale.
http://www.mnestudies.com/research/types-measurement-scales
Last lecture
• Operational Definition of Anxiety
• N on verbal measures – Could be face expressions
• Physiological measures – blood pressure, respiration rate, etc. (These are
measurable)
• Scale of Measurement
• Zip Code, P. O. Box No. – Nominal Scale
• Large, medium and small eggs – Ordinal Scale
• Reaction time – Ratio scale
• Score of SAT – Interval scale
• Class rank - Ordinal Scale
• No. of football Jersey – Nominal scale
• Miles per gallon – ratio scale
Types of Measures
• After measuring scales, we shall see “Types of Measures”
• Types of Measures are as following categories: (These are related to human
behavior)
• Self-Report Measures
• Tests
• Behavioral Measures
• Physical Measures
Type of Measures
• Self-Report Measures: Self-report measures are typically administered as
questionnaires or interviews to measure how people report that they act, think, or feel.
• Thus self-report measures aid in collecting data on behavioral, cognitive, and affective
events (Leary, 2001).
• Behavioral self-report measures typically ask people to report how often they do
something such as how often they eat a certain food, eat out at a restaurant, go to the
gym, etc. – Just statement of facts about one-self
• Cognitive self-report measures ask individuals to report what they think about
something, such as what do you think about canteen service – Involves judgement
• Affective self-report measures ask individuals to report how they feel about
something. Questions concerning emotional reactions such as happiness, depression,
anxiety, or stress lie in Affective domain. Many psychological tests are affective self-
report measures. These tests also fit into the category of measurement tests described
in the next section.
• Tests: Tests are measurement instruments used to assess individual differences in various
content areas.
• Psychologists frequently use two types of tests:
1. Personality tests and
2. Ability tests.
• Many personality tests are also affective self-report measures; they are designed to
measure aspects of an individual s personality and feelings about certain things.
• Ability tests, however, are not self-report measures and generally fall into two different
categories: aptitude tests and achievement tests. Aptitude tests measure an individual s
potential to do something, whereas achievement tests measure an individual s competence
in an area.

• Behavioral Measures deals with carefully observing and recording behavior.
• Behavioral measures are often referred to as observational measures because they involve
observing what a participant does.
• Behavioral measures can be applied to anything a person or an animal. For example, the
way men and women carry their bags, or how many follow the road signs.
• The observations can be:
• Direct (while the participant is engaging in the behavior) or
• Indirect
• In direct observations, the participants may become cautious and may react in an
unnatural way. This affects the results. This response of participants is called “reactivity”
• Observers may hide themselves, or use a more indirect means of collecting the data (such
as videotape). - this is indirect way
• Using an unobtrusive means of collecting data reduces reactivity, that is, participants
reacting in an unnatural way to being observed.
• Physical Measures - Physical measures are usually taken by means of equipment.
• Weight is measured with a scale, blood pressure with and temperature with a dedicated
apparatus.
• Physical measures are much more objective than behavioral measures.
• A physical measure is not simply an observation. Instead, it is a measure of a physical
activity that takes place in the brain or body.
• This is not to say that physical measures are problem free.
• Keep in mind that humans are still responsible for running the equipment that takes the
measures and ultimately for interpreting the data provided by the measuring instrument.
Thus even when using physical measures, a researcher needs to be concerned with the
accuracy of the data.
• Self-Report Measures – Subjective/ Objective
• Tests – Subjective/ Objective
• Behavioral Measures - Subjective/ Objective
• Physical Measures - Subjective/ Objective
Meaning of Reliability (Module 6)
• Reliability: An indication of the consistency or stability of a measuring instrument.
• In other words, the measuring instrument must measure exactly the same way every time it is
used.
• This consistency means that individuals should receive a similar output each time they use the
measuring instrument.
• For example, a bathroom scale needs to be reliable, that is, it needs to measure the same way
every time an individual uses it,
otherwise it is useless as a measuring instrument.
Error in Measurement
• Consider some of the problems with the four types of measures discussed in the previous
module (i.e., self-report, tests, behavioral, and physical).
• Some problems, known as method errors, stem from the experimenter and the testing
situation. For example,
• Does the individual taking the measures know how to use the measuring instrument
properly?
• Is the measuring equipment working correctly?
• Other problems, known as trait errors, stem from the participants.
• Were the participants being truthful?
• Did they feel well on the day of the test?
• Both types of problems can lead to measurement error. In fact, a measurement is a
combination of the true score and an error score.
• The true score is what the score on the measuring instrument would be if there were no
error.
• The error score is any measurement error (method or trait) (Leary, 2001; Salkind, 1997).
• The following formula represents the observed score on a measure, that is, the score
recorded for a participant on the measuring instrument used.
• The observed score is the sum of the true score and the measurement error.
Observed score = True score + Measurement error
• The observed score becomes increasingly reliable (more consistent) as we minimize error
and thus have a more accurate true score.
• True scores should not vary much over time, but error scores can vary tremendously from
one testing session to another.
• How then can we minimize error in measurement?
• We can make sure that all the problems related to the four types of measures are
minimized.
• These problems include those in recording or scoring data (method error) and those in
understanding instructions, motivation, fatigue, and the testing environment (trait error).
Types of Reliability
• There are four types of reliability:
• Test/retest reliability,
• Alternate-forms reliability,
• Split-half reliability, and
• Interrater reliability.
• Each type provides a measure of consistency, but they are used in different situations.
• Test/ Retest Reliability: This deals with repeating the same test on a second occasion, is
known as test/retest reliability
• If the test is reliable, we expect the results for each individual to be similar. That is, the
resulting correlation coefficient will be high (close to 1.00).
• This measure of reliability assesses the stability of a test over time.
• However, it is ideal case. Some error will be present in each measurement so the
correlation coefficient will not be 1.00 in most of the cases, but we expect it to be 0.80 or
higher.
• Alternate-Forms Reliability – Using alternate forms of the testing instrument and
correlating the performance of individuals on the two different forms.
• In this case the tests taken at times 1 and 2 are different but equivalent or parallel (hence
the terms equivalent-forms reliability and parallel-forms reliability are also used)
• For example:
1. We want to find the reliability for a test of mathematics comprehension, so we
create a set of 100 questions that measure that construct. Then we randomly split
the questions into two sets of 50 (set A and set B), and administer those questions
to the same group of students a week apart. i.e. say one set of 50 questions on
Monday and the next 50 questions to same students on say Friday or next Monday
and then results are correlated
2. We have made 100 different sample of same material. We take 50 samples in one
test and 50 in the same machine after some time. (This way we confirm calibration
also)
• Since, tests taken at times 1 and 2 are different but equivalent or parallel (hence the
terms equivalent-forms reliability and parallel-forms reliability are also used)
• Split-Half Reliability - a third means of establishing reliability is by splitting the items
on the test into equivalent halves and correlating scores on one half of the items with
scores on the other half.
• This split-half reliability gives a measure of the equivalence of the content of the test but
not of its stability over time as test/retest, what we did in Alternate form reliability.
• The biggest problem with split-half reliability is determining how to divide the items so
that the two halves are in fact equivalent.
• For example, it would not be advisable to correlate scores on multiple choice questions
with scores on short-answer or essay questions.
• What is typically recommended is to correlate scores on even-numbered items with scores
on odd-numbered items.

• Interrater Reliability – This measures the reliability of observers rather than tests
• It measure of consistency that assesses the agreement of observations made by two or
more raters or judges.
• Let us say that you are observing play behavior in children. Rather than simply making
observations on your own, it is advisable to have several independent observers collect
data.
• The observers all watch the children playing but independently count the number and
types of play behaviors they observe.
• Once the data are collected, interrater reliability needs to be established by examining the
percentage of agreement among the raters.
• If the raters data are reliable, then the percentage of agreement should be high.
• If the raters are not paying close attention to what they are doing or if the measuring scale
devised for the various play behaviors is unclear, the percentage of agreement among
observers will not be high.
• Although interrater reliability is measured using a correlation coefficient, the following
formula offers a quick means of estimating interrater reliability:
VALIDITY
• In addition to being reliable, measures must also be valid.
• Validity refers to whether a (statistical or scientific) study is able to draw
conclusions that are in agreement with statistical and scientific laws.
• This means if a conclusion is drawn from a given data set after experimentation, it
is said to be scientifically valid if the conclusion drawn from the experiment is
scientific and relies on mathematical and statistical laws.
• There are several types of validity;
• Like reliability, validity is measured by the use of correlation coefficients.
• For instance, if researchers developed a new test to measure any parameter, (such as
depression), they might establish the validity of the test by correlating scores on the new
test with scores on an already established measure of depression, and as with reliability
we would expect the correlation to be positive.
• Coefficients as low as 0.20 or 0.30 may establish the validity of a measure (Anastasi &
Urbina, 1997).
• In brief it means that the results are most likely not due to chance
Content validity:
• A systematic examination of the test content to determine whether it covers a
representative sample of the domain of behaviors to be measured assesses content
validity.
• This type of validity is important to make sure that the test or questionnaire that is
prepared actually covers all aspects of the variable that is being studied. If the test is too
narrow, then it will not predict what it claims.
• In other words, a test with content validity has items that satisfactorily assess the content
being examined.
• To determine whether a test has content validity, researchers consult experts in the area
being tested. (* In fact this is a challenge to EM students, who prepare questionnaire *)

Face Validity
• Face validity simply addresses whether or not a test looks valid on its surface. Does it
appear to be an adequate measure of the conceptual variable? It is generally confused with
content validity
• This is just a face value
Criterion validity:
• The extent to which a measuring instrument accurately predicts behavior or ability in a
given area establishes criterion validity.
• Two types of criterion validity may be used, depending on whether the test is used to
estimate present performance (concurrent validity) or to predict future performance
(predictive validity).
• The SAT and GRE are examples of tests that have predictive validity because
performance on the tests correlates with later performance in college and graduate school,
respectively.
• The tests can be used with some degree of accuracy to predict future behavior.
• A test used to determine whether someone qualifies as a pilot is a measure of concurrent
validity. The test is estimating the person’s ability at the present time, not attempting to
predict future outcomes.
• Thus concurrent validation is used for the diagnosis of existing status rather than the
prediction of future outcomes.
Construct Validity
• Construct validity is considered by many to be the most important type of
validity.
• The construct validity of a test assesses the extent to which a measuring instrument
accurately measures a theoretical construct or trait that it is designed to measure.
• Some examples of theoretical constructs or traits are verbal fluency, neuroticism,
depression, anxiety, intelligence, and scholastic aptitude.
• One means of establishing construct validity is by correlating performance on the test
with performance on a test for which construct validity has already been determined.
• Thus performance on a newly developed intelligence test might be correlated with
performance on an existing intelligence test for which construct validity has been
previously established
• Another example is for metallurgist/ material science, who are developing a material. The
new material can be correlated with performance on an existing material test, for which
construct validity has been previously established
• Another means of establishing construct validity is to show that the scores on the new test
differ across people with different levels of the trait being measured.
• For example, if a new test is designed to measure depression, you can compare scores on
the test for those known to be suffering from depression with scores for those not
suffering from depression.
• The new measure has construct validity if it measures the construct of depression
accurately.
1. Content and construct validity
2. Face validity
3. A test to measure something (different) other than what it claims to measure – establish
validity through experiment on other things.
4. It is a concern of validity of the test, because it does not measure, what it is supposed to
measure.

You might also like