0% found this document useful (0 votes)

12 views

Module 2 - Measurement Fundamentals

Uploaded by

Sheikh Badshah

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Module 2 - Measurement Fundamentals

Uploaded by

Sheikh Badshah

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 105

1

Level A: Basic Psychometrics

Module 2: Measurement Fundamentals

Professor Muhammad Kamal Uddin

(PhD, Kyushu University, Japan)
Department of Psychology
University of Dhaka, Bangladesh
Phone: 01713456644 Email:[email protected]
What is Psychometrics?
Psychometrics is a field of study concerned with the theory and
technique of psychological measurement. As defined by the US
National Council on Measurement in Education (NCME),
psychometrics refers to psychological measurement. Generally, it
refers to the field in psychology and education that is devoted to
testing, measurement, assessment, and related activities.

2
Psychological Assessment

It can be defined as the gathering and integration of psychology-

related data for the purpose of making a psychological evaluation,
accomplished through the use of tools such as tests, interviews, case
studies, behavioral observation, and specially designed apparatuses
and measurement procedures.

3
Tools of Psychological Assessment
1. Psychological test
2. Interview
3. Portfolio
4. Case History Data
5. Behavioral Observation
6. Role-Play Tests
7. Computers as Tools
8. Other Tools

4
A Taxonomy of Psychological Assessment
Psychological Test
According to Cronbach (1960) – “A psychological test is a systematic
procedure for comparing the behavior of two or more people”.
According to Anstey (1966) – “Psychological tests can be defined as devices
and techniques for the quantitative assessment of psychological attribute of an
individual”.
According to Anastasi (1969) – “A psychological test is an objective and
standardized measure of a sample of behavior”.

According to Gregory (1996) – “A test is a standardized procedure for

sampling behavior and describing it with categories or score”.

According to Anastasi and Urbina (1997) – “A psychological test is

essentially an objective and standardized measure of a sample of behavior”. 6
Definition of Standardized Tests
Standardized tests are constructed by test construction specialists, usually with
the assistance of curriculum experts, teachers, and school administrators. They
may be used to determine a student’s level of performance relative to (a) the
performance of other students of similar age and grade or (b) a criterion, such
as state academic standards, or the new Common Core State Standards. When
standardized tests are used to compare performance to students across the
country, they are called standardized norm-referenced tests, and when they are
used to determine whether performance meets or exceeds criteria like state
standards, they are called standardized criterion-referenced tests.
Norm-Referenced Tests (NRTs) and Criterion-
Referenced Tests (CRTs)
 NRTs are typically standardized tests developed by commercial test publishers or some state education
agencies (e.g., the SAT). They are designed to enable us to compare the performance of students who
currently take the test with a sample of students who completed the test in the past. The sample of students
who completed the test in the past is called a norm group (or normative group or sample). NRTs tend to
measure broad educational goals and are usually lengthy (hours long in duration).

 CRTs may be standardized or teacher-made and enable a different kind of comparison. Compared to
NRTs, CRTs are typically shorter in length and narrower in focus. Instead of comparing current student
performance to other students, CRTs enable comparisons to an absolute standard or criterion. CRTs help us
to determine what a student can or cannot do. Rather than stating that “Marie is above average,” CRTs
enable us to make judgments about a student’s (or group of students’) level of proficiency or mastery over a
skill or set of skills (e.g., “Marie is able to spell the words in the third grade spelling list with greater than
80% accuracy,” For this reason, and because they are shorter than NRTs, scores from a CRT are more likely
to be useful for instructional decision making than scores from a NRT would be.
Definition of Measurement
• Measurement is the assignment of numerals to objects or events according to rules
(Stevens, 1946).
• Measurement is a process that involves three components – an object of measurement, a
set of numbers, and a system of rules – that serve to assign numbers to attributes or
magnitudes of the variable being measured.
• The rules are the specific procedures used to transform qualities of attributes into numbers
(Camilli, Cizek, & Lugg, 2001; Nunnaly & Bernstein, 1994; Yanai, 2003).
• An educational or psychological test is a measuring device, and as such it involves rules
e.g., specific items, administration, and scoring instructions for assigning numbers to an
9
individual’s performance that are interpreted as reflecting characteristics of the individual.
Definition of Measurement…

10
Properties of Numbers

• Property of Identity
• Property of Order
• Property of Quantity
• The Number 0

11
Four Levels of Measurement
1. Nominal Scale: a scale in which the numbers or letters assigned to an object serve only
as labels for identification or classification, e.g. Gender (Male = 1, Female = 2)
2. Ordinal Scale: a scale that arranges objects or alternatives according to their magnitude
in an ordered relationship, e.g. Academic status (Freshman = 1, Sophomore = 2, Junior
= 3, etc.
3. Interval Scale: a scale that both arranges objects according to their magnitude,
distinguishes this ordered arrangement in units of equal intervals, but does not have a
natural zero representing absence of the given attribute, e.g. the temperature scale (40oC
is not twice as hot as 20oC)
4. Ratio Scale: a scale that has absolute rather than relative quantities and an absolute
12
(natural) zero where there is an absence of a given attribute, e.g. income, age.
Association Between Property of
Number and Level of Measurement
Level of Measurement
Property of Number Nominal Ordinal Interval Ratio

Identity    
Order   
Quantity  
Absolute Zero 
Example Sex Class Rank Temperature Distance
13
Why Level of Measurement Matters?

It helps you decide what mathematical operations and

statistical analyses are appropriate on the values that
were assigned
It helps you decide how to interpret the data from that
variable
14
Mathematical Operations
 With nominal level data the only mathematical operation that is applicable is “equal
to” (=) and “not equal to” (≠).
 With ordinal level data one can also include “greater than” (>) and “less than” (<) as
applicable operations.
 With interval level data all the basic mathematical operations like addition,
subtraction, multiplication, and division can be applied. However, because interval
level scores do not have an absolute or true zero, one cannot make accurate
statements about relative magnitude and create ratios.
 With ratio level data, however, one can make accurate statements about relative
15
magnitude and create ratios.
Statistics Analyses
Descriptive Statistics
 With nominal level data the only measures of central tendency applicable is mode. No common
measures of variability is applicable. One can describe the categories and the count (frequency
distribution) in each category.
 If ordinal scales are used, analysis of raw data can be done using median and range (plus mode and
frequency distribution)
 If interval or ratio scales are used, analysis of raw data can be done through the use of mean, median,
mode, range, variance, standard deviation.
Inferential Statistics
 Nominal and Ordinal data are amenable to nonparametric statistics but Interval and Ratio data can be
analyzed using parametric statistics. Parametric tests are more powerful meaning that they are more
16
sensitive in detecting true differences between groups.
17
Types of Correlations
Variables Nominal Ordinal Interval/Ratio (Scale)

Nominal Clustered bar-graph, Clustered bar-graph, Scatter plot, Bar chart or

Chi-square, Chi-square, Error-bar chart,
Phi (φ) or Cramer's V Phi (φ) or Cramer's V Point bi-serial correlation

Ordinal Scatterplot or clustered Recode

bar chart, Spearman’s Rho Scatter plot, Pearson Point bi-
or Kendall’s Tau serial, or Spearman’s Rho, or
Kendall’s Tau

Interval/Ratio (Scale) Scatter plot, Pearson

Product-moment correlation

18
Types of Correlations
No Correlation Level of Measurement
1. Phi, contingency Both variables nominal
2. Spearman rank order, Kendall’s Both variables ordinal
tau
3. Pearson product moment Both variables interval
4. Pearson Point biserial One variable interval, one variable (naturally)
dichotomous/binary
5. Pearson biserial One variable interval, one variable artificially
dichotomous/binary
6. Polychoric Both variables ordinal with underlying continuities
7. Tetrachoric Both variables dichotomous artificially 19
Types of Graphs
Type Level of Measurement

Bar Chart Nominal; must be organized into categories

Pie Chart Nominal, ordinal, interval, or ratio. However, it is not practical

to use a pie chart when there are more than five or six possible
values for a variable.

Histogram/Frequency Polygram Ordinal, interval, or ratio level data. Most often used with ratio
or interval level data

Line Chart Interval and ratio data

Single System Design Interval and ratio data

20
Types of Graphs

ABA Design

Low Self-esteem
40

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Baseline Intervention Baseline
Definition of Variable

Any characteristic which is subject to change and can have more

than one value such as age, intelligence, motivation, gender, etc.

22
Types of Variables Based on How They are Measured
Exercise
1. The number of participants in the current batch.
2. A person’s weight is measured on which kind of scale?
3. Salaries of college professors.
4. What level of measurement is of an IQ score?
5. Variables gender or ethnicity are measured on which scale?
6. Where do you live?
7. What is your political preference?
8. What is your hair color?
9. What is your social class?
26
9. Number of students present in the class
10. Time it takes to get to school
11. Number of red marbles in a jar
12. Distance traveled between classes
13. Students’ grade level
14. Height of students in class
15. weight of students in class
16. number of heads when flipping three coins
17. temperature measured in Kelvin scale

27
Exercise
18. How satisfied are you with the training program?
 Very Unsatisfied – 1
 Unsatisfied – 2
 Neutral – 3
 Satisfied – 4
 Very Satisfied – 5
19. How would you rate our app?
 Excellent
 Very Good
 Good
 Bad
 Poor
20. In medical practice, burns are commonly described as
First-Degree
Second-Degree
Third-Degree 28
Main Differences Between Discrete Versus Continuous Variables
Basis of Comparison Discrete Variable Continuous Variable

A variable with a limited Is characterized by variables

Meaning number of values which are with unlimited number of
isolated ranging values

Values Countable Measurable

Range of specified number Complete or whole Incomplete

Represented by Lone points on a graph Linked points

Classification Do not overlap Overlapping

Assumes Separate or distinct value A value between a range

29
Types of Variables Based on the Role They Play in a Study

Independent, Input, Covariate, Dependent, Output, Outcome,

Exploratory, Organismic, Predictor, Effect, Criterion, Endogenous,
Exogenous, Manipulated, Treatment, Test, Response
Explanatory

Extraneous, Confounding, Intervening, Mediating, Moderating, Interaction,

Control or Constant, Dummy or Indicator

30
Assignment

1. Prepare a psychometric test or questionnaire including demographics to

demonstrate categorical (nominal and ordinal), continuous (interval and
ratio), and discrete (either continuous or categorical) variables based on
how they are measured and label them graphically.

2. Explain all kinds of variables based on the role they play in a study and
present them pictorially.
31
Errors in Measurement

1. Random Error
2. Systematic Error
What Is Random Error?
Any factors that randomly affect measurement of the variable
across the sample. For instance, each person’s mood can
inflate or deflate performance on any occasion. Random error
adds variability to the data but does not affect average
performance for the group.

33
Random Error

Frequency

The distribution of X with no

random error

34
Random Error
The distribution of X with
random error
Frequency

The distribution of X with no

random error

35
Random Error
The distribution of X with
random error
Frequency

The distribution of X with no

random error

Notice that random error doesn’t X

affect the average, only the
variability around the average.
36
Participants Observed Score = X True Score = T Error = E
1 7 6 1
2 5 5 0
3 8 7 1
4 8 7 1
5 7 8 -1
6 6 5 1
7 5 6 -1
8 3 5 -2
9 5 6 -1
10 6 5 1
Mean 6.00 6.00 0.00
Variance 2.44 1.11 1.33
Variance T/Variance X 0.45
Squared Correlation 0.45
Correlation 0.67 37
What Is Systematic Error/Bias?
Any factors that systematically affect measurement of the variable across
the sample. For instance, asking questions like “Do you favor eliminating
the wasteful excess in the military budget?” will tend to yield a systematic
higher endorsement rate in disagreement option. Systematic error does
affect average performance for the group.

38
Systematic Error

Frequency

The distribution of X with no

systematic error

39
Systematic Error
The distribution of X with
systematic error
Frequency

The distribution of X with no

systematic error

40
Systematic Error
The distribution of X with
systematic error
Frequency
The distribution of X with no
systematic error

Notice that systematic error does X

affect the average; we call
this a bias.
41
Participants Observed Score = X True Score = T Error = E
1 7 6 1
2 5 4 1
3 8 7 1
4 8 7 1
5 7 6 1
6 6 5 1
7 5 4 1
8 3 2 1
9 5 4 1
10 6 5 1
Mean 6.00 5.00 1.00
Variance 2.44 2.44 0.00
Variance T/Variance X 1.00
Squared Correlation 1.00
Correlation 1.00
Variance, Covariance, and Correlation

Variance  
2


X X  2

X
( N  1)

Variance 
 (X  X )( X  X )
N 1

Covariance  XY   (X  X )(Y  Y )
N 1
Covariance
Correlatio n (ρ XY )
σ X σY

43
Computing Variance
Students X X  X  X  X  2

1 4 -2.6 6.76
2 5 -1.6 2.56
3 3 -3.6 12.96
4 8 1.4 1.96
5 8 1.4 1.96
6 6 -0.6 0.36
7 6 -0.6 0.36
8 7 0.4 0.16
9 9 2.4 5.76
10 10 3.4 11.56
Sum 66.00 0.00 44.40
Mean 6.60 0.00 N = 10
= (Sum of Squared Deviations) /(N-1)
Variance
= 44.40/9 = 4.93
Computing Covariance
Student X X  X  Y Y  Y  X  X  Y  Y 
1 4 -2.6 20 2.6 -6.76
2 5 -1.6 19 1.6 -2.56
3 3 -3.6 21 3.6 -12.96
4 8 1.4 16 -1.4 -1.96
5 8 1.4 16 -1.4 -1.96
6 6 -0.6 18 0.6 -0.36
7 6 -0.6 18 0.6 -0.36
8 7 0.4 17 -0.4 -0.16
9 9 2.4 15 -2.4 -5.76
10 10 3.4 14 -3.4 -11.56
Sum 66.00 0.00 174.00 0.00 -44.40

Mean 6.60 0.00 17.40 0.00

(-44.40/9) = -4.93
Covariance
45
Computing Correlation
Participants X =Height (cm) Y = Weight (gm) X =Height (inch) Y = Weight (dg)

1 7 6 2.76 0.60
2 5 5 1.97 0.50
3 8 7 3.15 0.70
4 8 7 3.15 0.70
5 7 8 2.76 0.80
6 6 5 2.36 0.50
7 5 6 1.97 0.60
8 3 5 1.18 0.50
9 5 6 1.97 0.60
10 6 5 2.36 0.50
Mean 6 6 2.36 0.60
Covariance 0.91 0.04
46
Correlation 0.67 0.67
ΣXΣY
COV XY 
 
Σ X  X Y  Y


ΣXY 
N
N  1 N  1

ΣXΣX
COV XX 
 
Σ X  X X  X  
ΣXX 
N
N  1 N  1

ΣX 
2

 
2
2 ΣX 
Σ X  X N
Var X  
N  1 N  1

ΣY 
2

 
2
2 ΣY 
Σ Y Y N
Var Y  
N 1 N 1
 Often multiple items are combined in order to create a composite score
 The variance of the composite is a combination of the variances and
covariances of the items creating it
 General Variance Sum Law states that if X and Y are random variables:

 2
X Y  2
X   2 XY
2
Y

49
Variance Sum Law
Σ[(X  Y)  ( X  Y)] 2
σ X2 Y 
N  1
Σ[(X  X)  (Y  Y)] 2
σ 2X  Y 
N 1
Σ(X  X) 2 Σ(Y  Y) 2 2Σ(X  X)(Y  Y)
σ 2X  Y   
N 1 N 1 N 1

2 2 2
σ X Y σ X σ Y  2 XY
2 2 2
σ X Y σ X σ Y  2 XY  X  Y
Covariance (σ XY )
Correlation (ρ XY ) 
σXσY
2 2 2
σ X Y σ X σ Y  2 XY
Individual X (Sleep) Y (Awaken) (X+Y)
1 4 20 24
2 5 19 24
3 3 21 24
4 8 16 24
5 8 16 24
6 6 18 24
7 6 18 24
8 7 17 24
9 9 15 24
10 10 14 24
Covariance -4.93
Variance 4.93 4.93 0
51
Variance (X+Y) 0
2 2 2
σ XY σ X σ Y  2 XY
Individual X (Sleep) Y (Awaken) (X-Y)
1 4 20 -16
2 5 19 -14
3 3 21 -18
4 8 16 -8
5 8 16 -8
6 6 18 -12
7 6 18 -12
8 7 17 -10
9 9 15 -6
10 10 14 -4
Covariance -4.93 -4.93
Variance 4.93 4.93 4.93
52
Variance (X-Y) 19.73
2 2 2
σ X Y σ X σ Y  2 XY
Individual X (Sleep) Y (Study) (X+Y)
1 5 4 9
2 6 5 11
3 7 4 11
4 8 4 12
5 6 5 11
6 7 5 12
7 6 5 11
8 7 4 11
9 8 5 13
10 6 4 10
Covariance 0.00
Variance 0.93 0.28 1.21
53
Variance (X+Y) 1.21
2 2 2
σ XY σ X σ Y  2 XY
Individual X (Sleep) Y (Study) (X-Y)
1 5 4 1
2 6 5 1
3 7 4 3
4 8 4 4
5 6 5 1
6 7 5 2
7 6 5 1
8 7 4 3
9 8 5 3
10 6 4 2
Covariance 0.00
Variance 0.93 0.28 1.21
Variance (X-Y) 1.21 54
Theories of Measurement

1. Classical Test Theory (CTT)

2. Item Response Theory (IRT)
3. Generalizability Theory (GT)
Classical Test Theory

Lord and Novick (1968) say “The classical test theory model is
based on a particular, mathematically convenient and conceptually
useful, definition of true score and on certain basic assumptions
concerning the relationships among true and error scores”.
Classical Test Theory

X T  E
Observed True Random
= +
Score Score Error

57
Classical test theory also assumes that (a) the distribution of observed scores
that a person may have under repeated independent testing is normal and (b)
the standard deviation of the normal distribution, referred to as standard error
of measurement (SEM), is the same for all persons taking the test.

Under these assumptions, the left figure represents

the (hypothetical) normal distribution of observed
scores for repeated measurements of one person
with the same test. The mean of this distribution is,
in fact, the person’s true score (T = 20) and the
[X – 2(SEM)] < T < [X +
2(SEM)] standard deviation is the standard error of
measurement (SEM = 2).
58
59
60
Assumptions in CTT
The classical measurements model — which asserts that an observed score, X, results

from the summation of a true score, T, plus error, E – starts with common assumptions

about items and their relationships to the latent variable and sources of error:

The observed score (X) of a person is the sum of the true part (T) and the error part (E).

X T  E

61
Assumptions
1. The amount of error associated with individual items varies randomly.
The error associated with individual items has a mean zero when
aggregated across a large number of people. Thus, items’ means tend to
be unaffected by error when a large number of respondents complete the
items.

 E  0...(1)

The expected value of E of all deviations of is zero; they

T are
deviation scores from the Mean.

62
Assumptions…
2. Measurement errors between two items of the same scale are
uncorrelated. That is one item’s error term is not correlated with
another item’s error term (i.e. assumption of local independence); the
only routes linking items always pass through the latent variable,
never through any error term.

 E E 0...(2)
1 2

63
Assumptions…
3. Error terms are not correlated with the true score of the latent variable. For example, with a high value of T, the E is
not systematically lower or higher. So, the assumption is:

TE 0...(3)

64
Important Deduction from the CTT
1. The observed variance of scores in a sample or population equals the true
score variance of a sample/population plus the error variance. The total
observed variance in a test/questionnaire consists of the sum of the true
variances and the error variances. Because the correlation between the T’s
and E’s is zero, no correlation between T and E has to be added.

 2
X  2
T   ...( 4 )
2
E

Note. Cov( TE ) 0...( 3 ) 65

Important Deduction from the CTT…
2. The covariance of observed score with true score is just the
variance of true score
Cov XT  Cov T  E ),T   2 T  Cov TE   2 T  0  2 T ...(5)

3. The correlation of observed score with true score is

Cov XT   2T T
 XT    ...( 6 )
 XT  XT X
4. The correlation of observed score with error score is

Cov XE  Var E  E E 
 XE     E ...(7)
 X E  X E  X E X 66
Theoretical Definition of Reliability
1. Reliability can be defined as the ratio of the true score to the observed
score.
T
R ...(1)
X
2. Reliability is generally defined as the ratio of the true score variance to the
observed score variance.
 T2
Re liability  2 ...( 2)
X
3. Reliability is the squared correlation between true score and observed
score.
Re liability  XT
2
...(3)
σ T2 2
 Reliabilit y(ρ XX )  2  ρ XT ...(4)
σX 67
Theoretical Definition of Reliability…
You may be wondering how we can compute a reliability coefficient if we don’t know the
true scores of all the test takers. Fortunately, the answer is simple. There is another
definition of reliability/precision that is mathematically equivalent to the formula that uses
true score variance and observed score variance to calculate reliability. That definition is as
follows: Reliability/precision is equal to the correlation between the observed scores on two
parallel tests (Crooker & Algina, 1986)

Re liability  XX '
68
69
Theoretical Definition of Reliability…
The theoretical reliability coefficient is not practical; we do not know each
person’s true score. So, we cannot compute reliability. If we can’t compute
reliability, perhaps the best we can do is to estimate it.

Test-Retest Reliability
Alternate/Equivalent/Parallel Forms Reliability
Internal Consistency Reliability
Inter-Rater /Inter-Scorer /Scorer Reliability
Each traditional estimation method – test-retest, parallel (equivalent) forms,
and internal consistency – defines reliability somewhat differently; none is
isomorphic with the theoretical definition.
71
72
Test-Retest Reliability
The test-retest method for estimating reliability involves
administering the same test to the same group of individuals on
two different occasions and then correlating the two sets of
scores. When using this method, the reliability coefficient
indicates the degree of stability (consistency) of examinees'
scores over time and is also known as the coefficient of stability.

73
Now, to see how repeatable or consistent an observation is,
we can measure it twice.
If we can't compute reliability, perhaps the best we can do is to
estimate it. Maybe we can get an estimate of the variability of the
true scores. How do we do that? Remember our two observations,
X1 and X2? We assume that these two observations would be related
to each other to the degree that they share true scores. So, let’s
calculate the correlation between X 1 and X2. Here’s a simple
formula for the correlation:

Covariance (X1 , X 2 )
Correlatio n 
SD X1  SD X 2
Covariance (X1 , X 2 )
Correlatio n 
SD X1  SD X 2

Variance (T )
Correlatio n 
Variance (X )
 Correlatio n Re liability
 Variance (T ) 
 Variance (X) Re liability 
 
Alternate/Equivalent/Parallel Forms Reliability
The theoretical reliability coefficient is not practical; we do not know each person’s
true score. Nevertheless, we can estimate the theoretical coefficient with the sample
correlation between scores on two parallel tests. Assume that X and X′ are two
strictly parallel tests (for simplicity) – that is, tests with equal means, variances,
covariances with each other, and equal covariances with any outside measure. The
Pearson product-moment correlation between parallel tests produces an estimate of
the theoretical reliability coefficient:

Cov XX / CovT  E T  E /  ST2

rXX /    2 Re liability
SX SX / S X2 SX

77
 Kuder-Richardson 20 (KR 20):
 a special case of alpha
 applies only to dichotomous items

k  sTotal   pq 
2

   2 
k  1 sTotal 
Where, k is the number of items and pq is the variance for each dichotomous item
The proportion of individuals who pass (p) multiplied by the proportion who
pq  fail (q) each item
Calculate Kuder-Richardson 20 reliability coefficients on the following scores on an achievement test, where 1
indicates a right answer and 0 a wrong answer.
Examinee A B C D E p q pq
Item
1
1 1 0 1 1 0.8 0.2 0.16
2
1 0 0 0 0 0.2 0.8 0.16
3
1 1 1 1 1 1 0 0.00
4
1 1 1 0 0 0.6 0.4 0.24
5
1 0 1 1 0 0.6 0.4 0.24

Total Score 5 3 3 3 2 0.80

Mean 3.2 1.2

S 2
Total 1.2  pq 0.8
2 

 X -X 
2

N σ 2  pq
2 2 p  Item Facility
X - 2 X X   X
2 
N q  Item Difficulty
2
X 2 2 X X X
2   
N N N
2
 X 0 or 1
X  ΣX  X
2   2  X   X X 2
N  N  N
2
NX
σ 2  p  2p 2 
N
σ 2  p  2p 2  p 2
σ 2  p(1  p) σ 2  pq
80
k   sij 
   2 
k  1  sTotal 
s 2 is the composite variance (if items were summed)
Total

 sij is covariance between the ith and jth items where i is not equal to j
 k is the number of items

81
Respondents
Items Ria Zia Pia Variance
1 6 5 4 1.00
2 6 4 5 1.00
3 5 3 3 1.33
4 4 4 4 .00
5 4 5 4 .34
Total 25 21 20
Variance of Total Scores = 7.0
Total of item variances = 3.67
82
A 3-item (X, Y, Z) psychological test is administered to 60 participants in Basic Psychometrics
and the following variance-covariance matrix is obtained:

X Y Z
X 55.83 29.52 30.33
Y 29.52 17.49 16.15
Z 30.33 16.15 29.06
The sum of all the item variance is 102.38
The sum of all the item covariance is 152.03
The variance of the Total Test Scores is 254.41
2
S Total 55.83  17.49  29.06  2( 29.52  30.33  16.15)
83
k  sTotal   si 
2 2
3  254.41  102.38 
   2     .8964
k  1 sTotal  3  1 254.41 

k   sij  3  152.03 
   2     .8964
k  1  sTotal  3  1  254.41 

84
Models of Measurement

Just as routine tests to check for any violations of normality should be carried out, so equally should tests for
assumptions in reliability estimation be applied. Despite the seemingly obscure labels given to the models, all
are connected by four underlying and easily-described properties of a scale (e.g., see Graham, 2006). These
properties are as follows:
i) the extent to which each item measures the same underlying trait (unidimensionality);
ii) whether the true scores for different items have the same mean (sensitivity);
iii) whether the true scores for different items have the same variance; and
iv) whether the error variance is the same for each item.
Thus the degree to which one assumes either constancy or variability of properties ii) to iv) is what distinguishes
the essentially tau-equivalent from parallel or congeneric models.
Models of Measurement…

In CTT, measures of the same thing (e.g., items, subtests, or tests) can be classified by their levels of similarity.
In this section, I define four levels of similarity: parallel, tau-equivalent, essentially tau-equivalent, and
congeneric. Note that these levels are hierarchical in the sense that the highest level (parallel) requires the most
similarity, whereas levels lower in the hierarchy allow for less similarity in test properties. For example, parallel
measures must have equal true score variances, whereas congeneric measures do not require this. One useful
way of thinking about these levels is in terms of the relationships between the true scores of pairs of measures
(Komaroff, 1997). In CTT, the basic relationship between the true scores on two measures (ti and tj) is

t i a ij  bij t j
Models of Measurement…

A. Congeneric Model (Least Restrictive)

B. Essentially Tau-equivalent (More Restrictive)
C. Tau-equivalent Model (Even More Restrictive)
D. Parallel Model (Most Restrictive)
Models of Measurement…
A. Parallel Model (Most Restrictive)
B. Tau-equivalent Model (Less Restrictive)
C. Essentially Tau-equivalent (Even Less Restrictive)
D. Congeneric Model (Least Restrictive)
X k   E...( A )
X k   E k ...(B)
X k   a k   E k ...(C)
X k λk η  a k   E k ...( D)
Assumptions in CTT Measurement Models

89
Congeneric Model

X k λ k η  a k   E k ...( A )
X is an observed score linearly related to a single latent trait η ,
λ is the slope representing units /scales of measurement,
a is the intercept representing origin of scale, E is the residual (error term), and subscript k is the item
in question.
The least restrictive measurement model referred to as congeneric model assumes
that a group of observed items
1.measure the same latent trait
2.can measure the latent trait on different scales/units (the slopes λk can be

different)
3.can measure the latent trait with different degrees of precision (dissimilar scale
origins—the intercepts ak can be different)

4.can measure the latent trait with different amounts of error (Var(Ek) can be

different).

Factor Analysis (typically) uses the Congeneric Measurement Model (Raykov, 1997a).

91
Essentially Tau-equivalent Model

X k   a k   E k ...(B)

92
Essentially Tau-Equivalent Model
A more restricted case of congeneric measures, referred to as essentially tau-
equivalent measures, occurs when only the first condition is in place-
1.λ1= λ2 = λ3 (all slopes are equal).

As variables that differ by a constant have equal variances, one can say that
essentially tau-equivalent measures have equal true score variances but unequal
error variances.

93
Tau-equivalent Model
X k   E k ...(C)
Tau-Equivalent Model

An even more restricted case of congeneric measures, referred to as tau-equivalent

measures, occurs when only the first two conditions are in place-
1.λ1= λ2 = λ3 (all slopes are equal); and

2.a1= a2 = a3 (all intercepts are equal);

95
Parallel Model

X k   E...( D)
Parallel Model
The most restricted case of congeneric measures, referred to as parallel measures, occurs
when the following three conditions are in place-
1.λ1= λ2 = λ3 (all slopes are equal);

2.a1= a2 = a3 (all intercepts are equal); and

3.Var(E1) = Var(E2) = Var(E3) (all error variances are equal).

Thus, parallel measures have the same units of measurement, scale origins, and error
variances.

97
98
99
Setting a Confidence Interval
Var X VarT  Var E
Var E Var X  VarT
Var E  Var X  VarT
 VarT 
SDE  Var X  1  
 Var X 
SDE SD X 1  R
Setting a Confidence Interval…
FIVE Steps:
1. Estimate True Score (T ) = (R)(X)
Where, R = Reliability, X = Observed score
2. Calculate Standard Error of Measurement (SE)
SE SD X 1  R

3. Find z value associated with the confidence level

4. Multiply SE with z
5. Confidence Interval (CI) = T ± (SE × z)
101
Setting a Confidence Interval…
Suppose a score of 120 on an IQ test is obtained, and the test
has a reliability of 0.95 and a standard deviation of 12.
Calculate the 95% confidence Interval of the estimated true
score.

102
Setting a Confidence Interval…
• Then, if I want to be 95% confident that the score will fall
within a certain range, the z value associated with the 95%
confidence interval (1.96) will be multiplied with the value of
the SE (1.96 × 2.68 = 5.25). Then this value is added to and
subtracted from the estimated true score.
• This leads to the following two equations:
• 114 + 5.25 = 119.25 and
• 114 – 5.25 = 108.75

103
Setting a Confidence Interval …

Thus, I can be 95% confident that if the test was

administered to the test taker 100 times, 95 times out of
100, the true score would fall between 108.75 and 119.25.

104
THANK YOU

105

Solutions Manual For Ayyub
No ratings yet
Solutions Manual For Ayyub
128 pages
AGRI EXTENSION MCQs 400
100% (2)
AGRI EXTENSION MCQs 400
38 pages
CH 3 Measurement and Data
No ratings yet
CH 3 Measurement and Data
42 pages
Psychological Statistics
0% (1)
Psychological Statistics
95 pages
Research
No ratings yet
Research
37 pages
Psychological Assessment HW #3
No ratings yet
Psychological Assessment HW #3
7 pages
5. Measuring Human Behaviour
No ratings yet
5. Measuring Human Behaviour
14 pages
MEASUREMENT
No ratings yet
MEASUREMENT
30 pages
Data Collection Methods
No ratings yet
Data Collection Methods
24 pages
Measurement in Research
No ratings yet
Measurement in Research
8 pages
LECTURE 3 - Test Development - 044659
No ratings yet
LECTURE 3 - Test Development - 044659
15 pages
Module 1 Basics of Assessment (1)
No ratings yet
Module 1 Basics of Assessment (1)
50 pages
Brm End Sem Complete
No ratings yet
Brm End Sem Complete
350 pages
Measurement of Variables and Data Types
No ratings yet
Measurement of Variables and Data Types
26 pages
Eps 400 New Notes Dec 15-1
No ratings yet
Eps 400 New Notes Dec 15-1
47 pages
Business Research Methodds Unit 3
No ratings yet
Business Research Methodds Unit 3
39 pages
Measurement and Scaling
No ratings yet
Measurement and Scaling
24 pages
Unit 1 mean And SD
No ratings yet
Unit 1 mean And SD
45 pages
Levels of Measurement
No ratings yet
Levels of Measurement
64 pages
PSYCHOLOGICAL ASSESSMENT CHAPTER 3-6 (Summary) 1
No ratings yet
PSYCHOLOGICAL ASSESSMENT CHAPTER 3-6 (Summary) 1
11 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Chapter 4 - Measurement and Statistics
No ratings yet
Chapter 4 - Measurement and Statistics
32 pages
Unit 8..8602 PDF
No ratings yet
Unit 8..8602 PDF
47 pages
EPS NOTES
No ratings yet
EPS NOTES
77 pages
StatisticsRefresher Part1
No ratings yet
StatisticsRefresher Part1
7 pages
AKTU BUSSINESS RESEARCH Unit 3
No ratings yet
AKTU BUSSINESS RESEARCH Unit 3
60 pages
Review Notes Assessment of Learning
No ratings yet
Review Notes Assessment of Learning
12 pages
Unit 2 Measurement Scales in Psychology
No ratings yet
Unit 2 Measurement Scales in Psychology
26 pages
Importance of Statistics
No ratings yet
Importance of Statistics
4 pages
Smu PM MB0050
No ratings yet
Smu PM MB0050
149 pages
Remaining Notes
No ratings yet
Remaining Notes
13 pages
Cooper&Schindler Chap7
100% (1)
Cooper&Schindler Chap7
23 pages
Test and Measurement
No ratings yet
Test and Measurement
23 pages
PSYCHOLOGICAL ASSESSMENT
No ratings yet
PSYCHOLOGICAL ASSESSMENT
6 pages
e423bca7-9177-4d09-baaa-9b4c03c5563c
No ratings yet
e423bca7-9177-4d09-baaa-9b4c03c5563c
13 pages
Measurement Levels and Stat Tool
No ratings yet
Measurement Levels and Stat Tool
50 pages
Basic Concepts of Quantitative Research
100% (1)
Basic Concepts of Quantitative Research
38 pages
Module 3: Principles of Psychological Testing: Central Luzon State University
No ratings yet
Module 3: Principles of Psychological Testing: Central Luzon State University
14 pages
Statistics and Probability: 11 - Athena Reporter #2
No ratings yet
Statistics and Probability: 11 - Athena Reporter #2
31 pages
3 Assumptions Stat
No ratings yet
3 Assumptions Stat
66 pages
REM 1 Concept Measurement 1 1 1
No ratings yet
REM 1 Concept Measurement 1 1 1
24 pages
Psychological Assessment (Midterms)
No ratings yet
Psychological Assessment (Midterms)
20 pages
Unit 2 Measurement & Scaling
No ratings yet
Unit 2 Measurement & Scaling
91 pages
PSYTEST
No ratings yet
PSYTEST
33 pages
Measurement and Scaling
No ratings yet
Measurement and Scaling
40 pages
Module 3 - Measurement and Scaling
No ratings yet
Module 3 - Measurement and Scaling
65 pages
Measurement and Scaling Techniques
No ratings yet
Measurement and Scaling Techniques
16 pages
BBA 4 RM Unit 4
No ratings yet
BBA 4 RM Unit 4
99 pages
Guidance Reviewer
No ratings yet
Guidance Reviewer
46 pages
Study Guide For Mid Term
No ratings yet
Study Guide For Mid Term
9 pages
PsychAss Reviewer
No ratings yet
PsychAss Reviewer
18 pages
Measurement and Scaling Techniques: Presented By: Sem Shaikh
No ratings yet
Measurement and Scaling Techniques: Presented By: Sem Shaikh
18 pages
Research Methodology on 8-2-19
No ratings yet
Research Methodology on 8-2-19
55 pages
Chapter 3 A Statistical Refresher
No ratings yet
Chapter 3 A Statistical Refresher
8 pages
Lesson 1 Introduction To Statistics
No ratings yet
Lesson 1 Introduction To Statistics
44 pages
CC02 - PA - Norma and Basic Statistics
No ratings yet
CC02 - PA - Norma and Basic Statistics
7 pages
Unit 3 BRM PDF
No ratings yet
Unit 3 BRM PDF
6 pages
PSYCHOLOGICAL ASSESSMENT CHAPTER 3-6 (Summary)
No ratings yet
PSYCHOLOGICAL ASSESSMENT CHAPTER 3-6 (Summary)
11 pages
TOPIC 2
No ratings yet
TOPIC 2
15 pages
01 PPT Psych. Statistics
No ratings yet
01 PPT Psych. Statistics
32 pages
Measurement: Quantifying The Dependent Variable
No ratings yet
Measurement: Quantifying The Dependent Variable
17 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Overall Descriptive Statistics
No ratings yet
Overall Descriptive Statistics
127 pages
8438 Ecap792 Data Science Toolbox
No ratings yet
8438 Ecap792 Data Science Toolbox
317 pages
Practical Research 2 Week 1 q2
0% (1)
Practical Research 2 Week 1 q2
9 pages
R Mcqs
100% (1)
R Mcqs
34 pages
Block 1 FEG 2 Unit 1
No ratings yet
Block 1 FEG 2 Unit 1
10 pages
MODULE 1-Introduction To Statistical Concept
No ratings yet
MODULE 1-Introduction To Statistical Concept
46 pages
Elementary Statistical Concepts
No ratings yet
Elementary Statistical Concepts
1 page
Attitude Scales - Rating Scales To Measure Data PDF
No ratings yet
Attitude Scales - Rating Scales To Measure Data PDF
10 pages
RMPS Mock Exam (Quiz Questions)
No ratings yet
RMPS Mock Exam (Quiz Questions)
11 pages
Activity Sheet Week 3 Variables - ZAPATA
No ratings yet
Activity Sheet Week 3 Variables - ZAPATA
3 pages
EBS 234 Assessment in Basic Schools
No ratings yet
EBS 234 Assessment in Basic Schools
92 pages
Course Code: Caec 3A Course Title: College: Authors: Title of The Learning Resource
No ratings yet
Course Code: Caec 3A Course Title: College: Authors: Title of The Learning Resource
30 pages
Math2101Stat 2 2
No ratings yet
Math2101Stat 2 2
23 pages
Cronbachs Alpha PDF
No ratings yet
Cronbachs Alpha PDF
8 pages
Variables
No ratings yet
Variables
11 pages
Chapter-13 Measurement and Scaling Concepts
No ratings yet
Chapter-13 Measurement and Scaling Concepts
14 pages
MSW Master of Social Work
No ratings yet
MSW Master of Social Work
62 pages
A Comparative Study of Categorical Variable Encoding Techniques
No ratings yet
A Comparative Study of Categorical Variable Encoding Techniques
4 pages
CH 7
No ratings yet
CH 7
26 pages
The Variables in Research
100% (2)
The Variables in Research
4 pages
Statistics
No ratings yet
Statistics
18 pages
University Updates: Financial Management
No ratings yet
University Updates: Financial Management
10 pages
Chapter 11 Quantitative Data
No ratings yet
Chapter 11 Quantitative Data
25 pages
Statistics Quarter1 Week 3
No ratings yet
Statistics Quarter1 Week 3
9 pages
DCM Year Ii Sem. Ii Course Content Outline Teacher Guide
No ratings yet
DCM Year Ii Sem. Ii Course Content Outline Teacher Guide
20 pages
Hair 4e IM Ch07
No ratings yet
Hair 4e IM Ch07
25 pages
The Mann-Whitney U-Test - : Analysis of 2-Between-Group Data With A Quantitative Response Variable
No ratings yet
The Mann-Whitney U-Test - : Analysis of 2-Between-Group Data With A Quantitative Response Variable
4 pages
SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif
No ratings yet
SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif
55 pages