0% found this document useful (0 votes)
33 views19 pages

Mean, Mode Median

Uploaded by

Pardeep Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views19 pages

Mean, Mode Median

Uploaded by

Pardeep Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIT 16 INTERPRETING TEST RESULTS

Structure

16.1 Introduction
16.2 Objectives
16.3 Meaning of Statistics
16.3.1 Descriptive Statistics
16.3.2 Inferential Statistics
16.4 Need for Statistics
16.5 Meaning and Need of Central Tendency
16.5.1 Mean, Median and Mode
16.5.2 Using Mean, Median and Mode
16.6 Meaning of Variability
16.6.1 Measures of Variability
16.6.2 Mean Deviation and Standard Deviation
16.6.3 Derived Scores
16.7 Correlation: Meaning and Interpretation
16.8 Let Us Sum Up
16.9 Answers to Check Your Progress
16.10 Suggested Readings

16.1 INTRODUCTION

If you evaluate a large number of students, say 1000, with the help of an
achievement test, you shall have 1000 different numerical scores (known as
raw scores). With the help of this long list of numbers, you cannot draw any
conclusion about the achievement of your student or about the progress of the
class as a whole. In order to make these 'raw scores' meaningful, you have to
arrange them in a systematic manner to enable yourself to say something about
the overall achievement of the students as a group. The process of organizing
data, in order to draw meaningful conclusions, is known as 'statistics'.

Statistics is a set of procedures used for describing, synthesizing. analyzing and


interpreting quantitative data or observations. These procedti, - enable us to
express a large number of numerical scores in terms of single numbers which
help us to compare groups of students on one or several variables or traits. A
basic knowledge of statistics is essential for every practising teacher so as to
understand that how the data are selected, gathered, organized, analyzed and
what conclusions can be validly drawn fiom the analysis of the data.

In order to be successful as a teacher, you have to motivate your students to


learn, and identify their strengths and weaknesses, adjust your teaching
methods according to the capabilities and motivational levels of your students.
You have to evaluate the progress of your students periodically to ensure that
proper learning takes place and instructional objectives are successfully
realized. You also have to inform students of their progress to motivate them.

All these responsibilities of a teacher make it essential for you as a teacher to


have basic knowledge of statistics and its uses in teaching-learning and
evaluation process. We shall discuss aialyzing and interpreting data in this
unit. Though you have studied basic statistics in your preservice education
Student Performance:
Interpretation programme, we want to bring those concepts on the forefront so that other
units in this course can be understood properly. Moreover, you should be able
to interpret and use data in your classroom.

After studying this unit, you should be able to:

differentiate between descriptive and inferential statistics,


discuss the role of educational statistics in teaching-learning process and
evaluation,
compute mean, median and mode for a given set of scores and explain their
uses,
explain the nieaning of 'variability' and name some measures of
variability,
compute mean deviation and standard deviation for a given set of scores
and interpret them,
differentiate between mean deviation and standard deviation with the help
of examples,
compute deviation scores, Z-scores and T-scores for a given set of raw
scores, and discuss their uses by giving suitable examples,
compute correlation coefficient for given data and interpret the results.

163 MEANING OF STATISTICS


Statistics has a long history. In the beginning, it was perhaps, used by ancient
chiefs who counted the number of effective warriors they had or the number of
warriors needed to defeat the enemy. Later on, statistics was used by kings or
rulers to estimate judiciously how much they could collect as taxes from the
people they ruled. Thereafter, it was used to report death rates, birth rates and
to make a study of natural resources. These uses of statistics indicate that the
term has been derived from the word 'stake' or perhaps h m 'state arithmetic'.
It also indicates that, in earlier times, it was put to a descriptive use.

The concept of probability emerged when gamblers asked mathematicians to


develop principles that would improve the chances of winning at cards and
dice. Later on, their principles were used in astronomy, psychology, and
education. In social and behavioural sciences, statistics is used as a set of
methods for describing and interpreting quantitative information, including
techniques for arganizing and summarizing data and for making
generalizations and inferences. It is an effective tool in the hands of a teacher
to study the behaviour of hisher students.

16.3.1 Descriptive Statistics

Generally, statistics is related to the -study of characteristics of population.


Most of the times population is so large that it is difficult or impossible to
study every individual unit or number of the population. Therefore, what the
statistician does is to draw a small part of the population as a sample, study it
carefully and infer from it the characteristics of the population as a whole. Infe~*efhgTest*esults
These types of studies are normally called sample surveys or sample studies.
Thus, a sample study has two steps:

i) selection and study/analysis of the sample,


ii) drawing inferences fiom the sample about the population.
I
It is with the first step that 'descriptive statistics' is concerned. Descriptive
statistics is used to describe the sample in hand. By using descriptive statistics,
data are described or summarized in terms of small number of indices. For
example, the performake level of a group of 200 students is indicated in terms
of a 'mean score', - a single index. Similarly, the variability of scores may be
expressed in terms of 'mean deviation' or 'standard deviation' which are single
indices to express the nature of large data. Such indices, computed for samples,
are known as 'statistics'. But, when these are used for the whole population,
these are known as 'parameters'.
h
Generally, four different types of descriptive statistics are used:

11 i) measures of central tendency


ii) mcasures of variability
I iii) measures of relative position and
I iv) measures of association/relationship.

~ The measures of central tendency such as mean, median and mode are used to
determine the 'typical' or average score for a group, where as the measures of
variability, such as standard deviation, indicate how the scores are spread about
the central or typical value. On the other hand, the measures of relative
position show how an individual performs in relation to other persons in the
group and the measures of association indicate how two sets of scores for the
same group of persons are related. These meaisuies are used to study the scores
based on samples drawn fiom large+mpulations and indicate the characteristics
I
of the samples.

16.3.2 Inferential Statistics

As mentioned in an earlier section of this unit, sample values such as mean,


mode, standard deviation, etc., are known as 'statistics' and the corresponding
values for population are referred to as 'parameters'. The major function of a
statistical inquiry is to infer population characteristics from the knowledge of
sample characteristics. Alternatively, we can say that we wish to estimate
'parameters' from 'statistics'. If the mean score based on a sample is 60, the
question of interest is whether mean score for the population (parameter) is
different from it or not. The part of statistics dealing with drawing of
inferences about the population h m the known values for the sample is
referred to as 'inferential statistics'. The inferential statistics involves
procedures used for estimating or predicting the properties of the population
when sample characteristics are known.

You may use the ideas of statistical inference very frequently. For example, if
you want to make a statement about the mean IQ in a complete population of
students in a particular institution, you draw a representative sample,
Student Perf01-mince:
Interpretation administer an intelligence test to the students included in the sample and then
compute the mean IQ of the sample. Thereafter, you may use this sample mean
for estimating the mean IQ of the population. The procedures of statistical
inference involves the use of many concepts such as, 'standard error',
'sampling distribution' and 'levels of significance' which are beyond the scope
of the present discussion.
The procedure of statistical inference also involves the establishment of the
accuracy of the sample statistics, such as, mean as estimates of the population
parameters.

16.4 NEED FOR STATISTICS

Study of statistics is an essential component of most of the courses in


Bducation and Psychology. This underlines the significance of statistics for
'roper understanding of basic concepts and principles of these disciplines.
$ ven in the teacher training programmes, statistics is taught as a compulsory
domponent. While studying literature on research in teaching, teachers come
across tables, charts, graphs and figures representing data. In order to grasp
the essential elements of such a literature, every teacher must have a basic
knowledge of statistics.
Educators and social scientists collect several kinds of data with the help of
educational and psychological measurement devices. In order to make a correct
analysis and interpretation of those data, you require some knowledge of
statistics. For example, if you have obtained a set of scores after administering
a test, you may be iaterested: (i) to have an idea of the 'typical' performance
level of your students and for this purpose, you may use an appropriate
measure of central tendency such as mean; (ii) to know how test scores are
distributed around the 'typical' value, you may use measures of variability;
(iii) to use tables, graphs or diagrams to portray clearly the nature of your class
as far as performance on the given test islconcerned; (iv) to transform your raw
scores into more meaningful forms known as derived scores so as to make a
better interpretation of students' performance and you may also use this
information for improving your teaching method; and (v) to use statistical
methods in guiding and counseling your students for further studies or for
employment purposes.

Check Your Progress 1

Notes: a) Write your answer in the space given below.

b) Compare your answer with that given at the end of the unit.

1. Define statistics, descriptive statistics . and inferential


statistics.

.........................................................................
.........................................................................
.........................................................................
Interpreting Test Results
16.5 MEANING AND NEED OF CENTRAL TENDENCY
It is commonly observed that measurements based on large groups or samples
have wide variation. If you measure the heights of plants of certain type in a
garden, it is observed that heights vary widely from very small (short) plants to
very tall plants. However, 'very short' and 'very tall' plants are relatively small
in number and majority of heights are spread around the 'average' value, some
being closer while others being farther. Similarly, if you consider the heights
of 15 year old boys, you shall observe that exceptionally short and
exceptionally tall boys are small in number. The heights of a large majority are
concentrated around the mean. This 'tendency' is common to all kinds of
measurements.

The tendency of scores or measurements to centre or concentrate around the


centrallmean value is known as 'central tendency'. The central tendency helps
in describing huge data in. terms of fewer 'typical' scores such as mean,
median and mode. The property of central tendency is very useful in statistical
analysis, especially in descriptive and inferential statistics. It enables us to
describe the overall level of performance of a group on a psychological test.
Whenever performance of two groups is compared, comparison is made in
terms of measures of central tendency. For example, if mean score of group A
on a vocabulary test is higher than that of group B, it may be safely concluded
that students of group A are better than those of group B on the trait being
measured by the test. The 'central tendency' is, thus, a very important concept.

16.5.1 Mean, Median and Mode

As mentioned earlier, mean, median and mode are measures of central


tendency. Now, we shall discuss how these measures are defined and
computed. The number resulting from the computation of a measure of central
tendency represents overall performance level of a group as a whole. The
three most frequently used indices of central tendency, viz., mean, median and
mode are appropriate for different scales of measurement. Since most of
measurements in Education and Psychology represent an interval scale, the
mean is the most frequently used measure of central tendency. Therefore, we
start our discussion with mean.

Mean: The mean is the arithmetic average of the scores. It is.calculated by


adding up all the raw scores and dividing the sum by the number of scores.
For example, if 10 students are given a test and the scores they obtain are as
follows:

Then the mean (M) of these scores may be computed as:


Student Performance:
Interpretation In general if the scores of N students are represented as Xl, Xz, X3, ...... X,,
then the mean may be obtained as:

Symbolically, this expression may be written as:

where, CX indicates the sum of scores and N represents the number of


measures or scores or students. The computational procedure for the mean of
grouped data is slightly complex, but basic idea is the same. All methods of
computation, in effect, add the scores and divide their sum by their number.

By the very nature of the way in which mean is computed, it is based on eacb
and every score. If some of the scores in the set are very small or very large as
compared to the rest, the mean is drastically shifted towards the extreme
scores. However, in general, mean is a preferred measure of central tendency.
It is more precise and stable index as compared to median and mode. If a
number of samples of the same size are selected from a population, the means
of those samples will be closer to one another in comparison to the modes or
the medians.

In the above example, if mean is subtracted from each of the scores and the
resulting differences are added algebraically, the result will be zero. In the
following table column (1) shows raw sccares (X) and column (2) shows the
numbers obtained after subtracting mean fi-bm raw scores (x) and column (3)
shows the squares (x2) of these differences.

It can be observed that the sum of entries in column (2) is zero, and that of
those in column (3) is 120.
These results lead to certain important conclusions. Column (2) shows the InterprebgTestResults
deviation of each raw score from the mean. ' ~ h e s enumbers are called
'deviation scores'. The deviation scores may be negative as well as positive
and indicate how much each raw score is above or below the mean. As shown
in the example, the sum of the deviation scores around the mean is zero. This
leads to a new definition of mean which states that mean is "that point about
which the sum of deviations is zero". Here, the sum of deviations on one side
of the mean equals the sum of deviations on the other side. The mean,
therefore, acts as a "fulcrum" of the distribution.

Column (3) shows each deviation squared and sum of these squares is 120. It
. can be easily shown that sum of deviations about any point other than the mean
will always be more than 120. This leads to still another definition of mean as
"that point in the dstribution about which the sum of squares of deviations is
at a minimum". These properties of mean help in further study of statistics.

Median: In order to understand the calculation of median, let us consider the


scores of 11 students in a class which are given below:

If we arrange these scores in increasing order of magnitude, we get the


following order:

It can be observed that 21 is such a point which divides the distribution into

I
two equal parts because 5 scores are below it and 5 scores are above it. Such a
point on the scale of measurement above and below which equal number of
cases or scores fall, is known as 'median7. It should be noted that median is not
necessarily one of the given scores. It is a point on the scale which divides the
group into two equal halves.

The concept of median may be compared with the median of a triangle which
divides the triangle into two equal parts by area. Here, area is comparable to
the size of the group. If the number of scores or students is odd, the median is
I the middle score as shown in the above example. On the other hand, when the
number of scores is even, the median is the mean of the middle two scores. If
in the above example the last score 27 is deleted, the median (Md) may be
I computed as follows:

If two or more scores in the distribution are repeated and the repetition takes
place near the median, the computation of median is made by the method of
interpolation. For example, in the following set of scores the score 15 occurs
thrice near the median.

13, 14, 15, 15, 15, 16, 16, 17


Student Performance:
Interpretation By definition, median of the above scores should fall between the second and
third occurrence of 1%.The real limits of the score 15 are 14.5 and 15.5, that is,
the score 15 represents the interval 14.5 - 15.5. All the three repeated scores
of 15 cover this range and are distributed uniformly over it. This means that 2
of the 3 scores of 15 will occupy 2/3 of the interval 14.5 - 15.5 which indicates
a score of 1 point. This shows that a width of 0.67 = (1 x 2/3) is covered by
two repeated scores ~f 15. The median, therefore, is given by:

The logic may be extended to the computation of median for grouped data.
Consider the following distribution:

Score (X) Frequency V) Cumulative Frequency Q


19 1 40
18 3 39
17 7 36
16 12 29
15 8 17
14 5 9
13 4 4

In this case, the first column presents raw scores, second column frequencies
and the third column cumulative frequencies. The cumulative frequency of a
score means the total number of scores falling below its upper real limit. For
example, the cumulative frequency of the score 16 is 29. This means that 29
scores (out of 40) fall below 16.5.

In finding the median of the given distribution, we are interested in the point
below which 20 scores (half of 40) fall. We can see that 17 scores have been
covered up to the upper real limit (15.5) of the score 15. This means that our
median (Md) is more than 15.5, or we can say that

Md = 15.5 + something

As 17 scores have been covered up to 15.5, we are interested in the remaining


3 scores which fall above it. These 3 scores fall in the score 16, has a
frequency of 12 scores evenly distributed in it. That is, these 12 scores cover
the score range of 1 point. As these are even'ly distributed, 3 of them will
cover,

1x3
-= .25 points
12

Therefore, the median is given by:

If we analyse the procedure given above we can say that:


Interpreting Test Results

Here, 40 is the size of the group N, 17 is the cumulative frequency (F) of the
score next lower than in score which covers the median, 12 is the frequency V)

I of the score having the median and 1 is the class interval (i) of the score that
has the median. Therefore, the general equation for the median may be written
as:

where L is the real lower limit of the score that contains the median.
I
It should be noted that the median is only the mid-point of the scores and its
computation does not take into account each and every score. Unlike the mean,
it is not affected by extremely high or extremely low scores. Two quite
different sets of scores may have the same median. If one or more of the scores
at the upper or lower ends of the distribution are changed, median is not
affected. It is an appropriate measure of central tendency when data represent
an ordinal scale that is, when the data are available in the form of ranks,
ordered on some continuum in a series ranging from lowest to highest
according to characteristics we wish to measure. However, for a distribution
representing interval scale but having some extreme scores, median may be a
more appropriate measure of central tendency than the mean.

Mode: Mode is the third measure of central tendency which is used when data
represent a nominal scale. It is defined as that value of score which occurs
most frequently. For example, in the scores

the score 15 occurs thnce and 16 occurs twice, and all other scores 13, 14 and
17 occur only once each. Therefore, by definition 15 is the mode of this set of
scores. When data have been arranged into frequency distribution, then the
mode is the mid-point of the class interval having the highest frequency.
Sometimes a distribution may have two modes. If in a set of scores two scores
have equal and highest frequency, then both of these scores are mbdes. Such a.
distribution is known as a bimodal distribution. Similarly, there can be multi-
modal distributions also.
Student Performance:
Interpretation 16.5.2 Using Mean, Median and Mode

The three measures of central tendency are used in different situations. The
essential difference between mean and median is that while mean is based on
all the raw scores, the median is a point on the scale dividing the number of
persons or scores into two equal halves. The following data will clarify this
point: \

S1. No. Scores Mean Median


1. 10, 12, 14, 16, 18 14 14

In the above data, it can be observed that mean is affected by change of scores
at the extreme while median is not. Therefore, in a distribution where extreme
a scores exist, median is the appropriate measure of central tendency rather than
the mean. In the distributions (2) and (3), median is the appropriate measure of
central tendency while in distribution (1) mean represents the central tendency.
Mode is seldom used. Its computation is easy, but it is highly unstable and
may change with minor shift in the frequencies from one interval to another.
However, there are situations in which only mode can be used. For example,
if a shoe company wants to h o w which size of shoe it should produce more, it
would use mode as a measure of central tendency. The most frequently sold
size of the shoes is the mode.

Check Your Progress 2

Notes: a) Write yow answers in the space given below.

b) Compare your answers with those given at the end of the unit.

1. State the meaning of central tendency. What is the


difference between mean, median and mode?
.........................................................................
........................................................................
........................................................................
2. Compute mean, median and mode of the following
distribution of scores:

12, 18, 15, 17, 16, 17, 12, 11, i8,20, 19, i5,16, 17, i4, 17,
13, 13, 11, 19, 18, 17, 14,20, 15, 17, 16, 14, 19,20.

........................................................................
........................................................................
........................................................................
Interpreting Test Results

16.6 MEANING OF VARIABILITY

It can be easily seen that a measure of central tendency is not sufficient to


describe a distribution of scores completely. Let us consider the following two
sets of data:

SetA:44 46 48 50 52 54 56

SetB:20 30 40 50 60 70 80

It may be observed that mean and median of both the sets are the same (50),
but the two distributions are obviously different from each other. In set A
scores are very close to the mean while in set B, scores are spread apart in
t relation to the mean. This shows that in addition to central tendency, it is also
important to know as to how scores are spread about the central value. The
tendency of the scores to spread about the central value is known as
'variability'. It can be seen that variability of set B is more than that of the set
A. Thus, you will agree that there is a need for having measures that indicate
how scores are spread out.

16.6.1 Measures of Variability

There are several statistics that serve this purpose. These are: range, mean
deviation, standard deviation and quartiles. Range is simply the difference
between the highest and the lowest scores in a distribution. In set A, range is
56 - 44 = 12, while in set B it is 80 - 20 = 60. Though range can be quickly
calculated, it is highly unstable measure like mode. However, it gives a rough
estimate of variability in a given situation. Among other measures, standard
deviation is the most useful. Since mean deviation and standard deviation are
closely related, we will discuss them in detail.

16.6.2 Mean Deviation and Standard Deviation

While discussing the mean in Section 16.5.1, we defined the deviation score x
obtained by subtracting mean from each of the raw scores. That is:

x = X - M, where X is a raw score.

We also showed that sum of deviation scores for given data is zero. If we
consider all the deviation scores as positive, their sum will not be zero: Let us
reproduce the earlier example in the form of the given table. It can be observed'
that when deviation scores are added irrespective of their algebraic signs, their
sum is 28. We obtain a measure of variability by dividing 28 by 10 which is
2.8. As this measure is the mean of absolute values of deviations from the
mean, it is known as the 'Mean Deviation' (MD).
Student Performance:
Interpretation
Raw Score Deviation Score Absolute value of x Squared Deviation
X X . 1x1 x2
20 -6 6 ' 36
12 -4 4 16
13 - -3 3 9
15 -1 1 1
16 0 0 0
16 0 0 0
17 +1 1 1
20 +4 4 16
20 +4 4 16
21 ,+5 5 25
CX- 160 Cx=O ~ 1 x 1= 2 8 zx2= 120
M=16

Symbolically,

This statistics is also known as 'Average Deviation'. This is no longer widely


used in statistical work as it has been replaced by Standard Deviation.
1
In the given table, if we find the average of squared deviations, given in the
last column, we get: 1

Thereafter, if we compute the squareroot of this number, we have:

The number 3.46 is the Standard Deviation of the given set of scores. '1t is
computed by finding out the squareroot of the mean of squared deviations of
raw score from the mean. Symbolically,

in which x = X - M.

The square of SD is known as 'Variance' of the scores. In the present example


12 is the variance. We can also compute SD with the help of raw scores by
using the following fformula:
Interpreting.Test Results

Where C X is
~ the sum of squared raw scores and CX is the sum of raw scores.

The SD is the most frequently used measure of variability, especially, when


I data are available in an interval or a ratio scale. It takes into account every raw
I
score, and hence, is the most stable measure of variability. Alongwith mean, it
I
describes a distribution fairly well. If the distribution of scores is

1
I
approximately normal, the SD shows interesting features. In this case, 99% of
the raw scores fall within the points 3 standard deviations below and 3 standard
deviations above the mean. In the above example 99% of the raw scores lie
within the limits
I
16 _+ 3 x 3.46 or 16 f 10.38

I or from 5.62 to 26.38.

In the same way, 95% and 68% cases fall within the limits 16 f 2 x 3.46 and
+
16 1 x 3.46 respectively. Symbolically, we can say that'

99% cases fall within M f 3 SD


+
95% cases fall within M 2 SD
68% cases fall within M f 1 SD

16.6.3 Derived Scores


1 The deviation score x = X-M defined earlier is obtained by simple subtraction
1 of the mean from a given raw score. This is a 'derived score'. Its mean is zero
1 and SD is the same as that of the raw scores. Any score obtained after applying
a
simple arithmetical operations on raw scores is known as a derived score. The
utility of squared deviation scores in computing variance and SD has already
been explained. We will discuss here in detail two other types of derived
scores: Z-scores and T-scores.

%Scores: We have seen earlier that deviation scores may be both negative and
positive. All the raw scores falling above the mean have positive deviation
scores and those falling below the mean have negative deviation scores. When
each of the deviation scores is divided by standard deviation, we obtain a
different kind of derived scores commonly known as Z-scores or standard
scores. We define Z-score as follows:
I
X-M
z= -
SD

I As Z-scores are obtained by dividing deviation scores by a positive quantity


(SD), their algebraic signs are the same as those of the derived scores. We
I reproduce the Table given in Section'16.6.2 in a different form:
Student Performance:
Interpretation . Raw Score Deviation Score Z-Scores
10 -6 - 1.734
12 -4 - 1.156
13 - 31 - 0.867
15 -1 - 0.289
16 0 0
16 0 0
17 +1 + 0.289
20 +4 + 1.156
20 +4 + 1.156
21 +5 + 1.445
As mentioned earlier, the mean and SD of the raw scores are 16 and 3.46
respectively. The Z-scores may be obtained by dividing the deviation scores
(in the second column) by 3.46 which is the SD of the scores.

It can be observed that Z-scores are both positiv,e and negative. If we find the
algebraic sum of Z-scores, we will get zero. An important characteristic of
2-scores is that their mean is always zero and SD is always I . This enhances
the utility of Z-scores in comparing scores given on different scales. If two
diflerent sets of scores are converted into 2-scores, they become comparable.

Suppose a teacher administers tests of Mathematics and English to three


students: A, B and C and computes means and standard deviations. The mean
scores are found to be 60 and 50 with standard deviations 12 and 10
respectively. The following table shows the raw marks and Z-scores of three
students A, B, and C in these tests:

, Student Mathematics English Z-scores (Maths) Z-scores (~nglish)


A 72 60 + 1.OO + 1.00
B 58 58 - 0.17 + 0.80
C 60 72 0.00 + 2.20
The Table reveals that:
i) the performance level of student A is at the same level in Mathematics and
English as indicated by Z-scores which is not indicated by raw scores.
ii) the performance level of student B is better in English than in Mathematics
as indicated by Z-scores, while raw scores apparently tell a different story.
iii) the performance level of student C is better in English as indicated by Z-
scores as well as raw scores.

T-Scores: Z-Scores are simple and meaningful. But, they suffer from certain
limitations. You have to work with negative score and decimal fraction while
using Z-scores. The average teacher is not competent enough to face such a
situation. In order to simplify the use and interpretation of scores, you can
further convert Z-scores into new type of standard scores known as T-scores.
T-scores are standard scores having a mean of 50 with an SD of 10 points. For
converting a raw score to T-score, it has to be converted to Z-score first. After
this, each Z-score is multiplied by 10 and 50 is added to the product to obtain a
T-score. Symbolically
Interpreting Test Results
These standard scores can be used to interpret and compare raw scores without
encountering negative signs and decimal fractions. This can be easily
understood not only by teachers, but also by students and parents.

Check Your Progress 3

Notes: a) Write your answers in the space given belo'w.

b) Compare your answers with those given at the end of the unit.

1. State the meaning of variability. Compute standard deviation


of the fallowing distribution of scores: .

12, 18, 15, 17, 16, 17, 12, 11, 18, 20, 19, 15, 16, 17, 14, 17,
13, 13, 11, 19, 18, 17, 14,20, 15, 17, 16, 14, 19,20.

........................................................................
........................................................................
........................................................................
........................................................................

2. Convert the scores 12, 15, 18, 20 in the above distribution


into Z-scores.

........................................................................
........................................................................
.........................................................................

16:7 CORRELATION: MEANING AND


INTERPRETATION

You might have observed that normally students' scoring highin mathematics
tend to score high in science also. If you measure, heights and weights of 15
years old children, you shall observe that taller children tend to be heavier. In
the first case you can say that achievement scores in science and mathematics
vary together. Similarly, it can be said that heights and weights of children
vary together. There are many variables in nature that vary together. You can
also say that certain variables covary. When two variables covary, they are said
to be correlated, and the underlying phenomenon is known as 'correlation'.

Two variables are said to be correlated if a change in one is accompanied by


change in the other: Variables may vary in the same direction or in opposite
direction, that is, if both the variables increase or decrease together, we say that
correlation is said to be positive. If, on the other hand, if two variables vary
together in opposite directions, one increasing and the other decreasing, we say
that the correlation is said to be negative. It should be noted that correlation
does not necessarily indicate cause and effect relationship. For example, a
positive correlation between height and weight does not mean that certain
Student Performance:
Interpretation ' students are heavier than others because they are taller also. Correlation simply
means that two variable covary. The cause of covariation may be some other
factor affecting both the variables.

The degree of correlation between two variables is measured in terms of


'correlation coefficient' which is the mean of cross products of Z-scores on the
two variables under consideration. If X and Y denote the raw scores on the
two variables and x and y denote their deviations fiom the respective means
Mxand My, the data may be placed as given in the following table:

15
Mx= - = 3, 0,= = 1.095 = 1.1 Approx.
5

(r denotes correlation co-efficient)

The formula,

X Y
r= zxzy
N
m?y be simplified by substituting Zx = -,and
o x
Zy = -.

This gives the formula

If we substitute a, = i -
T a n d ?=
1 N

we have,

which is approxiniately the same except for rounding errors.


Interpreting Test Results
I
If we substitute x = X - M, and y = Y - Myand simplify in terms of X and Y,
I we have the following formula for calculating r.

This equation can be used to calculate correlation coefficient directly from raw
I
I
scores.

The correlation coefficient may assume both positive and negative values
ranging from -1 to +I.. ,When the value of correlation coefficient is +1 or -1, it

I indicates perfect positive and perfect negative correlation respectively.

The correlation coefficient is interpreted in many ways, but simplest


interpretation is in terms of variance. For this purpose, we define the
'Coefficient of Determination' which may be obtained by squaring the given
correlation coefficient.

Thus
Coefficient of Determination = 3
t The value of r2 gives the proportion of variance in one variable which is due to
variation in the other variable. In the above example,

This means that only 2.56% of the variance in one variable is due to variance
in the other variable. Similarly, an r = .70 indicates that coefficient of
determination is (.7012 or .49, showing that 49% of the variance is common to
the variables being correlated.

Check Your Progress 4

Notes: a) Write your answers in the space ggiven below.

b) Compare your answers with those given at the end of the unit.

1. Define correlation and mention its range.

........................................................................
.........................................................................
2. Compute correlation coefficient between the variables X'
and Y in the following data and interpret the result.

X: 95,90, 85,80,75,70,65,60,55
Y: 76,78,77,71,75,79,73,72,74
........................................................................
.........................................................................
.. --
.- . .- - . --
--
Student Performance:
Interpretation 16.8 LET US SUM UP

Statistics is a set of procedures for describing, synthesizing, analysing and


interpreting quantitative data. The part of statistics used for organislng,
summarizing and describing quantitative information is known as
'descriptive statistics', while methods for making inferences about large
groups of individuals on the basis of study of small samples constitute
'inferential statistics.'
The characteristics of measurement to centre around a typical value is
known as 'central tendency'. The mean, median and mode are the
measures of central tendency. The measures of variability such as range
and standard deviation indicate how scores are spread about the measure of
central tendency.
Derived scores are linear transformations of raw scores and are helpful in
interpreting and comparing test performance. The main derived scores are
deviation scores, 2-scores and T-scores.
When two variables vary together, they. are said to be correlated. The
correlation may be negative or positive and the coefficient of correlation
may vary from -1.to +l. The simplest way of interpreting correlation is in
terms of its squam which is termed as'coefficient of determination.

16.9 ANSWERS TO CHECK YOUR PROGRESS

Answer to Check Your Progress 1

1. Statistics is the study of techniques for organising, summarizing, describing


and interpreting quantitative data for making generalizations and inferences
from them. The part of statistics that deals with organizing, summarizing
and describing data is referred to as 'Descriptive Statistics', while the other
part dealing with interpreting a d drawing inferences is termed as
'Inferential Statistics'. . .

Answers to Check Your Progress 2

1. The characteristic of data by vide-f which they tend to concentrate


around a central value is lcnown as 'central tendency'.

Mean is obtained by adding together all the separate observations or scores


and by div,iding this total by the number of observations.

Median is that masure of central tendency which appears in the middle of


an arrangeaordered sequence of values. It is a point on-a measuring scale
above and below which fifty percent of the scores lie.

Mode is the most frequently occurring score or typical or commonly


observed value in a set of data.

2. For the given data Mean = 16, Median = 16.5, Mode = 17.
Answers to Check Your Progress 3 Interpreting Test Results

I . The tendency of measurements to spread or scatter about the measures of


central tendency is k11ow11 as variability.
I

I
For the given data standard deviation = 2.633

it 2. The Z-scores of the given raw scores are respectively:

1.52, - 0.38. + 0.76, -t 1.52


I
I Answers'to Check Your Progress 4

1. Two variables are said to be correlated when they vary together in the same
or opposite directions. The underlying phenomenon is known as
correlation. The coefficient of correlation may vary from - 1 to + I .

2. The correlation caefficient between the given sets of scores is + 0.443. The
coefficient of determination I'2 = 0.1962 which means that 19.62% of the
variance is common to both the variables.

16.10 SUGGESTED READINGS

1. Downie, N.M. and Health, R.W., (1965): Basiic Statistical~ethods.second


Edition, Harper and Row: New York.
I 2. Garrett, H.E.., (1973): Statistics in Psychology and Education. Vakils Feffer
'and Simons Pvt. Ltd.: Bombay.
3. Guilford, J.P., (1965): Fundamental Statistics in Psychology and
Education. McGraw Hill Book Co.: New York.
4. Gay, *L.R., '(1992): Educational Research. Fourth Edition, Macmillan
, Publishing Company.
5. McCall R.B., (1980): Fundamental Statistics for Psychology. Third
~dition,Harcourt B r ~ c eJovanovich Inc.: New York.

You might also like