0% found this document useful (0 votes)
196 views12 pages

MPC 006 D

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 12

STATISTICS IN PSYCHOLOGY

(MPC-006) TUTOR MARKED


ASSIGNMENT (TMA)

Course Code: MPC-006


Assignment Code: MPC-006/AST/TMA/2020-2021 Marks: 100
NOTE: All Questions Are Compulsory.
The answers are to be written in own words. Do not copy from the course
material or any other source.

SECTION A

Q1. Compare between parametric and nonparametric statistics. Discuss in detail any
two nonparametric techniques.

Ans. Parametric Statistics

Parametric statistics is defined to have an assumption of normal distribution for its population
under study. It refers to those statistical techniques which have been developed on the
assumption that the data are of certain types. In particular, the measure should be an interval
scale and the scores should be drawn from a normal distribution (Stratton and Hayes, 1999).

There are certain basic assumptions of parametric statistics. The very first characteristic of
parametric statistics is that it moves after confirming its population’s property of normal
distribution. The normal distribution of a population shows its symmetrical spread over the
continuum of –3 to +3 SD maintaining a unimodal shape as its mean, median and mode
coincide. If the samples are from various populations then it is assumed to have same
variance ratio among them. The samples are independent in their selection. The chances of
occurrence of any event or item out of the total population are equal and any item can be
selected in the sample. This reflects the randomized nature of sample which is also a good
tool in avoiding any experimenter bias.

In view of the above assumptions, parametric statistics seems to be more reliable and
authentic as compared to the non-parametric statistics. Such statistics is more powerful to
establish the statistical significance of effects and difference among variables. It is more
appropriate and reliable to use parametric statistics in case of large samples as it promises
more accuracy of results. The data to be analyzed under parametric statistics are usually from
the interval scale.

However, along with many advantages, some disadvantages have also been noted in the
parametric statistics. It is bound to follow the rigid assumption of normal distribution, which
further narrows the scope of its usage. In case of small samples, normal distribution cannot be
attained, and thus, parametric statistics cannot be used. Further, computation in parametric
statistics is lengthy and complex because of large samples and numerical calculations. T-test,
f-test and r-test are some of the major parametric statistics used for data analysis.

1
Q2. The scores obtained by three groups of students on Self Concept Scale are given
below. Compute ANOVA for the same.

Ans. K=3 (i.e., 3 groups), n=10 (i.e., each group having 10 cases), N=30 (i.e., the total
number of units in the group)

Null hypothesis Ho = µ1= µ2= µ3

Thus,

GROUP A GROUP B GROUP C

X1 2 X2 2 X3 2
X1 X2 X3 Step-1: Correlation
34 1156 34 1156 34 1156 term
43 1849 22 484 33 1089
x x x
22 484 33 1089 22 484 ...   x 2
66 4356 44 1936 58 3364
44 1936 65 4225 56 3136
34 1156 67 4489 54 2916
44 1936 43 1849 56 3136
77 5929 35 1225 66 4356
77 5929 57 3249 77 5929
33 1089 87 7569 78 6084
∑X1=47 ∑X12=25820 ∑X2=48 ∑X22=27271 ∑X3=534 ∑X32=31650
4 7
10 10 10
47.4 48.7 57.4

x 2
Cx   1 2 3 k
N n  n  n ... n
1 2 3 k

Cx2
 474  487  1495
534 2 
2
10  10  10 30

2235025
C  74500.8
x
30
Step-2: SS Sum of Squares of total   x2 – C
T
 25820  27271  31650 – 74500.8

 84741 – 74500.8

SST  10240.2

Q3. Describe hypothesis testing with a focus on errors in hypothesis testing.

Ans. Hypothesis testing has a vital role in psychological measurements. By hypothesis, we


mean the tentative answer to any question. Hypothesis testing is a systematic procedure for
deciding whether the results of a research study, which examines a sample, support a
particular theory or practical innovation, which applies to a population. Hypothesis testing is
the central theme in most psychology research.

Hypothesis testing involves grasping ideas that make little sense. Real life psychology
research involves samples of many individuals. At the same time, there are studies which
involve a single individual.

The Corelogic of Hypothesis Testing

There is a standard kind of reasoning researchers use for any hypothesis testing problem. For
example, ordinarily, among the population of babies that are not given the specially purified
vitamin, the chance of a baby’s starting to walk at age 8 months or earlier would be less than
2%. Thus, walking at 8 months or earlier is highly unlikely among such babies. But what if
the randomly selected sample of one baby in our study does start walking by 8 months? If the
specially purified vitamin had no effect on this particular baby’s walking age (which means
that the baby’s walking age should be similar to that of babies that were not given the
vitamin), it is highly unlikely (less than a 2% chance) that the particular baby we selected at
random would start walking by 8 months. So, if the baby in our study does in fact start
walking by 8 months, that allows us to reject the idea that the specially purified vitamin has
no effect. And if we reject the idea that the specially purified vitamin has no effect, then we
must also accept the idea that the specially purified vitamin does have an effect. Using the
same reasoning, if the baby starts walking by 8 months, we can reject the idea that this baby
comes from a population of babies with a mean walking age of 14 months. We, therefore,
conclude that babies given the specially purified vitamin will start to walk before 14 months.
Our explanation for the baby’s early-walking age in the study is that the specially purified
vitamin speeded up the baby’s development.

The researchers first spelled out what would have to happen for them to conclude that the
special purification procedure makes a difference. Having laid this out in advance, the
researchers could then go on to carry out their study. In this example, carrying out the study
means giving the specially purified vitamin to a randomly selected baby and watching to see
how early that baby walks. Suppose the result of the study is that the baby starts walking
before 8 months. The researchers would then conclude that it is unlikely the specially purified
vitamin makes no difference, and thus, also conclude that it does make a difference.

SECTION B

Q1. Describe the concept and importance of normal probability curve.

Ans. The Normal Probability Curve is the ideal symmetrical frequency curve. It is supposed
to be based on the data of a population. In it, the measures are concentrated closely around
the centre and taper off from this central point or top to the left and right. There are very few
measures at the low score end of the scale; an increasing number upto a maximum at the
middle position; and a symmetrical falling-off towards the high score end of the scale. The
curve exhibits almost perfect bilateral symmetry. It is symmetrical about central altitude. The
altitude divides it into two parts, which will be similar in shape and equal in area. The curve,
which is also called Normal Curve, is a bell-shaped figure. It is very useful in psychological
and educational measurements. It is shown in Fig. 3.1:

50%

68.26%

–3PE 3PE

–4PE –2PE –1PE 1PE2PE 4PE

–3 –2 –1 0 1 2 3
MEAN

Fig.: Normal Probability Curve

Normal probability curve is the frequency polygon of any normal distribution.

Theoretical Base of the Normal Probability Curve: The Normal Probability Curve is based
upon the law of probability (the various games of chance) discovered by French
Mathematician Abraham Demoiver (1667-1754). In the eighteenth century, he developed its
mathematical equation and graphical representation also.
The law of probability and the normal curve that illustrates it is based upon the law of chance
or the probable occurrence of certain events. It can be represented by a bell-shaped curve
with definite characteristics.

Characteristics or Properties of Normal Probability Curve (NPC): The characteristics of the


normal probability curve are as follows:

(1) The Normal Curve is Symmetrical: The normal probability curve is symmetrical
around its vertical axis called ordinate. The symmetry about the ordinate at the central point
of the curve implies that the size, shape and slope of the curve on one side of the curve are
identical to that of the other. In other words, the left and right halves to the middle central
point are mirror images, as shown in the Fig. .

M = Md = Mo

Fig. : Normal Probability Curve

(2) The Normal Curve is Unimodal: Since there is only one maximum point in the
curve, thus, the normal probability curve is unimodal, i.e. it has only one mode.

Q2. Using Pearson’s product moment correlation for the following data:

Ans:-

Data 1(x) Data 2(y) X2 Y2 Xy


10 2 100 4 20
7 1 49 1 7
5 9 25 81 45
6 4 36 16 24
3 9 9 81 27
6 5 36 25 30
8 4 64 16 32
2 9 4 81 18
9 5 81 25 45
10 4 100 16 40
x = 66 y = 52 x2 = 504 y2 = 346 xy = 288
n   xy     x   y 
r
n  x2  (  x)2  n  y2  (  y)2 

10(288)  (66)(52)
r
10(504)  (66)2  10(346)  (52)2 
2880  3432
r
5040  4356 3460  2704

r  0.77
Q3. With the help of Mann Whitney U test find if significant difference exists between
the scores obtained on Organisational Commitment Scale obtained by public and
private bank employees.

Ans:-

Public Bank Rank (R1) Private Bank Rank (R2) D(R1-R2 ) D2


Employees Employees
10 1 34 8 -7 49
12 2 54 12 -10 100
21 3 56 13 -10 100
23 5 43 10 -5 25
34 8.5 32 3.5 5 25
45 10 23 1 9 81
32 7 34 8 -1 1
23 5 32 3.5 1.5 2.25
34 8.5 33 6 2.5 6.25
23 5 44 11 -6 36
32 3.5 -3.5 12.25
34 8 -8 64
32 3.5 -3.5 12.25
R1 = 55 R2 = 91 D2 = 514

n1 (n1 
u  n 1 ,n 2  – R1 u'  n1, n2  n 2 (n 2   R 2
1) 2 1)
2
 130  130
Q4. Explain Two-way Analysis of Variance with a focus on its merits and demerits.

Ans. In two-way analysis of variance, usually the two independent variables are taken
simultaneously. It has two main effects and one interactional or joint effect on dependent
variable. In such condition, we have to use analysis of variance in two ways, i.e. vertically as
well as horizontally or we have to use ANOVA, column and row wise. Suppose we are
interested to study the intelligence, i.e. I.Q. level of boys and girls studying in VIII class in
relation to their level of socio economic status (SES). In such condition, we have the
following 3 x 2 design as shown in the table.

Illustration:

Table : SES, Intelligence and Gender factors

Levels of SES
Groups
High Average Low Total
Boys MHB MAB MLB MB
Girls MHG MAG MLG MG
Total MH MA ML M

In the table above,

M: Mean of intelligence scores.

MHB ,MAB , & MLB : Mean of intelligence scores of boys belonging to different levels
of SES, i.e. high, average and low respectively.

MHG , MA ,& MLG : Mean of intelligence scores of girls belonging to different levels
of SES respectively.

MH , MA , & ML : Mean of the intelligence scores of students belonging to different


levels of SES respectively.

MB , MG : Mean of the intelligence scores of boys and girls respectively.


From the above 3 2
contingency table, it is clear, first we have to study the significant
difference in the means column wise or vertically, i.e. to compare the intelligence level of the
students belonging to different categories of socio-economic status (High, Average and Low).

Secondly, we have to study the significant difference in the means row wise or horizontally,
i.e. to compose the intelligence level of the boys and girls.

Then we have to study the interactional or joint effect of sex and socio-economic status on
intelligence level, i.e. we have to compare the significant difference in the cell means of
columns and rows.

We have more than two groups, and to study the independent as well as interaction effect of the
two variables, viz. socio-economic status and sex on dependent variable, viz. intelligence in
terms of I.Q., we have to use two-way analysis of variance, i.e. to apply analysis of variance
column and row wise.

Q5. Compute Chi-square for the following data:

Ans:-

Phases of Adolescent Achievement Motivation scores


High Low Total
Early adolescents 34 43 77
Late adolescents 45 44 89
Total 79 87 166

we take the hypothesis that the populations are homogeneous with respect to different types
of Achievement motivation scores they prefer:-

E11 79  77 6083 87  77 6699


 166  166  36.64 E 21  166  166  40.36

79  89 7031
E12    42.36 E 87  89 7743
166 166 22  166  166  46.64

SECTION C
Q1. Tabulation

Ans. Tabulation is the process of summarising classified or grouped data in the form of a
table so that it is easily understood and an investigator is quickly able to locate the desired
information. A table is a systematic arrangement of classified data in columns and rows.
Thus, a statistical table makes it possible for the investigator to present a huge mass of data in
a detailed and orderly form. It facilitates comparison and often reveals certain patterns in data
which are otherwise not obvious. Classification and tabulation as a matter of fact, are not two
distinct processes. Actually, they go together.

Q2. Interval estimation

Ans. Statistical inference is that branch of statistics, which is concerned with using
probability concept to deal with uncertainty in decision-making. The field of statistical
inference has had a fruitful development since the latter half of the 19th century.

It refers to the process of selecting and using a sample statistic to draw inference about a
population parameter based on a subset of it, i.e. the sample drawn from the population.
Statistical inference treats two different classes of problems:

• Hypothesis testing, i.e. to test some hypothesis about parent population from which
the sample is drawn.

• Estimation, i.e. to use the ‘statistics’ obtained from the sample as estimate of the
unknown ‘parameter’ of the population from which the sample is drawn.

Q3. Level of significance

Ans. The level of significance refers to the degree of significance with which we accept or reject
a particular hypothesis. Since 100 per cent accuracy is not possible in taking a decision over the
acceptance or rejection of a hypothesis, we have to take the decision at a particular level of
confidence which would speak of the probability of one being correct or wrong in accepting or
rejecting a hypothesis. In most of the cases of hypothesis testing, such a confidence is fixed at 5
per cent level, which implies that our decisions would be correct to the extent of 95 per cent. For
a greater precise, however, such a confidence may be fixed at 1 per cent level which would
imply that the decision would be correct to the extent of 99 per cent. This level is usually
denoted by the symbol,  (alpha) which represents the probability of committing the type I error
(i.e. rejecting a null hypothesis which is true). The level of confidence (or significance), is always
fixed in advance before applying the test procedures. It is important to note that if no level of
significance is given, then we always take   0.05.

Q4. Direction of correlation

Ans. The direction of the relationship is an important aspect of the description of


relationship. If the two variables are correlated then the relationship is either positive or
negative. The absence of relationship indicates “zero correlation”.

Positive Correlation
The positive correlation indicates that as the value of one variable increases, the value of
other variable also increases. Consequently, as the value of one variable decreases, the value
of other variable also decreases. This means that both the variables move in the same
direction. For example,

• As the intelligence (IQ) increases the marks obtained also increases.

• As income increases, the expenditure also increases.

Negative Correlation

The Negative correlation indicates that as the value of one variable increases, the value of the
other variable decreases. Consequently, as the value of one variable decreases, the value of
the other variable increases. This means that the two variables move in the opposite direction.

Q5. Partial correlation

Ans. A partial correlation between two variables is one that partials out or nullifies the effects
of a third variable (or a number of other variables) upon both the variables being correlated.
Two variables, A and B, are closely related. The correlation between them is partialled out, or
controlled for the influence of one or more variables is known as partial correlation.
Therefore, when it is assumed that some other variable is influencing the correlation between
A and B, then the influence of this variable(s) is partialled out for both A and B. Hence, it can
be considered as a correlation between two sets of residuals. Suppose a simple case of
correlation between A and B is partialled out for C. This can be represented as rAB.C which is
read as correlation between A and B partialled out for C. The correlation between A and B
can be partialled out for more variables as well.

Q6. Multiple regression

Ans:- We have multiple predictors than a single predictor variable, the regression carried out
is called as multiple regression.

So we have a dependent variable and a set of independent variables. Suppose we have X 1 ,


X 2 , X 3 , … up to Xk as k independent variables, and Y as a dependent variable, then the
regression equation for sample can be written as:

Y = a + b1X1 + b2X2 + ...+ bkKk (eq. 32)

The same equation for the population can be written as

(eq. 33)

Q7. Skewness and Kurtosis

Ans. The term ‘Skewness’ means lack of symmetry, i.e. if the distribution of data is not
symmetrical, it is called a skewed distribution. Any measure of skewness indicates the
difference between the manner in which item are distributed in a particular distribution
compared with a symmetrical (or normal) distribution. If skewness is positive, the
frequencies in the distribution are spread out over a greater range of value on the high-value
end of the curve (the right-hand side) than they are on the low-value end. If the curve is
normal, the spread will be the same on both sides of the center point and the mean, median
and mode will have the same value.

Kurtosis is the measure of the shape of a frequency curve. It is a Greek word, which means
bulginess. While skewness signifies the extent of asymmetry, kurtosis measures the degree of
peakedness of a frequency distribution. Karl Pearson classified curves into three types on the
basis of the shape of their peaks. These are mesokurtic, leptokurtic and platykurtic.

Q8. Levels of measurement

Ans:- When deciding which statistical test to use, it is important to identify the level of
measurement associated with the dependent variable of interest. Generally, for the use of a
parametric test, a minimum of interval level measurement is required. Non-parametric
techniques can be used with all levels of measurement, and are most frequently associated
with nominal and ordinal level data.

(i) Nominal Data: The first level of measurement is nominal, or categorical. Nominal
scales are usually composed of two mutually exclusive named categories with no implied
ordering: yes or no, male or female. Data are placed in one of the categories, and the numbers
in each category are counted (also known as frequencies). The key to nominal level
measurement is that there are no numerical values assigned to the variables.

(ii) Ordinal Data: The second level of measurement, which is also frequently associated
with non-parametric statistics, is the ordinal scale (also known as rank-order). Ordinal level
measurement gives us a quantitative ‘order’ of variables, in mutually exclusive categories,
but no indication as to the value of the differences between the positions (squash ladders,
army ranks). As such, the difference between positions in the ordered scale cannot be
assumed to be equal. Examples of ordinal scales in health science research include pain
scales, stress scales and functional scales.

Q9. Measures of dispersion

Ans:- A numerical measure that can be used to throw some light on the scatter or the
homogeneity of data is called a measure of dispersion. It is of two types:

Absolute Measure: A measure variation or dispersion expressed in terms of original units is


referred to absolute measure. This method is executed through the following methods:

(a) Range (R)

(b) Average of Mean Deviation (M.D.)

(c) Quartile Deviation or Semi-Inter-quartile Range (Q.D.)

(d) Standard Deviation (S.D.)

Q10. Kendall’s tau


Ans. Kendall’s tau is another useful measure of correlation. It is as an alternative to
Spearman’s rho This correlation procedure was developed by Kendall (1938). Kendall’s tau
is based on an analysis of two sets of ranks, X and Y. Kendall’s tau is symbolised as which is
a lowercase Greek letter tau. The parameter (population value) is symbolised as and the
statistics computed on the sample is symbolised as . The range of tau is from – 1.00 to +
1.00. The interpretation of tau is based on the sign and the value of coefficient. The tau value
closer to ±1.00 indicates stronger relationship. Positive value of tau indicates positive
relationship and vice versa. It should be noted that Kendall’s Concordance Coefficient is a
different statistics and should not be confused with Kendall’s tau.

You might also like