Probability Distribution
Probability Distribution
Probability Distribution
SC PSYCHOLOGY
CALICUT UNIVERSITY
Prepared by
SAJITHA.K.S
SEMESTER III
Total 01 to 21 60
Question Paper setter has to give equal importance to both theory and problems in sections B
and C.
Objectives
Module 1: Distribution Theory- Binomial, Poisson and Normal Distributions, Mean and Variance
(without derivations), Numerical Problems, Fitting, Importance of Normal Distribution, standard
normal distribution, simple problems using standard normal tables, Central Limit Theorem
(Concepts only)
25 Hours
Module2: Methods of Sampling- Random Sampling, Simple Random Sampling, Stratified,
Systematic and Cluster Sampling, Non Random sampling, Subjective sampling, Judgment
sampling and convience sampling
20 Hours
Module 3: Fundamentals of Testing- Type-I & Type-II Errors, Critical Region, Level of
Significance, Power, p value, Tests of Significance
Module 4: Large Sample Tests – Test of a Single, Mean Equality of Two Means, Test of a
Single Proportion, and Equality of Two Proportions
10
Hours Module 5: Small Sample tests-Test of a Single Mean, Paired and Unpaired t-Test,
Chi- Square Test of Variance, F-Test for the Equality of Variance, Tests of Correlation
20
Hours
References
1. Gupta, S.P. Statistical Methods. Sultan Chand and Sons: New Delhi.
2. Gupta, S.C., &Kapoor, V.K. Fundamentals of Applied Statistics. New Delhi: Sultan
Chand and Sons.
3. Garret, H.E., &Woodworth, R.S. Statistics in Psychology and Education. Bombay:
Vakila, Feffex and Simens Ltd.
4. Mood, A.M., Graybill, F.A and Boes, D.C. Introduction to Theory of Statistics. 3rd
Edition Paperback – International Edition.
5. Mukhopadhyay, P. Mathematical Statistics. New central Book Agency
(P) Ltd: Calcutta.
MODULE 1
DISTRIBUTION THEORY
INTRODUCTION
We have seen that a random variable is a variable that is subject to random variations so
that it can take on different values, each with an associated probability. We have seen that a
probability distribution links each outcome of a random experiment to a value of the random
variable or process with its probability of occurrence.
Probability distribution
Binomial distribution
It can be thought of as simply the probability of success or failure outcome inn an experiment or
survey that is repeated multiple times. The binomial is a type of distribution that has two possible
outcomes.
Poisson Distribution
It was first introduced by Simeon Denis Poisson . Poisson distribution is a tool that
helps to predict the probability of certain events from happening when you know how often the
events has occurred. It gives us the probability of a given number of events happening in a fixed
interval of time.
.
Standard Normal Distribution
The standard normal distribution, also called the z-distribution, is a special normal distribution where
the mean is 0 and the SD is 1.
Any normal distribution can be standardized by converting its values into z-scores. Z-scores tell you how many
standard deviations from the mean each value lies.
Importance of z scores:
The absolute value of z scores indicates how far the scores lies from the mean
when measured in standard deviation.
1. The set of all units from which the required information has to be collected is called
population.
2. The selected representative subset of population is called sample.
3. Sampling is the process of selecting this subset, on which we focus our study.
Advantages of sampling
1. Less time
2. Less cost
3. Only technique if the testing process is destructive
4. Only technique when there is practical infeasibility.
5. Only technique when the population is infinite
6. Enough reliability of inference based on sampling
7. Quality of data collected
Disadvantages of sampling
1. Chances of bias
2. Difficulties in selecting truly representative samples
3. Inadequate knowledge in the subject
4. Changeability of units
5. Impossibility of sampling
TECHNIQUES OF SAMPLING
1. Lack of bias
2. Simplicity
3. Less knowledge required
Demerits:-
1. Time consuming
2. Cost involved
3. Difficulty to get all list of the population
Merits
1. Greater risk
2. Assumes size of the population can be determined
stratified sampling has two main disadvantages. It may require more administrative effort than a
simple random sample. And the analysis is computationally more complex.
a) Convenience sampling:- under this, the choice of the sample let completely
to the convenience of the interviewer.
b) Purposive/judgement sampling: Purposive sampling, also known as judgmental,
selective, or subjective sampling, is a form of non-probability sampling in which
researchers rely on their own judgment when choosing members of the population to
participate in their surveys.
c) Quota sampling: Under this, we divide the population into different subpopulation
and from which we select units according to our convenience.
d) Snowball sampling:- In which existing study subjects recruit future subject from
among their acquaintances.
• Sampling and nonsampling errors
Sampling error is a type of error, occurs due to the sample selected does not perfectly
represents the population of interest.
An error occurs due to sources other than sampling,while conducting survey activities is
known as nonsampling errors. Important nonsampling errors are
• Faulty planning
• Errors in response
• Errors in the design of survey
• Errors in compilation
• Publication errors
MODULE 3
FUNDAMENTALS OF TESTING
Test of Hypothesis
• Clear
• Testable
• Related to the existing body or theory
• Logical unity and comprehensive
• Capable of verification
• Operationisable
• A null hypothesis is that hypothesis which is tested for its possible rejection under
the assumption that it is true
• The hypothesis which is tested against the null hypothesis is called alternative
hypothesis
• Type I error :- It occurs when a researcher rejects null hypothesis that is actually true.
• Type II error:- It occurs when a researcher fails to reject a null hypothesis that is
really false.
Level of significance
The upper limit for the probability of type I error, fixed by the researcher is called level of
significance.
Critical region
In a test procedure we calculate a test statistic oon which we base our decision. The range
of variation of this statistics is divided into two regions, acceptance region and rejection
region. If the computed value of the test statistics falls in the rejection region we reject the
null hypothesis. The rejection region is also known as critical region.
Critical value
The value of the test statistic which seperates the rejection region from the acceptance
region is called the critical value.
• In one tailed test, the rejection region will be located in only one tail which may be
either right or left.
• A two tailed test is one in which we reject the null hypothesis if the computed value
of the test static is significantly greater than or lower than the critical value of the
test static.
Power of a test
Probability for rejecting the null hypothesis when the alternative hypothesis is true is called
power of a test.
Sampling Distribution
Sample mean is a random variable, as it changes from one sample to another with a
particular probability. Therefore, sample mean has a probability distribution known as
sampling distribution of sample mean.
T distribution
The T distribution, also known as the Student's t-distribution. It is a type of probability distribution that
is similar to the normal distribution which has bell shape but has heavier tails.
F distribution
The F-distribution is a particular parametrization, which is also called the beta distribution
of the second kind. The letter F is used to represent a test statistic that follows the F distribution.
Meaning
A test is said to be a large sample test if the sample size is greater than 30.
The following is the test procedure for any simple null hypothesis tested
against a simple alternative.
This test is intended to test whether the given population mean is true.that
is, to test whether the difference between sample mean and population mean is
significant or it is only due to sampling fluctuations. For this type of problems we can
apply Z test or t test. The test procedure is
x̄ = Mean of Sample
• μ = Mean of Population
• n = Number of Observation
➢ For Z test , the degree of freedom is infinity while for t test it is n-1.
➢ Then get the table value of the test statistic, for the degree of freedom and level of
significance.
➢ Finally take a decision either to accept or to reject the hypothesis originally set in step
I.
Procedure
➢ Set up a null hypothesis that there is no significant difference between the two means
ie µ1 = µ2
➢ When population SD is known or when the samples are large the test applied is Ztest.
Otherwise t test
Here the null hypothesis is there is no significant difference between sample proportion
and population proportion or P=P0. We apply Z test
Meaning
A test is said to be a small sample test if the sample size is less than 30.
Test of a single mean; The single mean (or one-sample) t-test is used to compare the mean of
a variable in a sample of data to a (hypothesized) mean in the population from which our sample data are drawn.
This is important because we seldom have access to data for an entire population.
Test of equality of two population means: Test if two population means are equal. The two-sample t-test
is used to determine if two population means are equal. A common application is to test if a new process or
Paired t test : Paired T-Test. The paired sample t-test, sometimes called the dependent sample t-test, is a
statistical procedure used to determine whether the mean difference between two sets of observations is zero. In
a paired sample t-test, each subject or entity is measured twice, resulting in pairs of observations.
Unpaired t test : An unpaired t-test (also known as an independent t-test) is a statistical procedure that
compares the averages/means of two independent or unrelated groups to determine if there is a significant
The chi-square test for variance is a non-parametric statistical procedure with a chi-square-distributed test
statistic that is used for determining whether the variance of a variable obtained from a particular sample has the
In statistics, an F-test of equality of variances is a test for the null hypothesis that two normal
Test of correlation
Correlation test is used to evaluate the association between two or more variables. For instance, if we are
interested to know whether there is a relationship between the heights of fathers and sons, a correlation