Statistcs For Management 2 Marks
Statistcs For Management 2 Marks
Devi AP/DOMS
UNIT I INTRODUCTION
Basic definitions and rules for probability, conditional probability independence of events,
Baye’s theorem, and random variables, Probability distributions: Binomial, Poisson, Uniform
and Normal distributions.
Hypothesis testing: one sample and two sample tests for means and proportions of large
samples (z-test), one sample and two sample tests for means of small samples (t-test), F-test for
two sample standard deviations. ANOVA one and two way
Chi-square test for single sample standard deviation. Chi-square tests for independence of
attributes and goodness of fit. Sign test for paired data. Rank sum test. Kolmogorov-Smirnov –
test for goodness of fit, comparing two populations. Mann – Whitney U test and Kruskal Wallis
test. One sample run test.
REFERENCES:
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui, Statistics for
Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical
Learning with Applications in R, Springer, 2016.
1
Mailam Engineering College , Mailam
Unit I
1. Define Probability
The term probability means “Measuring the degree of uncertainty and that of certainty also as
a corollary”.
It is also defined in a simple way that “It is a chance of occurrence of a certain event when
appeared quantitatively”.
Any experiment whose outcome cannot be predicted or determined in advance. E.g., tossing a
coin or throwing a dice.
A set of all possible outcomes from an experiment is called a sample space. E.g., toss a coin, the
result is either head or tail. Let 1 denote head and 0 denote tail. The point 0, 1 on a straight line
is called sample point or event points.
A sample space whose elements are finite or infinite but countable is called a discrete sample
space. E.g., if we toss a coin as many times as we require for turning up one head, then the
sequence of points S1 = (1), S2 = (0,1), S3 = (0,0,1) etc..
A sample space whose elements is infinite and uncountable or assumes all the values on a red
line R or on an interval of R is called a continuous sample space. E.g., all the points on a line.
A sub – collection of a number of sample points under a definite rule or law is called an event.
E.g., throwing a dice it forces.
An event consisting of only one sample point of a sample space is called a simple event. E.g., let
a dice be rolled once and A is the event that face number 5 is turned up, then A is a simple
event.
2
Mailam Engineering College , Mailam
9. Define /What is mean by Compound event?
When an event is decomposable into a number of simple events, then it is called a compound
event. E.g., the sum of the two numbers shown by the upper face of the two dice is seven in the
simultaneous throw of the two unbiased dice.
It is the total number of all the possible outcomes of an experiment. E.g., throw dice, any one
of the 6 faces may turn up and therefore, there are 6 possible outcomes.
It the happening of one event includes the happening of other events, Then the events are
mutually exclusive.
Independent event are happening of one event is not affected or influenced by other Events,
then they are independent events.
Events are said to be equally likely if there is no reason to expect any one in preference to
other. E.g., in throwing a dice, all the 6 faces (1, 2, 3, 4, 5, and 6) are equally likely to occur.
The total number of events in a population exhausts the population. So they are known as
collectively exhaustive events.
If in an experiment all possible outcomes have equal chances of occurrence, then such event
are said to be equally probable events. E.g., throwing a dice, all 6 faces have chances to occur.
If the happening of any one does not depend upon the happening of other event is called
independent event.
The events which are not independent are called dependent events.
3
Mailam Engineering College , Mailam
18. Name a few discrete probability distributions.
Probability histogram
Mass distribution
Continuous distribution
Cumulative probability
= 0.149[1+4.2+17.64]
-----
2
= 0.149[1+4.2+8.82]
= 0.149[14.02]
P(x<=2) = 0.2088.
4
Mailam Engineering College , Mailam
e-m mx
P(X=x) = -----, x=0,1,2…..
x!
25. Two cards are drawn from a deck of 52 cards calculate the probability that the drawn
includes ace and a ten.
26. The average number of traffic accidents in a certain selection of highway is two per week.
Assume that number of accidents follows a Poisson distribution with parametric α. Find the
probability of at most three accidents in this section of high way during a 2week period.
α =2
e-m mx
p(X=x)= -----
x!
at most three:
p(x<=3)=p(x=0)+p(x=1)+p(x=2)+p(x=3)
5
Mailam Engineering College , Mailam
e-2 20 + e-2 21 + e-2 22 + e-2 23
= ----- ------ ------ -----
0! 1! 2! 3!
= e-2 [1+2+2+8]
--
6
= 0.1353[6+12+12+8]
-------------
6
= 0.1353[38/6]
= 0.1353[6.333]
P(x<=3) = 0.8569
27. State baye’s theorem on rule of inverse probability?
P(Ei).P(A/Ei)
P(Ei/A) = ---------------
n
∑ P(E).P(A/Ei)
i=1
28. Find the probability of getting a total of 5 at least once in three tosses of a pair of fair
dice?
P=1/6,q=5/6
P(no of 5 in 3 times)=3C0(1/6)0(5/6)3
=0.02315
P(5 at least once) =1-0.02315
=0.9769
6
Mailam Engineering College , Mailam
Probability of an impossible event is 0
The probability of occurrence of any event lies between 0 and 1, both
inclusive.
Total number of favorable cases
Probability of an event = ---------------------------------------
Total number of equally likely cases
30. Define Baye’s theorem.
Let A1, A2. .A11 be a collection of events such that P (A) > 0 for all i. P (U n
Ai =1) and Ai∩Aj = ф for i ≠ j also let B be an event such that P (B) > 0 then i=1.
P (Ai) P (B/Ai)
P (Ai/B) = --------------------------
∑ⁿ P (Ai) P (B/Ai)
i = 1.
31. Define /What is mean by Binomial distribution?
It is a discrete probability distribution which is obtained when the probability P of the
happening of an event is same in all the trials and there are only two events in each trial.
E.g., the probability of getting a head, when a coin is tossed a number of times, must remain
same in each toss i.e., P = ½.
The probability distribution of binomial distribution for “r” successes in “n” trials
is given by P (r) =ncr qn-r pr
∑fx
Mean = np = -------- (i.e ∑f=N)
∑f
Variance = npq
P+Q = 1, Q=1-P
P(x) = N . ncr qn-r pr
32. Define /What is mean by Poisson distribution? Write the probability mass function of
poisson distribution?
The probability distribution of a random variable “X” is said to have a Poisson distribution, if
it takes only non – negative values and if its distribution is given by
7
Mailam Engineering College , Mailam
e-m mx
P(X=x)= ----------- (i.e x=1,2,3….)
x!
Mean = m , Variance = m S.D = √m
N. e-λ λx (or) N. e-m mx r = 0, 1, 2, 3……..
P(x)= --------- ----------
x! x!
33. Explain Normal distribution.
Normal distribution is an approximation to binomial distribution, whether or not “p = q”. The
binomial distribution tends to the form of the continuous curve and when “n” becomes large,
the normal distribution may be expressed through the following formula:
1 -(x – μ) ²
P (x) = ----- e ------
zσ√2π 2σ²
x–µ
Z = -------,
σ
Where x denotes the value of the continuous random variable, σ denotes the standard
deviation, then “μ” denotes the mean of the random variable, the “e” denotes mathematical
constant approximated by 2.7183, then “π” denotes mathematical constant approximated by
3.1416.
8
Mailam Engineering College , Mailam
36. Give two examples for categorical variables?
1. Type of climate –hot or cold.
2. Favorite ice cream flavor –variable , strawberry.
𝑎+𝑏 (𝑏−𝑎)2
Mean= , Variance=
2 12
9
Mailam Engineering College , Mailam
b) Stratified probability sampling:
(Random selection is not from the Heterogeneous universe from Homogeneous)
c) Systematic sampling:-
One unit in selected at random from the Universe and the other units are at a
specified interval from the selected unit.
d) Cluster sampling:-
Universe is divided into some recognizable sub-groups which are called cluster.
e) Multi-stage sampling:-
Sample units are selected in 2 or 3 or 4 stages.
f) Area sampling:-
Lists or registers are used as the sampling frame. The plus and minus points of cluster
sampling are also applicable to area sampling.
II) Non-probability sampling:-
The organizers of the inquiry purposively choose the particulars units of the universe
for constituting a sample on the small mass that they select out of a huge one will
be typical or representative of the whole.
a) Convenience sampling:-
The reasonable chooses the sampling units are the basis of convenience or accessibility.
b) Judgment sampling:-
It is sometimes advocated is the selection of universe items by means of expect
judgement specialist in the subject.
c) Quota sampling:-
It uses the principle of stratification bases for stratification in consumer survey are
commonly demographic.
d) Panel sampling:-
The initial samples are drawn on random basis and information from these are collected
by regular basis. It is a facility to select and quickly contact such well balanced samples
and to have relatively high response rate even by mail.
e) Snowball sampling:-
It relies on referrals from initial subjects to generate additional subjects.
2.Sampling errors:-
Sampling errors or variations among sample statistics are due to difference between each
sample and the population and among several samples.
o Biased sampling error,
o Unbiased sampling error.
4. Sampling distribution:-
It is the probability distribution, under repeated sampling of the population of a given statistic.
10
Mailam Engineering College , Mailam
5. Sampling distribution of the mean (𝒙 ̅):-
The probability distribution of all possible values of the sample mean 𝑥
̅̅̅is called sampling
distribution of 𝑥̅ .
𝑝𝑞
Confidence interval for the population proportion for large sample 𝑝 ± 𝑍𝛼 ⁄2 √ 𝑛
9. Central limit theorem:-
When sampling is done from a population with Mean𝜇and 𝜎 finite standard deviation, 𝜎 the
sampling distribution of the sample mean 𝑥̅ will tend to a normal distribution with mean 𝜇 and
𝜎
standard deviation 𝑛 as the sample size 𝑛 be comes large.
√
2
𝑥̅ ~𝑁(𝜇, 𝜎 ⁄𝑛) (i.e) =p[ ̅𝑥-µ/σ/√n]
10. Estimation:-
It is the techniques and methods by which population parameters are estimated
from sample studies.
11
Mailam Engineering College , Mailam
a) Unbiasedness
b) Efficiency,
c) Consistency,
d) Sufficiency.
𝑍 2𝑝𝑞
n=(σz/E)2 (or) 𝑛 = 𝐸2
Unit-3
1. Hypothesis:-
A hypothesis is a statement about the population parameter.
2. Tests of hypothesis:-
It is a procedure that helps us to ascertain the likelihood of hypothesized population parameter
being correct by making use of the sample statistic.
5. Type 1 error:-
It is the error of rejecting null hypothesis is 𝐻0 when it is true.
6. Type 2 error:-
It is the error of accepting the null hypothesis 𝐻0 when it is false.
7. Level of significance:-
The level of significance is the maximum probability of making a type 1 error and it is denoted
by 𝛼.
12
Mailam Engineering College , Mailam
A test of statistical hypothesis where the alternative hypothesis is one sided is called as
one tailed test.
12. Large samples (n≥ 𝟑𝟎) (Z-test)(find one tailed or two tailed for table value)
a) Test of significance of a single mean:-
𝑥̅ −𝜇 𝜎 𝑠
𝑍 = 𝑆.𝐸(𝑥̅ ) , 𝑆. 𝐸 (𝑥̅ ) = 𝑛 , 𝑛
√ √
b) Single proportion:-
𝑝−𝑃 𝑃𝑄
𝑍 = 𝑆.𝐸(𝑃̅), 𝑆. 𝐸 (𝑃̅) = √ 𝑛 , 𝑄=(1-𝑃)
𝑆 2 𝑆 2
𝑆. 𝐸(𝑆1 − 𝑆2 )=√2𝑛1 + 2𝑛2 , (unknown)
1 2
13. Small samples (𝒏 < 30) , (t-test) (Ref table student ‘t’ test)
𝑥̅ −𝜇 𝑆
𝑡 = 𝑆𝐸(𝑥̅ ) , 𝑆. 𝐸 (𝑥̅ ) =
√𝑛−1
1 ∑(𝑥
̅̅̅1̅−𝑥2 ) ̅̅̅̅̅2
2
𝑆 =𝑛−1 ∑(𝑥 𝑥2 2 ,
̅̅̅1 − ̅̅̅) 𝑆 = √ 𝑛−1
13
Mailam Engineering College , Mailam
̅
𝐷 ∑𝐷
𝑡 = 𝑆𝐸(𝐷) ̅
̅̅̅̅ , 𝐷 = 𝑛
̅ )2
∑(𝐷−𝐷 𝑆
𝑆=√ , ̅ )=
𝑆. 𝐸(𝐷
𝑛−1 √𝑛
c) Difference between two means:-
𝑥
̅̅̅̅−𝑥
̅̅̅̅ 1 1
𝑡 = 𝑆𝐸(𝑥1̅̅̅̅−𝑥2̅̅̅̅) , 𝑆𝐸 (̅̅̅ 𝑥2 ) = 𝑠 × √𝑛 + 𝑛
𝑥1 − ̅̅̅
1 2 1 2
∑(𝑥−𝑥̅ )2 ∑(𝑦−𝑦̅)2
𝑆1 2 = , 𝑆2 2 =
𝑛1 −1 𝑛2 −1
14
Mailam Engineering College , Mailam
∑ 𝑇 2
SSC= 𝑗 𝑟 𝑗 − 𝐶.F
SST=∑𝑖 ∑𝑗 𝑥𝑖𝑗 2 − C.F
SSE=SST-SSC
II) Two way classification:-
𝑇2
Correction factor(C.F) = 𝑟𝑐
Where 𝑇 = ∑𝑖 ∑𝑗 𝑥𝑖𝑗
∑𝑗 𝑇 𝑗 2
SSC= − 𝐶.F
𝑟
∑𝑖 𝑇 𝑖 2
SSR= 𝑐 − 𝐶.F
SST=∑𝑖 ∑𝑗 𝑥𝑖𝑗 2 − C.F
SSE=SST-(SSC+SSR)
21. Explain the procedure for testing the two sample proportion comparison.
Procedure:
1. Hypothesis:
Null hypothesis : Two proportion are equal (or) µ1=µ2.
Alternate hypothesis: Two proportion are unequal (or) )µ1≠µ2.
2. Test statistics
15
Mailam Engineering College , Mailam
𝑝1−𝑝2
Z=𝑆.𝐸(𝑝1−𝑝2)
𝑝1𝑞1 𝑝2𝑞2
S.E.(p1-p2)=√ + where q=1-p
𝑛1 𝑛2
UNIT-4
16
Mailam Engineering College , Mailam
The population sample is continuous and symmetrical
If n value is less than 30 (n<30) we use Binomial (or) Poisson distribution,
𝑒 −𝛼 𝛼𝑥
𝑃(𝑋 = 𝑥 ) = 𝑛𝑐𝑟 𝑝𝑟 𝑞 𝑛−𝑟 , 𝑝(X=x) = 𝑥!
If n value is greater than 30 (n>=30) we use normal distribution
𝑥 − 𝑛𝑄
𝑍=
√𝑛𝑄(1 − 𝑄)
For table value see one tailed test or two tailed test.
a) Mann-Whitney U test
It is used to determine whether two independent samples have been drawn from
populations with same distribution.
Rank the data and arrange the data in ascending order.
𝑛 (𝑛 +1)
𝑈 = 𝑛1 𝑛2 + 1 21 - 𝑅1
𝑛2 (𝑛2+1)
𝑈 = 𝑛1 𝑛2 + - 𝑅2
2
𝑛1 𝑛2 (𝑛1+𝑛2 +1)
𝜇𝑢 = 𝑛1 𝑛2 , 𝜎𝜇 2 =
2 12
𝑈 − 𝜇𝑢
𝑍=
𝜎𝑢
For table value see one tailed test or two tailed test.
2𝑛1 𝑛2
𝜇𝑣 = +1, V=no of runs
𝑛1+ 𝑛2
2𝑛1 𝑛2 (2𝑛1 𝑛2 −𝑛1−𝑛2 )
𝜎𝑣 2 =[ (𝑛 2 +(𝑛
]
1 +𝑛2 ) 1 +𝑛2 −1)
For table value see one tailed test or two tailed test.
17
Mailam Engineering College , Mailam
5. Run above and run below test:-
To find out the values falling above and below the median of the sample.
𝑉−𝜇
𝑍= 𝜎 𝑣
𝑣
2𝑛1 𝑛2
𝜇𝑣 = 𝑛 +1, V=no of runs
1+ 𝑛2
2𝑛1 𝑛2 (2𝑛1 𝑛2 −𝑛1−𝑛2 )
𝜎𝑣 2 =[ ]
(𝑛1 +𝑛2 )2 +(𝑛1 +𝑛2−1)
For table value see one tailed test or two tailed test.
𝐷𝑛 = 𝑚𝑎𝑥|𝐹𝑒 − 𝐹𝑜 |
8. What are the types of variables used in the goodness of fit chi-square test?
i)observed frequency
ii)expected frequency
18
Mailam Engineering College , Mailam
1.Rank correlation
2.Chi-square test.
13. Distinguish between Mann Whitney test and Kruskal Walis tests?
Mann whitney test tests two group similarities , while kruskal walis test tests three
Group similarities.
Kruskal – Wallis test is a non –parametric test, which is used to compare there or more
groups of sample data.
Procedure:
U=n1n2+n1(n1+1)/2-R1
µu=n1n2/2, σu2=n1n2(n1+n2+1)/12
19
Mailam Engineering College , Mailam
Z=U- µu/ σu
5. Find the level of significant for 5%
6. Find the table value by using chi-square test table value.
7. Find the decision for this problem by comparing calculated value and table value and
write the decision whether it is accepted or rejected.
Unit-5
1. What is mean by Correlation analysis?
Correlation analysis is a statistical technique used to describe not only the degree of
relationship between the variables, but also the direction of influences.
2. Define Correlation.
Correlation analysis attempts to determine the degree of relationship between two
variables.
According to A.M.Tuttle “Correlation is an analysis of the co-variation between two or
more variables.”
Scatter diagram
Karl perason’s coefficient of correlation or Covariance method
Spearman’s Rank Correlation method
Two way frequency method
Concurrent deviation method
It is the Square of the coefficient of correlation i.e r2, where r is the coefficient of
correlation.
20
Mailam Engineering College , Mailam
Simple, partial & multiple correlations
When only one variable are studied it is a problem of simple correlation.
When three or more variable are studied it is a problem of either multiple or partial
correlation.
6. Regression analysis:
Regression is the measure of the average relationship between 2 or more
variables in terms of the original units of the data.
E.g., If we know that advertising & sales are correlated, we find out expected
amount of sales for a given advertising expenditure or the required amount of expenditure
for attending a given amount of sales.
21
Mailam Engineering College , Mailam
Simple Linear Regression: A regression using only one predictor is called a simple
regression.
Multiple Regressions: Where there are two or more predictors, multiple regression
analysis is employed.
22
Mailam Engineering College , Mailam
∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
𝑟= 𝑛
2 ∑𝑑 2
√∑ 𝑑𝑥 2 − (∑ 𝑑𝑥 ) √∑ 𝑑𝑦 2 − ( 𝑦 )
𝑛 𝑛
b) Spearman’s Rank correlation:-
6 ∑ 𝐷2
𝑟 = 1−
𝑛(𝑛2 − 1)
When rank is repeated,
1 1
6(∑ 𝐷2 + 12 (𝑚1 2 − 𝑚1 ) + 12 (𝑚2 3 − 𝑚2 ) + ⋯ )
𝑟 =1−
𝑛(𝑛2 − 1)
c) Coefficient of correlation:-
(2𝑐−𝑛)
𝑟𝑐 = ±√± , (short term oscillation)
𝑛
d) Coefficient of determination:-
𝑟 = 𝑟2
23
Mailam Engineering College , Mailam
∑𝑥∑𝑦
∑ 𝑥𝑦−
𝑛
𝑏𝑦𝑥 = (∑ 𝑥)2
∑ 𝑥 2−
𝑛
Coefficient of regression:-
𝑟 = √𝑏𝑦𝑥 × 𝑏𝑥𝑦
∑ 𝑦−𝑐(∑ 𝑥 2 )
𝑎= ,
𝑁
∑ 𝑥𝑦
𝑏= ,
∑ 𝑥2
2
𝑁(∑ 𝑥 2𝑦)−(∑ 𝑥 )(∑ 𝑦)
𝑐= 𝑁(∑ 𝑥 4 )−(∑ 𝑥 2)2
Correlation Regression
1.correlation measures the degree 1.it is mathematical measure of the
of relationship between 2 average relationship between 2 or
variables. more variables.
2.it cannot be used for grouped 2.it can be used for units of data.
frequency distribution.
24
Mailam Engineering College , Mailam