BA4101 - Statistics - For - Management - Revised
BA4101 - Statistics - For - Management - Revised
1 K2
n=5, mean = 148 1
median = ((n+1)/2 )th term 160, mode = repeated value=170
1
5 If P(A/B) = 0.2 and P(B)=0.4 then find P(A ∩ B).
P(A∩B) = P(A/B)xP(B) 1 K2
1
P(A/B)xP(B) =0 .4x0.2=0.08 1
1|Page
6 What is the mean and variance of Binomial distribution?
mean= np, 1 1 K1
variance = npq 1
7 When do you say two events are independent?
P(A∩B) = P(A)x P(B) 1 K1
2
8 What is a random experiment?
An experiment is said to be a random experiment, if it‟s out-come can‟t be predicted 1 K1
2
with certainty.
9 Define a sample space.
The set of all possible out-comes of an experiment is called the sample space. It is 1 K1
2
denoted by „S‟ and its number of elements are n(s).
10 List the properties of Normal distribution.
(i) The mean of a normally distributed population lies at the centre of the normal
curve
(ii) The mean, median and mode of the normal distribution coincide. 1 K1
(iii) The curve is asymptotic at x axis on both sides. 2
(iv) The normal curve is bell shaped and symmetrical about the vertical line through
the centre
2|Page
2 K1
3|Page
5 What is point estimate?
A point estimate is a single number, which is used to estimate an unknown population
parameter. It is the common way of expressing an estimate. In other words, the point 2 K1
2
estimate does not give any idea about the reliability or precision of the method of
estimation used.
6 A automobile repair shop has taken a random sample of 40 services that the average
service time on an automobile is 130 mins with a standard deviation of 26 min. Find the
standard error of the mean. 2 K2
16. n =40, = 130, 1
S.E = = 4.11 1
7 Define Strata.
Groups within a population formed in such a way that each group is relatively 2
2 K1
homogeneous but wider variability exists among the separate groups.
1 Define Hypothesis
A Hypothesis is a proposition or a statement that we would like to verify whether it is
2 3 K1
true or not.
2 Differentiate one tailed and two tailed test.
One tail test: A statistical hypothesis test in which the alternative hypothesis is
1
specified such that only one direction of the [possible values is considered.
Two tail test: A statistical hypothesis test in which the alternative hypothesis is 3 K2
specified such that it includes both the higher and the lower values of a parameter than 1
the value specified in the null hypothesis.
3 Define critical region.
The set of values of the test statistic that will cause us to reject the null hypothesis. 3 K1
2
4|Page
4 Define Type I and Type II error?
Type I Error: An error caused by rejecting a null hypothesis that is true 1 3 K1
Type II Error: An error caused by accepting null hypothesis, that is false. 1
5 List out the different applications of t-test.
T test is applied for small samples i.e n<30 1
It is used to test whether there is any significant difference between sample mean and 3 K1
1
population mean.
6 What is Null Hypothesis and Alternative Hypothesis?
A Hypothesis of no difference is called Null Hypothesis. It is denoted by Ho 1 3 K1
Alternative hypothesis is opposite of null Hypothesis. It is denoted by H1 1
7 Write down the formula to test the difference of two means in case of large sample.
Z= | 1 - 2| / √(s 12/n1 +s 22/n2) 1
Where 3 K1
1 and 2 are sample mean, n1 , n2 are sample size and S1, s2 are sample 1
standard deviation.
8 What is ANOVA? Why is ANOVA helpful?
Analysis of Variance (ANOVA) is a statistical technique used to test the equality of
1 3 K1
three or more sample means.
It is helpful to test the homogeneity of several means. 1
9 What is level of significance?
Maximum size of the type I error to face risk is called Level of significance. 1 3 K1
Type I error is reject H0, when H0 is true. 1
10 Give test statistic for testing significance of difference between the variance of two small
sample.
S12= Σ(x 1 - 1)2 / n1 -1 S2 2= Σ(x2 - 2)2 / n2-1 3 K1
1
F test : F= S22/ S 12 1
11 Write the table of one way ANOVA
Source Sum of squares DoF Mean F ratio
Betw SSR R-1 MSR= F= 3 K1
sample SSR/R-1 MSR/MSE 2
Within SSE N-R MSE = (or)
Sample SSE/N-R MSE/MSR
12 State the basic principles of Experimental Design.
Replication, Randomization and Local control. 2 3 K1
13 What are the assumptions based on ANOVA?
The observations are independent; The samples are drawn from normal population. 1 3 K1
Various treatment and environmental effects are additive in nature.
1
14 Define Degrees of freedom.
The number of elements that can be chosen freely is called Degrees of freedom. 2 3 K1
15 Define test statistic.
The value of z or t calculated for a sample statistic such as the sample mean or the 3 K1
sample proportion is known as test statistic.
6|Page
4 How do find the degrees o freedom in case of chi square test?
Degrees of freedom = n-1 1 4 K1
Degrees of freedom = (r-1) (c-1) 1
5 Explain the term “run” with an example.
A sequence of identical occurrences that may be preceded and followed by different
1
occurrences. 4 K2
Example: The response from 10 school boys for the question “ whether the boys prefer
1
their home near to school” - y y y n n y n y y
6 Kruskal Wallis test is the non parametric ANOVA. Why?
Kruskal wallis is used test whether 3 or more populations are individual. It is a non
4 K2
parametric test, whereas ANOVA is a parametric test used to test more than 2 2
samples.
7 What are the major advantages of non parametric methods over parametric methods?
No assumptions are required, more suitable when ranked or scaled. 1 4 K1
Do not take much time, simple calculations. 1
8 Define Kolomogrov simirnnov test
A non parametric test that is concerned with the degrees of agreement between a set of 4 K1
observed ranks and a theoretical frequency distribution. 2
9 Two HR managers A and B ranked 5 candidates for a new position. Their rankings of the
candidates are shown below. Compute the Spearman’s Rank Correlation
Candidate name: Nancy Linda Oviya John Mary
Rank by A: 2 1 3 5 4 4 K2
Rank by B: 1 3 4 5 2
Σd2 = 10, r = 1 – 6 (Σd 2 ) / (n (n2i -1)) 1
r=0.5 positive correlation 1
10 Write the formula for chi – Square test of single standard deviation.
4 K1
Chi square = Σ ((Oi –Ei)2 / Ei) 2
11 Define one sample Runs test
A non parametric test used for determining whether the items in a sample selected 4 K1
2
randomly.
12 Write the standard error of the U statistic of the Mann Whitney U test.
4 K1
σ =√ (n1 n2 (n1+n2 + 1) /12 2
13 A chi square value is never negative. Give reason.
4 K2
Differences between observed and expected frequencies are squared. 1
14 A contingency table for a chi square test has 5 columns and 4 rows . how many degrees of
freedom should be used?
4 K2
DoF = r-1 x c – 1 1
DoF = 12 1
15 Disadvantages of non parametric test.
Based on limited amount of information , it do not make use of all the available 2 4 K2
information. A good deal of information is lost.
7|Page
3 How will you test the accuracy of regression equation?
If one of the regression coefficients is greater than unity the other must be less than
1 5 K2
unity.
Both the regression lines pass through the point ( , y ) are means of x and y. 1
4 What is non sense correlation?
Correlating irrelevant things is a non sense correlation. When r =0, the correlation 5 K1
2
between x and y is said to be non sense correlation.
5 Find the mean values of the variables X & Y from the following regression
lines 2y – x = 50 ; 3y – 2x = 10
solve equations 1 5 K2
=130, y =190 1
6 Why do we use multiple regressions instead of simple regression in estimating a
dependent variable? 5 K2
When we deal with more than one dependent variable we use multiple regression. 2
7 What are the changes or variations involved in time series analysis?
Secular trend, seasonal variations, cyclical variations and irregular variations. 2 5 K1
8 What is meant by forecasting errors?
Mean absolute deviation MAD = Σ l yi - y i l /n 1 5 K1
where yi = actual value y i = fitted value 1
9 State the major application of Time series analysis.
Analysis of Time series is helpful in economics, Business and science 1 5 K1
It helps in understanding past behavior of a variable 1
10 What are the methods for measuring cyclical variations?
Residual method, Direct method, Reference cycle analysis method, harmonic analysis 2 5 K1
method.
11 Give some examples of time series.
sales, production, import , export, population over the last 5 or 10 years. 2 5 K2
12 What is regression coefficient?
bxy = (nΣxy – ΣxΣy) /( nΣy2 –(Σy)2), 1 5 K1
byx = (nΣxy – ΣxΣy) /( nΣx2 –(Σx)2) 1
13 What are the uses of Regression analysis?
It can be useful to all natural, social and physical sciences where the data are in
1 5 K1
functional relationship.
It helps in prediction and it can estimate the values of unknown quantities. 1
14 What are the time series forecast error measures?
Root mean square error (RMSE) 1
5 K1
Mean absolute Deviation (MAD) 1
Mean Absolute Percentage Error (MAPE)
15 List the Merits and demerits of least Square method.
Merits: this is mathematical method and is completely objective in character. This
method gives trend value for the entire time period. 5 K1
1
Demerits: Time consuming, Tedious. The type of curve to be fitted has to be selected
carefully.
1
UNIT I – INTRODUCTION
1 A sales representative can convert a customer as a potential buyer with the probability of 70% .
8|Page
If he is able to meet 10 customers in a day , Find the probability of (i) Atleast one
customer (ii) not even a single customer (iii) exactly one customer
12 1 K2
9|Page
Binomial distribution P(X)=ncx px q n-x
4
(i) P(X≥1)= 0.9999
(ii) P(X=0)= 0.0000059 4
(iii) P(X =1) = 0.0001378 4
2 In a bolt factory machines A, B and C manufactures respectively 25 %, 35% and
40%. Of the total of their output 5%, 4%, 2 % are defective bolts. A bolt is drawn
at random from the product and found to be defective. What are the probability 12
that it was manufactured by Machines A, B and C
K2
Formula : and assumptions 1
3
P(E1/A)= 25/69 3
P(E2/A)= 28/69 3
P(E3/A)= 16/69 3
3 In a test of 2000 electric bulbs it was found that the life of a particular make was
normally distributed with an average life of 2040 hours and SD of 60 hours.
12
Estimate the number of bulbs likely to burn for (i) more than 2150 hours (ii) less
than 1950 hours (iii) more than 1920 hours but less than 2160 hours 1 K2
Normal distribution: Formula and given data N=2000 3
P(X>2150) = 0.0336, N= 2000 x 0.0336= 67 3
P(X<1950)= 0.0668, N = 134 3
P(1920<X<2100) = 0.8185, N= 1637 3
4 Marks of PG class are recorded and the head of the department need to do an
analysis based on the student’s performance on the particular paper for the
improvement of result. The data are grouped in the following table. Find the
mean and median. Also find the mark which maximum students obtained. 12
20-30 30-40 40-50 50-60 60-70
3 5 20 10 5 1 K2
Assumed mean = 45, Σfd =90
Mean = A + Σfd/n = 47.093
Median m = l1 + (l2-l1)(m-c)/f = 47
Mode = l1 + [(f1-f0 )i ]/ [(f1-f0 ) + (f1-f2 )] = 46.5
5 Four coins were tossed 150 times and the following results were obtained
No of heads: 0 1 2 3 4
12
Observed frequency: 28 62 46 10 4
Fit a binomial distribution.
Mean = Σfx / Σf = 1.33 1 K2
Mean = np therefore p = 0.2667, q =0.7333
N =150, P(X)= N . ncx px q n-x
0 1 2 3 4
31.80 57.84 42.07 15.30 2.78
6 The ages of a sample of 8 faculty members selected from the school of business
administration are shown below. Compute average age, mode, median age and
S.D 12
Faculty: 1 2 3 4 5 6 7 8
Age: 42 30 73 50 51 37 42 59 2 K2
Mean = Σf/n = 48 4
Median =(n+1/2)th term = 46 4
Mode = 42, S.D =√( Σ(x - )2/n) = 12.63 4
1 Ten individuals are chosen at random from a population and their heights are
found to be in inches. 63,63,66,67,68,69,70,70,71,71. Find the mean heights in the
universe is 66 inches 12 3 K2
11 | P a g e
Hypothesis; n=10, µ = 66, = 67.8 2
Σ(x - )2= 81.6, S2 = Σ(x - )2/ n-1 = 9.07, S = 3.01 4
T test = | - µ| / (S/ √n) = 0.18 4
T.V = 2.262, Degrees of freedom = 9, LOS = 5%. Accept Ho 2
2 The average number of articles produced by two machines per day are 200 and
250 with standard deviation 20 and 25 respectively on the basis of records of 25
days of production. Can you regard both the machines are equally efficient at 1% 12
level of significance?
3 K2
Hypothesis, Level of significance = 1%, Given value, 1 = 200, 2= 250, s1=20,
s22=25, n1=25,n 2=25
2
S = n s 2 +n s 2 / n +n -2 = 533.85, S = 23.11 4
1 1 2 2 1 2
T test = | 1 - 2| / s√(1/ n1 +1/ n2) = 7.645 4
T.V = 2.576, Reject Ho 2
3 Time taken by workers in performing a job are given below
Method I: 20 16 26 27 23 22 -
Method II: 27 33 42 35 32 34 38 12
Test at 1% level of significance whether there is any significant difference between the
variance of time distribution. 3 K2
Father X 65 63 67 64 68 62 70 66 68 67 69 71
12
Son Y 68 66 68 65 69 66 68 65 71 67 68 70
2
R1 = 93, n1=8, n2=10
2
U = n1 n2 + (n1(n1 +1) / 2 )– R1 = 23, µ = n1 n2 /2 = 40, σ =√ (n1 n2 (n1+n2 + 1) /12 =
6
11.25
Z = (U -µ)/σ = 1.51 , T.V = 1.96 Accept Ho 2
4 Use sign test to see if there is a difference between the number of days until
collection of an account receivable before and after collection policy. Use 0.05 level
of significance 12
Before 30 28 34 35 40 42 33 38 34 45 28 27 25 41 36
After 34 29 33 32 47 43 40 42 37 44 27 33 30 38 364 K3
n=14, no of + =5, no of - =9, p= 0.357, q = 0.643, 2
Hypothesis, LoS = 5% 2
Standard error = σp = √pq /n = 0.13 4
Limits of acceptance region = (P Ho + 1.96 σp) = (0.24, 0.75) . Accept Ho 4
5 Discuss the association for 1000 school boys between general and mathematical
abilities
General Ability
Good Fair Poor
Mathematica
12
l Ability
Good 44 22 4
Fair 265 257 178 4 K3
Poor 41 91 98
13 | P a g e
Calculate Expected value. 2
Hypothesis, LoS = 5% 2
Chi square = Σ ((Oi –Ei)2 / Ei) = 69.142 4
DoF= (r-1)(c-1) = 4, T.V = 3.841 Reject Ho 4
14 | P a g e
6 The marks secured in History test of the students in public school and Primary
schools are given below
Primary: 73 75 83 77 72 69 56 80 68 60 84
Public: 70 78 79 81 65 63 74 83 67 76 88 61 64 71 86 48 12
Test the Hypothesis at 0.05 LoS that there is no significant difference in the
performance of test against the alternative that the performance are significantly
different. 4 K3
Hypothesis, LOS = 5% 2
Ascending order arranging and ranking 2
R1 =154.5, n1= 11, n2=16 2
U = n1 n2 + (n1(n1 +1) / 2 )– R1 = 87.5, µ = n1 n2 /2 = 88, σ =√ (n1 n2 (n1+n2 + 1) /12 =
4
20.26
Z = (U -µ)/σ = 0.02 , T.V = 1.96 Accept Ho 2
15 | P a g e
5 Using the method calculate seasonal variations from the data given below
Year I Quarter II Quarter III Quarter IV Quarter
2011 72 68 80 70
2012 76 70 82 74
2013 74 66 84 80
2014 76 74 84 78
5 K2
2014 78 74 86 82
UNIT I – INTRODUCTION
1 A manufacturing firm produces steel pipes in three plants with daily production
of 500, 1000, 2000 with respectively. According to past experience it is known that
fraction of defective output is 0.005, 0.008, 0.010. If a pipe is selected from a days 20
total production and found to be defective. What is the probability of that it come
from I plant, II plant and III plant?
1 K2
Formula : and assumptions
2
P(A1/B)= 0.0007
4
P(A2/B)= 0.0022
4
P(A3/B)= 0.0057
4
P(I plant) = 0.0814, P(II plant) = 0.2558, P(III plant) = 0.6627
6
2 Global green company has the average early sales in an outlet is Rs. 25 lakhs and
S.D is Rs. 5 Lakhs . If the sales follow normal distribution find
(i) Probability of sales greater than Rs. 30 Lakhs
(ii) Probability of sales between Rs. 20 Lakhs and 30 lakhs.
20
(iii) Since 1000 outlets the company owns, Find the number of outlets
less than Rs. 15 Lakhs. 1 K2
µ = 25, S.D =5, z = x -µ /σ 3
P(x ≥ 30 ) = p(z ≥ 1) = 0.1587 5
P (20 ≤ x ≤ 30 ) = P (-1 ≤ z ≤ 1 ) = 0.6826 5
P (X < 15) = P(z < -2) = 0.0228. No of outlets = 23 7
16 | P a g e
UNIT II – SAMPLING DISTRIBUTION AND ESTIMATION
1 a)A movie maker sampled 55 fans who viewed his master piece movie and asked
them whether they had planned to see it again. Only 10 of them believed that the
movie was worthy of a second look. Find the Standard error of the population of
fans who will view the film a second time. Construct a 90 % confidence interval
for this population. 2 K3
20
17 | P a g e
b) The manager of a shop selling beverages wants to estimate the actual amount
of beverages in one litre bottles from a nationally known manufacturer. As per
manufacturer’s specification the standard deviation of the volume of the
beverage is 0.02 litre . The average amount of beverage per one litre bottle is
found to be 0.995 litre on checking 50 bottles. Set up 99 % confidence interval,
estimate of the true population average amount of beverage in a one liter bottle.
Check whether the manufacturer is genuine in filling the beverage.
(a) n=55 , p =0.18, q=0.82 3
SE = √pq/n = 0.052 4
TV at 90% = 1.645, P= (0.095, 0.266) 3
(b) n=50, =0.995, σ = 0.02 4
T.V = 2.58, Confidence limit = ( 0.987, 1.003) Reject Ho. 6
2 a)Television advertisers mistakenly believe that most viewers understand most
of the advertisement that they see and hear . In this connection, a research study
covering 2300 viewers above the age of 20 years was taken. Each viewer looked at
30 second television advertisement or a part of it. It was found that 1914 viewers
misunderstood either the entire advertisement or a part of it. Determine 95%
confidence interval for the proportion of all viewers that will misunderstand all
are part of the television advertisement used in this study.
1 Four doctors each test 4 treatments for a certain disease and observe the number of
days each patient takes to recover. The recovery time in number of days are given as
follws.
Treatment
Doctor A B C D
1 10 14 19 20
2 11 51 17 21
3 9 12 16 19
4 8 13 17 20 20 3 K2
Find the Variance at 5 % level of significance to test whether the sales differ with
respect (i) Treatment (ii) Doctor
Hypothesis, CF = T2/N = 4795.56 4
TSS = 1477.44, SSR = 321.69 4
SSC = 380.69, SSE = 775.06 4
F ratio =1.47(between treatment) F= 1.24 (Between doctor) 4
T.V = 3.86 (3,9) (both cases)
4
(i) Accept H0(ii) Accept H0
18 | P a g e
2 To study the performance of 3 detergents and three different water
temperatures, the following whiteness readings were obtained using specially
designed equipment.
Perform two way ANOVA at 5% level.
Detergents
Water Temperature A B C
Cold 57 55 67
20
Warm 49 52 68
3 K2
Hot 54 46 58
1 A department store has three sales counters. The manager wants to compare the
sales of the three stores over a six day week. The relevant data is as follows:
Counter A sales: 78 62 71 58 73
Counter B sales: 76 85 77 90 87
Counter C sales: 74 79 60 75 80
Compare the equality of mean sales in all the three counters using Kruskal Wallis 20
method.
Hypothesis, LOS 5%, n= n1+n2+n3 =15 4 4 K3
Arrange in ascending order and ranking 4
R1= 23, R2 = 59, R32 = 38 2 4
H = 12/ n (n+1) [ R /n + R /n R 2/n ] – 3 (n+1) = 6.54 6
1 1 2 2+ 3 3
T.V = 5.991 Reject H0 2
2 A dietician wants to test three types of diet plans 1,2,3. He selected a
homogenous group of 23 persons and placed them into 3 sub groups. Each
subgroup trying a different diet plan. Each plan was tried for a period of 30
days. The following observation of weight loses in pounds were recorded for
members of each group
after this period of 3o days. Determine at 0.05 LoS whether there is a difference in 20
weight reducing effects or not.
Diet plan 1: 4.0 3.8 3.7 6.2 5.6 4.2
4 K3
Diet plan 2: 3.6 5.2 2.8 3.0 3.8 5.0 3.9 5.5
Diet plan 3: 6.5 7.2 5.9 5.5 6.8 7.7 8.0 8.2 7.0
Hypothesis, LOS 5%, n= n1+n2+n3 =23 4
Arrange in ascending order and ranking 4
R1= 56.5, R2 = 52, R3 = 167.5 4
H = 12/ n (n+1) [ R 2/n + R 2/n R 2/n ] – 3 (n+1) = 7.748 6
1 1 2 2+ 3 3
T.V = 5.991 Reject H0 2
UNIT V – CORRELATION, REGRESSION AND TIME SERIES ANALYSIS
1 Mr. X owns a small company that manufactures portable message tables. Since
he started the company, the number of tables he has sold is represented by this
time series:
Year: 1987 1989 1990 1991 1992 1993 1996 number of tables sold
Tables : 140 144 160 152 168 176 180 by him. Also calculate
Find the linear equation using least square method that describes the trend in the trend value for the
19 | P a g e
year 1988 20 5 K2
20 | P a g e
Σy = na+ bΣx, Σxy = aΣx +bΣx2, 4
a= 159.29, b = 4.955 4
Trend value for 1988 = 144.425 2
Trend values : 139.45, 149.37, 154.34, 159.29, 164.25, 169.21, 184.09 6
Graph 4
2 The manager of a company wants to analyze about the sales of a particular brand
of television and he wants to forecast the sales of the television in future. Data
from 1985 to 1989 was given. Fit a parabola of second degree to the following and
20
analyse using the curve.
Year: 1985 1986 1987 1988 1989
Sales: 16 18 19 20 24 5 K3
a= Σy – cΣx2/n , b =Σxy/ Σx2 , c =Σx2y – a Σx2/Σx4 4
a= 18.83, b = 1.8, c = 0.286 6
Trend values: y = a+ bx +cx2= 16.35, 17.31, 18.83, 20.91, 23.5 6
Graph 4
21 | P a g e