0% found this document useful (0 votes)
79 views9 pages

S MM Unit I QB With Answers

The document discusses statistical methods and modelling. It provides examples and solutions related to sampling and estimation theory including probability calculations, the central limit theorem, and properties of estimators. Specific questions addressed include finding the probability of a sample mean falling in a given range, the hypergeometric and binomial distributions, constructing the distribution of sample means, and calculating the mean and variance of a random variable.

Uploaded by

prakaash A S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views9 pages

S MM Unit I QB With Answers

The document discusses statistical methods and modelling. It provides examples and solutions related to sampling and estimation theory including probability calculations, the central limit theorem, and properties of estimators. Specific questions addressed include finding the probability of a sample mean falling in a given range, the hypergeometric and binomial distributions, constructing the distribution of sample means, and calculating the mean and variance of a random variable.

Uploaded by

prakaash A S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

Departmen CSBS
t
Subject 23MA1205-STATISTICAL METHODS AND MODELLING

UNIT I – SAMPLING AND ESTIMATION THEORY

PART-A (Questions and answers)

1) How many different samples of size n=4 can be chosen from a finite
population of size (a) N=35 (b) N=15
Solution:
w.k.t the number of distinct samples of size ‘n’ from population size
‘N’ is N C n

(a) 35C =52360. i.e 52360 different samples can be chosen from a
4

population of size 35.


(b) 15C =1365. i.e1365 different samples can be chosen from a population
4

of size 15
2) State Central limit theorem.
Stmt: If x is the mean of a random sample of size n taken from a
x−μ
z=
population having the mean μ and the finite variance σ , then 2
σ
√n
is a random variable whose distribution function approaches that of the
standard normal distributions as n → ∞.
3) Find the value of the finite population correction factor for n = 10 and
N = 1,000 in finding variance.
Solution:
w.k.t the value of the finite population correction factor in finding
N −n
variance is N−1 . Here n = 10, N = 1,000
1000−10
∴ Correction Factor ¿ =0.991
1000−1
4) Define maximum likelihood estimator.
A statistic θ^ ( X 1,..., X n) is a maximum likelihood estimator of θ if, for each
sample X 1,..., X n, θ^ ( X 1,..., X n) is a value for the parameter that maximizes
the likelihood function L(θ| X 1,..., X n)

Page 1 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

5) Name the characteristics of good estimator?


The characteristics of good estimator are (i) Unbiasedness (ii) Efficiency
(iii) Consistency and (iv) Sufficiency

1)i) Car mufflers are constructed by nearly automatic machines. One


manufacturer finds that, for any type of car muffler, the time for a person to
set up and complete a production run has a normal distribution with mean
1.82 hours and standard deviation 1.20. What is the probability that the
sample mean of the next 40 runs will be from 1.65 to 2.04 hours
Solution:
Given μ=1.82 hrs , σ=1.20 hrs , n=40

Let random variable X denote production time of car muffler in hours


x −μ
Z=
Let σ be the standard normal variate.
√n

[ ]
x−μ
P 1.65 ≤ ≤ 2.04
Required Probability = σ
√n

[ ]
1.65−1.82 1.65−2.04
¿P ≤Z ≤
1.2 1.2 By Central limit thm
√ 40 √ 40
¿ P [−0.9≤ Z ≤ 1.16 ]
¿ P [−0.9≤ Z ≤ 0 ] + P [ 0≤ Z ≤ 1.16 ]
¿ 0.3159+0.3770
¿ 0.6929 .
1)ii) An Internet-based company that sells discount accessories for cell phones
often ships an excessive number of defective products. The company needs
better control of quality. Suppose it has 100 identical car chargers on hand
but that 25 are defective. If the company decides to randomly select 10 of
these items, what is the probability that 2 of the 10 will be defective? By
using
(a) the formula for the hypergeometric distribution (without replacement)
(b) the formula for the binomial distribution as an approximation(with
replacement)
Solution:
Given x=2 , n=10 , a=25 , N=100
(a)w.k.t the hypergeometric distribution is

Page 2 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

aC N −a C 25C 75C
h( x , n , a , N)= x n−x
= 2 8
=0.2924
NC n
100 C 10

(b)w.k.t the Binomial distribution is


x n−x
b (x , n , p)=nC p q , x=0 , 1 , 2, ... n
x

25
Here p= probability of success= 100 =0.25

∴ b(2 , 10 , 0.25)=10C 0.252 0.75 10−2=0.2816 ,


2

2) Given an infinite population whose distribution is given by


x 1 2 3 4 5
f (x) 0.2 0.2 0.2 0.2 0.2
List all the possible samples of size 2 and use this list to construct the
distribution of X for random samples of size 2 from the given population
Solution:
Sampling(taking) 2 items from 5 items (1,2,3,4,5) of infinite population, we get
5^2= 25 possible samples of size 2.
x i+ x j
w.k.t sample mean x k = for size 2.
2
Samples xk Samples xk Samples xk Samples xk Sample xk
(xi , x j ¿ ¿ (xi , x j ¿ ¿ (xi , x j ¿ ¿ (xi , x j ¿ ¿ s
(xi , x j ¿ ¿
(1, 1) 1 (2, 1) 1.5 (1, 1) 2 (1, 1) 2.5 (1, 1) 3
(1, 2) 1.5 (2, 2) 2 (1, 2) 2.5 (1, 2) 3 (1, 2) 3.5
(1, 3) 2 (2, 3) 2.5 (1, 3) 3 (1, 3) 3.5 (1, 3) 4
(1, 4) 2.5 (2, 4) 3 (1, 4) 3.5 (1, 4) 4 (1, 4) 4.5
(1, 5) 3 (2, 5) 3.5 (1, 5) 4 (1, 5) 4.5 (1, 5) 5

w.k.t the Distribution of x k is ( x k , p k ¿


xk 1 1.5 2 2.5 3 3.5 4 4.5 5
pk 1/25 2/25 3/25 4/25 5/25 4/25 3/25 2/25 1/25

3) i) Take 30 slips of paper and label five each −4 and 4, four each −3 and 3,
three each −2 and 2, and two each −1, 0 and 1.
(a) If each slip of paper has the same probability of being drawn, find the
probability of getting −4, −3, −2, −1, 0, 1, 2, 3, 4 and find the mean and the
variance of this distribution
Solution:
Given: out of 30 slips of paper, In 5 slips 4 is labelled, -4 is labelled in another 5
slips, - 3 & 3 are labelled in another each four slips, - 2 & 2 are labelled in
another each 3 slips, - 1, 0 , 1 in another each 2 slips.
x -4 -3 -2 -1 0 1 2 3 4
p(x ) 5/30 4/30 3/30 2/30 2/30 2/30 3/30 4/30 5/30

Page 3 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

w.k.t mean ∑ xi0


x= i=1 = =0
n 30
n

w.k.t variance ∑ (x ¿¿ i−x)2


σ 2= i =1 =26 /3 ¿
n
3)ii) The dean of a college wants to use the mean of a random sample to estimate
the average amount of time students take to get from one class to the next,
and she wants to be able to assert with 99% confidence that the error is at
most 0.25 minute. If it can be presumed from experience that σ = 1.40
minutes, how large a sample will she have to take??
Solution: Given E¿ 0.25 , σ =1.4 , 1−α =0.99 → α =0.01 → α /2=0.005
∴ z α /2 ¿ z α 0.005=2.576 From t-table last line

( ) (
2

)
2
zα/ 2 X σ 2.576 X .4
w.k.t n= = =208.098≈ 208
E 0.25
∴The Dean will take maximum 208 students from one class to next class

4)i) An industrial engineer intends to use the mean of a random sample of size
n = 150 to estimate the average mechanical aptitude (as measured by a
certain test) of assembly line workers in a large industry. If, on the basis of
experience, the engineer can assume that σ = 6.2 for such data, what can he
assert with probability 0.99 about the maximum size of his error?
Solution: Given n¿ 150 , σ=6.2, 1−α =0.99 →α =0.01 → α /2=0.005
∴ z α /2 ¿ z α 0.005=2.576 From t-table last line
σ 6.2
w.k.t E=z α / 2 X =2.576 x =1.3040
√n √150
∴The Engineer can assert with probability 0.99 that his error will be at most
1.304.
4) Explain the characteristic of the Good Estimator?
ii)
Solution:
The important properties or characteristics of a Good estimators are (i)
unbiasedness (ii) Efficiency (iii) Consistency and (iv)Sufficiency
i)Unbiasedness Sufficiency
An estimator x is said to be unbiased estimator of μ if E( x ¿=μ
ii) Efficiency
An estimator is said to be efficient if it has relatively smaller variance
iii) Consistency
An estimator is said to be consistent if its probability of being close to the
parameter increases as sample size increases.
Eg. As the sample size ‘n’ increases, the S.D of x decreases and hence the
probability of x value will be very close to its expected value μ.
iv) Sufficiency

Page 4 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

An estimator is said to be sufficient if it contains all the information in the


data about the parameter its estimates.

5)i) One process of making green gasoline takes biomass in the form of sucrose
and converts it into gasoline using catalytic reactions. At one step in a pilot
plant process, the output includes carbon chains of length 3. Fifteen runs
with same catalyst produced the yields (gal) 5.57 5.76 4.18 4.64 7.02
6.62 6.33 7.24 5.57 7.89 4.67 7.24 6.43 5.59 5.39 Treating the
yields as a random sample from a normal population,
(a) Obtain the maximum likelihood estimates of the mean yield and the
variance.
(b) Obtain the maximum likelihood estimate of the coefficient of variation
σ/μ.
Solution:
n

(a)w.k.t the maximum likelihood estimator of μ and σ 2 are ∑ x i and


^ μ=x = i=1
n
n

∑ (x ¿¿ i−x)2 .
i =1
σ^ 2= ¿
n
Given x 1=123 , x2 =106 , x 3=114 , x 4=128 , x 5=113 , x 6=109
x 7=120 , x 8=102 , x 9 =111→ n=9
9

=114 and 2 ∑
1,026 (x ¿¿ i−114 )2
∴ μ^ =
9 σ^ = i =1 =61.7778 → σ^ =7.860 ¿
9

^ ^
μ ^μ 114 ()
(b)w.k.t the MLE of variation σ/μ is σ = σ = 7.86 =0.0689
5)ii) The number of defective hard drives produced daily by a production line
can be modeled as a Poisson distribution. The counts for ten days are
7 3 1 2 4 1 2 3 1 2
Obtain the maximum likelihood estimate of the probability of 0 or 1
defectives
Solution:
−λ x
e λ
w.k.t the Poisson distribution is p(x )= , x=0 ,1 , ...
x!
n

By theorem, the maximum likelihood estimator of λ is ^ ∑ xi


λ=x= i=1
n
Given x 1=7 , x 2=3 , x 3=1, x 4=2, x 5=4 , x6 =1
x 7=2 , x 8=3 , x9 =1 , x 10=2 → n=10
^λ= 26 =2.6
10

Page 5 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

Let r.v X denote the no. of defective hard drives.


∴ Maximum likelihood estimates of p(x=0∨1)
¿ p(x=0)+ p(x =1)
^ ^
¿ e− λ +e− λ ^λ
−2.6 −2.6
¿ e + e x 2.6
¿ 0.267

UNIT II – LINEAR STATISTICAL MODELS

PART-A (Questions and answers)

1 Define Statistical modeling.


)
Solution:
Statistical modeling is the process of applying statistical analysis to a
dataset. A statistical model is a mathematical representation (or
mathematical model) of observed data.
2 Write the normal equations for multiple regression with r=2.
)
Solution:
The normal equations for multiple regression with r =2 are
∑ y=n b0 + b1 ∑ x 1+ ¿ b2 ∑ x2 ¿
∑ x 1 y=b 0 ∑ x 1 +b1 ∑ x 12+¿ b 2 ∑ x 1 x2 ¿
∑ x 2 y=b 0 ∑ x 2 +b1 ∑ x 2 x 1 +¿ b2 ∑ x 22 ¿
3 Define Correlation and Multiple Correlation.
)
Solution:
Correlation refers to a process for establishing the relationships between
two variables. Whereas Multiple correlation is a statistical technique
used to measure the strength and direction of the relationship between a
dependent variable and two or more independent variables.
4 Define Regression and Multiple Regression

Page 6 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

)
Simple linear regression finds a function that maps data points to a
straight line onto a graph of two variables whereas Multiple linear
regression finds a function that maps data points to a straight line between
one dependent variable, like ice cream sales, and a function of two or
more independent variables, such as temperature and advertising spend.
5 Compare RBD with CRD
)
RBD is more efficient and accurate when compared to CRD. Ø Chance
of error in RBD is comparatively less. Ø Flexibility is also very high in
RBD and thus any number of treatments and any number of replications
can be used.
6 What are the advantages of completely randomized block design?
)
Advantages of completely randomized designs 1. Complete flexibility is
allowed - any number of treatments and replicates may be used. 2.
Relatively easy statistical analysis, even with variable replicates and
variable experimental errors for different treatments.

Unit II – Linear Statistical Models


PART-A (2Marks)
Std. of Question Course
S.No. Question
Cognitive Level Outcome
1. Given 5 CO2
∑ x =142.3 , ∑ y =166.8 , ∑ x2=2.085.31 , ∑ xy=2.434 .69¿ ∑ y 2=2.897 .80 .
Calculate the sample correlation coefficient.
2. Write the normal equations for multiple regression with r =2. 1 CO2
3. Define Correlation and Multiple Correlation. 2 CO2
4. Define Regression and Multiple Regression 2
5 Compare RBD with CRD. 2
6. Write down the ANOVA table for Randomized Block Design. 1 CO2
7. What are the advantages of completely randomized block design? 1 CO2

Part – B (16 marks)


Std. of
Course
Question
S.No. Question
Cognitive
Outcom Mark
e
Level
1. The following are measurements of the air velocity and evaporation 4 CO2 16
coefficients of burning fuel droplets in an impulse engine:

Page 7 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

Air velocity (cm/s) x 20 60 100 140 180


Evaporation coefficient (mm2/s) y 0.18 0.37 0.35 0.78 0.56

Air velocity (cm/s) x 220 260 300 340 380


2
Evaporation coefficient (mm /s) y 0.75 1.18 1.36 1.17 1.65
Fit a straight line to these data by the method of least squares, and use it to
estimate the evaporation coefficient of a droplet when the air velocity is 190
cm/s.
2 The following are the numbers of minutes it took 10 mechanics to assemble 5 CO2 16
a piece of machinery in the morning, x , and in the late afternoon, y :
x 11.1 10.3 12.0 15.1 13.7 18.5 17.3 14.2 14.8 15.3
y 10.9 14.2 13.8 21.5 13.2 21.1 16.4 19.3 17.4 19.0
Calculate r .
3. Heavy metals can inhabit the biological treatments of waste in municipal 4 CO2 16
treatment plants. Monthly measurements were made at a state of the art
treatment plant of the chromium ( mgl ) in both the influent and effluent.
influent 250 290 270 100 300 410 110 130 1100
effluent 19 10 17 11 70 60 18 30 180
(a) Make a scatter plot
(b) Make a scatter plot after taking the natural logarithm of both variables.
(c) Calculate the correlation coefficient,r , in part (a) and part (b).
(d) Comment on the appropriateness of r in each case.

4. The following are data on the number of twists required to break a certain 5 CO2 16
kind of forged alloy bar and the percentages of two alloying elements present
in the metal:

Number of twists Percentage of Percentage of


element A element B
y x1 x2
41 1 5
49 2 5
69 3 5
65 4 5
40 1 10
50 2 10
58 3 10
57 4 10
31 1 15
36 2 15
44 3 15
57 4 20
19 1 20
31 2 20
33 3 20
43 4 20
Fit a least square regression plane and use its equation to estimate the
number of twists required to break one of its bars when x 1=2.5 and x 2=12.
5. The following are the average weekly losses of worker-hours due to 4 CO2 16

Page 8 of 9
Panimalar Engineering College- Question Bank with answer 23MA1205-STATISTICAL METHODS AND MODELLING

accidents in 10 industrial plants before and after a certain safety program


was put into operation:
Before 45 73 46 124 33 57 83 34 26 17
After 36 60 44 119 35 51 77 29 24 11
concerning losses of worker-hours before and after safety programs in 10 industrial
plants. Calculater S.

6. A panel of 8 judges was asked to rate each of 3 models developed by engineering 5 CO2 16
students on the likelihood that these models can be practically implemented to
harness the controlled fusion energy. Their ratings (in the form of judgmental
probabilities) are as follows:
Judge
A B C D E F G H
MODEL

X 0.21 0.42 0.53 0.26 0.85 0.28 0.12 0.90


Y 0.24 0.48 0.46 0.51 0.76 0.35 0.30 0.65
Z 0.38 0.06 0.24 0.42 0.92 0.19 0.22 0.78
Calculate the rank correlation coefficient, r S
(a) using Models X and Y;
(b) using Models Y and Z.
1. The following are the number of mistakes made in 5 successive days for 4 5 CO2 16
technicians working for a photographic laboratory:
Technician Technician Technician Technician
I II III IV
6 14 10 9
14 9 12 12
10 12 7 8
8 10 15 10
11 14 11 11
Test at the level of significance α =0.01 whether the differences among the 4
sample means can be attributed to chance.

2. An experiment was designed to study the performance of 4 different 5 CO2


16
detergents for cleaning fuel injectors. The following “cleanness” readings
were obtained with specially designed equipment for 12 tanks of gas
distributed over 3 different models of engines:
Engine 1 Engine 2 Engine 3 Total
Detergent A 45 43 51 139
Detergent B 47 46 52 145
Detergent C 48 50 55 153
Detergent D 42 37 49 128
Total 182 176 207 565
Looking at the detergents as treatments and the engines as blocks, obtain
the appropriate analysis of variance table and test at the 0.01 level of
significance whether there are differences in the detergents or in the engines.

Page 9 of 9

You might also like