0% found this document useful (0 votes)
77 views21 pages

PS With R Lab Record Exp

This document describes experiments involving probability distributions using the R programming language. There are 12 listed experiments involving the binomial, Poisson, and normal distributions. The experiments include: 1) Calculating probabilities for the binomial distribution using R commands like dbinom() and finding the probability of getting a sum of 7 when throwing dice. 2) Fitting binomial and Poisson distributions to sample data using R commands like dpois() and calculating expected frequencies. 3) Finding probabilities for the normal distribution like the number of students weighing between 138-148 pounds using pnorm(), and calculating expected counts.

Uploaded by

Arun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views21 pages

PS With R Lab Record Exp

This document describes experiments involving probability distributions using the R programming language. There are 12 listed experiments involving the binomial, Poisson, and normal distributions. The experiments include: 1) Calculating probabilities for the binomial distribution using R commands like dbinom() and finding the probability of getting a sum of 7 when throwing dice. 2) Fitting binomial and Poisson distributions to sample data using R commands like dpois() and calculating expected frequencies. 3) Finding probabilities for the normal distribution like the number of students weighing between 138-148 pounds using pnorm(), and calculating expected counts.

Uploaded by

Arun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

AR - 18 - B.Tech.

II Year II Semester

Aditya Institute of Technology and Management, Tekkali-532201 12

Probability and Statistics with R Programming

(For CSE & IT)

Course Code: 18BSL203 External Marks : 60


Credits : 1.5 Internal Marks: 40
List of Experiments

1. Write the commands on R console to calculate the probability for Binomial distribution
functions.
2. Write the commands on R console to calculate the probability for Poisson’s
distribution functions.
3. Write the commands on R console to calculate the probability for Normal distribution
functions.
4. Write the commands on R console to calculate probability of sample means using
central limit theorem.
5. Write the commands of R console to calculate confidence intervals for proportions and
means
6. Write the commands on R console to perform z-test for testing the Null Hypothesis for
single proportion and difference of proportions at α level of significance.
7. Write the commands on R console to perform z-test for testing the Null Hypothesis for
single mean and difference of means at α level of significance.
8. Write the commands on R console for the following:

Perform t-test for testing the Null Hypothesis for single mean and difference of
means at α level of significance.

9. Write the commands on R console for the following: Perform F-test for testing the
Null Hypothesis for variance at α level of significance.
10. Write the commands on R console for the following:
a) Perform 𝜒 2-test for testing the goodness of fit.
b) Perform 𝜒 2-test for independence of attributes.
11. Write the commands on R console for the following:
a) Perform ANOVA (one way classification) to test on the basis of sample
observations whether the means of 3 or more populations are equal or not.
b) Perform ANOVA (two way classification) to test on the basis of sample
observations whether the means of 3 or more populations are equal or not based
on two different factors.
12. Write the commands on R console for the following:
a) Analyze the correlation between two variables using Karl Pearson’s coefficient of
correlation
b) Calculate the regression equations for two and three variables.

Experiment I
Binomial Distribution
Problem 1:Two dice are thrown five times, find the probability of getting 7 as sum (i) at least
once i.e.,p(x>=1)(ii) two times i.e.,P(X=2) (iii)between 1 to 5 i.e.,P(1<X<5).
Aim:- To find the probabilities of getting 7 as sum (i) at least once p(x>=1)(ii) two times
p(x=2) (iii)between 1 to 5 i.e.,P(1<X<5).
Formula:-
𝑛
The probability mass function of Binomial distribution is 𝑃 (𝑋 = 𝑥 ) = ( ) 𝑝 𝑥 𝑞𝑛−𝑥 ,
𝑥
x = 0,1,2,3,…..n, p,q>0,Where n=no.of trails/experiments, x=no.of successes, p=probability of
successes, q=probability of failure.
Calculation and R commands:
# To calculate the probabilities of binomial distribution for x~b(n,p)
# To calculate p(x>=1) for x~b(5,1/6)
i) 1-pbinom(0,size = 5,prob=1/6)
(Or)
pb1<-pbinom(q=1,size=5,prob=1/6,lower.tail=F)
pb2<-dbinom(x=1,size=5,prob=1/6)
pb<-pb1+pb2
# To calculate p(x=2) for x~b(5,1/6)
ii) dbinom(2, size = 5, prob=1/6)
# To calculate p(1<x<5) for x~b(5,1/6)
iii) x<-dbinom(x=2, size = 5, prob = 1/6)
x
y<-dbinom(x=3, size = 5, prob = 1/6)
y
z<-dbinom(x=4, size = 5, prob = 1/6)
z
sum(x+y+z)
(or)
sum(dbinom(c(2:4),size=5,prob=1/6)

Problem 2 Seven coins are tossed and number of heads is noted. The experiment is repeated
128 times with the following data:
No. of heads: 0 1 1 23 4 5 6 7
Frequencies: 7 6 19 35 30 23 7 1
Fit the binomial distribution assuming
i) The coin is unbiased
ii) Nature of coin is unknown
Aim:- To fit the binomial distribution for the above data when (i) the coin is unbiased (ii)
nature of coin is unknown
Formula:-
𝑛
The probability mass function of Binomial distribution is 𝑃 (𝑋 = 𝑥 ) = ( ) 𝑝 𝑥 𝑞𝑛−𝑥 ,
𝑥
x = 0,1,2,3,…..n, p,q>0
The expected frequencies are obtained by using the formula E(X) = N*P(X=x),
∑ 𝑓𝑥
mean = ∑𝑓
, N=∑ 𝑓
Calculation and R commands:-
# To calculate the fitting of binomial distribution of the given data by assuming that
#case (i): The coin is unbiased(i.e.,p=q=1/2)
>x<-c(0,1,2,3,4,5,6,7)
>x
>f<-c(7,6,19,35,30,23,7,1)
>f
>p<-1/2
>p
>N<-sum(f)
>N
>pb<-dbinom(x,7,p)
>pb
#to calculate the expected frequencies of the given data
>ef<-N*pb
>ef
>rf<-round(ef)
>rf
# To calculate fitting of binomial distribution of the data by assume that
#case (ii): When the nature of coin is unknown
>x<-c(0,1,2,3,4,5,6,7)
>x
>f<-c(7,6,19,35,30,23,7,1)
>f
>n=length(x)
>n
>N<-sum(f)
>mu<-sum(x*f)/sum(f)
>mu
>p<-mu/n
>p
# to calculate binomial probabilities for x~b(n,p)
>pb<-dbinom(x,7,p)
>pb
# to calculate expected frequencies out of 128 trials
>ef<-N*pb
>ef
>rf<-round(ef)
>rf
Experiment II

Poisson distribution
Problem 1: If the probability that an individual suffers a bad reaction from a certain injection
is 0.001. Determine the probability that out of 2000 individuals (i) exactly 3 individuals (ii)
more than 2 individuals (iii) none (iv) more than one individual suffer a bad reaction.
Aim:- To determine the probability that out of 2000 individuals (i) exactly 3 individuals (ii)
more than 2 individuals (iii) none (iv) more than one individual suffer a bad reaction.
Formula:-
𝑒 −𝜆 𝜆𝑥
The probability mass function of poison distribution is 𝑃 (𝑋 = 𝑥 ) = , x = 0,1,2,3,……∞,
𝑥!
𝜆 > 0, where 𝜆 is mean of Poisson distribution and is obtained by 𝜆 = 𝑛𝑝
Calculation & R-commands:-
>n<- 2000
>n
>p<- 0.001
>p
>lambda<- n*p
>lambda
# to calculate the probabilities that out of 2000 individuals (i) p(x=3), (ii)p(x>2),(iii)
p(x=0),(iv)p(x>1) for X~P(lambda)
(i) >p1<-dpois(x=3, lambda=2)
>p1
(ii) >p2 <-ppois(q=2, lambda = 2, lower.tail = F)
>p2
(iii) >p3<- dpois(x=0, lambda=2)
>p3
(iv) >p4<-ppois(q=1, lambda = 2, lower.tail = F)
>p4
Problem 2: Fit the Poison distribution of the following data
x: 0 1 2 3 4 5 6 7 8
f: 56 156 132 92 37 22 4 0 1
Aim:- To fit the Poison distribution of the above data.
Formula:-
𝑒 −𝜆 𝜆𝑥
The probability mass function of poison distribution is 𝑃 (𝑋 = 𝑥 ) = , x = 0,1,2,3,……∞,
𝑥!
∑ 𝑓𝑥
𝜆 > 0, where 𝜆 is mean of Poisson distribution and is obtained by 𝜆 = ∑𝑓
The expected frequencies are obtained by using the formula E(X) = N*P(X=x),
∑ 𝑓𝑥
mean = ∑𝑓
, N=∑ 𝑓
Calculation &R - commands:-
# To fitting the Poisson distribution of the given data.
>x<-c(0,1,2,3,4,5,6,7,8)
>x
>f<-c(56,156,132,92,37,22,4,0,1)
>f
>n<-length(x)
>n
>N<-sum(f)
>N
>mu<-sum(x*f)/sum(f)
>mu
>lambda<-mu
>lambda
#To calculate the probabilities of Poisson distribution for X~p(lambda)
>pb<-dpois(x, lambda)
>pb
#To calculate the expected frequencies
>ef<-N*pb
>ef
>rf<-round(ef)
>rf
Experiment - III
Normal distribution
Problem 1: Suppose the weights of 900 male students are normally distributed with mean µ =
140 pounds and standard deviation 10 pounds. Find the number of students whose weights are
(i) between 138 and 148 pounds (ii) more than 152 pounds.
Aim:- To find the number of students whose weights are (i) between 138 and 148 pounds (ii)
more than 152 pounds.
Formula:-
−1 𝑋−𝜇 2
1 ( )
The probability density function of Normal distribution (𝑥 ) = 𝑒 2 𝜎 , -∞<X<∞,
√2𝜋𝜎
-∞<µ<∞, σ>0
The expected number obtained by using the formula is N*P(X=x), N=Total frequency
Calculation & R – Command:-
#To calculate P(138<x<148) for X~N(140,10)
>mu<-140
>mu
>sd<-10
>sd
>N<-900
>N
(i) >p1<-pnorm(138, mean = 140, sd=10,lower.tail = T)
>p1
>p2<-pnorm(148, mean = 140, sd=10,lower.tail = F)
>p2
>p<-1-(p1+p2)
>p
>ens<-N*p
>ens
Or
>p<-sum(d(137:147, mean = 140, sd=10))
>p
>ens<-p*900
>ens
# To calculate p(x>152) for x~N(140,10)
(ii) >p<-pnorm(152, mean = 140, sd=10,lower.tail = F)
>p
>ens<-p*900
>ens
Example 2: If the masses of 300 students are normally distributed with mean 68 kgs and stand
ard deviation 3 kgs, how many students have masses (i) greater than 72kgs (ii) less than or equ
al to 64 kgs (iii) between 65 and 71 kgs.
Aim:- To find the number of students have masses (i) greater than 72kgs (ii) less than or equal
to 64 kgs (iii) between 65 and 71 kgs.
Formula:-
−1 𝑋−𝜇 2
1 ( )
The probability density function of Normal distribution is 𝑓 (𝑥 ) = 𝑒 2 𝜎 ,-∞<X<∞,
√2𝜋𝜎
-∞<µ<∞, σ>0
The expected number obtained by using the formula is N*P(X=x)
Calculation & R – Command:-
# To calculate the probabilities of normal distribution for x~N(mu,sigma)
>mu<- 68
>mu
>sigma<-3
>sigma
#To calculate p(x>72) for x~N(68,3)
(i) >pb1<-pnorm(72, mean = 68, sd=3,lower.tail = F)
>pb1
#To calculate p(x<=64) for x~N(68,3)
(ii) >pb2<-pnorm(64, mean = 68, sd=3,lower.tail = F)
>pb2
#To calculate p(65<x>71) for x~N(68,3)
(iii) >pb3<-pnorm(65, mean = 68, sd=3,lower.tail = T)
>pb3
>pb4<-pnorm(71, mean = 68, sd=3,lower.tail = F)
>pb4
>pb<-1-(pb3+pb4)
>pb

Experiment - IV
Central limit theorem:
Problem 1: A random sample of size 64 is taken from a normal population with µ = 51.4 and
σ = 6.8. What is the probability that the mean of the sample will (i) exceed 52.9 (ii) fall
between 50.5 and 52.3 (iii) be less than 50.6.

Aim:- To find the probabilities that the mean of the sample will (i) exceed 52.9 (ii) fall
between 50.5 and 52.3 (iii) be less than 50.6.

Formula:- (Central limit theorem)

If 𝑥̅ be the mean of a sample size n drawn from a population with mean µ and sd σ then the
𝑥̅ −𝜇
standardized sample mean is 𝑧 = 𝜎 is a random variable whose distribution approached that
⁄ 𝑛

of the standard normal distribution N(z;0,1) as n→∞.

Calculation & R – commands:-


#to calculate the probabilities that the mean of the sample will (i) exceed 52.9 (ii) fall between
50.5 and 52.3 (iii) be less than 50.6.

>n<-64

>n

>mu<-51.4
>mu
>sigma<-6.8
>sigma
>se<-sigma/sqrt(n)
#to calculate p(>52.9) for x~N(51.4,6.8)
(i) >p1<-pnorm(52.9,51.4, se, lower.tail = F)
>p1
#to calculate p(50.5<x>52.3) for x~N(51.4,6.8)
(ii) >p2<-pnorm(52.3,51.4, se, lower.tail = T) - pnorm(50.5,51.4, se, lower.tail = T)
>p2
#to calculate p(<50.6) for x~N(51.4,6.8)
(iii) >p3<-pnorm(50.6,51.4, se, lower.tail = T)
>p3

Problem 2:-Determine the expected number of random samples having their means i) betwee
n 22.39 and 22.41 (ii) greater than 22.42 (iii) less than 22.37 (iv) less than 22.38 or more than
22.41 for the following data N = 300, n = 36, σ = 0.48, µ = 22.4.
Aim:- To determine the expected number of random samples having their means i) between 2
2.39 and 22.41 (ii) greater than 22.42 (iii) less than 22.37 (iv) less than 22.39 or more than 22.
41
Formula:- (Central limit theorem)
If 𝑥̅ be the mean of a sample size n drawn from a population with mean µ and SD σ then the st
𝑥̅ −𝜇
andardized sample mean is 𝑧 = 𝜎 is a random variable whose distribution approached that
⁄ 𝑛

of the standard normal distribution N(z;0,1) as n→∞
The expected number of values obtained by using the formula is N*P(X=x)
Calculation &R – commands:-
# to calculate the probabilities of sample mean for X~N(mu, sigma/sqrt(n))
>N<-300
>N
>n<-36
>n
>mu<-22.4
>mu
>sigma<-0.48
>sigma
>se<-sigma/sqrt(n)
>se
# to calculate p(22.39<xbar<22.41) for x~N(22.4,0.48)
(i) >p1<-pnorm(22.41,22.4,se, lower.tail=T)-pnorm(22.39,22.4,se, lower.tail=T)
>p1
#to calculate the expected number of random samples having their means between 22.39 and 2
2.41.
>ens1<-p1*N
>ens1
# to calculate p(xbar>22.42) for x~N(22.4,0.48)
(ii) >p2<-pnorm(22.42,22.4,se,lower.tail = F)
>p2
#to calculate the expected number of random samples having their means greater than 22.42
>ens2<-p2*N
>ens2
# to calculate p(xbar<22.37) for x~N(22.4,0.48)
(iii) >p3<-pnorm(22.37,22.4,se,lower.tail = T)
>p3
#to calculate the expected number of random samples having their means less than 22.37
>ens3<-p3*N
>ens3
# to calculate p(xbar<22.38 or xbar>22.41) for x~N(22.4,0.48)
(iv) >p4<-1-( pnorm(22.41,22.4,se)-pnorm(22.38,22.4,se))
>p4
(or )
>p4<-pnorm(22.38,22.4,se, lower.tail=F)+pnorm(22.41,22.4,se, lower.tail=F)
>p4
#to calculate the expected number of random samples having their means less than 22.38 or gr
eater than 22.41
>ens4<-p4*N
>ens4

Experiment – V
Confidence Intervals:
Problem 1: In a random sample of 100 packages shipped by air freight 13 had some damage.
Construct 95% confidence interval for the true proportion of the damage package?
Aim:- To construct 95% confidence interval for the true proportion of damage package
Formula:-
The confidence interval for single proportion is (p − zα⁄2 √pq⁄n , p + zα⁄2√pq⁄n), where p =
sample proportion = x/n, 𝑧𝛼⁄2 = critical value at α level, 1 − 𝛼 = 95%, 𝛼 = 5%,𝛼2 = 0.025, 1 − 𝛼2 =
0.975
Calculation & R – command:-
# to calculate 95% confidence interval for the true proportion of the damage package
>n<-100
>n
>x<-13
>x
>p<-13/100
>p
>q<-1-p
>q
>alpha<-0.05
>alpha
>zalphaby2<-round(qnorm(1-alpha/2),2)
>zalphaby2
>se<-zalphaby2*sqrt((p*q)/n)
>se
>lower limit<-p-se
>lower limit
>upper limit<-p+se
>upper limit

Problem 2: We assume that the sample mean is 5, the standard deviation is 2, and the sample
size is 20. We will use a 95% confidence level and wish to find the confidence interval of
population men.
Aim:- To construct 95% confidence interval for the true mean
Formula:-
The confidence interval for single mean is (x̅ − zα⁄2 √σn , x̅ + zα⁄2 √σn), where 𝑥̅ = sample mean,
𝑧𝛼⁄2 = critical value at α level, 1 − 𝛼 = 95%, 𝛼 = 5%,𝛼2 = 0.025, 1 − 𝛼2 = 0.975
Calculation& R – commands:-
#to construct 95% confidence interval for the true mean
>xbar<- 5
>xbar
>sigma <- 2
>sigma
>n <- 20
>n
>zalphaby2<-round(qnorm(1 - (1 - 0.95)/2), 2)
>zalphaby2
>se<-zalphaby2*sigma/sqrt(n)
>se
>c(xbar-se, xbar + se)
>lower limit <- xbar-se
>lower limit
>upper limit<- xbar + se
>upper limit

Problem 3: In a large city A, 20% of a random sample of 900 school children had defective
eye-sight. In other large city B, 15% of sample of 1600 children had the same defect. Obtain
95% confidence limits for the difference in the population proportions.
Aim:- To obtain 95% confidence limits for the difference in the population proportions
Formula:- The confidence interval for difference of proportion is given by
p1 q 1 p2 q 2 p1 q 1 p2 q 2
((p1 − p2 ) − zα⁄2 √ + , (p1 − p2 ) + zα⁄2 √ + ) , where p1 = x1/n1 and p2 = x2/n2 and
n1 n2 n1 n2

𝑧𝛼⁄2 = critical value at α level, 1 − 𝛼 = 95%, 𝛼 = 5%, 𝛼2 = 0.025, 1 − 𝛼2 = 0.975


Calculation and R – commands:
# to obtain 95% confidence limits for the difference in the population proportions
>n1<-900
>n1
>n2<-1600
>n2
>x1<-180
>x1
>x2<-240
>x2
>p1<-180/900
>p1
>p2<-240/1600
>p2
>q1<-1-p1
>q1
>q2<-1-p2
>q2
>a<-(p1*q1)/n1
>a
>b<-(p2*q2)/n2
>b
>b<-(p2*q2)/n2
>b
>se<-sqrt(a+b)
>se
>zalphaby2<-round(qnorm(1 - (1 - 0.95)/2), 2)
>zalphaby2
>lower limit<-(p1-p2)-zalphaby2*se
>lower limit
>upper limit<-(p1-p2)+zalphaby2*se
>upper limit

Problem 4: In a certain factory there are two independent processes manufacturing the same
item. The average weight in a sample of 250 items produced from one process is found to be
120 ozs with a sd of 12 ozs. While the corresponding figures in a sample of 400 items from
the other process are 124 and sd of 14 ozs. Obtain the 99% confidence limits for the difference
in the average weights of items produced by the two process respectively.

Aim:- To obtain 99% confidence limits for the difference in the average weights of items
produced by the two processes

σ 2
σ 2
Formula:- The confidence interval for difference of means is ((x̅1 − x̅2 ) − zα⁄2 (√n1 + n2 ) , (x̅1 −
1 1

σ2 σ2
x̅2 ) + zα⁄2 (√n1 + n2 )), where 𝑥̅1 , 𝑥̅2 = sample means, 𝑧𝛼⁄ = critical value at α level, 1 − 𝛼 =
1 1 2
𝛼 𝛼
99%, 𝛼 = 1%, 2 = 0.005, 1 − 2 = 0.995
Calculation & R – commands:
#to obtain 99% confidence limits for the difference in the average weights of items produced
by the two processes
>n1<-250
>n1
>n2<-400
>n2
>sigma1<-12
> sigma1
>sigma2<-14
> sigma2
>x1bar<-120
>x1bar
>x2bar<-124
>x2bar
>zalphaby2<-round(qnorm(1 - (1 - 0.99)/2), 2)
>zalphaby2
>a<-(sigma1)^2/n1
>a
>b<-(sigma2)^2/n2
>b
>se<-sqrt(a+b)
>se
>lower limit<-(x1bar-x2bar)-zalphaby2*se
>lower limit
>upper limit<-(x1bar-x2bar)+zalphaby2*se
>upper limit
Experiment – VI
Test for proportions:
Problem 1: A manufacturer claimed that at least 95% of the equipment which he supplied to a
factory conformed to specifications. An examination of sample of 200 pieces of equipment re
vealed that 18 were faulty. Test his claim at 5% level of significance.
Aim:- To test the claim that at least 95% of the equipment which the manufacturer supplied to
a factory is conformed to specifications at 5% level of significance
Formula:-
H0: A manufacture claimed that not at least 95% of the equipment which the supplied to a fact
ory conformed the specifications (P=0.95)
H1: A manufacture claimed that at least 95% of the equipment which the supplied to a factory
conformed the specifications (P > or equal to 0.95)
𝑝−𝑃
Test statistic𝑍 = ~𝑁(0,1), p = x/n, x=no.of good items, n=total no.of items
√𝑝𝑞⁄𝑛

Calculation & R – commands:


# Test his claim at 5% level of significance
>Test<-function(p, P, n){
(p-P)/(sqrt(P*(1-P)/n))
}
>ZStatistic<-Test(0.91, 0.95, 200)
>ZStatistic
>Zalpha<-round(qnorm(0.95),2)
>Zalpha
>Calcz<-abs(ZStatistic)
>Calcz
>if(Calcz < Zalpha){
print("Null hypothesis is accepted")
}else{
print("Null Hypothesis is rejected")
}
Alternative Calculation and R – command:-
>p<-0.91
>p
>P<-0.95
>P
>n<-200
>n
prop.test(0.91, 200, 0.95, conf.level = 0.95,correct = F)
Problem 2: Random samples of 400 men and 600 women were asked whether they would like
to have a flyover near their residence, 200 men and 325 women were in favour of the proposa
l. Test the hypothesis that proportions of men and women in favour of the proposal are same, a
t 5% level of significance.
Aim:- To test the hypothesis that proportions of men and women in favour of the proposal are
same at 5% level of significance
Formula:-
H0: P1=P2 i.e there is no significance difference between the proportion of men and women in
forvour of the proposal
H1: P1 not equal to P2 i.e there is no significance difference between the proportion of men
and women in forvour of the proposal (two-tailed test)
𝑝1 −𝑝2
Test statistic𝑍 = 𝑝1 𝑝2
~𝑁(0,1), p1 = x1/n1, p2 = x2/n2
√( + )
𝑛1 𝑛2

Calculation & R – commands:


z.testprop = function(p1, p2, n1,n2){
z = (p1 -p2) / (sqrt((p1*(1-p1))/n1 + (p2*(1-p2))/n2))
}
>zstatistic<-z.testprop(200/400,325/600,400,600)
>zstatistic
>calcz<-abs(zstatistic)
>calcz
>zalphaby2<-round(qnorm(1-(1-0.95)/2),2)
>zalphaby2
if(calcz < zalphaby2){
print("Null hypothesis is accepted")
}else{
print("Null Hypothesis is rejected")
}
Alternative Calculation and R- command:-

Experiment – VII
Test for means:
Problem 1: A sample of 400 items is taken from a population whose standard deviation is 10.
The mean of the sample is 40. Test whether the sample has come from a population with mean
38. Also, calculate 95% confidence interval for the population.
Aim:- To test whether the sample has come from a population with mean 38. Also, to
calculate 95% confidence interval for the population.
Formula:
H0:µ=38 i.e., the sample has come from a population with mean 38
H1: µ not equal to 38 i.e., the sample has not come from a population with mean 38 (two-tailed
test)
𝑥̅ −𝜇
Test statistic 𝑍 = 𝜎 ~𝑁(0,1), 𝑥̅ = sample mean, n = sample size
⁄ 𝑛

Calculation and R – command:-
>Test<-function(xbar, mu, sd, n){
(xbar-mu)/(sd/sqrt(n))
}
>ZStatistic<-Test(40, 38, 10, 400)
>ZStatistic
>Zalphaby2<-round(qnorm(1-(1-0.95)/2),2)
>Zalphaby2
if(ZStatistic < zalphaby2){
print("Null Hypothesis is accepted")
}else{
print("Null Hypothesis is rejected")
}
#calculation for confidence interval
>xbar<-40
>xbar
>Sigma<-10
>Sigma
>Zalphaby2<-round(qnorm(1-(1-0.95)/2),2)
>Zalphaby2
>se<- Zalphaby2*sigma/sqrt(n)
>se
>lower limit<-xbar-se
>lower limit
>upper limit<-xbar-se
>upper limit
Alternative Calculation and R – command:-
>x<-40
>x
>n<-400
>n
>mu<-38
>mu
>sd<-10
>sd
z.test(x=40, mu=38, sd=10,n=400, conf.level = 0.95)
Problem 2: The mean yield of wheat from district A was 210 pounds with standard deviation
10 pounds per acre from a sample of 100 plots. In another district the mean yield was 220
pounds with S.D. 12 pounds from a sample of 150 plots. Assuming that the S.D. of yield in
the entire state was 11 pounds. Test whether there is any significant difference between the
mean yield of crops in the two districts.
Aim:-To test whether there is any significant difference between the mean yield of crops in
the two districts
Formula:-
H0: µ1 = µ2 i.e there is no significant difference between the mean yield of crops in the two
districts
H1: µ1 not = µ2 i.e there is no significant difference between the mean yield of crops in the
two districts (two- tailed test)
𝑥̅ 1 −𝑥̅2
Test statistic𝑍 = ~𝑁(0,1), where 𝑥̅1 , 𝑥̅2 are sample means and n1,n2 are sample sizes
𝜎2 𝜎2
√( 1 + 2 )
𝑛1 𝑛2

Calculation and R – command:-


x1bar=210,
x2bar=220,
SD=11,
n1=100,
n2=150.
z.testsam = function(x1bar, x2bar, var.a, var.b,n1,n2){
z = (x1bar -x2bar) / (sqrt(var.a/n1 + var.b/n2))
}
z<-z.testsam(210,220,121,121,100,150)
calcz<-abs(z)
zalphaby2<-round(qnorm(1-(1-0.95)/2),2)
zalphaby2
if(calcz < zalphaby2){
print("Null hypothesis is accepted")
}else{
print("Null Hypothesis is rejected")
}
Experiment – VIII
t-test for means
Problem 1: A mechanist is making engine parts with axle diameters of 0.700 inch. A random
sample of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of 0.040
inch. Compute the statistic you would use to test whether the work is meeting the specification
at 5% level of significance.
Aim:- To test whether the work is meeting the specifications at 5% level of significance
Formula:-
H0: µ=µ0 i.e the mean diameter of work is not meeting the specification
H1: µ not = µ0.i.e. the mean diameter of work is not meeting the specification (two-tailed test)
𝑥̅ −𝜇 1 1
Test statistic 𝑡 = 𝑠 ~𝑡𝑛−1, 𝑥̅ = ∑ 𝑥𝑖 , 𝑠 2 = ∑(𝑥𝑖 − 𝑥̅ )2
⁄ 𝑛 𝑛 𝑛

Calculation and R – commands:-
xbar=0742
n=10
mu=0.700
sd=0.040
t_test<-function(xbar,mu,n,sd){
(xbar-mu)/(sd/sqrt(n-1))
}
tStatistic<-t_test(0.742,0.700,10,0.040)
talphaby2<-round(qt(1-(1-0.95)/2,9,lower.tail = T),2)
talphaby2
tStatistic
if(tStatistic < talphaby2){
print("Null Hypothesis is accepted")

}else{
print("Null Hypothesis is rejected")
}
Problem 2 : Below are given the gain in weights (in lbs) of pigs fed on two diets A and B
DIET A: 25 32 30 34 24 14 32 24 30 31 35 25 -- -- --
DIET B: 44 34 22 10 47 31 40 30 32 35 18 21 35 29 22
Test, if the two diets differ significantly as regards their effect on increase in weight.
Aim:- To test the two diets differ significantly as regards their effect on increase in mean
weight
Formula:-
H0: µ1=µ2 i.e there is no significant differ regards their effect on increase in mean weight of
two diets of pigs fed
H1: µ1 not = µ2(two-tailed test)
𝑥̅ 1 −𝑥̅ 2 1 1 1
Test statistic 𝑡 = ~𝑡𝑛1 +𝑛2 −2, ̅̅̅
𝑥1 = ∑ 𝑥𝑖 , ̅̅̅
𝑥2 = ∑ 𝑦𝑗 𝑠 2 = (∑(𝑥𝑖 − 𝑥̅ )2 +
1 1 𝑛1 𝑛2 𝑛1 +𝑛1
𝑠√ +
𝑛1 𝑛2

∑(𝑥𝑖 − 𝑥̅ )2 )
Calculation and R – commands:-
DietA<- c(25,32,30,34,24,14,32,24,30,31,35,25)
DietA
DietB<- c(44,34,22,10,47,31,40,30,32,35,18,21,35, 29,22)
DietB
t.test(DietA, DietB, conf.level = .95)
Another Solution
DietA<- c(25,32,30,34,24,14,32,24,30,31,35,25)
DietA
DietB<- c(44,34,22,10,47,31,40,30,32,35,18,21,35, 29,22)
DietB
mu1<-mean(DietA)
mu1
mu2<-mean(DietB)
mu2
n1<-length(DietA)
n1
n2<-length(DietB)
n2
var<-((sum((DietA-mu1)^2))+sum((DietB-mu2)^2))/(n1+n2-2)
var
t_test<-function(mu1,mu2,n1,n2,var)
{
(mu1-mu2)/sqrt(var*(1/n1+1/n2))
}
tcritical<-round(qt(1-(1-0.95)/2,2))
tStatistic<-t _test(mu1,mu2,n1,n2,var)
if(tStatistic < tcritical)
{
print("Null hypothesis is accepted")
}else{
print("Null hypothesis is rejected")
}

Problem 3: A random sample of 10 boys had the following I.Qs: 70,120,110,101, 88, 83, 95,
98, 107, and 100. Do these data support the assumption of a population mean I.Q. of 100.
Aim:- To test whether the data support the assumption of a population mean I.Q of 100
Formula:-
H0: µ=µ0 i.e the mean diameter of work is not meeting the specification
H1: µ not = µ0(two-tailed test)
𝑥̅ −𝜇 1 1
Test statistic 𝑡 = 𝑠 ~𝑡𝑛−1, 𝑥̅ = ∑ 𝑥𝑖 , 𝑠 2 = ∑(𝑥𝑖 − 𝑥̅ )2
⁄ 𝑛 𝑛 𝑛

Calculation and R – command:-
x1<-c(70,120,110,101, 88, 83, 95, 98, 107, 100)
xbar<-mean(x1)
xbar
mu=100
mu
n=length(x1)
n
sd<-sd(x1)
sd
t_test<-function(xbar,mu,n,sd){
(xbar-mu)/(sd/sqrt(n))
}
tStatistic1<-t_test(xbar,mu,n,sd)
talpha<-round(qt(1-(1-0.95)/2,n-1,lower.tail = T),5)
talpha
tStatistic1
tStatistic1<-abs(tStatistic1)
if(tStatistic1 < talpha){
print("Null Hypothesis is accepted")

}else{
print("Null Hypothesis is rejected")
}
Another Solution
t.test(x1, y = NULL,alternative = c("two.sided"),mu = 100, paired = FALSE, var.equal =
FALSE,conf.level = 0.95)

Experiment XII
Correlation and Regression
Problem 1: Find the correlation coefficient for the following data
x:124 100 105 112 102 93 99 115 123 104 99 113 121 103 101
y:80 100 102 91 100 111 109 100 89 104 111 102 98 111 123
Aim:- To find the correlation coefficient for the following data
Formula:-
The coefficient of correlation between X and Y is

rxy 
 (x i  x )( y i  y )
( xi  x ) 2  ( y i  y ) 2

Or r 
cov ariance of xy
(2)r 
 xy (3)r 
 XY
 x  y N x y  X 2 Y 2
X  ( x  X ), Y  ( y  Y ) where X , Y are means of the series x and y.

x = standard deviation of series x.

y = standard deviation of series y.

Calculation and R – commands:-

x<-c(124,100,105,112,102,93,99,115,123,104,99,113,121,103,101)
y<-c(80,100,102,91,100,111,109,100,89,104,111,102,98,111,123)
> cov(x,y)
> cor(x,y)
> plot(x,y)

Problem 2: Calculate the regression lines from the following data:


x:22 26 29 30 31 31 34 35
y:20 20 21 29 27 24 27 31
Aim:- To calculate the regression lines from the following data
Formula:-
In this method we take deviations from the assumed mean instead of arithmetic Mean. The
regression line of X on Y

x
X X r (Y  Y )
y

x
We can find out the value of r by applying the following formula.
y

 dx   dy
 dxdxy 
x N
r  , dx  X  A; dy  Y  A
y ( dy ) 2
 dy 
2

x
The regression equation of Y on X is Y  Y  r (X  X )
y
Calculation and R – command:-
R-commands:
x<-c(22,26,29,30,31,31,34,35)
y<-c(20,20,21,29,27,24,27,31)
lm1<-lm(y~x)
lm1
plot(lm1)
plot(x,y)
abline(lm1,col=2)
> lm(?x~y)

You might also like