0% found this document useful (0 votes)
13 views7 pages

Experiment 6

The document outlines an experiment focused on understanding Poisson and Normal distributions using R functions. It includes procedures for calculating probabilities, visualizing distributions, and solving problems related to defective products and job completion times. The document also provides code examples and results for various calculations and visualizations in R.

Uploaded by

Harsh Umredkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views7 pages

Experiment 6

The document outlines an experiment focused on understanding Poisson and Normal distributions using R functions. It includes procedures for calculating probabilities, visualizing distributions, and solving problems related to defective products and job completion times. The document also provides code examples and results for various calculations and visualizations in R.

Uploaded by

Harsh Umredkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Experiment – 6

Normal distribution, Poisson distribution


Aim: To understand Poisson distribution and Normal distribution using R functions
Introduction:
A discrete distribution is one in which the data can only take on certain values, for example integers.
For a discrete distribution, probabilities can be assigned to the values in the distribution. These
distributions model the probabilities of random variables that can have discrete values as outcomes.
Example: Binomial distribution, Poisson distribution

Poisson distribution is a discrete probability distribution that is widely used in the field of finance. It
gives the probability that a given number of events will take place within a fixed time period. The
notation is written as X ~ Pois(λ), where λ>0. The pmf is given by the following formula:

𝑒−λ λ𝑥
𝑃(𝑋 = 𝑥) = , 𝑥 = 0,1,2, …
𝑥!
Procedure:
 Import the data set
 Determine the probabilities of the random variable using Poisson distribution in R
 Visualize the probability distribution using R functions

Problem:
A manufacturer of pins knows that 2% of his products are defective. If he sells pins in boxes of 20 and
find the number of boxes containing (i) at least 2 defective (ii) exactly 2 defective (iii) at most 2
defective pins in a consignment of 1000 boxes (iv) plot the distribution (v) E(x) (vi) Variance of X?

Codes and Results:


#Poisson Distribution
# number of trails
m=20
m

## [1] 20

# probability of success
ps=0.02
# poisson parameter
lambda=m*ps
lambda
## [1] 0.4
#at least 2 defectives
p1=sum(dpois(2:m,lambda))
p1

## [1] 0.06155194

# (i) number of boxes containing at least 2 defectives


round(1000*p1)

## [1] 62

#exactly 2 defectives
p2=dpois(2,lambda)
p2

## [1] 0.0536256

# (ii) number of boxes containing exactly 2 defectives


round(1000*p2)

## [1] 54

#at most 2 defectives


p3=sum(dpois(0:2, lambda))
p3

## [1] 0.9920737

# (iii) number of boxes containing at most 2 defectives


round(1000*p3)

## [1] 992

# (iv) plot the distribution


x1=0:m
px1=dpois(x1,lambda)
plot(x1,px1,type="h",xlab="values of x",ylab="Probability distribution of
x",main="Poisson distribution")
#(v) E(x)
Ex1=weighted.mean(x1,px1)
Ex1

## [1] 0.4

# (vi) variance of x
Varx1=weighted.mean(x1*x1,px1)-(weighted.mean(x1 ,px1))^2
Varx1

## [1] 0.4

qpois( ) function is used to calculate the quantiles of a Poisson distribution. It takes two arguments:
the probability at which to evaluate the quantile and the mean of the Poisson distribution.
For example, to calculate the 95th percentile of a Poisson distribution with a mean of 2.5, you can use
the following code:
qpois(0.95, 2.5)
## [1] 5
rpois( ) function is used to generate random numbers from a Poisson distribution. It takes one
argument: the mean of the Poisson distribution.
rpois(q, lambda)
where, q: number of random numbers needed lambda: mean per interval
rpois(2, 3)
##[1] 4 1
rpois(6, 6)
##[1] 9 2 9 9 1 10
Normal Distribution
The Normal Distribution is defined by the probability density function for a continuous random variable
in a system. Let us say, f(x) is the probability density function and X is the random variable.

f(𝑥) ≥ 0 for all 𝑥 ∈ (−∞, ∞) and ∫ −∞ f(𝑥)𝑑𝑥 = 1

The probability density function of normal or Gaussian distribution is given by;


Where,
x is the variable
μ is the mean
σ is the standard deviation

Procedure:
 Generating the data set
 Determine the probabilities of the random variable using Normal distribution in R
 Visualize the probability distribution using R functions

dnorm(x, mean = 0, sd = 1, log = FALSE)


pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
rnorm(n, mean = 0, sd = 1)
Arguments
x, q vector of quantiles.
p vector of probabilities.
n number of observations. If length(n) > 1, the length is taken to be the number required.
mean vector of means.
sd vector of standard deviations
log, log.p logical; if TRUE, probabilities p are given as log(p).
lower.tail logical; if TRUE (default), probabilities are P[X ≤ x] otherwise, P[X > x].

Problem:
A company finds that the time taken by one of its engineers to complete or repair job has a normal
distribution with mean 20 minutes and S.D 5 minutes. State what proportion of jobs take:
i. Less than 15 minutes
ii. More than 25 minutes
iii. Between 15 and 25 minutes
iv. Plot the distribution
v. Table the distribution

Code and Results:


# Generating the data x
x=seq(0,40)
x

## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
23 24
## [26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

# find the density function of x


y=dnorm(x,mean=20,sd=5)
y

## [1] 2.676605e-05 5.838939e-05 1.223804e-04 2.464438e-04 4.768176e-04


## [6] 8.863697e-04 1.583090e-03 2.716594e-03 4.478906e-03 7.094919e-03
## [11] 1.079819e-02 1.579003e-02 2.218417e-02 2.994549e-02 3.883721e-02
## [16] 4.839414e-02 5.793831e-02 6.664492e-02 7.365403e-02 7.820854e-02
## [21] 7.978846e-02 7.820854e-02 7.365403e-02 6.664492e-02 5.793831e-02
## [26] 4.839414e-02 3.883721e-02 2.994549e-02 2.218417e-02 1.579003e-02
## [31] 1.079819e-02 7.094919e-03 4.478906e-03 2.716594e-03 1.583090e-03
## [36] 8.863697e-04 4.768176e-04 2.464438e-04 1.223804e-04 5.838939e-05
## [41] 2.676605e-05

# plot the normal distribution curve


plot(x,y,type='l')
# Proportion of jobs take less than 15 minutes
p1=pnorm(15,mean=20,sd=5)
p1
## [1] 0.1586553

x2=seq(0,15)
x2

## [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

y2=dnorm(x2,mean=20,sd=5)
y2

## [1] 2.676605e-05 5.838939e-05 1.223804e-04 2.464438e-04 4.768176e-04


## [6] 8.863697e-04 1.583090e-03 2.716594e-03 4.478906e-03 7.094919e-03
## [11] 1.079819e-02 1.579003e-02 2.218417e-02 2.994549e-02 3.883721e-02
## [16] 4.839414e-02

polygon(c(0,x2,15),c(0,y2,0),col='yellow')

#Proportion of jobs take more than 25 minutes


p2=pnorm(40,mean=20,sd=5)-pnorm(25,mean=20,sd=5)
p2

## [1] 0.1586236

x1=seq(25,40)
x1

## [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

y1=dnorm(x1,mean=20,sd=5)
y1

## [1] 4.839414e-02 3.883721e-02 2.994549e-02 2.218417e-02 1.579003e-02


## [6] 1.079819e-02 7.094919e-03 4.478906e-03 2.716594e-03 1.583090e-03
## [11] 8.863697e-04 4.768176e-04 2.464438e-04 1.223804e-04 5.838939e-05
## [16] 2.676605e-05

polygon(c(25,x1,40),c(0,y1,0),col='red')

#Proportion of jobs take between 15 and 25 minutes


p3=pnorm(25,mean=20,sd=5)-pnorm(15,mean=20,sd=5)
p3

## [1] 0.6826895

x3=seq(15,25)
x3

## [1] 15 16 17 18 19 20 21 22 23 24 25

y3=dnorm(x3,mean=20,sd=5)
y3
## [1] 0.04839414 0.05793831 0.06664492 0.07365403 0.07820854 0.07978846
## [7] 0.07820854 0.07365403 0.06664492 0.05793831 0.04839414
polygon(c(15,x3,25),c(0,y3,0),col='green')

# Probability distribution
data.frame(p1,p2,p3)

## p1 p2 p3
## 1 0.1586553 0.1586236 0.6826895

The function qnorm(), which comes standard with R, aims to do the opposite: given an area, find the
boundary value that determines this area.
For example, suppose you want to find that 85th percentile of a normal distribution whose mean is 70
and whose standard deviation is 3.
qnorm(0.85,mean=70,sd=3)
## [1] 73.1093
The value 73.1093 is indeed the 85th percentile, in the sense that 85% of the values in a population
that is normally distributed with mean 70 and standard deviation 3 will lie below 73.1093. In other
words, if you were to pick a random member X from such a population, then
P( X < 73.1093)=0.85
It can be checked that this is correct by plugging 73.1093 into pnorm():
pnorm(73.1093,,mean=70,sd=3)
The R function rnorm() is used for generating a vector of random numbers with a
normal distribution.
random_values <- rnorm(n = 25)
random_values
##[1] -0.53499657 -0.74971670 -0.22026809 1.54708162 0.39232568 -0.86422146
##[7] 0.36944164 0.49069040 -0.69812587 -0.13187081 -0.27770614 0.51801381
##[13] 0.57531833 0.11970355 0.84234079 1.28128005 0.92639749 1.20121957
##[19] 0.21812995 -0.02332619 1.06100364 0.55529338 0.45422197 -0.17755005
##[25] -1.44717709

Conclusion: Poisson distribution and Normal distribution have been explored using
R.

You might also like