0% found this document useful (0 votes)
25 views14 pages

C7 - DSC551 - R Programming

chapter 7 r programming

Uploaded by

fakhrizul Afif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views14 pages

C7 - DSC551 - R Programming

chapter 7 r programming

Uploaded by

fakhrizul Afif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

DSC551: Programming for Data

Science (R Programming)
7. Basic Simulation in R

Lecturer, Department of Statistics


2024-10-01

Asmui Rahim, DSC551:R , Oct 2024


Introduction
Simulation is an important topic for both statistics and for a variety of other areas
where there is a need to introduce randomness.
Implement a statistical procedures that requires random number generation or
samples.
Simulate a system and random number generators can be used to model random
inputs.
R comes with a set of pseudo-random generators that allow us to simulate from
well-known probability distributions like the Binomial, Poisson and Normal
Distribution.

Asmui Rahim, DSC551:R , Oct 2024


r for random number generation.
d for density.
p for cumulative distribution.
q for quantile function (inverse cumulative distribution)

Asmui Rahim, DSC551:R , Oct 2024


Generating random numbers

Asmui Rahim, DSC551:R , Oct 2024


Uniform distribution
When you roll a fair dice, the outcomes are 1 to 6. The probabilities of getting these
outcomes are equally likely and that is the basis of uniform distribution.
Parameter needed; minimum (min), and maximum (max).
Example: The number of bouquets sold daily at a flower shop is uniformly
distributed with a maximum of 40 and a minimum of 10.
Generate a random numbers of sales for any 20 given days.
1 round(runif(20, min=10, max=40))
[1] 23 11 33 15 25 15 39 17 25 30 33 10 31 39 37 15 21 22 26 23

Asmui Rahim, DSC551:R , Oct 2024


Binomial distribution
Satisfies the following four requirements;
1. There must be fixed number of trials.
2. Each trial can have only two outcomes. These outcomes can be considered as
either success or failure.
3. The outcomes of each trial must be independent of one another.
4. The probability of a success must remain the same for each trial.
5. Parameter needed; number of trials (size) and probability of success (prob).

Asmui Rahim, DSC551:R , Oct 2024


Example: A survey from a group project found that 30% of the students receive their
spending money from part-time jobs. Generate 10 random numbers of students who
receive their spending money from part-time jobs, if each selection, 5 students are
selected at random.
1 rbinom(10,size=5,prob=0.3)
[1] 1 4 2 1 4 1 3 1 1 0

Asmui Rahim, DSC551:R , Oct 2024


Poisson Distribution
Can be used when a density of items is distributed over a given period of time ,area
or volume.
Assuming that each events occurs randomly;
1. The number of emergency calls received by an ambulance in an hour.
2. The number of vehicles approaching the highway toll in a five-minute interval.
3. The number of typing errors per page.
4. The number of bacteria in a given culture.
Parameter needed; the rate at which an event occurs (lambda).

Asmui Rahim, DSC551:R , Oct 2024


Example: A washing machine in a self-service laundry breaks down an average 3
times per month. Suppose the event follows a Poisson probability distribution,
generate random numbers for any given 12 months the number of times the washing
machine is having a breaks down.
1 rpois(12, lambda=3)
[1] 2 3 2 3 1 4 1 3 1 5 1 3

Asmui Rahim, DSC551:R , Oct 2024


Normal Distribution
Represent the behaviour of most of the situations in the universe.
Characteristics of the normal distribution;
1. A normal distribution curve is bell-shaped.
2. The mean, median and mode are equal and are located at the centre of the
distribution.
3. Symmetric about the mean.
4. Curve is continuous.
5. Total area under a normal distribution is equal to 1.00 or 10%.
Parameter needed; mean (mean), standard deviation (sd).

Asmui Rahim, DSC551:R , Oct 2024


Example: A research reported that children between the ages of 2 and 5 watch an
average of 15 hours of youtube per week. Assume the variables is normally distributed
and the standard deviation is 3 hours. Generate a random number which represents
20 children between the ages of 2 and 5, where the children are selected randomly.
1 rnorm(20,mean=15,sd=3)
[1] 16.53296 15.62304 20.13076 19.96253 13.51690 14.74595 14.85434 18.67489
[9] 14.86843 17.00371 14.02095 11.66480 18.79348 19.27440 12.94321 11.15338
[17] 18.89279 21.03892 16.82730 14.85520

Asmui Rahim, DSC551:R , Oct 2024


Setting the Random Number Seed
1. It is essential to set the random number seed.
2. Setting the random number seed with set.seed() ensures reproducibility of the
sequence of random numbers.

1 set.seed(1)
2 rnorm(5)
[1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078
1 rnorm(5)
[1] -0.8204684 0.4874291 0.7383247 0.5757814 -0.3053884

Note

In general, you should always set the random number seed when conducting a
simulation so that we will able to reconstruct the exact numbers that we produced
in an analysis.

Asmui Rahim, DSC551:R , Oct 2024


Summary
Drawing samples from specific probability distributions can be done with “r”
functions.
Standard distribution are built-in: Uniform, Binomial, Poisson, Normal, etc.
Setting the random number generator seed via set.seed() is critical for
reproducibility.

Asmui Rahim, DSC551:R , Oct 2024


End of Slides

Asmui Rahim, DSC551:R , Oct 2024

You might also like