0% found this document useful (0 votes)
9 views11 pages

Tutorial 2 - Questions.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views11 pages

Tutorial 2 - Questions.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

ICT583

7 Mar 2023

ICT583 Data Science Applications


Tutorial 2
Mathematical Preliminaries

Scenarios
Many instances of binomial distributions can be found in real life.
- For example, if a new drug is introduced to cure a disease, it either cures the disease
(it’s successful) or it doesn’t cure the disease (it’s a failure).
- If you purchase a lottery ticket, you’re either going to win money, or you aren’t.
Basically, anything you can think of that can only be a success or a failure can be represented by
a binomial distribution.
A binomial distribution can be thought of as simply the probability of a SUCCESS or FAILURE
outcome in an experiment or survey that is repeated multiple times.

Maths
X ~ Binomial(n, p)
- For binomial distribution of the outcomes, n = number of observations, p = probability
Question: it is a discrete probability distribution or continues one? – discrete

Coding
In these exercises, you will practice using rbinom() function to generate random “flips” that are
either “heads” (1) or “tails” (0), to simulate random data
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Binomial.html
rbinom(n, size, prob)
generates required number of random values of given probability from a given sample.
n - number of observations
size - number of trials
prob - probability of success of each trial.

1
ICT583
7 Mar 2023

Part One: Simulating coin flips


1.1 Flipping a coin in R
# you will simulate 10 coin flips, each with a 30% chance of coming up “heads”:
- 10 coin flips = 10 observations
- Flipping a coin
- 30% probability of head
rbinom(10, 1, .3)

Flipping multiple coins in R


# Generate 100 occurrences of flipping 10 coins, each with 30% probability
- 100 observations
- Flipping 10 coins,
- 30% chance of getting head
- rbinom(100, 10, .3)

how about increasing the number of observations, as well as the number of trials in each
observation?

rbinom(100, 3, .5) %>% hist


rbinom(100, 6, .5) %>% hist

rbinom(1000, 6, .5) %>% hist


rbinom(1000, 9, .5) %>% hist

rbinom(10000, 9, .5) %>% hist


rbinom(10000, 12, .5) %>% hist

f (event) = prob
rbinom(10000, 12, .5) %>% hist(freq = F)

2
ICT583
7 Mar 2023

1.2 Calculating density of a binomial


by hands
calculate the probability that 2 are heads, out of 10 trials, using the binomial probability mass
function (binomial PMF) formula
- n = number of trials = 10
- k = number of desired outcomes = 2
- p = probability = 0.3

First, compute the binomial coefficient


10!/ (8! 2!)
= (10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1) / (8 × 7 × 6 × 5 × 4 × 3 × 2 × 1) × (2 × 1)
= (10 * 9) / (2 * 1)
= 90 / 2
= 45
Then calculate the probability when k = 2
P(X = 2)
= 45 * (0.3 ^ 2) * (0.7 ^ 8) = 0.2334744

Coding
dbinom(x, size, prob)
gives the probability density distribution at each point.
x - vector of numbers (specify where you want to evaluate the binomial density)
If you flip 10 coins each with a 30% probability of coming up heads, what is the probability
exactly 2 of them are heads?
# Calculate the probability that 2 are heads using dbinom
dbinom(2, 10, .3)
plot(1:10, dbinom(1:10, size=10, prob=.3), type='h')

3
ICT583
7 Mar 2023

# Confirm your answer with a simulation using rbinom

For example, you will observe 100 times, the random deviates are
r = rbinom(100, 10, .3)

# which of the results exactly have 2 heads?


R <- r == 2

# to compute the proportion of these logical results,


mean(R)

# how about increasing the number of observations?

# what do you observe?

# we know the chance of head is 0.5, so, what is the probability of getting 2 heads given 10
trials per observation?
dbinom(2, 10, .5)

# how about 8 heads?


dbinom(8, 10, .5)

4
ICT583
7 Mar 2023

1.3 Calculating cumulative density of a binomial


pbinom(x, size, prob)
gives the cumulative probability of an event.

Scenario
If you flip ten coins that each have a 30% probability of heads, what is the probability at least
five are heads?

# Calculate the probability that at least five coins are heads


# we know the cumulative density of less than five heads is
r = pbinom(4, 10, .3)

# cumulative density curve


lapply(1:10, function(x) pbinom(x, 10, .3)) %>% unlist%>% plot
plot(1:10, pbinom(1:10, 10, .3), type="h")

# the cumulative density of five heads or more will be


1-r

# Confirm your answer with a simulation using rbinom, with 10000 observations
mean( rbinom(10000, 10, .3) >= 5 )

Try to simulate 100, 1000, 10000, 100000 observations.

Which is closest to the exact answer?

5
ICT583
7 Mar 2023

1.4 Expected values and variance for binomial distribution


Expected values
- e.g., we expect the chance of getting head for flipping a coin for unlimited number of
times will be 0.5
Calculate the expected value using the exact formula
Expect value = n * p
- n = number of trials
- p = probability
# What is the expected value of a binomial distribution where 1 coin is flipped, having 50%
chance of head?
1 * 0.5
# What is the expected value of a binomial distribution where 25 coins are flipped, each having a
30% chance of heads?
25 * .3 = 7.5
# Confirm with a simulation using rbinom, assuming 10000 observations
mean(rbinom(10000, 1, 0.5))
mean(rbinom(10000, 25, 0.3))

Variance
What is the variance of a binomial distribution where n coins are flipped, each having a p chance
of heads?
Var= n * p * (1-p)
When n = 1, p = .5, var = 0.25, SD = sqrt(.25)
When n = 25, p = .3, var = 5.25 , SD = 2.291288
# Confirm with a simulation using rbinom
r = rbinom(10000, 25, 0.3)
var(r)
sd(r)

6
ICT583
7 Mar 2023

Part Two Probability of compound events


If events A and B are independent, and
- A has a 40% chance of happening, and
- event B has a 20% chance of happening,

by hands
what is the probability they will both happen?
Joint probability: P(A ⋂ B) = P(A) * P(B) # intersection
= .4 * .2
what is the probability either A or B will come up heads?
Union probability: P(A ⋃ B) # union
= P(A) + P(B) – P(A ⋂ B)
= .4+.2 - .4 * .2

Coding
Assuming 100000 observations done, one trial,
A <- rbinom(100000, 1, .4)
B <- rbinom(100000, 1, .2)
a = mean(A)
b = mean(B)

j = a * b
u = a + b – j

# or
mean(A & B)
mean(A | B)

7
ICT583
7 Mar 2023

Part Three: Normal distribution


- For continuous variable
Suppose you flipped 1000 coins, each with a 20% chance of being heads.
What would be the mean and variance of the binomial distribution?
# Mean
=n*p
= 1000 * 0.2 = 200
# Variance
= n * p * (1-p)
= 1000 * 0.2 * 0.8 = 160

3.1 Simulating from binomial and normal


rnorm(n, mean, sd)
mean - mean value of the sample data. It's default value is zero.
sd - standard deviation. It's default value is 1.

When a random variable X is normally distributed with mean mu and standard deviation sigma.
# Draw a random sample of 100,000 from the Binomial(1000, .2) distribution
b <- rbinom(100000, 1000, .2)
plot(hist(b, breaks=30))
hist(b, breaks=50, main = "my binomial dist")

# Draw a random sample of 100,000 from the normal approximation


g <- rnorm(100000, 200, sqrt(160))
plot(hist(g, breaks=30))
hist(g, breaks=50, main = "Gaussian dist")

8
ICT583
7 Mar 2023

# probability density
g <- rnorm(100000, 0, sqrt(1))
plot(hist(g, breaks=30))
hist(g, breaks=50, main = "Gaussian dist", freq = F)

9
ICT583
7 Mar 2023

3.2 Comparing cumulative density of the binomial


pnorm(x, mean, sd)
gives the probability of a normally distributed random number to be less that the value of a
given number. It is also called "Cumulative Distribution Function".

# Simulations from the normal and binomial distributions


b <- rbinom(100000, 1000, .2)
g <- rnorm(100000, 200, sqrt(160))

# Use binom_sample to estimate the probability of <= 190 heads


mean(b <= 190)

# Use normal_sample to estimate the probability of <= 190 heads


mean(g <= 190)

# Calculate the probability of <= 190 heads with pbinom


pbinom(190, 1000, .2)

# Calculate the probability of <= 190 heads with pnorm


pnorm(190, 200, sqrt(160))

10
ICT583
7 Mar 2023

Expected value and variance for random variables

# dice
dice = 1:6
p = 1/6
EV = sum(dice)*p
var = map(dice, function(x) p * ( x - EV)^2 ) %>% unlist %>% sum
sd = sqrt(var)

# blood type
A couple has a 25% (p) chance of a having a child with type O
blood. What is the chance that three (X) of their five (n) kids
have type O blood?

dbinom(3, 5, .25)
p= map(0:5, function(x) dbinom(x, 5, .25))
EV = map2(0:5, p, function(x, y) x*y ) %>% unlist %>% sum
var= pmap(
list(
as.list(0:5),
EV,
p
) ,
\(x,y,z) z * (x-y)^2
) %>% unlist %>%sum

n = 5
p = .25
1-p = .75

5*.25
5*.25*.75

11

You might also like