0% found this document useful (0 votes)
5 views

Lecture13 Randomvariables

Db note

Uploaded by

OLATIGBE NASIF
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Lecture13 Randomvariables

Db note

Uploaded by

OLATIGBE NASIF
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

STAT 311: LECTURE 13

Heavily based on lecture notes from Martina Morris

Random Variables
Logistics

 Homework due today


 Midterms to be passed back Friday
 Shiqing will be giving lectures on Monday and
Wednesday
 Sam’s office hours will be on Friday from 1-3

Stat 311 Summer 2016 2


Random Variables
From probabilities of specific events
To describing the full probability distribution

Stat 311 Summer 2016


What is a random variable?

Formally:
 An event in a sample space

 That takes a value


 Discrete or
 Continuous PDF

 With a certain probability


 Can be described by the PDF: P(X=k) CDF
 Or by the CDF: P(X ≤ k), P(X > k), P(j < X < k)
Stat 311 Summer 2016 4
Examples

 Flip a coin: Heads or tails?


 How many cars will drive through an
intersection in an hour?
 What is the height of a random individual?
 How much time until I get my next text
message?

Stat 311 Summer 2016 5


Random variable notation
 Random variables are denoted by capital letters
 For example, X or Y

 The value the RV takes in a specific case is called a “realization”


and denoted by a lowercase letter
 For example: x, y or k

 P(X = k)
 “The probability that the random variable X takes the value k”

 The sample space (set of all possible outcomes) is denoted by

Stat 311 Summer 2016 6


Empirical vs Theoretical Distributions

 We have seen distributions of observed data. These


are often called Empirical Distributions, because
they are what we empirically observe
 Today we will begin to formally discuss distributions
which are mathematical constructs used to model
real world situations
 These distributions are typically a family of
distributions which are specified by a mathematical
equation and are governed by a set of parameters
Stat 311 Summer 2016 7
Notation: statistics vs. parameters

Sample Statistics Theoretical parameters

 Mean n
 Expected value
1
x   xi
n i 1
 Variance  Variance
n
2 1 2
s
X   i ( x  x )
(n  1) i 1
 Each observation has equal  Each possible outcome in
weight (1/n) the sample space receives
its own weight (pi)

Stat 311 Summer 2016 8


Example

 If I roll a single fair dice

Stat 311 Summer 2016 9


What comes next
 Deriving expectations and variances for different distributions
 Discrete (Bernoulli, Binomial, Poisson)
 Continuous (Normal, Uniform, Exponential)

 For example: with coin tosses


 Each toss is a random variable with outcomes {0, 1}
 The sum of these outcomes over n tosses is a linear combination of the
random variables for each toss, with outcome space {0, 1, 2, … , n}

 We start with the rules of expectations and variances for


linear transformations and combinations of RVs
Stat 311 Summer 2016 10
11 Rules of expectations and variances
For linear transformations and combinations of
random variables

Stat 311 Summer 2016


Rules of expectations and variances

1. Variance-Mean relationship

Note the theoretical or


population variance formula
here, not the sample
variance.

Stat 311 Summer 2016 12


Rules of expectations and variances
Transformations and combinations of RVs

 Examples:
 Transformation: converting degrees F to degrees C
 Combination: adding your midterm and final exam scores

 Linear transformations and combinations have simple


expressions for their expected values and variances
 Linear transformations : Y = a + X, Y=bX, Y = a + bX,

 Linear combinations: Z = X + Y, Z = aX + bY

Stat 311 Summer 2016 13


Rules of expectations and variances

2. Linear Transformations of RVs


 If a and b are constants, and X is a random variable, then:

E ( a) a Var (a ) 0
E (bX ) b E ( X ) Var(bX )  b 2 Var ( X )
E ( a  bX ) a  b E ( X ) Var (a  bX )  b 2 Var ( X )

 These are easily proven if you start with the definitions on


slide 8 and work through the algebra (try it…).

Stat 311 Summer 2016 14


Rules of expectations and variances
3. Linear combinations of independent RVs

UH covers this:
Let a, b and c be constants, and X and Y be independent random variables

E ( X  Y ) E ( X )  E (Y )
E ( X  Y ) E ( X )  E (Y )
E (a  bX  cY ) a  bE ( X )  cE (Y )

Var ( X  Y ) Var ( X )  Var (Y )


Var ( X  Y ) Var ( X )  Var (Y )
Var (a  bX  cY ) b2Var ( X )  c 2Var (Y )

Again, these are easily proven


Stat 311 Summer 2016 15
Rules of expectations and variances
4. Linear combinations of dependent RVs

 Not covered in UH, but straightforward


Let a and b be constants, and X and Y be random variables

E ( X  Y ) E ( X )  E (Y ) No difference in the mean

Var ( X  Y ) Var ( X )  Var (Y )  2Cov( X , Y ) But the variance


Var ( X  Y ) Var ( X )  Var (Y )  2Cov( X , Y ) changes
Var (aX  bY ) a 2Var ( X )  b2Var (Y )  2abCov( X , Y )

Stat 311 Summer 2016 16


Rules of expectations and variances

Why the difference for correlated RVs?


 Suppose 10 individuals are deciding whether or not to show up to a
party. Each has a 50/50 chance of going

 If the individuals all decide independently, we tend to end up with


very few extreme events (ie 0 show up or all 10 show up) and
usually will have around 5 individuals

 If the individuals all text each other and decide to either all show
up or all not show up (attendance for each individual is now
dependent), we will always have either 0 or 10 individuals. The
average is still 5, but the outcomes are more extreme
Summary
 There are simple rules for expectations and variances

 When we transform or combine random variables


 As long as the transformation/combination is linear

 And these are the foundation for what comes next


 Derive expected values and variances for some common distributions

Stat 311 Summer 2016 18


19 Discrete random variables
Exploring the derivation of discrete probability
distributions and their properties:
Bernoulli, Binomial and Poisson

Stat 311 Summer 2016


The goal
 To define a theoretical Probability Density Function for a
random variable

“Probability that the RV X=k, given p”

“The RV X is distributed as f with parameter p”

Where p is one or more parameters that determine the outcome of the


random variable.
 And use the PDF to derive expected values, variances, and
probabilities
With discrete distributions, f(x) is technically called a “probability mass function” or PMF, but PDF is also used, and
we will use it here.
Stat 311 Summer 2016 20
Example

 Coin tosses:
 Each individual toss is an RV with 2 outcomes
 Let X be the random variable for each toss: = {H,T}

 Pass Stat 311:


 Each individual student is an RV with 2 outcomes
 Let X be the random variable for each toss:
= {Pass, No Pass}

Stat 311 Summer 2016 21


The Bernoulli distribution
 Let X be a random variable with two outcomes:
={0,1} (we have to decide what is 1 and what is 0)

X ~ Bernoulli ( p)

Stat 311 Summer 2016 22


Derivation of E(Y) and Var(Y)

X ~ Bernoulli ( p)

Stat 311 Summer 2016 23


Getting more complicated
 If there are 50 students in a class, how many total students
will show up to class.
 Assume each student shows up with probability .8
 Assume the attendance of each student is independent of other students
 Each student’s attendance is a Bernoulli trial
 Want to know the sum of the Bernoulli trials
 How can we do this?

Stat 311 Summer 2016 24


Repeated Bernoulli trials: the Binomial

 Define a general probability distribution for the sum of n


independent Bernoulli trials

 Let X = {0, 1, 2, … , n} the count of the number of 1’s.

Example: 3 Coin tosses with H=1


X ~ Binomial(n; p) HTT
THT
HHT
HTH
TTT TTH THH HHH
Value 0 1 2 3
of X
Probab 1/8 3/8 3/8 1/8
ility

Stat 311 Summer 2016 25


The Binomial distribution


 The Binomial describes the probability distribution of
counts of successful trials

 It is a linear combination (sum) of Bernoulli RVs

 So both the number of trials, , and the probability of each trial, ,


outcome influence the result

 And the outcome space is now {0, 1, 2, … , n}

Stat 311 Summer 2016 26


Binomial probabilities

There are 3 elements to the calculation:


1. Define the probability of each individual outcome in the
sample space.

2. Identify the number of outcomes in the set of interest


(i.e., that satisfy the condition X=k).

3. Multiply the probability of the outcome by the number


in the set

Stat 311 Summer 2016 27


What is the probability of each outcome?

 Start with a single trial:


 What is the probability of each outcome Y?

p y (1  p)1 y Our friend the Bernoulli

 What about
 What is the probability of each outcome

p k (1  p ) 2 k
k n k
 And in general:
... p (1  p)
Stat 311 Summer 2016 28
How many outcomes satisfy

 This is a counting problem

 How many ways to get successes in trials when


order does not matter?

 Out of our N trials, we need to choose k trials to be


the successes

 Use the combination rule:


Stat 311 Summer 2016 29
The “binomial coefficient”

The number of outcomes that satisfy the condition:

HTT HHT
THT HTH
TTT TTH THHHHH
 n   3   3  3  3
    ,   ,   ,  
 k   0  1  2  3
 1, 3, 3, 1
 is referred to as the binomial coefficient in this context

Stat 311 Summer 2016 30


The Binomial distribution

Putting these all together


Let Y be a random variable with two outcomes, Y={0,1}
Let X be the number of successes in n trials, and p be the
probability of success on each trial. Then:

 n k n k
X ~ Bin(n; p) P ( X k )   p (1  p )
k

Number of outcome combinations Probability of each n-trial


that have k successes outcome with k successes

Stat 311 Summer 2016 31


Derivation of E(X) and Var(X)
n
X ~ Bin(n; p) the sum of n independent Bernoulli trials: X  Yi
i 1

E ( X )  X Var ( X )  X2
n n
E (  Y ) Var ( Y )
i 1 i 1
n
n Y2 *
 E (Y ) *
i 1 np (1  p )
nY
np
* Using E(Z+W) = E(Z) + E(W) * Using Var(Z+W) = Var(Z) + Var(W)
for independent RVs Z and W
Stat 311 Summer 2016 32
Binomial Summary
 For any repeated Bernoulli trial, the count of successes, X,
has a Binomial distribution:
X ~ Bin (n; p )

 n
f ( x; n, p)   p x (1  p) n x
 x
 X np
 X2 np(1  p )

We can calculate the mean, variance, and the probability of any value
of X from just two values: n and p

Stat 311 Summer 2016 33


Other models
 Suppose I know that on average 10 cars pass through a
specific intersection near my house each hour
 The number of cars that pass through for a given hour is
random
 It is discrete (we’re only considering whole cars)
 No set number of trials, or maximum value that this rv can
take
 How might we describe this process?

Stat 311 Summer 2016 34


Poisson Distribution
 Poisson – Used for counts of events as rates
 n (the number of trials) is large, not fixed,
 λ approximates a rate of events (e.g., per time unit, or per capita)

for x ≥ 0
e   x
f ( x;  ) P( X x) 
x!
 X  X2 

Stat 311 Summer 2016 35


Poisson Distribution
 The numerator is always positive, so there is a positive
probability for all x ≥ 0, although it gets very small as x
becomes large
 We assume that there is a constant rate
 We assume that all ``arrivals” are independent of each other

Stat 311 Summer 2016 36


Example
 Suppose I know that on average 10 cars pass through a
specific intersection near my house each day
 What is λ?
 What is the probability that 8 cars pass through the intersection in an hour?

 What is the probability that 15 cars pass through the intersection in an hour?

 Why might a Poisson assumption be wrong for this model?


Stat 311 Summer 2016 37
Summary:
 A discrete random variable takes specific values
 Typically integer counts
 Each with a certain probability

 If you can represent the underlying stochastic process as a


mathematical function
 You can calculate almost anything you want for the RV
 E(X), Var(X), P(X=k), P(X≤k) or P(i < X < j)

 The distribution is defined by the stochastic process


 The details of the process matter, and are reflected in the formal definition of
the distribution

Stat 311 Summer 2016 38

You might also like