0% found this document useful (0 votes)
8 views28 pages

Lecture 4

The document covers key concepts in statistics, focusing on random variables (RVs), including discrete and continuous RVs, probability density functions (PDFs), and cumulative distribution functions (CDFs). It includes exercises to apply these concepts, such as determining whether certain experiments can be described by discrete or continuous RVs and calculating expectation values and variances. Additionally, it discusses transformations of RVs and multi-dimensional random variables, emphasizing their joint distributions.

Uploaded by

Alan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views28 pages

Lecture 4

The document covers key concepts in statistics, focusing on random variables (RVs), including discrete and continuous RVs, probability density functions (PDFs), and cumulative distribution functions (CDFs). It includes exercises to apply these concepts, such as determining whether certain experiments can be described by discrete or continuous RVs and calculating expectation values and variances. Additionally, it discusses transformations of RVs and multi-dimensional random variables, emphasizing their joint distributions.

Uploaded by

Alan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

B123 – Statistics

Lecture 4

Dr. Christian Dombrowski


Important Definitions
• Random variable
Head Tail
• Mapping function for an event 𝑋: Ω → ℝ
(from sample space to a probability)
• Discrete RV
• Number of events:
• Finitely many
• Countably (infinitely) many
• Most definitions can be expressed with sums
½ ½
• Continuous RV
• Number of events:
• Uncountably many
• Most definitions can be expressed with integrals

• Extra: Mixed RVs

GISMA Business School – Potsdam – B123 2


Important Definitions
• Random variable
• Mapping function for an event 𝑋: Ω → ℝ Why RV?
(from sample space to a probability)
• Discrete RV 1. No “strange” Ω, but ℝ-valued
• Number of events: numbers and functions
• Finitely many 2. Complete description via 𝑓𝑋 or 𝐹𝑋 ,
• Countably (infinitely) many
instead of potentially infinitely
many simple events and their P
• Most definitions can be expressed with sums
• Continuous RV
• Number of events:
• Uncountably many
• Most definitions can be expressed with integrals

• Extra: Mixed RVs

GISMA Business School – Potsdam – B123 3


Exercise
• Which of the following experiments can be described by a discrete or by
continuous RV?
• Jeans sold per day
• Jeans sold per year
• Marathon time
• Interest rate
• Neutrinos passing through Earth
• Traffic accidents
• Football goals
• Student height
• Voltage at electrical outlet
• Stars per galaxy

GISMA Business School – Potsdam – B123 4


Probability Density Function
• Probability Density Function (PDF) 𝑓𝑋 (𝑥) or sometimes P(𝑋 = 𝑥)
• „Likelihood“ for any realization 𝑥 of RV 𝑋

• Function with 𝑓𝑋 𝑥 ≥ 0 and ‫׬‬−∞ 𝑓𝑋 𝑥 d𝑥 = 1
𝑏
• Probability: P 𝑎 < 𝑋 ≤ 𝑏 = ‫ 𝑥 𝑋𝑓 𝑎׬‬d𝑥

• Examples: 6-sided Die Throw Waiting time in grocery store

GISMA Business School – Potsdam – B123 5


Cumulative Distribution Function
• Cumulative Distribution Function (CDF) 𝐹𝑋
• Cumulative „likelihood“, i.e. 𝐹𝑋 𝑥 = P 𝑋 ≤ 𝑥
• Monotonically increasing function, with 0 ≤ 𝐹𝑋 𝑥 ≤ 1
𝑥
• Integral of PDF up to point 𝑥: 𝐹𝑋 𝑥 = ‫׬‬−∞ 𝑓𝑋 𝜙 d𝜙
• Probability: P 𝑎 < 𝑋 ≤ 𝑏 = 𝐹𝑋 𝑏 − 𝐹𝑋 (𝑎)
• Examples: 6-sided Die Throw Waiting time in grocery store

GISMA Business School – Potsdam – B123 6


Exercise
• Are the following PDFs?
• Baby wakes per night 0 1 2 3 4 5
2/50 11/50 23/50 -9/50 0 5/50

• Lucky Wheel

• Car engine delay after red light:


−0.15 𝑥−0.5
• 𝑓 𝑥 = ቊ 0.15 ⋅ e 𝑥 ≥ 0.5
0 otherwise

GISMA Business School – Potsdam – B123 7


Exercise
• Calculate the CDF for the following PDF!

GISMA Business School – Potsdam – B123 8


Exercise
• Calculate the probability that the sum of the top
values is 7 ≤ sum ≤ 10 for a fair 2-die-throw.

GISMA Business School – Potsdam – B123 9


Transformation of RVs
• Transformation of one RV into another RV
See Nugget 1
• Consists of piecewise continuous transformations
• Consists of arithmetic operations, e.g. 4𝑥 + 2
• Uses function 𝑔: ℝ → ℝ which must be a 1:1 transformation (inverse fct exists)
• Extra: sometimes also possible if not a 1:1 transformation

• New realization 𝑦 of RV 𝑌 given by expression 𝑌 = 𝑔(𝑋)

• Transformation may have implications on probabilities!


• Changes to 𝑓𝑋 and 𝐹𝑋
• May involve inverse functions, and for 𝑓𝑋 also derivatives
• Discrete RV: 𝑓𝑌 𝑦 = 𝑓𝑋 𝑔−1 𝑦
• Continuous RV: 𝑓𝑌 𝑦 = 𝑓𝑋 𝑔−1 𝑦 ⋅ 𝑔−1 ′ 𝑦

GISMA Business School – Potsdam – B123 10


Exercise
• You are given RV 𝑋 with Ω𝑋 = 1, 2, 3, 4, 5 . Transform 𝑋 into 𝑌: 𝑌 = 2𝑋 + 1
• State Ω𝑌

𝑥Τ for 𝑥 = 1,2,3,4,5
• Transform 𝑋‘s PDF: 𝑓𝑋 𝑥 = ቊ 15
0 otherwise

• 𝑔 𝑥 =

• 𝑔−1 𝑦 =

• 𝑓𝑌 𝑦 =

• Is 𝑓𝑌 𝑦 a valid PDF?

GISMA Business School – Potsdam – B123 11


Exercise
1 for 0 < 𝑤 < 1
• You are given RV 𝑊 and its PDF 𝑓𝑊 𝑤 = ቊ and consider
0 else
the transformation 𝑍 = −2 ⋅ ln 𝑊
• State Ω𝑊 and Ω𝑍

• Calculate 𝑓𝑍 𝑧
• 𝑢 𝑤 =

• 𝑢−1 𝑧 =

• 𝑓𝑍 𝑧 =

• Is 𝑓𝑍 𝑧 a valid PDF?

GISMA Business School – Potsdam – B123 12


Algebra of Random Variables
• Addition can be applied to RV
• Do not add the PDFs, but properly consider all events and their probabilities
• Properties of the RV may change
See Nugget 1

• Extra:
• New operation to calculate the PDF of summed-up RVs

• Convolution: 𝑓 ∗ 𝑔 𝑥 = ‫׬‬−∞ 𝑓 𝜏 ⋅ 𝑔 𝑥 − 𝜏 d𝜏
• Other basic operations also possible
• Subtraction, multiplication, exponentiation

• Addition has connection to


• Normal Distribution
• Central Limit Theorem

GISMA Business School – Potsdam – B123 13


Exercise
• Derive PDF of fair two-dice-throw using PDF of single die throw.
• For example, by using decision tree

GISMA Business School – Potsdam – B123 14


Expectation Value
• Similar approach to arithmetic mean (descriptive statistics)

• Definition: E 𝑋 = 𝜇𝑋 = ‫׬‬−∞ 𝑥 ⋅ 𝑓𝑋 𝑥 d𝑥
• Discrete version: E 𝑋 = σ∞ 𝑖=1 𝑥𝑖 ⋅ P(𝑋 = 𝑥𝑖 )
• May result in value that is not element of the sample space

• Important rules
• Linear operation: E 𝑎 + 𝑏 ⋅ 𝑋 = 𝑎 + 𝑏 ⋅ E 𝑋
• Additivity: E 𝑋 + 𝑌 = E 𝑋 + E 𝑌
• Multiplicativity: E 𝑋 ⋅ 𝑌 ≠ E 𝑋 ⋅ E(𝑌)
• Exception: uncorrelated RVs

• Expectation of transformed RV: E 𝑌 = E 𝑔 𝑋 = ‫׬‬−∞ 𝑔(𝑥) ⋅ 𝑓𝑋 𝑥 d𝑥
• Instead of “use 𝑔(𝑥) and 𝑓𝑋 ”, you can also derive 𝑓𝑌 directly
• Careful, in general: 𝐸 𝑔 𝑋 ≠ 𝑔 𝐸 𝑋
• Exception: 𝑔 is a linear function

GISMA Business School – Potsdam – B123 15


Exercise
• Show that the Expectation Value is a linear operation.

• Calculate the expectation value of an unfair coin throw, in which P H = 0.8

• Calculate the expectation value of a fair 6-sided die.

GISMA Business School – Potsdam – B123 16


Exercise
𝑥Τ for 𝑥 = 1,2,3,4,5
• Given RV 𝑋 with Ω𝑋 = 1, 2, 3, 4, 5 and PDF 𝑓𝑋 𝑥 = ቊ 15 ,
0 otherwise
and the transformation of 𝑋 into 𝑌: 𝑌 = 2𝑋 + 1
• Calculate the E 𝑔 𝑋 using the formula, and E 𝑌 using the transformed
PDF.

GISMA Business School – Potsdam – B123 17


Variance and Standard Deviation

• Definition: Var 𝑋 = 𝜎𝑋2 = E 𝑋 − 𝜇𝑋 2 = ‫׬‬−∞ 𝑥 − 𝜇𝑋 2 ⋅ 𝑓𝑋 𝑥 d𝑥
• Discrete version: Var 𝑋 = σ∞
𝑖=1 𝑥𝑖 − 𝜇𝑋
2 ⋅ P(𝑋 = 𝑥𝑖 )

• Important rules
• Variance of a constant is zero: Var 𝑎 = 0
• Variance is invariant to location, but scales: Var 𝑎 + 𝑏 ⋅ 𝑋 = 𝑏 2 ⋅ Var(𝑋) See Nugget 7

• Alternative calculation: Var 𝑋 = E 𝑋 2 − 𝜇𝑋2

• Sum rule (only for uncorrelated RVs): Var σ𝑛𝑖=1 𝑋𝑖 = σ𝑛𝑖=1 Var(𝑋𝑖 )

• Standard deviation: 𝜎𝑋 = Var 𝑋

GISMA Business School – Potsdam – B123 18


Exercise
• Let RV 𝑉 represent the number of heads obtained in a fair two-coin throw.
• Calculate E 𝑉 and Var 𝑉 .

• Calculate Var 𝑋 if RV 𝑋 represents the fair 6-sided die.

GISMA Business School – Potsdam – B123 19


Multi-Dimensional Random Variables
• Two (or more) RVs that are defined on same sample space Ω
• Generic case: RVs are from different distributions
• Special case: RVs are from same distribution See 𝑆𝑛 in Nugget 1

• Joint probability distribution for two-dimensional RV (𝑋 and 𝑌)


• Connection between 𝑋 and 𝑌 may be complicated!
• We consider only two-dimensional (bivariate) RVs!

𝒚𝟏 … 𝒚𝒋 … 𝒚𝑱 𝚺
𝒙𝟏 P 𝑥1 , 𝑦1 P 𝑥1 , 𝑦𝑗 P 𝑥1 , 𝑦𝐽 P(𝑋 = 𝑥1 )
… ⋱ ⋱ …
𝒙𝒊 P 𝑥𝑖 , 𝑦1 P 𝑥𝑖 , 𝑦𝑗 P 𝑥𝑖 , 𝑦𝐽 P(𝑋 = 𝑥𝑖 )
… ⋱ ⋱ …
𝒙𝑰 P 𝑥𝐼 , 𝑦1 P 𝑥𝐼 , 𝑦𝑗 P 𝑥𝐼 , 𝑦𝐽 P(𝑋 = 𝑥𝐼 )
Σ P(𝑌 = 𝑦1 ) … P(𝑌 = 𝑦𝑗 ) … P(𝑌 = 𝑦𝐽 ) 𝟏

GISMA Business School – Potsdam – B123 20


Two-Dimensional Random Variables
• Joint PDF
• Discrete RVs:
P 𝑋 = 𝑥𝑖 ∩ 𝑌 = 𝑦𝑗 = P 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = P 𝑥𝑖 , 𝑦𝑗
• Cont. RVs: 𝑓𝑋𝑌 (𝑥, 𝑦)

Pictures: Kris Hauser (Robotic Systems)


GISMA Business School – Potsdam – B123 21
Two-Dimensional Random Variables
• Joint PDF
• Discrete RVs:
P 𝑋 = 𝑥𝑖 ∩ 𝑌 = 𝑦𝑗 = P 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = P 𝑥𝑖 , 𝑦𝑗
• Cont. RVs: 𝑓𝑋𝑌 (𝑥, 𝑦)
• Joint CDF
𝑗
• Discrete RVs: 𝐹𝑋𝑌 𝑥𝑖 , 𝑦𝑗 = P(𝑋 ≤ 𝑥𝑖 , 𝑌 ≤ 𝑦𝑗 ) = σ𝑖𝛼=1 σ𝛽=1 P(𝑥𝛼 , 𝑦𝛽 )
𝑥 𝑦
• Cont. RVs: 𝐹𝑋𝑌 𝑥, 𝑦 = ‫=𝛼׬‬−∞ ‫=𝛽׬‬−∞ 𝑓𝑋𝑌 𝛼, 𝛽 d𝛼 d𝛽

Pictures: Kris Hauser (Robotic Systems)


GISMA Business School – Potsdam – B123 22
Two-Dimensional Random Variables
• Joint PDF
• Discrete RVs:
P 𝑋 = 𝑥𝑖 ∩ 𝑌 = 𝑦𝑗 = P 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = P 𝑥𝑖 , 𝑦𝑗
• Cont. RVs: 𝑓𝑋𝑌 (𝑥, 𝑦)
• Joint CDF
𝑗
• Discrete RVs: 𝐹𝑋𝑌 𝑥𝑖 , 𝑦𝑗 = P(𝑋 ≤ 𝑥𝑖 , 𝑌 ≤ 𝑦𝑗 ) = σ𝑖𝛼=1 σ𝛽=1 P(𝑥𝛼 , 𝑦𝛽 )
𝑥 𝑦
• Cont. RVs: 𝐹𝑋𝑌 𝑥, 𝑦 = ‫=𝛼׬‬−∞ ‫=𝛽׬‬−∞ 𝑓𝑋𝑌 𝛼, 𝛽 d𝛼 d𝛽

• Marginal PDF can be calculated by


• Discrete RVs: Sum “row” or “column”, e.g. P 𝑌 = 𝑦𝑖 = σ𝑥𝑖 P(𝑥𝑖 , 𝑦𝑗 )
• Cont. RVs: Integrate over the other RV 𝑓𝑌 𝑦 = ‫׬‬Ω 𝑓𝑋𝑌 𝑥, 𝑦 d𝑥
• Joint PDF contains more information than marginal PDFs
• Same information only for independent 𝑋 and 𝑌
• Extra: Marginal CDF
Pictures: Kris Hauser (Robotic Systems)
GISMA Business School – Potsdam – B123 23
Exercise
• You work for HR. The table shows the high fluctuation in your company.
Rows indicate after how many years employees leave your company,
columns show how much prior work experience they had before joining.
1 2 3
2 0.03 0.05 0.22
3 0.05 0.06 0.15
4 0.14 0.15 0.15

• Model the data with RVs! Which proportion of employees had one year of
prior experience and stayed for two years?

• What proportion of employees stay four years?


GISMA Business School – Potsdam – B123 24
Two-Dimensional Random Variables
• Expectation values
• Expectation of marginal PDFs E 𝑓𝑋 𝑥 and E 𝑓𝑌 𝑦
• Use known formulas for univariate RVs (discrete or cont.)
• Yields a two-element vector if both are computed
• Joint expectation E 𝑋𝑌
• Discrete RVs: E 𝑋𝑌 = σ𝑥𝑖 σ𝑦𝑗 𝑥𝑖 ⋅ 𝑦𝑗 ⋅ P 𝑥𝑖 , 𝑦𝑗
• Cont. RVs: E 𝑋𝑌 = ‫׬‬Ω ‫׬‬Ω 𝑥 ⋅ 𝑦 ⋅ 𝑓𝑋𝑌 𝑥, 𝑦 d𝑥 d𝑦
• Link to Covariance (see session “Regression and Correlation”)

• Extra:
• The following can also be derived
• Variance of marginal PDFs
• Conditional probabilities of joint PDFs/CDFs
• …

GISMA Business School – Potsdam – B123 25


Exercise – Ctn’d
• What proportion has been hired with only one year of experience?

• What is the expectation value of employee stay if they have two years of
experience?

• What is the expectation value of employee stay in general?

• What is the joint expectation E 𝑋𝑌 ?

GISMA Business School – Potsdam – B123 26


Independence and Uncorrelated-ness of RVs
• Definition of independence
• RVs 𝑋 and 𝑌 are independent, iff events {𝑋 ≤ 𝑥} and {𝑌 ≤ 𝑦} are independent
• P 𝑋 ≤𝑥|𝑌 ≤𝑦 =P 𝑋 ≤𝑥 or equivalently P 𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦 = P 𝑋 ≤ 𝑥 ⋅ P 𝑌 ≤ 𝑦
• Implications for PDFs
• Discrete RVs: P 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 = P 𝑋 = 𝑥𝑖 ⋅ P(𝑌 = 𝑦𝑗 )
• Continuous RVs: 𝑓𝑋𝑌 (𝑥, 𝑦) = 𝑓𝑋 (𝑥) ⋅ 𝑓𝑌 (𝑦)
• Definition of iid
• Independent and identically distributed: RVs exhibit the same properties
• Repetitions in experiments: “Same RV, executed in multiple independent runs”
• Definition of uncorrelated-ness
• The RVs 𝑋 and 𝑌 are uncorrelated if: E 𝑋 ⋅ 𝑌 = E 𝑋 ⋅ E 𝑌
• Second, more precise definition given in Lecture “Regression and Correlation”
• Independence is stronger than (implies) uncorrelated-ness
• Correlation describes only a linear relationship between 𝑋 and 𝑌
GISMA Business School – Potsdam – B123 27
Exercise – Ctn’d
• Are the years of stay and years of previous experience independent?

• Are the years of stay and years of previous experience iid?

GISMA Business School – Potsdam – B123 28

You might also like