0% found this document useful (0 votes)

13 views

Lec2 IntroToProbabilityAndStatistics

This document provides an introduction to probability and statistics concepts. It discusses random variables, cumulative distribution functions, mean, variance, and common probability distributions like the uniform, Gaussian, binomial, multinomial, Poisson, and empirical distributions. The goals of the lecture are to understand these concepts and distributions. It also lists references for further reading.

Uploaded by

hu jack

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lec2 IntroToProbabilityAndStatistics

Uploaded by

hu jack

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Introduction to

Probability and Statistics

(Continued)
Prof. Nicholas Zabaras

Email: [email protected]
URL: https://www.zabaras.com/

August 18, 2020

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras

Contents
 Random Variables, CDF, Mean and Variance

 Uniform Distribution, Gaussian, The binomial and Bernoulli distributions

 The multinomial and multinoulli distributions, Poisson distribution

 Empirical distribution

 The goals for today’s lecture:

 Understand the definition of random variables, CDFs and PDFs

 Learn about the mean, variance and higher-moments of a distribution
 Familiarize ourselves with the Uniform, Gaussian, Bernoulli, Binomial, Multinomial,
Multinoulli, Poisson and Empirical distributions
 Get a first look at the concept of maximum likelihood for parameter estimation
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 2
References
• Following closely Chris Bishops’ PRML book, Chapter 2

• Kevin Murphy’s, Machine Learning: A probablistic perspective, Chapter 2

• Jaynes, E. T. (2003). Probability Theory: The Logic of Science. Cambridge

University Press.

• Bertsekas, D. and J. Tsitsiklis (2008). Introduction to Probability. Athena

Scientiﬁc. 2nd Edition

• Wasserman, L. (2004). All of statistics. A Concise Course in Statistical

Inference. Springer.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 3

Random Variables
 Define Ω to be a probability space equipped with a probability measure 𝑃 that
measures the probability of events 𝐸 ⊂ Ω. Ω contains all possible events in the
form of its own subsets.

 A real valued random variable 𝑋 is a mapping 𝑋: Ω → ℝ.

 We call 𝑥 = 𝑋 𝜔 , 𝜔 ∈ ℝ, a realization of 𝑋.

 Probability distribution of 𝑋: For 𝐵 ⊂ ℝ,

PX ( B)  P  X 1 ( B )   P  X ( )  B

 Probability density
PX ( B)   p X ( x)dx
B
 We often write:
p X ( x)  p( x)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 4
Cummulative Density Function

z
p  x  0 F  z   p( x)dx
 

 p( x)dx  1

(cumulative density function)

b
P  x  (a, b)    p ( x)dx  F (b)  F (a )
a

 The CDF for a random variable 𝑋 is the function 𝐹(𝑥) that returns the
probability that 𝑋 is less than 𝑥.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 5

Mean and Variance
 The expected (mean) value of 𝑋 is:


 X    xpX ( x)dx


 The variance of 𝑋 is defined as:

var  X    X   X 
2
  X 2   X 
2

 

 The standard deviation is often useful defined as

std  X   var[ X ]

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 6

Mean and Variance
 If 𝑇(𝑋) is a function of 𝑋 (called a statistic), the mean of 𝑇(𝑋) is given by


T ( X )   T ( x) p X ( x)dx


 The variance of 𝑇(𝑋) is also defined as:

var T ( X )   T ( X )  (T ( X )) 2   T ( X ) 2    T ( X ) 

 

 The 𝑘th moment is defined as: (𝑘 = 3, skewness, 𝑘 = 4, kurtosis)


 X  [ X ]   
 
k k
( x [ X ]) p X ( x)dx
 


Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 7

Expectation
 For the discrete case

 f    p( x) f ( x)
x

 For the continuous case

 f    p( x) f ( x)dx

 Conditional expectation

x  f | y    p( x | y ) f ( x)
x

x  f | y    p( x | y) f ( x)dx

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 8

Expectation
 The expectation of a random variable is not necessarily the value that we
should expect a realization to have. Let Ω = [−1,1] with the probability 𝑃 being
uniform: 1 1
2 I
P( I )  dx  I , I  [1,1]
2

 Consider the following two random variables:

X 1 :[1,1]  , X 1 ( )  1 
2   0
X 2 :[1,1]  , X 2 ( )  
0   0
 We can see that
1 1
1 1
 1  1  
X  X  dx   1 dx  1,
1
2 1
2
1 1
1 1
 X 2    X 2   dx   2 dx  1
2 2
although 0 0

X 2 ( )  1   .
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 9
Uniform Random Variable
 Consider the Uniform random variable 𝒰 𝑥 𝑎, 𝑏 :
1
U ( x | a, b)  ( a  x  b)
ba
 x for 0  x  1,
 What is the CDF of a uniform random variable 𝒰 𝑥 0,1 ? 
PU ( x)  1, for x  1
 0, otherwise


 You can show that the mean, 2nd moment and variance of 𝒰 𝑥 𝑎, 𝑏 are:
ab a 2  ab  b 2 (b  a ) 2
 x | a, b   ,  x | a, b  
2
, var  x | a, b  
2 3 12
 Note that it is possible for 𝑝(𝑥) > 1 but the density still needs to integrate to 1.
For example, note that
 1
U ( x | 0,1/ 2)  2  1 x  0, 
 2
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 10
The Gaussian Distribution
 A random variable 𝑋 ∈ ℝ is Gaussian or normally distributed 𝑋~𝒩 𝜇, 𝜎 2 if:
 1 2
t
1
P  X  t 
2 
exp   ( x   ) dx
2   2 
2

 The probability density is:

 1 
N  x,  ,  2  
1
exp   2 ( x   ) 2 
2 2  2 

To show that this density is normalized note the following trick:
  
 1   1   1 
Let I   exp   2 ( x   ) 2 dx, then : I 2    exp   2 ( y   ) 2  exp   2 ( x   ) 2  dxdy
  2     2   2 
 2 
 1 2  1 2
Set r  ( x   )  ( y   ) , then : I   0  2 2 
    0   2 2 r rdr
2 2 2 2
exp r rdrd 2 exp
0

 1 
Thus : I 2    exp   2 u du    2 2  e   1  2 2
0  2 
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 11
The Gaussian Distribution
 A random variable 𝑋 ∈ ℝ is Gaussian or normally distributed 𝑋~𝒩 𝜇, 𝜎 2 if:
 1 
N  x,  ,  2  
1
exp   2 ( x   ) 2 
2 2  2 

 The following can be shown easily with direct integration:


1  1 2
X   2 
exp   ( x   )  xdx   ,
2   2 2


1  1 2 2  X    2    2
 X       x dx   2   2 , var[ X ] 
2
exp  ( x )   
2 2   2 
2

The following integrals are useful in these derivations :

  

 exp  u du   ,  u exp  u du  0,  u exp  u du 
2 2 2 2

  
2

 We often work with the precision of a Gaussian 𝜆 = 1/𝜎2. The higher the 𝜆 the
narrower the distribution is.
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 12
CDF of a Gaussian
 Plot of the Standard Normal 𝒩(0,1) and CDF.
PDF
0.4

Run MatLab function gaussPlotDemo 0.35

from Kevin Murphys’ PMTK 0.3

0.25

0.2

0.15

CDF 0.1

N  x;0,1
100

90 0.05

80 0
-3 -2 -1 0 1 2 3
70
x
F  x;  ,     N ( z |  ,
60
2 2
50
)dz

40

20
F  x;  ,  2  
1
2
 
1  erf z / 2  , z  ( x   ) / 

  x;0,1
x
2
erf  x    dt
10
t 2
e
0
-3 -2 -1 0 1 2 3  0

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 13

CDF and the Standard Normal
 Assume 𝑋 ~ 𝒩(0,1)
 Define Φ(𝑥) = 𝑃(𝑋 < 𝑥) = Cumulative Distribution of 𝑋

x
 ( x)   p( z )dz
z  

1
x
 z2 

2 
z  
exp  dz
 2

𝑥−𝜇
 Assume 𝑋 ~ 𝒩(𝜇, 𝜎2). Then F x = 𝑃(𝑋 < 𝑥|𝜇, 𝜎2) = Φ .
𝜎

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 14

Univariate Gaussian

[X ]  μ
var[ X ]   2

1  ( x   )2 
p( x)  exp  
2   2 2


 Shorthand: We say 𝑋 ~ 𝒩(𝜇, 𝜎2) to mean “𝑋 is distributed as a Gaussian with

parameters 𝜇 and 𝜎2 ”.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 15

Univariate Gaussian
 Representation of symmetric phenomena without p( x) 
1  ( x   )2 
exp  
2   2 2

long tails.
 Inappropriate for skewness, fat tails,
multimodality, etc.
 The popularity of the Gaussian is related to facts as
 Completely defined in terms of the mean and the variance
 The Central Limit Theorem shows that the sum of i.i.d. random variables has
approximately a Gaussian distribution making it an appropriate choice for
modeling noise (limit of additive small effects)
 The Gaussian distribution makes the least assumptions (maximum entropy) from
all distributions with given mean and variance.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 16

The Normal Model
 The normal (or Gaussian) distribution 𝑋 ~ 𝒩(𝜇, 𝜎2)
1  1 2
exp   2  x    
2  2 

is one of the most studied and one of the most used distributions

 Sample 𝑥1, . . . , 𝑥𝑛 from a 𝒩 𝜇, 𝜎2 normal distribution 0.4

0.35

In a typical inference problem, we are interested in: 0.3

0.25

 Estimation of 𝜇, 𝜎
0.2

0.15

 Confidence regions on 𝜇, 𝜎 0.1

0.05

 Test on (𝜇, 𝜎) and comparison with other samples 0

-4 -3 -2 -1 0 1 2 3 4

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 17

Datasets: Normal Data
 Normaldata : Relative changes in reported larcenies between 1991 and 1995
(relative to 1991) for the 90 most populous US counties (Source: FBI)1
4

3.5

Normal data 2.5

Matlab implementation 2

1.5

0.5

0
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

From Bayesian Core, J.M. Marin and C.P. Roberts, Chapter 2 (available online)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 18
Datasets: CMBData
 CMBdata: Spectral representation of the cosmological microwave background
(CMB), i.e. electromagnetic radiation from photons back to 300,000 years
after the Big Bang, expressed as difference in apparent temperature from the
mean temperature.1 6

1
 MLE based
0.9
5 estimates of
0.8 𝜇 and 2 used in
0.7 4
constructing the
0.6
Gaussian shown
0.5 3
with the solid line
0.4

 Maximum
0.3
2
0.2

0.1
Likelihood
1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Estimators (MLE)
will be discussed
CMBdata 0
-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 in a follow up

Matlab implementation lecture

Normal estimation

From Bayesian Core, J.M. Marin and C.P. Roberts, Chapter 2 (available online)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 19
Quantiles
 Recall that the probability density function 𝑝𝑋(·) of a random variable 𝑋 is
defined as the derivative of the cumulative density function, so that
x0

F ( x0 )  p

X ( x)dx

 The value 𝑦(𝛼) such that 𝐹(𝑦(𝛼) ) = 𝛼 is called the a-quantile of the
distribution with CDF 𝐹. The median is of course 𝐹(𝑦 0.5 ) = 0.5
  One can define tail area probabilities. The shaded
 Note that PX ()  p

X ( x)dx  1 regions each contain 𝛼/2 of the probability mass.
For 𝒩(0, 1), the leftmost cutoff point is Φ−1 (𝛼/2),
where Φ is the cdf of 𝒩(0, 1).

 If 𝛼 = 0.05, the central interval is 95%, and the

left cutoff is −1.96 and the right is 1.96. Figure
generated by quantileDemo from PMTK
a/2 a/2

 For 𝒩(𝜇, 𝜎2), the 95% interval becomes (𝜇 −

-1(a/2) 0 -1(1-a/2) 1.96𝜎, 𝜇 + 1.96𝜎) – often approximated as (𝜇2𝜎)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 20
Binary Variables
 Consider a coin flipping experiment with heads = 1 and tails = 0. With 𝜇 ∈
[0,1]
p ( x  1|  )  
p( x  0 |  )  1  

 This defines the Bernoulli distribution as follows:

Bern ( x |  )   x (1   )1 x
 Using the indicator function, we can also write this as:

Bern ( x |  )   ( x 1)
(1   ) ( x  0)

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 21

Bernoulli Distribution
 Recall that in general
[ f ]   p( x) f ( x), [ f ]   p( x) f ( x)dx
x

var[ f ]  [ f ( x) 2 ]  [ f ( x)]2
 For the Bernoulli distribution Bern ( x |  )   x (1   )1 x, we can easily show
from the definitions:
[ x]  
var[ x]   (1   )

[ x]   
x0,1
p ( x |  ) ln p( x |  )    ln   (1   ) ln(1   )

 Here H[𝑥] is the “entropy of the distribution”.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 22

Likelihood Function for Bernoulli Distribution
 Consider the i.i.d. data set D   x1 , x2 ,..., xN  in which we have 𝑚 heads (𝑥 =
1), and 𝑁 − 𝑚 tails (𝑥 = 0)

 The likelihood function takes the form:

N N
p (D |  )   p ( xn |  )   xn (1   )1 xn  m (1   ) N  m
n 1 n 1

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 23

Binomial Distribution
 Consider the discrete random variable 𝑋 ∈ 0,1,2, … , 𝑁 .

 We define the Binomial distribution as follows:

binomial distribution
0.35

N m
Bin ( X  m | N ,  )     (1   ) N  m 0.3

m 0.25 Bin ( N  10,   0.25)

 In our coin flipping experiment,
0.2

0.15

it gives the probability in 𝑁 flips to 0.1

get 𝑚 heads with 𝜇 being the Matlab Code

0.05

probability getting heads in one flip. 0

0 1 2 3 4 5 6 7 8 9 10

 It can be shown (see S. Ross, Introduction to Probability Models) that the limit
of the binomial distribution as 𝑁 → ∞, 𝑁𝜇 → 𝜆, is the Poisson(l) distribution.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 24

Binomial Distribution
 The Binomial distribution for 𝑁 = 10, and 𝜇 ∈ 0.25,0.9 is shown below using
MatLab function binomDistPlot from Kevin Murphys’ PMTK

 =0.250
 0.25  =0.900
0.9
0.35 0.4

0.3 0.35

0.25 0.3

0.25
0.2

0.2
0.15

0.15
0.1
0.1

0.05
0.05

0
0 1 2 3 4 5 6 7 8 9 10 0
0 1 2 3 4 5 6 7 8 9 10

Bin ( N ,  )

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 25

Mean,Variance of the Binomial Distribution
 Consider for independent events the mean of the sum is the sum of the
means, and the variance of the sum is the sum of the variances.

 Because 𝑚 = 𝑥1 + . . . + 𝑥𝑁, and for each observation the mean and variance
are known from the Bernoulli distribution:
N
[m]   mBin (m | N ,  )  [ x1  ...  xN ]  N 
m0
N
var[m]    m  [m] Bin (m | N ,  )  var[ x1  ...  xN ]  N  (1   )
2

m0

 One can also compute [m], [m 2 ] by differentiating (twice) the identity

wrt 𝜇. Try it!
N
N m
    (1   )
m0  m 
N m
1

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 26

Binomial Distribution: Normalization
To show that the Binomial is correctly normalized, we use the following:
 N   N   N  1
 With direct substitution can show:     (*)
 n   n  1  n 
N m
N

 Binomial theorem: (1  x)     x (**)

m0  m 

 This theorem is proved by induction using (*) and noticing:

N
N m N
 N  m N  N  m 1 N  N  m N 1  N  m
(1  x) N 1
    x (1  x)     x     x     x    x 
m0  m  m0  m  m0  m  m0  m  m 1  m  1 
 N m  N  N  m N 1   N  1 m N 1 N  1
*
N N
  m
      
N 1
1  x   x  x   1    x  x   x
 m 1  m    m 1  m  1  m 1  m  m 0  m 

 To finally show normalization using (**):

m N
N
N m N
 N    N   
    (1   ) N m
 (1   ) N
   
m 0  m   1   
 (1   )  1   1
m0  m   1  
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 27
Generalization of the Bernoulli Distribution
We are now looking at discrete variables that can take on one of 𝐾 possible
mutually exclusive states.

The variable is represented by a 𝐾-dimensional vector 𝒙 in which one of the

elements 𝑥𝑘 equals 1, and all remaining elements equal 0: 𝒙 =
0,0, … , 1,0, … , 0 𝑇 .
K

These vectors satisfy: x

k 1
k 1

 Let the probability of 𝑥𝑘 = 1 be denoted as 𝜇𝑘. Then

K K K
p( x |  )      k
k 1
xk
k
k 1
( xk 1)
, 
k 1
k  1, k  0

where 𝝁 = 𝜇1 , … , 𝜇𝐾 𝑇 .

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 28

Multinoulli/Categorical Distribution
The distribution is already normalized:
K K

 p( x |  )    
x x k 1
xk
k   k  1
k 1

 The mean of the distribution is computed as:

 x |     xp( x |  )   1 ,...,  K 
T

x

This is similar to the result for the Bernoulli distribution.

 The Multinoulli also known as the Categorical distribution often denoted as

(Mu here is the multinomial distribution):
Cat  x |    Multinoulli  x |    Mu  x |1,  
 The parameter 1 stands to emphasize that we roll a 𝐾 −sided dice once (𝑁 =
1) – see next for the multinomial distribution.
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 29
Likelihood: Multinoulli Distribution
 Let us consider an i.i.d. data set D   x1 ,..., xN  . The likelihood becomes:
N
N K K  xnk K N
p(D |  )    xnk
k   kn1   , where : mk  xnk
mk
k
n 1 k 1 k 1 k 1 n 1

is the # of observations of 𝑥𝑘 = 1.

 𝑚𝑘 is the “sufficient statistic” of the distribution.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 30

MLE Estimate: Multinoulli Distribution
To compute the maximum likelihood (MLE) estimate of 𝝁, we maximize an
augmented log-likelihood
 K  K  K 
ln p (D |  )  l   k  1   mk ln k  l   k  1
 k 1  k 1  k 1 
mk
 Setting the derivative wrt 𝜇𝑘 equal to zero: k  
l
 Substitution into the constraint
K

K m k K k 
mk

mk

k 1
k 1  k 1

l
 1  l   mk 
k 1 m
K

k
N
k 1

As expected, this is
the fraction in the 𝑁
observations of 𝑥𝑘 = 1
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 31
Multinomial Distribution
 We can also consider the joint distribution of 𝑚1, … , 𝑚𝐾 in 𝑁 observations
conditioned on the parameters 𝝁 = (𝜇1, … , 𝜇𝐾).

 From the expression for the likelihood given earlier

K
p (D |  )    kmk
k 1

the multinomial distribution Mu (m1 ,..., mK | N ,  ) with parameters 𝑁 and 𝝁 takes

the form:
N!
p (m1 , m2 ,..., mK | N , 1 , 2 ,...,  K )  1m1 2m2 ... Kmk where  k 1 mk  N
K

m1 !m2 !...mK !

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 32

Example: Biosequence Analysis
 Consider a set of DNA cgat acg gggtcgaa
Sequences
caat ccg agatcgca
sequences where there caat ccg t g t t g g g a N=1:10
are 10 rows (sequences) caat cgg catgcggg
and 15 columns (locations cgagccg cgtacgaa
along the genome). cat acgg agcacgaa
t aat ccg ggcatgta
 Several locations are cgagccg agtacaga
ccat ccg cgtaagca
conserved by evolution ggat acg agatgaca
(e.g., because they are part Location along the genome
of a gene coding region), since
the corresponding columns tend to be pure e.g., column 7 is all 𝑔’s.

 To visuallize the data (sequence logo), we plot the letters 𝐴, 𝐶, 𝐺 and 𝑇 with a
font size proportional to their empirical probability, and with the most probable
letter on the top.
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 33
Example: Biosequence Analysis
 The empirical probability distribution at location 𝑡, is obtained by normalizing
the vector of counts (see MLE estimate)
𝑁 𝑁 𝑁 𝑁
1
෡𝑡 =
𝜽 ෍𝕀(𝑋𝑖𝑡 = 1) , ෍𝕀(𝑋𝑖𝑡 = 2) , ෍𝕀(𝑋𝑖𝑡 = 3) , ෍𝕀(𝑋𝑖𝑡 = 4)
𝑁
𝑖=1 𝑖=1 𝑖=1 𝑖=1
2

 This distribution is known

as a motif.

 Can also compute the

Bits
1

most probable letter

in each location; this is the
consensus sequence.
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Sequence Position

Use MatLab function seqlogoDemo from Kevin Murphys’ PMTK

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 34
Summary of Discrete Distributions
 A summary of the multinomial and related discrete distributions is summarized
below on a Table from Kevin Murphy’s textbook

 𝑛 = 1 (one roll of the dice), 𝑛 = − (𝑁 rolls of the dice)

 𝐾 = 1 (binary variables), 𝐾 = − (1 −of−𝐾 encoding)

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 35

The Poisson Distribution
 We say that 𝑋 ∈ {0,1,2,3, . . } has a Poisson distribution with parameter 𝜆 > 0,
if its pmf is
l lx
X ~ Poi (l ) : Poi ( x | l )  e
x!

 This is a model for counts of rare events.

Poi(l=1.000) Poi(l=10.000)
0.4 0.14

0.35 0.12

0.3
0.1

0.25
0.08

0.2
0.06

0.15
0.04
0.1

0.02
0.05

0
0 0 5 10 15 20 25 30
0 5 10 15 20 25 30

Use MatLab function poissonPlotDemo from Kevin Murphys’ PMTK

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 36
The Empirical Distribution
 Given data, 𝒟 = {𝑥1, … , 𝑥𝑁}, 𝑥𝑖 ~𝑝 𝑥 , define the empirical distribution as:
1 N
1 if xi  A
pemp ( A)    xi ( A), Dirac Measure :  xi ( A)  
N i 1 0 if xi  A
 We can also associate weights with each sample:
N N N
1
Generalize pemp ( x) 
N
 i 1
xi ( x)  pemp ( x)   wi xi ( x) , 0  wi  1,  wi  1
i 1 i 1

 This corresponds to a histogram with spikes at each sample point with height
equal to the corresponding weight. This distribution assigns zero weight to
any point not in the dataset.
 Note that the “sample mean of 𝑓(𝑥)” is the expectation of 𝑓(𝑥) under the
empirical distribution:
N N
1 1
[ f ( x)]  pemp ( x ) [ f ( x)]   f ( x)  xi ( x)dx   f (x )
i
i 1 N N i 1
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 37

Business Analytics Project
100% (1)
Business Analytics Project
11 pages
cs109 Final Cheat 3 PDF
No ratings yet
cs109 Final Cheat 3 PDF
13 pages
Lecture 8.2 - Variational Quantum Eigensolver
No ratings yet
Lecture 8.2 - Variational Quantum Eigensolver
27 pages
ALL ST218 Lecture Notes
No ratings yet
ALL ST218 Lecture Notes
87 pages
Intro To Data Science Lecture 2
No ratings yet
Intro To Data Science Lecture 2
12 pages
M131-Lecture Notes No. 4
No ratings yet
M131-Lecture Notes No. 4
58 pages
College Statistics
No ratings yet
College Statistics
244 pages
Lecture 6 - Fall 2023
No ratings yet
Lecture 6 - Fall 2023
39 pages
STA80006 Weeks7-12 PDF
No ratings yet
STA80006 Weeks7-12 PDF
29 pages
Distribution and Statistical Interference
No ratings yet
Distribution and Statistical Interference
43 pages
Sta 2200 Probability & Statistics II (Course Outline With Notes)
No ratings yet
Sta 2200 Probability & Statistics II (Course Outline With Notes)
155 pages
Lec25 MonteCarloMethods
No ratings yet
Lec25 MonteCarloMethods
57 pages
Lecture 6 - Fall 2023
No ratings yet
Lecture 6 - Fall 2023
38 pages
11 Normal Distribution
No ratings yet
11 Normal Distribution
48 pages
280 LN Deller PART1 WITH ALL SUPPLEMENTS Fall2015 PDF
No ratings yet
280 LN Deller PART1 WITH ALL SUPPLEMENTS Fall2015 PDF
118 pages
Notes
No ratings yet
Notes
56 pages
ProbabilityStatistics_Probability3 (1)
No ratings yet
ProbabilityStatistics_Probability3 (1)
9 pages
Statistics
No ratings yet
Statistics
60 pages
Econ 1006 Summary Notes 5
No ratings yet
Econ 1006 Summary Notes 5
16 pages
MS Theory Exam Study Guide
No ratings yet
MS Theory Exam Study Guide
50 pages
S1B 16 All Lectures
No ratings yet
S1B 16 All Lectures
221 pages
Random Variable (slide)
No ratings yet
Random Variable (slide)
22 pages
MATH 181 1 SEMESTER/AY 2018-2019: Frequently Used Continuous Random Variables
No ratings yet
MATH 181 1 SEMESTER/AY 2018-2019: Frequently Used Continuous Random Variables
8 pages
Lecture01 Uppsala EQG 12
No ratings yet
Lecture01 Uppsala EQG 12
39 pages
Distribución Gaussiana
No ratings yet
Distribución Gaussiana
26 pages
Lecture 1-1_Review of Probability
No ratings yet
Lecture 1-1_Review of Probability
36 pages
Probability Space and Random Variable Proporties
No ratings yet
Probability Space and Random Variable Proporties
21 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Probability Theory For Machine Learning: Chris Cremer September 2015
No ratings yet
Probability Theory For Machine Learning: Chris Cremer September 2015
40 pages
Machine Learning and Pattern Recognition Week 2 Univariate Gaussian
No ratings yet
Machine Learning and Pattern Recognition Week 2 Univariate Gaussian
3 pages
MECH 262 - Notes (Statistics)
No ratings yet
MECH 262 - Notes (Statistics)
7 pages
Probability and Statistics: Dr. K.W. Chow Mechanical Engineering
No ratings yet
Probability and Statistics: Dr. K.W. Chow Mechanical Engineering
113 pages
Chapter 4 Expected Values
No ratings yet
Chapter 4 Expected Values
13 pages
Lecture Notes
No ratings yet
Lecture Notes
90 pages
Lec3 IntroToProbabilityAndStatistics
No ratings yet
Lec3 IntroToProbabilityAndStatistics
45 pages
323 egec
No ratings yet
323 egec
18 pages
MT233 October 2019-1
No ratings yet
MT233 October 2019-1
39 pages
Slide Mathematical Statistics 220802
No ratings yet
Slide Mathematical Statistics 220802
254 pages
Lecture 8
No ratings yet
Lecture 8
76 pages
Lecture Notes For STAT2602
No ratings yet
Lecture Notes For STAT2602
104 pages
Aaoc C111: Probability & Statistics: Bits-Pilani Hyderabad Campus
No ratings yet
Aaoc C111: Probability & Statistics: Bits-Pilani Hyderabad Campus
41 pages
Unit-1-Probability (PAS)
No ratings yet
Unit-1-Probability (PAS)
103 pages
STAT 552 Probability and Statistics Ii: Short Review of S551
No ratings yet
STAT 552 Probability and Statistics Ii: Short Review of S551
51 pages
6
No ratings yet
6
108 pages
Refresher Probabilities Statistics PDF
No ratings yet
Refresher Probabilities Statistics PDF
3 pages
Chapter6 Probability
No ratings yet
Chapter6 Probability
20 pages
Unit 2 (2) - 1
No ratings yet
Unit 2 (2) - 1
37 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
Module 6 Common Continuous Probability Distribution
No ratings yet
Module 6 Common Continuous Probability Distribution
45 pages
doc-cours_MathsV
No ratings yet
doc-cours_MathsV
69 pages
Basicsofstatisticalmethods PDF
No ratings yet
Basicsofstatisticalmethods PDF
85 pages
Fundamentals of Mathematical Statistics: Pavol Oršanský
No ratings yet
Fundamentals of Mathematical Statistics: Pavol Oršanský
85 pages
Newbold SBE9e Accessible CH05
No ratings yet
Newbold SBE9e Accessible CH05
74 pages
✨SMA 240 Probability and Statistics 1 Lecture Notes
No ratings yet
✨SMA 240 Probability and Statistics 1 Lecture Notes
36 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
Basic Probability and Statistics: Random Variables Distribution Functions Various Probability Distributions
No ratings yet
Basic Probability and Statistics: Random Variables Distribution Functions Various Probability Distributions
39 pages
ML_Lec 2- Review of probability and statistics
No ratings yet
ML_Lec 2- Review of probability and statistics
30 pages
Module Wise Important Formulae
No ratings yet
Module Wise Important Formulae
45 pages
8: Continuous Random Variables and Probability Density Functions
No ratings yet
8: Continuous Random Variables and Probability Density Functions
20 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
From Everand
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
Mohmmad Khaja Shareef
No ratings yet
Lecture 7 - Introduction To Quantum Noise Bonus
No ratings yet
Lecture 7 - Introduction To Quantum Noise Bonus
13 pages
Durrande 2020
No ratings yet
Durrande 2020
90 pages
Ek 2020
No ratings yet
Ek 2020
203 pages
Dai 2020
No ratings yet
Dai 2020
62 pages
Seminar em
No ratings yet
Seminar em
51 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Lecture 1.1 - Single States
No ratings yet
Lecture 1.1 - Single States
49 pages
Lecture 4.1 - Quantum Query Algorithms
No ratings yet
Lecture 4.1 - Quantum Query Algorithms
38 pages
Lec31 32 CaterpillarRegressionExample
No ratings yet
Lec31 32 CaterpillarRegressionExample
108 pages
Lec20 RidgeRegression
No ratings yet
Lec20 RidgeRegression
21 pages
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
No ratings yet
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
31 pages
Lecture 3 - Entanglement in Action
No ratings yet
Lecture 3 - Entanglement in Action
36 pages
Lec24 BayesianLinearRegression
No ratings yet
Lec24 BayesianLinearRegression
29 pages
Lec33 MetropolisHastings
No ratings yet
Lec33 MetropolisHastings
66 pages
Lec35 SequentialImportanceSampling
No ratings yet
Lec35 SequentialImportanceSampling
46 pages
Lec27 AcceptReject
No ratings yet
Lec27 AcceptReject
30 pages
Lec21 BiasVarianceDecomposition
No ratings yet
Lec21 BiasVarianceDecomposition
15 pages
Lec23 Evidence4Regression
No ratings yet
Lec23 Evidence4Regression
38 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Lec29 ImportanceSampling
No ratings yet
Lec29 ImportanceSampling
84 pages
Lec17 PriorModeling
No ratings yet
Lec17 PriorModeling
37 pages
Lec18 HierarchicalBayesianModels
No ratings yet
Lec18 HierarchicalBayesianModels
20 pages
Lec9 MultivariateGaussian
No ratings yet
Lec9 MultivariateGaussian
60 pages
Lec16 SummarizingPosteriors BayesianModelSelection
No ratings yet
Lec16 SummarizingPosteriors BayesianModelSelection
59 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
Lec22 Introduction2BayesianRegression
No ratings yet
Lec22 Introduction2BayesianRegression
42 pages
Lec14 15 GenerativeModelsForDiscreteData
No ratings yet
Lec14 15 GenerativeModelsForDiscreteData
74 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Soal UTS Statistik Multivariat
No ratings yet
Soal UTS Statistik Multivariat
23 pages
John Ehlers - Left Brained Concepts For Traders in Their Right Minds - AfTA2007
100% (2)
John Ehlers - Left Brained Concepts For Traders in Their Right Minds - AfTA2007
33 pages
QBM101 Business Statistics
No ratings yet
QBM101 Business Statistics
4 pages
Logistic Regression Assignment
No ratings yet
Logistic Regression Assignment
20 pages
0 Anova Oneway A
No ratings yet
0 Anova Oneway A
10 pages
Cochran-Mantel-Haenszel Test - Handbook of Biological Statistics
No ratings yet
Cochran-Mantel-Haenszel Test - Handbook of Biological Statistics
8 pages
Bayesian Classifier Notes
No ratings yet
Bayesian Classifier Notes
9 pages
Kolmogorov Uji Normalitas
No ratings yet
Kolmogorov Uji Normalitas
19 pages
01 Probability and Probability Distributions
No ratings yet
01 Probability and Probability Distributions
18 pages
MESPRO
No ratings yet
MESPRO
47 pages
Skittles Project 2
No ratings yet
Skittles Project 2
10 pages
Chapter 7 The Central Limit Theorem: Prepared by The College of Coastal Georgia For Openstax Introductory Statistics
No ratings yet
Chapter 7 The Central Limit Theorem: Prepared by The College of Coastal Georgia For Openstax Introductory Statistics
30 pages
Chapter2 Sampling Simple Random Sampling
No ratings yet
Chapter2 Sampling Simple Random Sampling
24 pages
Chapter 2 Performance Task 2 Normal Distribution (1)
No ratings yet
Chapter 2 Performance Task 2 Normal Distribution (1)
1 page
List of Books of Statistics
100% (2)
List of Books of Statistics
13 pages
Inferential Statistics For Psychology
No ratings yet
Inferential Statistics For Psychology
20 pages
Quantitative Psychology The 85th Annual Meeting Of The Psychometric Society Virtual Marie Wiberg pdf download
100% (1)
Quantitative Psychology The 85th Annual Meeting Of The Psychometric Society Virtual Marie Wiberg pdf download
82 pages
Assignment 1-302
No ratings yet
Assignment 1-302
5 pages
2DI90 Probability & Statistics: 2DI90 - Chapter 6 of MR
No ratings yet
2DI90 Probability & Statistics: 2DI90 - Chapter 6 of MR
33 pages
Readings For Lecture 5,: S S N N S N
No ratings yet
Readings For Lecture 5,: S S N N S N
16 pages
Sta 32101 Questions-Estimation
No ratings yet
Sta 32101 Questions-Estimation
9 pages
Memory-Universal Prediction of Stationary Random Processes: Dharmendra S. Modha,, and Elias Masry
No ratings yet
Memory-Universal Prediction of Stationary Random Processes: Dharmendra S. Modha,, and Elias Masry
17 pages
Statistic..past Question
No ratings yet
Statistic..past Question
19 pages
Statisticsprobability11 q4 Week2 v4
100% (1)
Statisticsprobability11 q4 Week2 v4
10 pages
ITM Chapter 6 On Testing of Hypothesis
No ratings yet
ITM Chapter 6 On Testing of Hypothesis
39 pages
Chapter 10 Anova
No ratings yet
Chapter 10 Anova
7 pages
Using A Deck of Playing Cards To Teach Basics Concepts-1
No ratings yet
Using A Deck of Playing Cards To Teach Basics Concepts-1
10 pages
Statistical Modelling Assignment II
No ratings yet
Statistical Modelling Assignment II
3 pages
American Statistical Association
No ratings yet
American Statistical Association
19 pages