0% found this document useful (0 votes)

22 views26 pages

Probs Stats

Xác suất thống kê PTIT

Uploaded by

buiminhtung01052004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views26 pages

Probs Stats

Xác suất thống kê PTIT

Uploaded by

buiminhtung01052004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

2.6.

Probability and Statistics

• Probability: Reasoning under uncertainty (given a probabilistic model
of a process, we can reason about the likelihood of different events)

• Statistics: the study of data: collecting, analyzing, interpreting, and

drawing conclusions from datasets. It often involves making unknown
patterns of a population based on a sample.
2.6.1. Example: Tossing coins
• Supposed the coin is fair ( P(head) = 0.5 ), we can simulate multiple
draws with the Multinomial function
• Each time you run this sampling process, you get a different result
• As the number of samples grows, the sample estimates converge to
the true underlying probabilities (Central Limit Theorem).
2.6.2. Formal notations
• Set of possible outcomes (sample space) 𝑆 = ℎ𝑒𝑎𝑑𝑠, 𝑡𝑎𝑖𝑙𝑠 if the task is
tossing a coin.
• If we’re tossing 2 coins: 𝑆 =
{ ℎ𝑒𝑎𝑑𝑠, ℎ𝑒𝑎𝑑𝑠 , ℎ𝑒𝑎𝑑𝑠, 𝑡𝑎𝑖𝑙𝑠 , 𝑡𝑎𝑖𝑙𝑠, ℎ𝑒𝑎𝑑𝑠 , (𝑡𝑎𝑖𝑙𝑠, 𝑡𝑎𝑖𝑙𝑠)}
+ Example: rolling a dice: 𝑆 = {1, 2, 3, 4, 5, 6}
• Given a random variable 𝑋, 𝑃(𝑋 = 𝑣) denotes the probability of
𝑋 𝑡𝑎𝑘𝑖𝑛𝑔 𝑣𝑎𝑙𝑢𝑒 𝑣
• Similarly, 𝑃 1 ≤ 𝑋 ≤ 3 indicates the probability of event {1 ≤ 𝑋 ≤ 3}
• A probability function 𝑃 maps events onto real values:
𝑃: 𝐴 ⊆ 𝑆 → [0,1]
• The probability, denoted 𝑃(𝐴), of an event 𝐴 in sample space 𝑆 has the
following properties:
1. The probability of any event A is a real non-negative number:
𝑃 𝐴 ≥0
2. The probability of the entire sample space is 1:
𝑃 𝑆 =1
3. For any sequence of events 𝐴1 , 𝐴2 , … that are mutually exclusive
(𝐴𝑖 ∩ 𝐴𝑗 = ∅ f𝑜𝑟 𝑎𝑙𝑙 𝑖 ≠ 𝑗), the probability that any of them happen
is equal to the sum of their individual probabilities:
∞
∞
𝑃 ራ 𝐴𝑖 = ෍ 𝑃(𝐴𝑖 )
𝑖=1
𝑖=1
2.6.3. Random variables

• 2 types: discreet and continuous

• Example:
+ 𝑋 is the number rolled on a dice (discreet)
+ 𝑌 is the height of a group sampled at random from a population
(continuous)
Let 𝑋 be the exact amount of
rain tomorrow:
𝑃 𝑋 =2 =?

∞
Probability density function 𝑝 𝑥 with 𝑃(𝑋) = ‫׬‬−∞ 𝑝 𝑥 𝑑𝑥

2
Example: 𝑃(𝑋 ≤ 2) = ‫׬‬0 𝑝 𝑥 𝑑𝑥
2.6.4. Multiple random variables
• Joint probability 𝑃(𝐴 = 𝑎, 𝐵 = 𝑏) denotes the probability of event 𝐴 = 𝑎 𝑎𝑛𝑑 𝐵 =
𝑏 happening at the same time:
𝑃 𝐴 = 𝑎, 𝐵 = 𝑏 ≤ 𝑃(𝐴 = 𝑎)
𝑃(𝐴 = 𝑎, 𝐵 = 𝑏) ≤ 𝑃(𝐵 = 𝑏)
+ To get P(A=a), take sum of all 𝑃(𝐴 = 𝑎, 𝐵 = 𝑣) with all values v that random variable
B can get :
𝑃(𝐴 = 𝑎) = σ𝑣 𝑃(𝐴 = 𝑎, 𝐵 = 𝑣)
• Conditional probability 𝑃 𝐴 = 𝑎 𝐵 = 𝑏 denotes the probability of event 𝐴 = 𝑎,
once the condition 𝐵 = 𝑏 is met
𝑃(𝐴 = 𝑎, 𝐵 = 𝑏)
𝑃(𝐴 = 𝑎, 𝐵 = 𝑏) =
𝑃(𝐵 = 𝑏)
+ For 2 disjoint events 𝐵 and 𝐵’ : 𝑃(𝐵 ∪ 𝐵′|𝐴 = 𝑎) = 𝑃(𝐵|𝐴 = 𝑎) + 𝑃(𝐵’|𝐴 = 𝑎)
Bayes theorem
• With the conditional probability equation, we have:
𝑃(𝐴, 𝐵) = 𝑃(𝐴|𝐵)𝑃(𝐵) = 𝑃(𝐵|𝐴)𝑃(𝐴)

𝑃 𝐵 𝐴 𝑃(𝐴)
→𝑃 𝐴𝐵 =
𝑃(𝐵)

P(A|B): posterior
P(B|A): likelihood
P(A): prior
P(B): evidence
• Example: if we know the prevalence of symptoms for a disease, we can
determine how likely someone has the disease based on the symptoms.
• In case we don’t have access to 𝑃(𝐵), a simpler version of Bayes
theorem can be used:
𝑃(𝐴|𝐵) ∝ 𝑃 𝐵 𝐴 𝑃(𝐴)
• Since 𝑃(𝐴|𝐵) must be normalized to 1, meaning σ𝑎 𝑃 𝐴 = 𝑎 𝐵 = 1,
we also have:
𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃(𝐴|𝐵) =
σ𝑎 𝑃 𝐵 𝐴 = 𝑎 𝑃(𝐴 = 𝑎)

σ𝑎 𝑃 𝐵 𝐴 = 𝑎 𝑃(𝐴 = 𝑎) = σ𝑎 𝑃 𝐵 𝐴 = 𝑎 = 𝑃(𝐵)
Independence
• Random variables A and B are independent if changes on value of A does not change the
probability distribution of B and vice versa.
𝐴, 𝐵 are independent (𝐴 ⊥ 𝐵)
→ 𝑃 𝐴 𝐵 = 𝑃 𝐴 → 𝑃 𝐴, 𝐵 = 𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐴 𝑃(𝐵)

• Conditional Independence: random variables A and B are conditionally independent

given a third variable C iff 𝑃(𝐴, 𝐵|𝐶) = 𝑃(𝐴|𝐶)𝑃(𝐵|𝐶)
• Example: broken bones and cancer are independent if we consider the whole population.
However, if we condition on being in a hospital, broken bones are negatively correlated
with having cancer.
Example: Doctor administer HIV test to a patient. 𝐷1 = 1 means positive and 𝐷1 = 0 means negative. H is
the HIV status of the patient. Assume P(H=1) = 0.0015

𝑃(𝐻 = 1|𝐷1 = 1) = ?

𝑃(𝐷1 = 1) = 𝑃(𝐷1 = 1, 𝐻 = 0) + 𝑃(𝐷1 = 1, 𝐻 = 1)

= 𝑃(𝐷1 = 1|𝐻 = 0) 𝑃(𝐻 = 0) + 𝑃(𝐷1 = 1| 𝐻 = 1) 𝑃(𝐻 = 1)
= 0.01 × 0.9975 + 1 × 0.0015
= 0.011475

Using Bayes rules:

𝑃 𝐷1=1|𝐻=1 𝑃(𝐻=1) 0.0015
→ 𝑃 𝐻 = 1 𝐷1 = 1 = = 0.0011475 = 0.1306
𝑃(𝐷1=1)

→ There’s 13% chance the patient have HIV if diagnosed positive, even though the test is very accurate
according to the table.
This is counter-intuitive
• Second test is not as accurate as the first one

𝑃(𝐷2 = 1) = 0.98 × 0.0015 + 0.03 × 0.9975 = 0.0314

0.98 × 0.0015
𝑃(𝐻 = 1|𝐷2 = 1) = = 0.0468
0.0314

Second test also came out positive with 4.68% of getting HIV.
Assuming conditional independence for test 1 and 2, we have:

𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 0) = 𝑃(𝐷1 = 1|𝐻 = 0) 𝑃(𝐷2 = 1|𝐻 = 0) = 0.0003

𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 1) = 𝑃(𝐷1 = 1|𝐻 = 1) 𝑃(𝐷2 = 1|𝐻 = 1) = 0.98

𝑃(𝐷1 = 1, 𝐷2 = 1)
= 𝑃(𝐷1 = 1, 𝐷2 = 1, 𝐻 = 0) + 𝑃(𝐷1 = 1, 𝐷2 = 1, 𝐻 = 1)
= 𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 0) 𝑃(𝐻 = 0) + 𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 1) 𝑃(𝐻 = 1)
= 0.00177

𝑃 𝐷1 = 1, 𝐷2 = 1 𝐻 = 1 𝑃(𝐻 = 1)
𝑃(𝐻 = 1|𝐷1 = 1, 𝐷2 = 1) = = 0.8307
𝑃(𝐷1 = 1, 𝐷2 = 1)
The second test significantly improved the estimate when combined with the first one
2.6.6 Expectations
• Expectation of random variable 𝑋 is defined as:
𝐸 𝑋 = 𝐸𝑥~𝑃 [𝑥] = ෍ 𝑥𝑃(𝑋 = 𝑥)
𝑥
• For densities, we have 𝐸 𝑋 = ‫)𝑥(𝑝𝑑 𝑥 ׬‬
• Expected value of some function f(x):
𝐸𝑥~𝑃 [𝑓(𝑥)] = σ𝑥 𝑓 𝑥 𝑃(𝑥) = ‫𝑥𝑑 𝑥 𝑝 𝑥 𝑓 ׬‬
Variance

𝑉𝑎𝑟 𝑋 = 𝐸[(𝑋 − 𝐸 𝑋 )2 ] = 𝐸 𝑋 2 − 𝐸[𝑋]2

• The variance of a function of a random variable:

𝑉𝑎𝑟𝑥~𝑃 𝑓 𝑥 = 𝐸𝑥~𝑃 𝑓 2 𝑥 − 𝐸𝑥~𝑃 𝑓(𝑥)2

• Standard deviation:
𝜎= 𝑉𝑎𝑟(𝑋)
Expectation and variance of vector
Apply the formula elementwise:
𝝁 ≝ 𝐸𝒙~𝑃 [𝒙]

𝝁 has coordinates 𝜇𝑖 = 𝐸𝐱~𝑃 [𝑥𝑖 ]

Covariance matrix:
𝚺 ≝ 𝐶𝑜𝑣𝐱~𝑃 𝐱 = 𝐸𝐱~𝑃 𝐱 − 𝛍 𝐱 − 𝛍 𝑇

Let 𝐯 be a vector of the same size as 𝐱

𝐯 T 𝚺𝐯 = 𝐸𝐱~𝑃 [𝐯 T (𝐱 − 𝛍)(𝐱 − 𝛍)T 𝐯]= 𝑉𝑎𝑟𝐱~𝑃 [𝐯 T 𝐱]

𝚺 allows us to compute variance for any linear function of 𝐱 with matrix multiplication. The
off-diagonal elements show the correlation between coordinates.
0 means low correlation, large positive value means they are strongly correlated
Maximum likelihood
• Suppose we have a model with parameters 𝜽 and data samples 𝑋, we want to find
the most likely value for the parameters:

Using Bayes rules:

𝑃(𝑋), 𝑃(𝜽) does not depend on 𝜽 (uninformative prior)

The probability of the data given the parameter 𝑃(𝑋|𝜃) is called the likelihood
Numerical Optimization and Negative log-
likelihood
• Instead of finding 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃(𝑋|𝜽) we can find 𝑎𝑟𝑔𝑚𝑎𝑥 log(𝑃(𝑋|𝜽)),
𝜃 𝜃
since log(x) is a monotone increasing function
𝑎𝑟𝑔𝑚𝑎𝑥𝜃 log(𝑃(𝑋|𝜽)) = 𝑎𝑟𝑔𝑚𝑖𝑛𝜽 − log(𝑃 𝑋 𝜽 )

• Related to information theory, entropy is the amount of randomness in a

random variable

• If we take the negative log-likelihood and divide by n samples, we get cross-

entropy (a way to measure classification performance)
• Due to independence assumption, most probabilities we see in ML
are products of individual probabilities:

• Using the product rule to compute derivative

• This needs n(n-1) multiplications, so it’s proportional to quadratic

time in the inputs (inefficient).
• Instead we can use negative log-likelihood

• Compute derivative:

• This needs n divisions and n sums -> Linear time

Example
• Given 𝑋 = {𝑥𝑖 }𝑛𝑖=1 is a random sample from an exponential
distribution with parameter 𝜆 > 0. It has the following p.d.f:
𝑝 𝑥 = 𝜆𝑒 −𝜆𝑥

The likelihood is: 𝐿(𝑋|𝜆) = ς𝑛𝑖=1 𝑝(𝑥𝑖 |𝜆)

= ς𝑛𝑖=1 𝜆𝑒 −𝜆𝑥𝑖
𝑛 −𝜆 σ𝑛
𝑖=1 𝑥𝑖
= 𝜆 𝑒
We want to find the maximum likelihood estimate:
We can find the maximum by taking the derivative and equate to 0:

The maximum likelihood estimate is

• Given X = [2.7, 4.9, 0.2, 4.9, 4.4, 18.7, 1.5, 0.9, 10.5, 1.3] following an
exponential distribution (n = 10)

-> The MLE is

Maximum likelihood for Continuous variables
• For continuous variables we want to compute within a range 𝜖

• Take negative log of this:

• Again, −𝑁𝑙𝑜𝑔(𝜖) does not depend on 𝜽

• We only need to optimize

Wah Industry Limited. Internship Report
100% (4)
Wah Industry Limited. Internship Report
52 pages
Sample Calculation Drainage Design Road Side Drain PDF
100% (3)
Sample Calculation Drainage Design Road Side Drain PDF
3 pages
01 04 Circles 4 PDF
0% (1)
01 04 Circles 4 PDF
9 pages
Probability and Statistics Cheat Sheet
100% (2)
Probability and Statistics Cheat Sheet
28 pages
2004, Vol.6, No.4, Pediatric Surgery PDF
100% (1)
2004, Vol.6, No.4, Pediatric Surgery PDF
95 pages
BT Inter Phone User Manual
100% (1)
BT Inter Phone User Manual
15 pages
Mumbai Pune Expressway
100% (3)
Mumbai Pune Expressway
12 pages
General Navigation
100% (4)
General Navigation
46 pages
The Witches' Devil by Roger J. Horne
No ratings yet
The Witches' Devil by Roger J. Horne
249 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Brochure AVEVA InitialDesign PDF
No ratings yet
Brochure AVEVA InitialDesign PDF
4 pages
AP Statistics Study Guide
100% (1)
AP Statistics Study Guide
12 pages
Week 4
No ratings yet
Week 4
84 pages
PE Pipe Fittings
No ratings yet
PE Pipe Fittings
41 pages
PTSP
No ratings yet
PTSP
101 pages
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
Pityriasis Versicolor
No ratings yet
Pityriasis Versicolor
21 pages
Bayesclassday 1
No ratings yet
Bayesclassday 1
57 pages
Unit 2 .Statistical Decision Making-1
No ratings yet
Unit 2 .Statistical Decision Making-1
213 pages
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
No ratings yet
L1: (Probability And) Statistics: ENGG 2780A ESTR 2020
29 pages
PTSP
No ratings yet
PTSP
74 pages
MSG Catalogue Equipment 2023 EU Web
No ratings yet
MSG Catalogue Equipment 2023 EU Web
54 pages
STAT 552 Probability and Statistics Ii: Short Review of S551
No ratings yet
STAT 552 Probability and Statistics Ii: Short Review of S551
51 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Algebra Che 304 Cimpetency Reviewer
No ratings yet
Algebra Che 304 Cimpetency Reviewer
1 page
07 Probability Review
No ratings yet
07 Probability Review
56 pages
Correlation, Probability
No ratings yet
Correlation, Probability
36 pages
Babybayes Master
No ratings yet
Babybayes Master
172 pages
Metal Losses in Pyrometallurgical Operations - A Review - Bellemans Et Al., 2018
No ratings yet
Metal Losses in Pyrometallurgical Operations - A Review - Bellemans Et Al., 2018
17 pages
IDS21 Bayes Theorem
No ratings yet
IDS21 Bayes Theorem
22 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
2 Mle
No ratings yet
2 Mle
28 pages
Useful Formulae: Mathematical & Physical
From Everand
Useful Formulae: Mathematical & Physical
Matthew Watkins
No ratings yet
Running Master
No ratings yet
Running Master
57 pages
Pump Division: Type MPT
No ratings yet
Pump Division: Type MPT
41 pages
Calculus Volume1
From Everand
Calculus Volume1
Ming Yao Tsai
No ratings yet
2 Probability
No ratings yet
2 Probability
30 pages
ML - Lec 2 - Review of Probability and Statistics
No ratings yet
ML - Lec 2 - Review of Probability and Statistics
30 pages
Sam Roweis Probx
No ratings yet
Sam Roweis Probx
12 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Imp Class Bayes Therom and Basian Network Class
No ratings yet
Imp Class Bayes Therom and Basian Network Class
39 pages
Statistical Tests Martin G 161131 V15 UPLOAD
No ratings yet
Statistical Tests Martin G 161131 V15 UPLOAD
33 pages
Unit-Ii: Probability I: Introductory Ideas
No ratings yet
Unit-Ii: Probability I: Introductory Ideas
28 pages
Machine Learning Models and Theories
No ratings yet
Machine Learning Models and Theories
38 pages
Probability Is A Branch of Mathematics That Deals With Measuring The Likelihood of Events
No ratings yet
Probability Is A Branch of Mathematics That Deals With Measuring The Likelihood of Events
34 pages
MAS 102 - Topic 1
No ratings yet
MAS 102 - Topic 1
13 pages
Statistics W
No ratings yet
Statistics W
11 pages
Formula Sheet
No ratings yet
Formula Sheet
19 pages
PBM Notes
No ratings yet
PBM Notes
130 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
Emergency Light
No ratings yet
Emergency Light
2 pages
Statistical Methods
No ratings yet
Statistical Methods
16 pages
Probabilistic Model
No ratings yet
Probabilistic Model
7 pages
Ps Notes
No ratings yet
Ps Notes
62 pages
Descriptive and Infrential Statistics
No ratings yet
Descriptive and Infrential Statistics
33 pages
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
No ratings yet
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
31 pages
Unit I Probability
No ratings yet
Unit I Probability
40 pages
Scribe: Naive Bayes Classifier
No ratings yet
Scribe: Naive Bayes Classifier
16 pages
Toefl Speaking & Writing
No ratings yet
Toefl Speaking & Writing
5 pages
Evaluation of The Methods For Determination of The Free Radical Scavenging Activity by DPPH
No ratings yet
Evaluation of The Methods For Determination of The Free Radical Scavenging Activity by DPPH
14 pages
CENG 222 Statistical Methods For Computer Engineering
No ratings yet
CENG 222 Statistical Methods For Computer Engineering
31 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Statistics
No ratings yet
Statistics
5 pages
Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
No ratings yet
Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
34 pages
2 Unit PR Statistical Decision Making
No ratings yet
2 Unit PR Statistical Decision Making
61 pages
Chapter 5 - 7
No ratings yet
Chapter 5 - 7
110 pages
Probability Theory For Machine Learning: Chris Cremer September 2015
No ratings yet
Probability Theory For Machine Learning: Chris Cremer September 2015
40 pages
Uncertainty PDF
No ratings yet
Uncertainty PDF
102 pages
Magazine4 09
No ratings yet
Magazine4 09
24 pages
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
No ratings yet
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
7 pages
Calculus Super Review
From Everand
Calculus Super Review
Editors of REA
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
No ratings yet
Probability Theory: Much Inspired by The Presentation of Kren and Samuelsson
27 pages
Introduction To Probability Theory: A Short Course On Graphical Models
No ratings yet
Introduction To Probability Theory: A Short Course On Graphical Models
30 pages
Rvrlecture 1
No ratings yet
Rvrlecture 1
20 pages
A Probability and Statistics Cheatsheet
No ratings yet
A Probability and Statistics Cheatsheet
28 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Formula PDF
No ratings yet
Formula PDF
7 pages
Carriageway 4.5 M: BSC BBC CAB GSB 2.5% 2.5% 4% 4%
No ratings yet
Carriageway 4.5 M: BSC BBC CAB GSB 2.5% 2.5% 4% 4%
1 page
Icect 2012
No ratings yet
Icect 2012
4 pages
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
From Everand
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Formula
No ratings yet
Formula
7 pages
Guide For The Development of The Practical Component - Unit 2 - Phase 4 - Development of The Simulated Practical Component
No ratings yet
Guide For The Development of The Practical Component - Unit 2 - Phase 4 - Development of The Simulated Practical Component
15 pages
Modelling Coolmos Transistors in Spice: !orecki
No ratings yet
Modelling Coolmos Transistors in Spice: !orecki
7 pages
Bostik No More Nails
No ratings yet
Bostik No More Nails
1 page
Abrahams and McMinns Clinical Atlas of Human Anatomy 1st edition by Peter Abrahams, Jonathan Spratt, Marios Loukas, Albert VanSchoor 0702073350 9780702073359 - The ebook in PDF/DOCX format is available for instant download
100% (6)
Abrahams and McMinns Clinical Atlas of Human Anatomy 1st edition by Peter Abrahams, Jonathan Spratt, Marios Loukas, Albert VanSchoor 0702073350 9780702073359 - The ebook in PDF/DOCX format is available for instant download
28 pages
Phychem Lab Assignment - R104 R105
No ratings yet
Phychem Lab Assignment - R104 R105
1 page
SS1 Jis
No ratings yet
SS1 Jis
5 pages
Hy Panel Supreme
No ratings yet
Hy Panel Supreme
3 pages
The Molecular Basis of Cancer 4th Edition by John Mendelsohn PDF
No ratings yet
The Molecular Basis of Cancer 4th Edition by John Mendelsohn PDF
2 pages
Health and Diseases
No ratings yet
Health and Diseases
31 pages

Probs Stats

Uploaded by

Probs Stats

Uploaded by

2.6.

Probability and Statistics

• Statistics: the study of data: collecting, analyzing, interpreting, and

• 2 types: discreet and continuous

• Conditional Independence: random variables A and B are conditionally independent

𝑃(𝐷1 = 1) = 𝑃(𝐷1 = 1, 𝐻 = 0) + 𝑃(𝐷1 = 1, 𝐻 = 1)

Using Bayes rules:

𝑃(𝐷2 = 1) = 0.98 × 0.0015 + 0.03 × 0.9975 = 0.0314

𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 0) = 𝑃(𝐷1 = 1|𝐻 = 0) 𝑃(𝐷2 = 1|𝐻 = 0) = 0.0003

𝑃(𝐷1 = 1, 𝐷2 = 1|𝐻 = 1) = 𝑃(𝐷1 = 1|𝐻 = 1) 𝑃(𝐷2 = 1|𝐻 = 1) = 0.98

𝑉𝑎𝑟 𝑋 = 𝐸[(𝑋 − 𝐸 𝑋 )2 ] = 𝐸 𝑋 2 − 𝐸[𝑋]2

• The variance of a function of a random variable:

𝑉𝑎𝑟𝑥~𝑃 𝑓 𝑥 = 𝐸𝑥~𝑃 𝑓 2 𝑥 − 𝐸𝑥~𝑃 𝑓(𝑥)2

𝝁 has coordinates 𝜇𝑖 = 𝐸𝐱~𝑃 [𝑥𝑖 ]

Let 𝐯 be a vector of the same size as 𝐱

𝐯 T 𝚺𝐯 = 𝐸𝐱~𝑃 [𝐯 T (𝐱 − 𝛍)(𝐱 − 𝛍)T 𝐯]= 𝑉𝑎𝑟𝐱~𝑃 [𝐯 T 𝐱]

Using Bayes rules:

𝑃(𝑋), 𝑃(𝜽) does not depend on 𝜽 (uninformative prior)

• Related to information theory, entropy is the amount of randomness in a

• If we take the negative log-likelihood and divide by n samples, we get cross-

• Using the product rule to compute derivative

• This needs n(n-1) multiplications, so it’s proportional to quadratic

• This needs n divisions and n sums -> Linear time

The likelihood is: 𝐿(𝑋|𝜆) = ς𝑛𝑖=1 𝑝(𝑥𝑖 |𝜆)

The maximum likelihood estimate is

-> The MLE is

• Take negative log of this:

• Again, −𝑁𝑙𝑜𝑔(𝜖) does not depend on 𝜽

You might also like