0% found this document useful (0 votes)

24 views28 pages

1a-Biostat Review

Uploaded by

dieu2802

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views28 pages

1a-Biostat Review

Uploaded by

dieu2802

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Introductory HIV/AIDS Data Analysis

Workshop
Pham Ngoc Thach University of Medicine

Rishi Chakraborty1,2
1Center for AIDS Research, Duke University, North Carolina, USA
2Department of Biostatistics and Bioinformatics, Duke University, North Carolina, USA

March 19-20, 2024

Biostatics Review

1
MODELS

Relate a dependent variable,outcome,

or response,Y, to some other variable(s).

These other variable(s) are called

s
independent or regressor variables, X’ .

LINEAR MODELS
Y is Normally distributed. Y ~ N( µ , σ2 )
2
If all the independent variables are numeric,
we have regression.

If all the independent variables are,

categorical, we have analysis of variance.

If we have both types of independent

variables, we have analysis of covariance.

LINEAR MODELS
3
Exceptions

Correlation - not a model at all

Logistic regression - dichotomous outcome

Poisson regression - count outcome

4
The sciences do not try to explain, they hardly
even try to interpret, they mainly make
models. By a model is meant a mathematical
construct which, with the addition of certain
verbal interpretations, describes observed
phenomena. The justification of such a
mathematical construct is solely and precisely
that it is expected to work.

John Von Neumann

5
Population (parameters, N obs.)

Inferential
Probability
Statistics

Sample (statistics, n obs.)

6
Statistics
n
∑ Xi
Sample mean, X = i=1
n
Sample median n

2
∑ (X i - X) 2

Sample variance, s = i=1

(n -1)
2
Sample standard deviation, s = s 7
Probability

0 < p < 1

Discrete Distributions

Binomial - number of successes in n trials

Poisson - number of events in an interval

8
Continuous Distributions
Normal Y ~ N( µ , σ2 )

µ−σ µ µ+σ 9
A standard Normal distribution is one where
µ = 0 and σ2 = 1. This is denoted by Z.
Z ~ N(0 , 1)

-3 -2 -1 0 1 2 3 10
Table A of the statistical tables gives cumulative
probabilities for a standard Normal distribution.

P(Z < 1.27)

-3 -2 -1 0 1 2 3 11
1.27
Table A (continued)

Cumulative Probabilities for the Standard Normal (Z) Distribution

Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 Z

0.00 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359 0.00
0.10 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753 0.10
0.20 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141 0.20
0.30 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517 0.30
0.40 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879 0.40

0.50 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224 0.50
0.60 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549 0.60
0.70 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852 0.70
0.80 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133 0.80
0.90 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389 0.90
1.00 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621 1.00

1.10 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830 1.10
1.20 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015 1.20
1.30 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177 1.30
1.40 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319 1.40
1.50 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 1.50
12
Table A of the statistical tables gives cumulative
probabilities for a standard Normal distribution.

P(Z < 1.27)

= .8980

-3 -2 -1 0 1 2 3 13
1.27
For other Normal distributions, we can convert
to a standard Normal by standardizing.
Y-µ
Z = ~ N(0 , 1)
σ

Y = diastolic blood pressure Y ~ N(77 , 11.62)

 60 - 77 
P(Y < 60) = P  Z < 
 11.6 
= P(Z < -1.47) = .0708 14
Other Distributions

t
- one parameter, called the df
- similar to a Z, but with “fatter tails”

- specific percentiles are in Table B

t(12),.95

15
Table B
Percentiles of the t-Distribution

df t.60 t.70 t.80 t.90 t.95 t.975 t.99 t.995 t.9995

1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657 636.619

2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925 31.599
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841 12.924
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604 8.610
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032 6.869
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707 5.959
7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499 5.408
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355 5.041
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250 4.781
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169 4.587
11 0.260 0.540 0.876 1.363 1.796 2.201 2.718 3.106 4.437
12 0.259 0.539 0.873 1.356 1.782 2.179 2.681 3.055 4.318
13 0.259 0.538 0.870 1.350 1.771 2.160 2.650 3.012 4.221
14 0.258 0.537 0.868 1.345 1.761 2.145 2.624 2.977 4.140
15 0.258 0.536 0.866 1.341 1.753 2.131 2.602 2.947 4.073

16
Other Distributions

t
- one parameter, called the df
- similar to a Z, but with “fatter tails”

- specific percentiles are in Table B

t(12),.95 = 1.782

For “lower tail” values, t(df),α = -t(df),1-α 17

χ2
- one parameter, called the df
- specific percentiles are in Table C

F
- two parameter, called the numerator df
and the denominator df
- specific percentiles are in Tables D1 – D3
18
Sampling Distributions

The mean of a sampling distribution is called

the expected value of the statistic.

The standard deviation of a sampling distribution

is called the standard error of the statistic.

19
Sampling Distribution of X

E(X) = µ
σ2 σ
Var(X) = ⇒ s.e.(X) =
n n
2
 σ 
If X ~ N(µ , σ ), then X ~ N  µ , 
2
 n 
X -µ
⇒ ~ N ( 0 , 1)
σ/ n 20
Central Limit Theorem

For n sufficiently large, the sampling

distribution of X is at least approximately
Normal for any underlying distribution!

X -µ
~ N ( 0 , 1)
σ/ n
21
Statistical Inference
- Estimation
- Hypothesis Testing
A point estimate is a single statistic that is
used to estimate a population parameter.
We can also estimate a parameter by a
100(1-α)% confidence interval.

This has a probability of “capture” of (1-α).

22
µ
( )
( )
( )
( )
( ) 100(1-α)% of
( ) these intervals
. will capture the
.
. parameter (µ)
.
.
( )
23
Form of most confidence intervals:

point estimate ± (table value)(std. error)

A 100(1-α)% C. I. for µ is:

(
X ± t (n−1),1−α/2 ) s
n
24
Hypothesis testing

Test a null hypothesis, H0,

against an alternative hypothesis, H1.

Two possible decisions:

- Reject H0 (in favor of H1)

- Fail to reject H0
25
TRUTH
DECISION H0 true H1 true

Reject H0 Type I error correct

Fail to reject H0 correct Type II error

α = P(Type I error) = P(Reject H0 | H0 true)

α is the significance level of the test

β = P(Type II error) = P(Fail to reject H0 | H1 true)

Power = P(Reject H0 | H1 true) = 1 - β 26

p - values

The probability of getting a test statistic at

least as “extreme” (in the direction stated
by H1) as the one observed.

Reject H0 if the p-value < α.

27
Hypothesis Testing Steps
1) Determine hypotheses
2) Decide on α ( .01 , .05 , .10 )

3 & 4) State rejection region, calculate test statistic

(or)
Calculate test statistic and p-value

5) Make decision (reject or not reject)

6) Write conclusions (interpret results),
in the context of the problem 28

Stat 5002 Final Exam Formulas W 21
No ratings yet
Stat 5002 Final Exam Formulas W 21
7 pages
Data Science Formula - Very Imp
No ratings yet
Data Science Formula - Very Imp
6 pages
Week 6
No ratings yet
Week 6
14 pages
111 08.6 Lecture Notes
0% (1)
111 08.6 Lecture Notes
5 pages
Solutions To End-of-Section and Chapter Review Problems 225
No ratings yet
Solutions To End-of-Section and Chapter Review Problems 225
33 pages
BS Lect 13
No ratings yet
BS Lect 13
34 pages
Distribution Tables
No ratings yet
Distribution Tables
16 pages
Prof. S P Bansal: Co - Principal Investigator
No ratings yet
Prof. S P Bansal: Co - Principal Investigator
18 pages
Lec 6 - Continuous Random Variables
No ratings yet
Lec 6 - Continuous Random Variables
27 pages
Microsoft Word - AS - A Level 9231 9709 Mathematics MF19 2020
No ratings yet
Microsoft Word - AS - A Level 9231 9709 Mathematics MF19 2020
1 page
L4 Continuous Probability
No ratings yet
L4 Continuous Probability
28 pages
Normal Distribution Table
No ratings yet
Normal Distribution Table
1 page
MF15
100% (1)
MF15
8 pages
STAT 5002 Final Exam Formulas
No ratings yet
STAT 5002 Final Exam Formulas
8 pages
List of Formulae and Statistical Tables
No ratings yet
List of Formulae and Statistical Tables
4 pages
Statistical Tables SNM MA3251
No ratings yet
Statistical Tables SNM MA3251
10 pages
Tabelas de Distribuição - Econometria
No ratings yet
Tabelas de Distribuição - Econometria
8 pages
CEN 414 - Tutorial 4
No ratings yet
CEN 414 - Tutorial 4
3 pages
Please DO NOT Bring This Formula Sheet To The Class Room On Exam Day. Formula Sheet and Function Tables Will Be Provided
No ratings yet
Please DO NOT Bring This Formula Sheet To The Class Room On Exam Day. Formula Sheet and Function Tables Will Be Provided
5 pages
Alevel数学公式
No ratings yet
Alevel数学公式
5 pages
Statistics and Probability
No ratings yet
Statistics and Probability
4 pages
AP Stats Reference Sheet
No ratings yet
AP Stats Reference Sheet
7 pages
Lecture Notes 5 - EDA - Continuous Random Variable
No ratings yet
Lecture Notes 5 - EDA - Continuous Random Variable
27 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
Statistical Formulae and Tables
No ratings yet
Statistical Formulae and Tables
12 pages
标准正态分布表
No ratings yet
标准正态分布表
2 pages
Homework 5
No ratings yet
Homework 5
5 pages
Microsoft Word - AS - A Level 9231 9709 Mathematics MF19 2020
No ratings yet
Microsoft Word - AS - A Level 9231 9709 Mathematics MF19 2020
2 pages
CS User Manual
No ratings yet
CS User Manual
53 pages
Final Exam Sheet
No ratings yet
Final Exam Sheet
3 pages
Two-Sample Estimate of A Common Variance: 2 + +: Sampling and Testing X X X X S N N
No ratings yet
Two-Sample Estimate of A Common Variance: 2 + +: Sampling and Testing X X X X S N N
2 pages
Normal Distribution Table - A Level
No ratings yet
Normal Distribution Table - A Level
1 page
Statistical Tables All in One
No ratings yet
Statistical Tables All in One
6 pages
STAT 221 Introduction To Probabibility
No ratings yet
STAT 221 Introduction To Probabibility
13 pages
Note 30 Nov 2024 3
No ratings yet
Note 30 Nov 2024 3
3 pages
Note 30 Nov 2024 5
No ratings yet
Note 30 Nov 2024 5
3 pages
Note 30 Nov 2024 2
No ratings yet
Note 30 Nov 2024 2
3 pages
Table
No ratings yet
Table
4 pages
T Tables
No ratings yet
T Tables
3 pages
Note 30 Nov 2024 3
No ratings yet
Note 30 Nov 2024 3
3 pages
Normal Distribution Z Table
100% (1)
Normal Distribution Z Table
1 page
Statistial Tables
No ratings yet
Statistial Tables
2 pages
Normal
No ratings yet
Normal
1 page
Tables Exam 20212022
No ratings yet
Tables Exam 20212022
3 pages
Normal Student Chi Square
No ratings yet
Normal Student Chi Square
3 pages
Exam SRM Tables
No ratings yet
Exam SRM Tables
3 pages
MF26
No ratings yet
MF26
12 pages
Business Statistics Formula
No ratings yet
Business Statistics Formula
4 pages
Quantitative Techniques: Normal Distribution
No ratings yet
Quantitative Techniques: Normal Distribution
39 pages
BYU Stat 121 Statistical Tables
No ratings yet
BYU Stat 121 Statistical Tables
1 page
MF 19 Math A Level Formula
No ratings yet
MF 19 Math A Level Formula
1 page
Formula List MF15
No ratings yet
Formula List MF15
8 pages
Formula Sheet For Maths
No ratings yet
Formula Sheet For Maths
5 pages
MF 10
No ratings yet
MF 10
12 pages
Administrative Discretion AND Judicial Review: Project On
No ratings yet
Administrative Discretion AND Judicial Review: Project On
13 pages
SP Pts Course
100% (1)
SP Pts Course
8 pages
The Meaning of Einstein Field Equations
No ratings yet
The Meaning of Einstein Field Equations
8 pages
Laser Beam Shaping
100% (1)
Laser Beam Shaping
6 pages
Verilog HDL
100% (1)
Verilog HDL
204 pages
CBHRMD For SK and Lydo - Final
No ratings yet
CBHRMD For SK and Lydo - Final
42 pages
Couple Therapy Forgiveness As An Islamic Approach in Counselling
No ratings yet
Couple Therapy Forgiveness As An Islamic Approach in Counselling
6 pages
White Paper Alarm Rationalization Deltav en 56654
No ratings yet
White Paper Alarm Rationalization Deltav en 56654
13 pages
Integrated School Forms V2 Manual
100% (1)
Integrated School Forms V2 Manual
31 pages
Dr. Jose P. Rizal Elementary School Cluster VI: Weekly Home Learning Plan For Grade 2
No ratings yet
Dr. Jose P. Rizal Elementary School Cluster VI: Weekly Home Learning Plan For Grade 2
6 pages
Barometer - Wikipedia
No ratings yet
Barometer - Wikipedia
61 pages
20~47液晶屏规格书下载
100% (3)
20~47液晶屏规格书下载
6 pages
Unit Lesson Plan
No ratings yet
Unit Lesson Plan
3 pages
Chad Rafe: Unit: Unit 1 Lesson: Getting Started in Spanish I
No ratings yet
Chad Rafe: Unit: Unit 1 Lesson: Getting Started in Spanish I
64 pages
Whybihar
No ratings yet
Whybihar
9 pages
Form Caq 2
No ratings yet
Form Caq 2
2 pages
Berthold Schwarz: Ralph Oesper
No ratings yet
Berthold Schwarz: Ralph Oesper
4 pages
Torque, Rotational Equilibrium and Center of Gravity: Gwyneth Marie Dayagan
No ratings yet
Torque, Rotational Equilibrium and Center of Gravity: Gwyneth Marie Dayagan
5 pages
Creating MDI Child Forms - 1
No ratings yet
Creating MDI Child Forms - 1
2 pages
Slavoj Zizek - The Fetish of The Party
No ratings yet
Slavoj Zizek - The Fetish of The Party
13 pages
CAP 3 Agronomic and Statistical Evaluation of Fertilizer Response 1985
No ratings yet
CAP 3 Agronomic and Statistical Evaluation of Fertilizer Response 1985
38 pages
Sprob Assigment 2
No ratings yet
Sprob Assigment 2
9 pages
Stefano Boni
No ratings yet
Stefano Boni
2 pages
LDT1-028K Application Note
No ratings yet
LDT1-028K Application Note
1 page
Algebra Practice
No ratings yet
Algebra Practice
4 pages
Introduction To Scientific Computing Tool Scilab
No ratings yet
Introduction To Scientific Computing Tool Scilab
16 pages
Dragon-Kings Lorenz 16dec10
No ratings yet
Dragon-Kings Lorenz 16dec10
73 pages
Technology: Autoform
No ratings yet
Technology: Autoform
8 pages
PWR Coin Official Swap Proposal
No ratings yet
PWR Coin Official Swap Proposal
3 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Constructed Layered Systems: Measurements and Analysis
From Everand
Constructed Layered Systems: Measurements and Analysis
W. H. Cogill
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

1a-Biostat Review

Uploaded by

1a-Biostat Review

Uploaded by

Introductory HIV/AIDS Data Analysis

March 19-20, 2024

Relate a dependent variable,outcome,

These other variable(s) are called

If all the independent variables are,

If we have both types of independent

Correlation - not a model at all

Logistic regression - dichotomous outcome

Poisson regression - count outcome

John Von Neumann

Sample (statistics, n obs.)

Sample variance, s = i=1

Binomial - number of successes in n trials

Poisson - number of events in an interval

P(Z < 1.27)

Cumulative Probabilities for the Standard Normal (Z) Distribution

P(Z < 1.27)

Y = diastolic blood pressure Y ~ N(77 , 11.62)

- specific percentiles are in Table B

df t.60 t.70 t.80 t.90 t.95 t.975 t.99 t.995 t.9995

1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657 636.619

- specific percentiles are in Table B

For “lower tail” values, t(df),α = -t(df),1-α 17

The mean of a sampling distribution is called

The standard deviation of a sampling distribution

For n sufficiently large, the sampling

This has a probability of “capture” of (1-α).

point estimate ± (table value)(std. error)

A 100(1-α)% C. I. for µ is:

Test a null hypothesis, H0,

Two possible decisions:

- Reject H0 (in favor of H1)

Reject H0 Type I error correct

Fail to reject H0 correct Type II error

α = P(Type I error) = P(Reject H0 | H0 true)

β = P(Type II error) = P(Fail to reject H0 | H1 true)

Power = P(Reject H0 | H1 true) = 1 - β 26

The probability of getting a test statistic at

Reject H0 if the p-value < α.

3 & 4) State rejection region, calculate test statistic

5) Make decision (reject or not reject)

You might also like