0% found this document useful (0 votes)

42 views39 pages

Nominal Variables Tests and Outcome Measures - Lecture 4

The document discusses different types of variables that can be used in statistical tests and different methods for comparing categorical variables between groups, including the chi-square test. It also covers calculating odds ratios to quantify the strength of association between variables and factors like genotype and disease outcomes. Cumulative probabilities are introduced as a way to incorporate survival data over time into statistical analyses when observations may be incomplete.

Uploaded by

black.hadi194

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views39 pages

Nominal Variables Tests and Outcome Measures - Lecture 4

Uploaded by

black.hadi194

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Nominal variables tests and outcome

measures

Department of Biostatistics and Translational Medicine

What are the principal types of
variables?
• Continuous
– Everything that can be measured
• Ordinal
– Everything that can be ranked/ordered
• Categorical
– Everything that can be grouped
How can one compare categorical
variables between groups?
• Do men get diabetes more often than women?
A basic test for proportions – the
2
Chi test
• The test is used to determine whether two variables
are associated in a way that certain combinations of
values occur more often than others
Converting Chi-square values to p
How does Chi2 work?
• It calculates the expected number of state/class
combinations
• Afterwards it calculates the deviation of the observed
values with the expected ones
• If the deviations are large the test rejects the null
hypothesis

H0 – the observed values do not deviate from expected ones

HA – the observed values deviate from the expected ones
How to compare categorical
variables?
Without
Diabetes Total
Diabetes
FTO [CC] 1500 8500 10000
15.00% 85.00%
FTO [nonCC] 1300 8700 10000
13.00% 87.00%
Total 2800 17200 20000

How should this table look if there was no association between

FTO genotype and diabetes?
Expected distribution
Without
Diabetes Total
Diabetes
FTO [CC]
1500 8500
observed
expected 1400 8600 10000
FTO [nonCC]
1300 8700
observed
expected 1400 8600 10000
Total 2800 17200 20000
Outcome
Without
Diabetes Total
Diabetes
FTO [CC] 1500 8500 10000
15.00% 85.00%
FTO [nonCC] 1300 8700 10000
13.00% 87.00%
Total 2800 17200 20000
p<0.001
The distribution of cell counts deviates significantly from the expected
distribution. Considering that there were more patients with diabetes among CC
homozygtes than among patients with other genotypes we conclude that being
homozygous predisposes to diabetes
When not to use the Chi 2 test?
• Very small groups
– Use a Fisher’s exact test instead
• Paired observations (the same individual
evaluated twice – before an after an
intervention)
Alternatives to the Chi-square test
Fisher’s exact test
A permutational test that calculates all possible tables with the same
marginal sums and checks whether the observed table is within 5% of the
extreme distributions.
Typically used if in the 2x2 table contains values <5

Yates’ corrected Chi-square test (continuity correction)

Used to prevent overestimation of statistical significance for small

numbers of observations
Typically used if the 2x2 table contains values <15
Which one?
Row -
Hypoglycemia - 0 Hypoglycemia - 1
Totals
MDI 218 6 224
Column % 49.21% 50.00%
Row % 97.32% 2.68%
CSII 225 6 231
Column % 50.79% 50.00%
Row % 97.40% 2.60%
Totals 443 12 455
Which one?
Row -
Hypoglycemia - 0 Hypoglycemia - 1
Totals
MDI 218 16 234
Column % 49.21% 50.00%
Row % 97.32% 2.68%
CSII 225 2 227
Column % 50.79% 50.00%
Row % 97.40% 2.60%
Totals 443 18 461
Which one?
Row -
Hypoglycemia - 0 Hypoglycemia - 1
Totals
MDI 218 16 234
Column % 49.21% 50.00%
Row % 97.32% 2.68%
CSII 225 16 241
Column % 50.79% 50.00%
Row % 97.40% 2.60%
Totals 443 32 475
Matched pairs test for nominal variables
McNemar’s Chi-square test
Works by contrasting the divergent pairs
Works by comparing the difference between divergent pairs b and c

Depression according Without depression in DSM

to DSM IV IV
Depression according
100 (a) 20 (b)
to ICD-10
Without depression
10 (c) 1500 (d)
in ICD-10
But what about comparing the
effect?
Are p values enough or can we do better?
Diabetes Without Diabetes Total

FTO [CC] 1500 8500 10000

p<0.0001
FTO [nonCC] 1300 8700 10000

Total 2800 17200 20000

Diabetes Without Diabetes Total

INS 5’VNTR [CC] 800 1200 2000 p<0.0001
INS 5’VNTR
650 1350 2000
[nonCC]
Total 1450 2550 4000

Which of these variants exerts a stronger biological effect?

Odds ratio
• Odds - a measure comparing the odds of getting the event of
interest against not getting one p/(1-p)
– An odds of 1 corresponds to an equal probability of survival and failure

• An odds ratio (p1/(1-p1))/(p2/(1-p2)) thus shows the relative odds

of an event of interest occurring depending on the examined
variable
– For example having a risk allele may lead to an OR of 1.2 for getting
diabetes, which means that carriers are 1.2 times more likely to become
diabetic than non-carriers
Odds ratio calculations

Without
Diabetes Total
Diabetes
FTO [CC] 1500 8500 10000

FTO [nonCC] 1300 8700 10000

Total 2800 17200 20000

(p1/(1-p1)) = (1500/8500) 0.1765 =

OR= = 1.18
(p2/(1-p2)) (1300/8700) 0.1494
Interpreting OR and RR
OR=1.2 95%CI 0.7 – 1.7
p>0.05

Protective effect Detrimental effect

0 1

OR=0.6 95%CI 0.2 – 1.0

p=0.05
OR=1.2 95%CI 1.1 – 1.3
p<0.05
Other tools used in expressing
effects’ strength
• Relative risk – a ratio of the probabilities of an event occurring in the exposed
and control groups. Typically used in RCTs as it requires a good baseline
probability estimate provided by a placebo-treated group

• Hazard ratio – a ratio of probabilities of an event occurring in the exposed and

control group taking into account the observation time

• Number needed to treat – the minimum number of patients who need to be

treated to prevent one bad outcome.

http://www.cebm.net/glossary/
What to do when talking about survival?
Dead due to myocardial infarction Alive Total
INS 5’VNTR [CC] 100 1900 2000
INS 5’VNTR
80 1920 2000
[nonCC]
Total 180 3820 4000

Humans don’t
What is missing?
live forever
Cumulative probability – a way to incorporate
incomplete observations into the analysis
• A multiplication of probabilities of an event occurring on a certain timepoint in an
observation lasting t epochs

• The cumulative probability covers both the probability of an event occurring while taking
into account the observations dropping out of the analysis due to various reasons:
• Surviving past observation end

• Leaving the study due to non-event reasons

A potential cancer trial – what is the probability
of surviving 5 years since diagnosis?

R – Relapse
D – Death
Converting the probabilities into a database with individual
starting points

R – Relapse
D – Death
Objectives of survival analysis
Estimate time-to-event for a group of individuals, such as time until
second heart-attack (MI) for a group of MI patients.

To compare time-to-event between two or more groups, such as treated

vs. placebo MI patients in a randomized controlled trial.

To assess the relationship of co-variables to time-to-event, such as: does

weight, insulin resistance, or cholesterol influence survival time of MI
patients?
Why not...?
1. Why not compare mean time-to-event between your groups using
a t-test or linear regression?
 ignores censoring

2. Why not compare proportion of events in your groups using

risk/odds ratios or logistic regression?
 ignores time
Terms used in survival analysis
Time-to-event:
The time from entry into a study until a subject has a particular outcome (ti = time at
last disease-free observation or time at event)

Censoring:
Subjects are said to be censored if they are lost to follow-up or drop out of the study,
or if the study ends before they die or have an outcome of interest. They are counted
as alive or disease-free for the time they were enrolled in the study. (ci =1 if had the
event; ci = 0  no event by time ti)
Kaplan-Meier curves
• We take the time to event into account rather than just the
event’s presence and group assignment
• The database needs three variables
– Complete observations are ones in whom the event occurred
• They impact survival curves by reducing the estimated probability of survival

– Censored observations are ones that dropped out of the analysis

Interpreting Kaplan-Meier curves
P<0.05 log-rank test result
Probability of the
outcome

Median
survival
time

Individual Censored
observation time observations

Furman R, et al. Idelalisib and Rituximab in Relapsed

Chronic Lymphocytic Leukemia. N Eng J Med. Jan 2014
Log-rank test
H0 - no difference between survival functions of the two groups

A log-rank test creates 2x2 tables at each event time and combines across the tables

It provides a c2 statistic with 1 degree of freedom (for a two groups comparison) and a
p-value.

When p value <0.05 we can conclude that there is a significant difference in the
survival time, e.g. in the treated group compared to untreated one.
Examples on using K-M curves

Hunger SP, Mullighan CG.Acute Lymphoblastic Leukemia in Children.

N Engl J Med. 2015 Oct 15;373(16):1541-52. doi: 10.1056/NEJMra1400972.
Limitations of Kaplan-Meier method
• Mainly descriptive

• Requires categorical predictors

• Survival estimates can be unreliable toward the end of a study when there are
small numbers of subjects at risk of having an event

• Doesn’t control for covariates

• Can’t accommodate time-dependent variables

Comparing the impact of variables
on the probability of survival
• For univariate comparisons (single variable
divides the whole group) we can use the log-
rank test
– H0 – cumulative probabilities of survival are equal
– HA – cumulative probabilities of survival are not equal
• What if there are several overlapping variables?
Multivariate analyses
• Can be used for continuous, catagorical and time-
dependent variables using different methods
• Typically used when multiple variables coexist and
overlap and one wants to extract the impact of a single
variable free from confounding effects of others
– Does smoking cause lung cancer or is male sex a more
significant risk factor?
Multivariate analysis of survival
probabilities
• Cox’ proportional hazard regression model
• Uses a polynomial equation to express the relative impact of
variables on the probability of survival
• Results are expressed as Hazard Ratios (interpreted similarly to
ORs)
• HRs that are adjusted represent the impact of a single variable
after „cleaning” it from the effects of other variables in the model
Typical results of Cox’ regression
Interpreting HR

Protective effect Detrimental effect

0 1
Thank you for your attention

SAMA's Regulatory Role Strategy Project: Saudi Arabian Monetary Authority RFI Presentation - January 29, 2019
100% (2)
SAMA's Regulatory Role Strategy Project: Saudi Arabian Monetary Authority RFI Presentation - January 29, 2019
44 pages
Gravimetic Feeders
100% (1)
Gravimetic Feeders
26 pages
How Social Media Can Make A History by Clay Shirky - Reaction Paper John Darryl P. Ligan
No ratings yet
How Social Media Can Make A History by Clay Shirky - Reaction Paper John Darryl P. Ligan
2 pages
Telecommunications Security Code of Practice
No ratings yet
Telecommunications Security Code of Practice
150 pages
Survival Analysis
No ratings yet
Survival Analysis
13 pages
Statistical Methods in Nursing
No ratings yet
Statistical Methods in Nursing
73 pages
Cox Regression Thesis
100% (3)
Cox Regression Thesis
6 pages
Science Technology and Society Final Examination
100% (2)
Science Technology and Society Final Examination
9 pages
Dissertation Cox Regression
100% (2)
Dissertation Cox Regression
5 pages
DRM Steps
100% (3)
DRM Steps
30 pages
Survival Analysis
No ratings yet
Survival Analysis
28 pages
Feedback Control Systems (FCS) : Lecture-26 Routh-Herwitz Stability Criterion
No ratings yet
Feedback Control Systems (FCS) : Lecture-26 Routh-Herwitz Stability Criterion
19 pages
Quran Fonts
0% (1)
Quran Fonts
8 pages
Surveillance Systems
No ratings yet
Surveillance Systems
17 pages
Solutions Set2
No ratings yet
Solutions Set2
2 pages
Two Way Anova
No ratings yet
Two Way Anova
12 pages
1categorical Data Analysis (Chi Square) June 2022
No ratings yet
1categorical Data Analysis (Chi Square) June 2022
194 pages
Lecture 4&5-Categorical Data Analysis
No ratings yet
Lecture 4&5-Categorical Data Analysis
85 pages
1measures of Association
No ratings yet
1measures of Association
105 pages
Survival Analysis - Lecture 3
No ratings yet
Survival Analysis - Lecture 3
72 pages
Logistic Regression
No ratings yet
Logistic Regression
79 pages
Comparing Variables
No ratings yet
Comparing Variables
82 pages
Survival
No ratings yet
Survival
44 pages
Measures of Association
No ratings yet
Measures of Association
56 pages
Espan140 Solution 54860159 8697
No ratings yet
Espan140 Solution 54860159 8697
39 pages
Categorical Data - spss2019
No ratings yet
Categorical Data - spss2019
62 pages
Ni Hms 595225
No ratings yet
Ni Hms 595225
16 pages
Cda 4568
No ratings yet
Cda 4568
44 pages
Survival Analysis Theory 2024-4
No ratings yet
Survival Analysis Theory 2024-4
49 pages
Analysis of Categorical Data and Epidemiologic Studies - Topic 8
No ratings yet
Analysis of Categorical Data and Epidemiologic Studies - Topic 8
52 pages
Across t2d - Statistics 2 Fundamental Statistics
No ratings yet
Across t2d - Statistics 2 Fundamental Statistics
37 pages
Fortimanager v6.4.11 Release Notes
No ratings yet
Fortimanager v6.4.11 Release Notes
45 pages
Inferential Statistics II
No ratings yet
Inferential Statistics II
62 pages
Class 7
No ratings yet
Class 7
42 pages
Ks2 Mathematics 2001 Marking Scheme
No ratings yet
Ks2 Mathematics 2001 Marking Scheme
30 pages
Phase 3 Statistics Record
No ratings yet
Phase 3 Statistics Record
43 pages
Lecture Three
No ratings yet
Lecture Three
28 pages
Analysis of Matched Data Plus, Diagnostic Testing
No ratings yet
Analysis of Matched Data Plus, Diagnostic Testing
58 pages
Lecture 01.1 Introduction To Website Development
No ratings yet
Lecture 01.1 Introduction To Website Development
22 pages
Descriptive Statistics Inferential Statistics: Chinna Chadayan
No ratings yet
Descriptive Statistics Inferential Statistics: Chinna Chadayan
40 pages
Longitudinal EBM-CAS Introduction To Survival Analysis and Log-Rank Test-Dr. Vicka Oktaria, MPH, PH.D (2023)
No ratings yet
Longitudinal EBM-CAS Introduction To Survival Analysis and Log-Rank Test-Dr. Vicka Oktaria, MPH, PH.D (2023)
24 pages
Leviat - Ancon - AUS Coupler BR - 2024
No ratings yet
Leviat - Ancon - AUS Coupler BR - 2024
24 pages
Lecture 3
No ratings yet
Lecture 3
62 pages
12 Chi Square and Odds Ratios
No ratings yet
12 Chi Square and Odds Ratios
44 pages
Prognosis - EBM - Bahan Kuliah Yg Dipakai - 1
No ratings yet
Prognosis - EBM - Bahan Kuliah Yg Dipakai - 1
48 pages
Exam Questions With Answers
No ratings yet
Exam Questions With Answers
11 pages
Mercedes Benz StarTuned December 2019
No ratings yet
Mercedes Benz StarTuned December 2019
36 pages
Biostat II Final Exam
No ratings yet
Biostat II Final Exam
7 pages
Survival Analysis
No ratings yet
Survival Analysis
4 pages
Building Internet Brands: Brand Equity and Brand Image Creating A Strong Brand On The Internet
No ratings yet
Building Internet Brands: Brand Equity and Brand Image Creating A Strong Brand On The Internet
22 pages
Biostatics Non-Parametric Tests (Osmosis Notes 2025)
No ratings yet
Biostatics Non-Parametric Tests (Osmosis Notes 2025)
3 pages
Evaluating Prognosis Answer
No ratings yet
Evaluating Prognosis Answer
5 pages
Survival Analysis
No ratings yet
Survival Analysis
5 pages
Supports Production DRW Rev B
No ratings yet
Supports Production DRW Rev B
9 pages
Applied Statistics Survival Analysis
No ratings yet
Applied Statistics Survival Analysis
23 pages
Survival Analysis
No ratings yet
Survival Analysis
44 pages
Survival - Notes (Lecture 6)
No ratings yet
Survival - Notes (Lecture 6)
27 pages
Basic Concepts in Biostatistics 1-1
No ratings yet
Basic Concepts in Biostatistics 1-1
36 pages
Categorical Data Analysis
No ratings yet
Categorical Data Analysis
44 pages
Non Parametric Tests
No ratings yet
Non Parametric Tests
37 pages
Everything-As-A-Service (XaaS) For Original Equipment Manufacturers
No ratings yet
Everything-As-A-Service (XaaS) For Original Equipment Manufacturers
26 pages
PanduitProductDetails UTP28SP2MBU
No ratings yet
PanduitProductDetails UTP28SP2MBU
2 pages
Programming Unit Vocabulary 1
No ratings yet
Programming Unit Vocabulary 1
4 pages
Amani's Resume 2025
No ratings yet
Amani's Resume 2025
2 pages
TSP Formulations Oncan PDF
No ratings yet
TSP Formulations Oncan PDF
18 pages
Survival Analysis
No ratings yet
Survival Analysis
36 pages
Informe de Viaje de Visita Tecnica de Los Puentes La Leche Vilela y Motupe
No ratings yet
Informe de Viaje de Visita Tecnica de Los Puentes La Leche Vilela y Motupe
42 pages
CSF213 OOP Handout 2023 24 Sem I
No ratings yet
CSF213 OOP Handout 2023 24 Sem I
3 pages
Mickael Musindo
No ratings yet
Mickael Musindo
2 pages
Advanced ATM Crime Prevention System by Using Wireless Communication
No ratings yet
Advanced ATM Crime Prevention System by Using Wireless Communication
6 pages
CBD ZZ 00 DR DR 1001
No ratings yet
CBD ZZ 00 DR DR 1001
1 page
1073 Full
No ratings yet
1073 Full
1 page
Krok 2 2002-2003 Mixed
No ratings yet
Krok 2 2002-2003 Mixed
8 pages
BES - R Lab 7
No ratings yet
BES - R Lab 7
5 pages
Aimcat 1803 Exp Review
No ratings yet
Aimcat 1803 Exp Review
2 pages
Statistical Analysis: Parametric Non Parametric
No ratings yet
Statistical Analysis: Parametric Non Parametric
10 pages
1073 Full
No ratings yet
1073 Full
1 page
HP425 Seminar 3: Non-Parametric Survival Analysis I
No ratings yet
HP425 Seminar 3: Non-Parametric Survival Analysis I
3 pages
What Is Statistics?
No ratings yet
What Is Statistics?
13 pages
Bio Statistics Hand Out
No ratings yet
Bio Statistics Hand Out
11 pages
Objective:: Write An Experiment On Zener Diode Clipper
No ratings yet
Objective:: Write An Experiment On Zener Diode Clipper
13 pages
Biostatistics 203. Survival Analysis: Yhchan
No ratings yet
Biostatistics 203. Survival Analysis: Yhchan
8 pages
BIOS576A W6 HW Key PDF
No ratings yet
BIOS576A W6 HW Key PDF
5 pages
Study Designs: Sample Bias
No ratings yet
Study Designs: Sample Bias
4 pages
Survival Analysis
No ratings yet
Survival Analysis
33 pages
Abstract
No ratings yet
Abstract
7 pages
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
No ratings yet
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
62 pages
Minitab Tip Sheet 15
No ratings yet
Minitab Tip Sheet 15
5 pages
PUZZLES IN GENERAL SURGERY: A STUDY GUIDE (2nd Edition)
From Everand
PUZZLES IN GENERAL SURGERY: A STUDY GUIDE (2nd Edition)
Hassan Bukhari
No ratings yet
Rvt & Veterinary Assistant Job Aid
From Everand
Rvt & Veterinary Assistant Job Aid
TaLethia RVT CBOM
No ratings yet

Nominal Variables Tests and Outcome Measures - Lecture 4

Uploaded by

Nominal Variables Tests and Outcome Measures - Lecture 4

Uploaded by

Nominal variables tests and outcome

Department of Biostatistics and Translational Medicine

H0 – the observed values do not deviate from expected ones

How should this table look if there was no association between

Yates’ corrected Chi-square test (continuity correction)

Used to prevent overestimation of statistical significance for small

Depression according Without depression in DSM

FTO [CC] 1500 8500 10000

Total 2800 17200 20000

Diabetes Without Diabetes Total

Which of these variants exerts a stronger biological effect?

• An odds ratio (p1/(1-p1))/(p2/(1-p2)) thus shows the relative odds

FTO [nonCC] 1300 8700 10000

Total 2800 17200 20000

(p1/(1-p1)) = (1500/8500) 0.1765 =

Protective effect Detrimental effect

OR=0.6 95%CI 0.2 – 1.0

• Hazard ratio – a ratio of probabilities of an event occurring in the exposed and

• Number needed to treat – the minimum number of patients who need to be

• Leaving the study due to non-event reasons

To compare time-to-event between two or more groups, such as treated

To assess the relationship of co-variables to time-to-event, such as: does

2. Why not compare proportion of events in your groups using

– Censored observations are ones that dropped out of the analysis

Furman R, et al. Idelalisib and Rituximab in Relapsed

Hunger SP, Mullighan CG.Acute Lymphoblastic Leukemia in Children.

• Requires categorical predictors

• Doesn’t control for covariates

• Can’t accommodate time-dependent variables

Protective effect Detrimental effect

You might also like