0% found this document useful (0 votes)

121 views3 pages

STATA Command Summary

This document summarizes common STATA commands used in clinical statistics. It provides commands for data management and exploration, descriptive statistics, hypothesis testing, correlations, regression analysis, and cohort studies. Key tips are included such as checking for normality before parametric tests and using non-parametric alternatives when appropriate. Common plots like histograms, scatter plots, and regression lines are demonstrated. Parameter meanings and assumptions are explained for various tests.

Uploaded by

Silp Satjawattanavimol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views3 pages

STATA Command Summary

Uploaded by

Silp Satjawattanavimol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

### STATA command Summary ### (CMU Basic Clinical Statistic Course 2020)

>> DAY1 <<

Command

help {command} - Help on STATA command

describe +/-{var} - Tell more details about data (Variable, Type of variable, Label)
summarize,sum +/-{var} - If numerical data, can describe characters of data (Mean, SD, Range) -> Numerical
data . Categorical data result has no meaning.
sum +/-{var}, detail - Describe more characters of data includes Percentile, Median (= 50th percentile),
SD, Variance

**TIPS Before analysis

1. Check range (min, max) to make sure that data is appropriately collected (eg. gender min must be 0, max must be 1)
2. Use describe and sum to check if data is valid before perform any analysis.

disp {formula} - Calculator function; eg. disp (1+1)/(2*6), disp sqrt(16), disp ln(10) ; log = ln = natural
logarithm
tab {var} - Create table with frequency, percentage, and cumulative percentage ->
Categorical data
tab {var1} {var2} - Create 2x2 table (row = var1, column = var2) (Cross-tabulation)
tab {var1} {var2}, col - Create 2x2 table (row = var1, column = var2) with column percentage **Use column
percentage is preferred.
tab {var1} {var2}, row - Create 2x2 table (row = var1, column = var2) with row percentage
histogram {var} - Create histogram -> Can be used for evaluate if the data is normally distributed
histogram {var},{var2} - Create 2 histograms of Var1 by Var2
histogram {var},by({var2}) - Create 2 histograms of Var1 by each of Var2
silk {var} - Shapiro-Wilk W test for normal data (if not significant -> normal)

**Test of normality (Kolmogorov-Smirnov, Shapiro-Wilk) - if n > 40 cannot be used (Tends to always significant despite true normal
distribution)
**Easy way to confirm normality
1. Eyeball test (Histogram plot)
2. Size of S.D. (< Mean/2 ?)
3. Mean = Median = Mode ?
**Clinical count data 1. Mostly non-normal distribution, 2. Mostly Right skewed

gen {var1}={var2} - Create new variable var1 with value of var2

recode {min1}/{max1}={cat1} … {minn}/{maxn}={catn}
- Stratify continuous data to categorical data for n strata (Note : each stratum is separated
with space)
recode {min1}/{max1}={cat1} … {minn}/{maxn}={catn}, gen({newgroupname} )
- Combine gen and recode command in 1 line
cii means {n} {mean} {SE} - Calculate 95% confidence interval from Mean
cii proportions {n} {proportion} - Calculate 95% confidence interval from Proportion

**TIPS Normally not to present standard error of mean in manuscript, use CI instead.
**Proportion -> STATA with use binomial or Bernoulli’s distribution with proportion variable -> show 95% CI in ‘Binomial Exact’

tab {var1} {var2}, col chi2 - Create 2x2 table (row = var1, column = var2) with column percentage, analyse with
Chi-square
tab {var1} {var2}, col exact - Create 2x2 table (row = var1, column = var2) with column percentage, analyse with
Fisher’s Exact Probability test
sum {var1} if {var2 + comparator + value}
- Summarize variable with if clause eg. sum a if b==1
ttest {var1},by({var2}) - Test of Mean using 2-sample t-test with equal variances (Var1 by Var2)
sdtest {var1},by({var2}) - Variance test of 2 means (Proof of equal variance)
ranksum {var1}, by({var2}) - Test of Mean using Ranksum

**T-test can only be use if both 2 means are normally distributed -> Use histogram to evaluate first! (T-test is ‘Parametric test’
-> using parameters eg Mean, SD)
**If not normally distributed -> Use Non-parametric test instead : Wilcoxon rank-sum (=Mann-Whitney U test)
**Non-parametric statistics - Not depends on mean, SD of data (But lower power compared to Parametric test)
**Using normal distribution data with non-parametric test is ok, but not preferred because of lower power
**Ranksum test is the test of rank summation, not the test of median!
**Conservatively : always use 2-sided p-value initially (Pr(|T|>|t|), because we don’t actually know the direction of difference
**But if we know that the intervention we give will result in only 1 direction of difference only -> We can use 1-sided p-value of
expected direction of difference.(BUT not recommended)
**H0 = NULL hypothesis, Ha = Alternative hypothesis
**Chi-square : Use in ‘LARGE’ sample test (Not clearly defined how much is LARGE). If small sample size of don’t want any
assumption -> Use Fisher’s Exact test instead
**Fisher’s Exact test use very complex calculation -> very slow if very high sample size -> Use Chi-square is accepted (result can be
assumed as equal)

pwcorr {var1} {var2} - Pearson’s pairwise correlation : - is negative correlation, + is positive, 0 is cannot
describe correlation, rage can only be between -1 to +1. Greater value = greater
strength of linear correlation!
pwcorr {var1} {var2}, sig - Pearson’s pairwise correlation with p-value
spearman {var1} {var2} - Spearman’s rho correlation

**Use Pearson’s pairwise correlation in conjunction with scatter plot

**Use correlation analysis for ‘Hypothesis generating purpose’
**Correlation analysis -> cannot be used for determine effect size/slope; Tell only 2 things 1. Direction, 2. How linear the data is?
**Use Pearson’s if the data is normally distributed
**Use Spearman’s rho if the data is not normally distributed

oneway {var1} {var2} - Analysis of variance (ANOVA) of Var1 by multiple groups of Var2
oneway {var1} {var2}, tab - Create 2x2 table of Var1 by Var 2, and do the Analysis of variance (ANOVA) of Var1 by
multiple groups of Var2
oneway {var1} {var2}, tab bon - Create 2x2 table of Var1 by Var 2, and do the Analysis of variance ANOVA) of Var1 by
multiple groups of Var2 with Bonferroni correction (Do multiple paired T-test
with p-value compensation)
kwallis {var1}, by({var2}) - K-Wallis rank test for multiple means

**ANOVA is like T-test with same assumption (normally distribute, equal variance -> This command use Bartlett’s test of equal
variance; if significant -> variance not equal between group)
**ANOVA is Parametric test
**We don’t do T-test 3 times instead of using analysis of multiple mean -> Multiplicity, Some may use Bonferroni p-value correction
(but not recommended)
**Multivariate analysis have to include variable that has no statistical significant, but has a difference
**Regression analysis is better than ANOVA, and thus more preferred
**K-Wallis rank test is Non-parametric test for multiple means. Not depends on mean, SD of data

Regression plot (Linear plot) : Menu Graphics -> Two-way graph -> Create -> select ‘Fit plot’ -> Linear prediction ->
input X and Y variable -> Submit
Scatter plot : Menu Graphics -> Two-way graph -> Create -> select ‘Basic plot’ -> Scatter plot -> input
X and Y variable -> Submit
regress {var1} {var2} - Do the linear regression analysis using var1 and var2 and display constant and
coefficient to form linear formula (Y = a + b(x), a = constant, b = coefficient)
regress {var1} i.{var2} **if var2 is ‘strata’ (group1, group2,…)
- Do the linear regression analysis (Y = base + 0(group0) + Coef1(group1) +
Coef2(group2) +…)
regress {var1} i.{var2}, base **if var2 is ‘strata’ (group1, group2,…)
- Do the linear regression analysis (Y = base + 0(group0) + Coef1(group1) +
Coef2(group2) +…), and show base group (group 0)
regress {var1} i.{var2} {var3} …{varn}
- Do the linear regression analysis, adjust base with Var 3 to Var n (var 3 to var n have
to be linear associated with var1**)
**Linear regression plot - Create a line the have lowest cumulative distance between line and each point of data in scatter plot (least
error)
**Regression analysis = regress to the mean/best line
**regress command = Gaussian regression (Y data has normal distribution). There is non-Gaussian regression

>> DAY2 <<

Command

(Cohort study)

drop if {condition} - Drop table according to if clause

cs {var y} {var x} - Cohort study -> Create 2x2 table, calculate risk ratio, risk difference with 95% CI and
Chi-square test result
cs {var y} {var x}, exact - Cohort study -> Create 2x2 table, calculate risk ratio, risk difference with 95% CI and
Fisher’s exact test result (Use 2-sided exact test result)
csi {value1} {value1} {value1} {value1}
- Cohort study immediate command -> Create table using value 1-4
binreg {var1} {var2} {var3} … {var n}, rr
- Do the multivariate regression analysis between Var1 and Var2 using Binary
regression, **adjust for Var3 to Var n to correct confounding factors, then
calculate RR
**Using CS command to create 2x2 epitable without univariable risk ratio in Cohort study is not acceptable anymore (We have to use
Multivariate binary regression to adjust other confounding factors)
Except for RCT (OK due to low confounding factors)

(Case-control study)

cc {var y} {var x} - Case-control study -> Create 2x2 table, calculate odds ratio, 95% CI and Chi-square
test result
cci {value1} {value1} {value1} {value1}
- Case-control study immediate command -> Create table using value 1-4
logistic lbw smoke - Do the multivariate regression analysis between Var1 and Var2 using Logistic
regression, **adjust for Var3 to Var n to correct confounding factors, then
calculate OR

**Risk factor research (Cohort study) : Can use OR in Cohort study, but may overestimate risk ratio (But it looks dramatic!, and
frequently use in risk factor research)

ir {var y} {var x} {follow-up time} - Create 2x2 table, calculate Incidence rate, Incidence rate ratio,
Incidence rate difference and Fisher’s exact test result
poisson {var1} {var2}, exp(day) irr - Poisson regression analysis for rate (Univariable)
poisson {var1} {var2} {var3} … {var n}, exp(day) irr
- Poisson regression analysis for rate (Multivariable adjust using Var3 to Var n) -> for
Average rate (Incidence rate must be constant at all point of time)
stset {day} {var y} - Prepare data for survival analysis (street = survival time set)
sts graph, hazard - Show shape of Smoothed hazard estimate -> shape of rate at all time point
sts graph, cumhaz - Show shape of Cummulative hazard curve
sts graph, (+/-surv) - Show Kaplan-Meier survival probability curve -> Showing overall survival **Default of
sts graph command is KM curve, no need to use ‘surv’
sts graph, surv by({var}) - Show Kaplan-Meier survival probability curve stratify by var
sts graph, failure - Show Failure curve (Inversion of KM curve) -> Showing complication, not death
outcome
stsum - Show Time at risk, Incidence rate, and Survival time percentile
sts graph if {condition} - Show curve that comply with the specifiedcondition
sts test - Log-rank test for survival function (Non-parametric) -> Only tell if all survival curve are
different or not
sts list, at({time1},{time2},{time3},…) surv
- Show survival list at time1, time2, time3,…
stcox {var x} - Cox-regression analysis -> show Hazard ratio compare by Var x
stcox i.{var x}, base - Cox-regression analysis if Var X is ‘strata’ (not value) and show base

**If Incidence rate is not constant -> Instantaneous rate : cannot use Poisson regression, use Cox-regression analysis instead
**sts command : Only use after stset command
**Median survival time = Time point that only 50% of study population survive

diagt {reference} {test} - Calculate all characters of diagnostic test (Sense, Spec, PPV, NPV), with 95% CI
included
roctab {reference} {test} - Create ROC table -> Help in display accuracy in non-binary index test
roctab {reference} {test}, graph - Create ROC curve -> Help in display accuracy in non-binary index test

**Accuracy = (True pos + True neg)/(All sample population)

**When Index test is not binary -> Cannot directly evaluate sensitivity/specificity -> We have to convert into binary eg. restratify group
1,2,3,4,5 into (1,2) and (3,4,5)
**LR+ = Odds of disease in test / Odds of disease in all patient -> Use when Index test is not binary instead of try to create
sense/spec table (too crude!)

simps {proportion1} {proportion2}, power({value}) alpha({value}) ration({value})

- Object-based sample size estimation

Pulmonary Alveolar Proteinosis (PAP) : Punchalee Kaenmuang M.D
No ratings yet
Pulmonary Alveolar Proteinosis (PAP) : Punchalee Kaenmuang M.D
42 pages
Gomez. Statistical Procedures Agriculltural PDF
97% (32)
Gomez. Statistical Procedures Agriculltural PDF
690 pages
Statistical Fundamentals Using Microsoft Excel For Univariate and Bivariate Analysis by Rovai A.P.
No ratings yet
Statistical Fundamentals Using Microsoft Excel For Univariate and Bivariate Analysis by Rovai A.P.
628 pages
Annotated Bibliography Matrix - Template
No ratings yet
Annotated Bibliography Matrix - Template
10 pages
Advanced Quantitative Methods
No ratings yet
Advanced Quantitative Methods
12 pages
A+Short+Guide+to+stata+commands,+M +elsherif
No ratings yet
A+Short+Guide+to+stata+commands,+M +elsherif
29 pages
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
No ratings yet
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
1,595 pages
Artikel Statistik
No ratings yet
Artikel Statistik
14 pages
Group Assignment No.1
No ratings yet
Group Assignment No.1
15 pages
Thematic Analysis - A Guide With Examples - Research Prospect
No ratings yet
Thematic Analysis - A Guide With Examples - Research Prospect
7 pages
Stata Commands-3
No ratings yet
Stata Commands-3
11 pages
SPSS Procedure
No ratings yet
SPSS Procedure
10 pages
Lab 4 Regression BBIO180 Manual Au24
No ratings yet
Lab 4 Regression BBIO180 Manual Au24
5 pages
Statistics Cheatsheet 1703847367
No ratings yet
Statistics Cheatsheet 1703847367
8 pages
Biostat MBBS Project Final 231118 133415
No ratings yet
Biostat MBBS Project Final 231118 133415
51 pages
Test of Significance
No ratings yet
Test of Significance
32 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
WK9 - Quantifying Research Study
No ratings yet
WK9 - Quantifying Research Study
5 pages
Statisticsworksheets
No ratings yet
Statisticsworksheets
12 pages
Term Paper of Statistics - Wilcoxon Test
No ratings yet
Term Paper of Statistics - Wilcoxon Test
17 pages
Data Screening& Factor Analysis
No ratings yet
Data Screening& Factor Analysis
23 pages
Stats For FRCA
No ratings yet
Stats For FRCA
5 pages
Types of Statistical Methods
No ratings yet
Types of Statistical Methods
2 pages
Statistics Tests
No ratings yet
Statistics Tests
9 pages
SPSS Stat Tool Guide
No ratings yet
SPSS Stat Tool Guide
2 pages
Managerial Computing Lab Manual
No ratings yet
Managerial Computing Lab Manual
25 pages
AP Statistics Michel Liao
No ratings yet
AP Statistics Michel Liao
20 pages
Problem Set
No ratings yet
Problem Set
10 pages
Class: I MSC Psychology Subject Name: Research Methodology & Applied Statistics Subject Code: 23psy13 Unit-Iv
No ratings yet
Class: I MSC Psychology Subject Name: Research Methodology & Applied Statistics Subject Code: 23psy13 Unit-Iv
9 pages
Assignment-Online Classes-2-A (Formulating The Research Design)
No ratings yet
Assignment-Online Classes-2-A (Formulating The Research Design)
4 pages
Introduction To Gastrointestinal Radiology: Piyaporn Apisarnthanarak, M.D. Faculty of Medicine Siriraj Hospital
100% (1)
Introduction To Gastrointestinal Radiology: Piyaporn Apisarnthanarak, M.D. Faculty of Medicine Siriraj Hospital
37 pages
Content Outline: Chapter 1: Descriptive Statistics and Graphical Analysis
No ratings yet
Content Outline: Chapter 1: Descriptive Statistics and Graphical Analysis
4 pages
Medical Statistics New
No ratings yet
Medical Statistics New
46 pages
Chapter Proposal
No ratings yet
Chapter Proposal
6 pages
Onwuegbuzie
No ratings yet
Onwuegbuzie
28 pages
MBR Lab Week 10-12-1
No ratings yet
MBR Lab Week 10-12-1
65 pages
Bussiness Statistics Book
No ratings yet
Bussiness Statistics Book
5 pages
Existing Tools For Measuring or Evaluating Internationalisation in HE
No ratings yet
Existing Tools For Measuring or Evaluating Internationalisation in HE
2 pages
Connective Tissue Disease-Associated Interstitial Lung Disease
No ratings yet
Connective Tissue Disease-Associated Interstitial Lung Disease
35 pages
Chest PA Film Film Focal Sport Distance 6 Feet Left Lateral Film Left Lateral Film Film Focal Sport Distance 6 Feet With Barium Swallowing
No ratings yet
Chest PA Film Film Focal Sport Distance 6 Feet Left Lateral Film Left Lateral Film Film Focal Sport Distance 6 Feet With Barium Swallowing
39 pages
Lavaan Multilevel Zurich2017
100% (1)
Lavaan Multilevel Zurich2017
162 pages
Microunit 4
No ratings yet
Microunit 4
5 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Statistics През
No ratings yet
Statistics През
46 pages
Cannabis Preparation
No ratings yet
Cannabis Preparation
44 pages
Seminar 3
No ratings yet
Seminar 3
69 pages
Data Analysis - Selecting A Test
No ratings yet
Data Analysis - Selecting A Test
5 pages
ADA1 Notes F14
No ratings yet
ADA1 Notes F14
376 pages
Business Statistics - Session Introduction To Statistics
No ratings yet
Business Statistics - Session Introduction To Statistics
34 pages
The Vanish Trial
No ratings yet
The Vanish Trial
41 pages
Resumo Adp
No ratings yet
Resumo Adp
5 pages
ICM Neuroradiology: Siri-On Tritrakarn, M.D. Division of Diagnostic Neuroradiology
No ratings yet
ICM Neuroradiology: Siri-On Tritrakarn, M.D. Division of Diagnostic Neuroradiology
41 pages
Statistical Tests - Handout PDF
No ratings yet
Statistical Tests - Handout PDF
21 pages
Anova and F Test
No ratings yet
Anova and F Test
32 pages
Psych Stats
No ratings yet
Psych Stats
8 pages
Calculator Shortcuts For AP
No ratings yet
Calculator Shortcuts For AP
5 pages
Sampling Methods: Prof. Dr. Kirti Mahajan
100% (1)
Sampling Methods: Prof. Dr. Kirti Mahajan
51 pages
Interstitial Pneumonia With Autoimmune Features (IPAF) : Interesting Case
No ratings yet
Interstitial Pneumonia With Autoimmune Features (IPAF) : Interesting Case
10 pages
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
100% (1)
Descriptive Descriptive Analysis and Histograms 1.1 Recode 1.2 Select Cases & Split File 2. Reliability
6 pages
AGR003 Laboratory Stats Tester: For Android
No ratings yet
AGR003 Laboratory Stats Tester: For Android
3 pages
Analyze Tools Matrix v4.3
No ratings yet
Analyze Tools Matrix v4.3
17 pages
Principles of Statistical Analysis
No ratings yet
Principles of Statistical Analysis
21 pages
Quantitative Research Methods
No ratings yet
Quantitative Research Methods
18 pages
Chapter 7
No ratings yet
Chapter 7
24 pages
Real Statistics Examples Part 1A
No ratings yet
Real Statistics Examples Part 1A
853 pages
MCQ Internal Med 2018-Chest PDF
No ratings yet
MCQ Internal Med 2018-Chest PDF
44 pages
Preliminary Analysis: - Descriptive Statistics. - Checking The Reliability of A Scale
No ratings yet
Preliminary Analysis: - Descriptive Statistics. - Checking The Reliability of A Scale
92 pages
304BA AdvancedStatisticalMethodsUsingR
No ratings yet
304BA AdvancedStatisticalMethodsUsingR
31 pages
Quality Trainer Content Outline
0% (1)
Quality Trainer Content Outline
4 pages
Basics of Biostatistics: DR Sumanth MM
No ratings yet
Basics of Biostatistics: DR Sumanth MM
27 pages
MINITAB 14 Supplement For Biostatistics For Health Sciences
No ratings yet
MINITAB 14 Supplement For Biostatistics For Health Sciences
87 pages
JH Nursing Research Evidence Appraisal
No ratings yet
JH Nursing Research Evidence Appraisal
3 pages
Systat
No ratings yet
Systat
8 pages
Quiz Stat Ans From Net
No ratings yet
Quiz Stat Ans From Net
7 pages
15 Qualitative Research PDF
No ratings yet
15 Qualitative Research PDF
4 pages
800 Research Methods
No ratings yet
800 Research Methods
39 pages
Content Outline: Chapter 1: Descriptive Statistics and Graphical Analysis
50% (2)
Content Outline: Chapter 1: Descriptive Statistics and Graphical Analysis
4 pages
EAPP Principles and Uses of Surveys Experiments and Scientific Observations
0% (1)
EAPP Principles and Uses of Surveys Experiments and Scientific Observations
2 pages
Basic STATA Command
No ratings yet
Basic STATA Command
5 pages
4.4 Non Parametric Test
No ratings yet
4.4 Non Parametric Test
56 pages
Sigmaxl Summary
No ratings yet
Sigmaxl Summary
24 pages
Writing The Research Proposal
No ratings yet
Writing The Research Proposal
10 pages
Minitab14 Manual
No ratings yet
Minitab14 Manual
87 pages
Basic Statistics Competency: I. Introduction To Statistics
No ratings yet
Basic Statistics Competency: I. Introduction To Statistics
4 pages
Checklist - For - RANDOMIED CONTROLLED TRIALS PDF
No ratings yet
Checklist - For - RANDOMIED CONTROLLED TRIALS PDF
9 pages
General SPSS Help
No ratings yet
General SPSS Help
4 pages
SPSS Workshop: Utilizing and Implementing SPSS in Our OC-Math Statistics Classes
No ratings yet
SPSS Workshop: Utilizing and Implementing SPSS in Our OC-Math Statistics Classes
11 pages
Add Names For The Following Examples in The Practice Questionnaire: Serial No., Section A. Question 1, Section A. Question 2
No ratings yet
Add Names For The Following Examples in The Practice Questionnaire: Serial No., Section A. Question 1, Section A. Question 2
6 pages
Advanced Statistics Manual PDF
100% (3)
Advanced Statistics Manual PDF
258 pages
3) One New and Important Graph Graphs Legacy Dialogs Population Pyramid
No ratings yet
3) One New and Important Graph Graphs Legacy Dialogs Population Pyramid
3 pages
Calculator Help
No ratings yet
Calculator Help
2 pages
Math13 - Advanced Statistics Lecture Note: Case of Two Independent Samples
No ratings yet
Math13 - Advanced Statistics Lecture Note: Case of Two Independent Samples
27 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
Mutual Fund Project Outline
No ratings yet
Mutual Fund Project Outline
6 pages
Choosing The Right Statistical Test: Source
No ratings yet
Choosing The Right Statistical Test: Source
4 pages
Diagnostic Test Practical Research 2
90% (10)
Diagnostic Test Practical Research 2
4 pages
Oup 9
No ratings yet
Oup 9
26 pages
Guideline For Final Year Project - Research Supervision: Faculty of Business, Accountancy and Management
No ratings yet
Guideline For Final Year Project - Research Supervision: Faculty of Business, Accountancy and Management
71 pages

STATA Command Summary

Uploaded by

STATA Command Summary

Uploaded by

### STATA command Summary ### (CMU Basic Clinical Statistic Course 2020)

>> DAY1 <<

help {command} - Help on STATA command

**TIPS Before analysis

gen {var1}={var2} - Create new variable var1 with value of var2

**Use Pearson’s pairwise correlation in conjunction with scatter plot

>> DAY2 <<

drop if {condition} - Drop table according to if clause

**Accuracy = (True pos + True neg)/(All sample population)

simps {proportion1} {proportion2}, power({value}) alpha({value}) ration({value})

You might also like