0% found this document useful (0 votes)
3 views5 pages

Lecture 4

The document outlines a lecture on using STATA for statistical analysis, focusing on commands for initial statistics and T-tests. It includes specific tasks such as loading a dataset, performing hypothesis testing, and calculating confidence intervals for serum creatinine levels and temperature changes among patients. The results indicate that certain hypotheses were not statistically significant, while others showed significant changes in temperature.

Uploaded by

Sara Noor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

Lecture 4

The document outlines a lecture on using STATA for statistical analysis, focusing on commands for initial statistics and T-tests. It includes specific tasks such as loading a dataset, performing hypothesis testing, and calculating confidence intervals for serum creatinine levels and temperature changes among patients. The results indicate that certain hypotheses were not statistically significant, while others showed significant changes in temperature.

Uploaded by

Sara Noor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Lecture # 4

Understanding the STATA commands statistics and T test

Stata is a statistical software package used for data analysis, data management, and visualization.
It provides a wide range of tools for statistical analysis, including regression analysis, time-series
analysis, panel data analysis, and survival analysis, among others. Stata is popular in academia,
business, and government for its robust features, user-friendly interface, and powerful scripting
capabilities. It is commonly used by researchers, economists, social scientists, and healthcare
professionals to analyze and interpret data in various fields.

For today, session our objective is to perform the following task

Task 1: Find the initial statistics

Task 2: Performing T tests

Task 1

1) Load the data set named “sepsis.dta”


2) Using following commands as we did in class
a) Use describe ( this will describe the data set)
b) Use summarize or summ to find quantitive variables' summary statistics(mean, standard
deviation).
c) Use tab var1 var2 to make one and two way table of the qualitative variables
d) Use tab var1,var2 , summarize(var3) to make twoway table of the qualitative for quatitative
var3 . for eg
Task# 2

For Question 1-2 Use the following information

In the general population the population mean (μ) and standard deviation (σ) of serum creatinine
are μ = 1.0 mg/dL and σ = 0.4 mg/dL, respectively

Question 1

A sample of n=12 patients were administered a new antibiotic. One day later, their serum creatinine
levels were measured. The mean of this sample was 1.2 mg/dL. Using an appropriate immediate
command, test the null hypothesis that the mean in the sample is different from the mean in
the general population. Do NOT assume that the population variance is known. Instead, suppose
you are given the sample standard deviation (s) and it has value s = 0.6 mg/dL. Provide a 1 sentence
interpretation of your output.

Statistics-> Summaries -> Classical hypothesis testing-> t test calculator


Answer

In this sample of n=12, the observed mean was 1.2. A two-sided t-test of the null hypothesis that μ
= 1.0 using sample standard deviation s = 0.6, yielded a t-statistic value of 1.15 and associated two
sided p-value = 0.27. This is not statistically significant. The null hypothesis is not rejected.
Conclude that these data do not provide statistically significant evidence that the mean serum
creatinine among patients taking the new antibiotic μ ≠ 1.0.

Note that the question hypothesis is that the sample mean is different than population mean. While
in STATA H0 is about equality.

Question # 2

Using an appropriate immediate command, obtain a 95% confidence interval estimate of the true
mean serum creatinine among patients who have received the new antibiotic. Again, do NOT
assume that the population variance is known. Again, use the sample standard deviation s = 0.6
mg/dL. Provide a 1-sentence interpretation of your output

Answer

Based on this sample of n=12, with 95% confidence, the unknown mean serum creatinine among
patients taking the new antibiotic is estimated to be between 0.82 and 1.58.
Question # 3

Download from the course website Sepsis.dta

Consider treated patients whose race is recorded as “other”. For this subset of the data, test
whether the baseline temperature (temp0) is significantly different from their temperature
after 2 hours (temp1). Provide a 1 sentence interpretation of your output.

Here question null hyposthis is temp0 != temp1

INTERPRETATION: In the sub-group of 32 who are classified as race=’other’, data on temp0 and
temp1 were complete for n=28, or 87.5%. Among these n=28, the mean temperatures at baseline
and two-hours were 100.6 and 99.9, respectively. The mean and standard deviation of the baseline
to two-hour change in temperature were 0.67 and 0.22, respectively. A two-sided paired t-test of the
null hypothesis that the mean change μd = 0 yielded a paired tstatistic value of 2.96 and associated
two sided p-value = 0.006. This is statistically significant. The null hypothesis is rejected. Conclude
that these data provide statistically significant evidence that the change in temperature between
the baseline and 2-hour occasions of measurement μd ≠ 0.

Question # 4

Consider, still, ONLY the treated patients whose race is recorded as “other”. For this subset of the
data, obtain a 90% confidence interval for the true change in temperature between baseline and 2
hours. Provide a 1 sentence statement and interpretation of your confidence interval.

Just find the same with 90% significance level

Do it your self with 90% confidence intervals


Question # 5

Test whether the baseline APACHE score (apache) is different between treated and untreated
patients The treatment variable is treat. Provide a 1 sentence interpretation of your output.

Hypotheis is APACHE score (apache) is different between treated and untreated patients

Answer

Full data set - Two independent samples, continuous: we will apply 2 Sample t test

INTERPRETATION: In this cohort of n0 = 230 treated with placebo and n1 = 224 treated with
ibuprofen, the mean APACHE scores were 15.19 (s1 = 6.9) and 15.48 (s2 = 7.3), respectively. A two
sample t-test of the null hypothesis of equality of means yielded a t-statistic value of -0.44 and
associated two sided p-value = 0.66. This is not statistically significant. The null hypothesis is not
rejected. Conclude that these data do not provide statistically significant evidence of a treatment
effect on APACHE score.

You might also like