0% found this document useful (0 votes)
12 views25 pages

Practical Manual

The BSU 5335 Health Statistics Practical Manual focuses on using Minitab for statistical data analysis, covering topics such as the user interface, data types, and various statistical methods including descriptive analysis, hypothesis testing, and linear regression. It provides step-by-step instructions for managing projects and worksheets, as well as conducting analyses on qualitative and quantitative variables. The manual is designed to facilitate understanding and application of statistical techniques in health sciences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views25 pages

Practical Manual

The BSU 5335 Health Statistics Practical Manual focuses on using Minitab for statistical data analysis, covering topics such as the user interface, data types, and various statistical methods including descriptive analysis, hypothesis testing, and linear regression. It provides step-by-step instructions for managing projects and worksheets, as well as conducting analyses on qualitative and quantitative variables. The manual is designed to facilitate understanding and application of statistical techniques in health sciences.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

BSU 5335 – Health Statistics

Practical Manual
Minitab

Department of Basic Sciences


Faculty of Health Sciences
The Open University of Sri Lanka
Tel: 011 - 2881355
Contents

1. Introduction to Statistical Software...................................................................................... 2


2. The Minitab User Interface ................................................................................................... 3
3. Projects and Worksheets........................................................................................................ 4
3.1 How to save a Project ...................................................................................................4
3.2 How to save the Worksheet ..........................................................................................4
4. Data Types ............................................................................................................................... 4
5. How to Open a Worksheet .................................................................................................... 5
6. Changing the Data .................................................................................................................. 5
7. Changing the Data Type ........................................................................................................ 6
8. Coding the Data ...................................................................................................................... 7
9. Extract a Portion of Data ....................................................................................................... 7
10. Descriptive Analysis .............................................................................................................. 8
10.1 Analysis of Qualitative Variables ...............................................................................8
10.2 Analysis of Quantitative Variables...........................................................................10
10.3 Relationship between two Quantitative Variables ...................................................11
11. Linear Regression….…………………………………………………………………13

12. Hypothesis testing for a Single mean (Z test) ................................................................... 15


13. Hypothesis testing for a Single mean (t test) .................................................................... 16
14. Paired t-test ............................................................................................................................ 17
15. Significant test for Comparison of two means ................................................................. 19
16. Chi – square test ................................................................................................................... 21
17. ANOVA ................................................................................................................................. 22

1
1. Introduction to Statistical Software

Measurements are of little use until they are 'analyzed'. Data analysis includes
organizing measurements into a meaningful order or into groups, reducing the data into
manageable quantities, forming succinct descriptions of the main features of the data,
and elucidating any anomalies for subsequent examination. Analysis is the step between
obtaining data and applying it to solve practical problems.

The recent advent of spreadsheets and statistical/graphical software packages has


transformed elementary statistical analysis from a rather mathematical subject, backed
up by dense statistical tables, into a readily accessible technique. Spreadsheets such as
EXCEL contain several statistical functions, and these can be very helpful and much
better. However, statistical packages that have been specially written to be 'user
friendly' and to include a comprehensive suite of up-to-date statistical methods. There
are lot of supportive statistical software available to make you more comfortable with
undertaking statistical analyses such as MINITAB, SPSS, SAS, Stata, S – Plus, R etc.

SPSS stands for Statistical Package for the Social Sciences. It was one of the earliest
statistical packages with Version 1 being released in 1968, well before the advent of
desktop computers. It is now on Version 23. SAS stands for Statistical Analysis
System. It was developed at the North Carolina State University in 1966, so is
contemporary with SPSS. Stata is a more recent statistical package with Version 1
being released in 1985. Since then, it has become increasingly popular in the areas of
epidemiology and economics. S-plus is a statistical programming language developed
in Seattle in 1988. R is a free version of S-plus developed in 1996. MINITAB is a
particularly easy package to learn and to use; it has excellent self-help facilities, has
been well tested, includes modern statistical methods and is widely used both inside and
outside the University. MINITAB is an ideal package for learning statistics.

2
2. The Minitab User Interface

Minitab is a software package that is for statistical data analysis. There are lots of versions
of Minitab. In these practical sessions we are learning about Minitab 16.0.

To start Minitab software, go to Windows taskbar,

Start  All Programs  Minitab 16

To exit Minitab, select


File  Exit

There are three main windows in Minitab. By default, Minitab opens with two windows
visible and one window minimized.

Session window
The Session window displays the results of your analyses in text format. Also, in
this window, you can enter session commands instead of using Minitab’s menus.
(Ctrl+M)

Worksheet
The worksheet, which is similar to a spreadsheet, is where you enter and arrange
your data. You can open multiple worksheets. (Ctrl+D)

Project Manager
The third window, the Project Manager, is minimized below the worksheet. (Ctrl+I)

3
Project manager contains another few icons; in that, history window records all the
commands you have used earlier, graph window displays graphs that you have drawn,
and worksheet window shows information of active worksheets.

3. Projects and Worksheets


Project is the component that is used to manipulate data, perform analyses, and generate
graphs. Projects contain one or more worksheets. You can save Minitab project as
(.MPJ) files and it store all worksheets, graphs, session window outputs and session
command history and you can save only the worksheets as (.MTW) files. It will store
only the data in worksheet.

Save your work as a project file to keep all your data, graphs, dialog box settings, and
options together. If you need only to save data, save your work as a worksheet file. A
worksheet file can be used in multiple projects. Worksheets can have up to 4,000
columns. The number of worksheets that a project can have is limited only by your
computer's memory.

3.1 How to save a Project

1. Choose File  Save Project As.


2. Browse to the folder that you want to save your files in.
3. Enter a name for the project file.
4. Select the Minitab project (*.MPJ) as the save type.
5. Click Save.

3.2 How to save the Worksheet

1. Click in the worksheet, then choose File  Save Current Worksheet As.
2. Browse to the folder that you want to save your files in.
3. Enter a name for the worksheet.
4. Select the relevant file type as the save type.
5. Click Save.

4. Data Types
A worksheet can contain the following types of data.

 Numeric data
Numbers, such as 264 or 5.28125.

 Text data
Letters, numbers, spaces, and special characters, such as Test #4 or North America.

 Date/time data
Dates, such as Mar-17-2013, 17-Mar-2013, 3/17/13, or 17/03/13.
Times, such as 08:25:22 AM.
Date/time, such as 3/17/13 08:25:22 AM or 17/03/13 08:25:22

4
5. How to Open a Worksheet

You can open a new, empty worksheet at any time. You can also open one or more files
that contain data, such as a Microsoft Excel file. When you open a file, it copies the
contents of the file into the current Minitab project. Any changes that you make to the
worksheet while you are working in the Minitab project do not affect the original file.

 To open new empty worksheet

Go to File  New  Minitab Worksheet

 To open existing worksheet

Go to File  Open Worksheet  Browse the file that you want to open

In a worksheet, data are arranged in columns, which are also called variables. The column
number and name are indicated at the top of each column.

6. Changing the Data

Data in the data window can be corrected by simply clicking on a cell, typing in a correct
entry, and hitting Enter. For more extensive changes, the Editor menu can be used. Under
the Editor menu, you may choose to either insert cells, rows, or columns in the data set.
 To insert one or more empty cells above the active cell of the data window, select
Editor  Insert Cells from the menu.
Then the remaining cells in the column move down. The number of cells inserted will be
equal to the number of cells selected before you choose the command.
 To insert one or more empty columns to the left of the active column. Select

Editor  Insert Columns from the menu.

5
 Similarly, to insert one or more empty rows above the active row. Select
Editor  Insert Rows from the menu.

Example 01

Enter the following data set into Minitab worksheet which consists of the variables,
Index Number, Mathematics, Statistics and Gender (male-1, female-2).

Index No Mathematics Statistics Gender


AS2014101 70 60 1
AS2014102 82 86 2
AS2014103 56 55 1
AS2014104 65 65 1
AS2014105 76 84 1
AS2014106 55 45 2
AS2014107 60 57 1
AS2014108 75 70 2
AS2014109 51 50 2
AS2014110 82 93 1

7. Changing the Data Type

 To change data type, go to:

Data Change data type  Numeric to text / Text to Numeric

Example 02:

Change the data type of “Gender” from Numeric to Text and store it in column C6.
(Type C6 in “store text column in”)

6
8. Coding the Data

By coding data, we can label data (Specially the categorical data)

 To change data, go to:

Data  code  Numeric to Numeric/ Text to Numeric/Text to text

Example 03:

Code “Gender” as follows and store the result in the same column.
1 - male 2 - female

9. Extract a Portion of Data

There are several options that we can use to extract a portion of data from the original
worksheet. Following are two such options.

 Split (Data  Split worksheet)

This option can be used to separate the data in a worksheet using a qualitative
variable. It splits the whole worksheet by a variable (usually by a qualitative
variable) and produce new worksheets.

Example 04:

Split the worksheet by “Gender” and get separate worksheets for each gender.

 Subset (Data  Subset worksheet)

This option is used to copy specified rows from the active worksheet to a new sub
worksheet. You can specify the subset based on row numbers or a condition.

7
Example 05:

Obtain a sub worksheet for Gender = “male”

10. Descriptive Analysis

Descriptive statistic is a summary statistic that quantitatively describes or summarizes


features of a collection of information. Descriptive statistics provide simple summaries
about the sample and about the observations that have been made. Such summaries may
be either quantitative (Mean, Median, Variance etc.) or visual (graphs).

Example: Sales Data Set


A research company has collected following information from 250 sales representatives
of a certain company to identify the factors that could affect the monthly sales income
of sales representatives. The data are stored in a Minitab worksheet named
“SALESREP.MTW”. The description of data collected is given. We can descriptively
analyze these data as follows.

Column Name Description


C1 Age Age of a sales representative in years
C2 Gender Gender of a sales representative
C3 Experience Sales representatives’ experience in years
C4 Edu Quali Highest education qualification
1 – G.C.E.(O/L) 2 – G.C.E.(A/L)
C5 Coverage Sales coverage of a representative in km2
C6 Site Sales site of a representative
1 – Urban Area 2 – Rural Area
C7 Sales Income Monthly sales income of a sales representative in Rupees

10.1 Analysis of Qualitative Variables

a. One-way frequency table (Stat  Tables  Tally individual variables)


b. Cross Tabulation Table (Stat  Tables  Cross Tabulation and Chi-Square)

8
Table 01: Example of Frequency table and Cross Tabulation table

Figure 01: Pie Chart (Graph  Pie charts) and Simple Bar Chart (Graph  Bar charts)

Pie Chart of Gender Chart of Gender


Category 70 67.6
Female
Male
60

Female 50
32.4%
Percent

40
32.4
30

20

Male 10
67.6%

0
Female Male
Gender
Percent within all data.

Interpretation: According to the table 01 and Figure 01, we can say that the proportion of
males in the sample is twice as much as the proportion of females in the sample. Male
proportion is about 68% from the entire sample.

9
10.2 Analysis of Quantitative Variables

Figure 02: Dot plot (Graph  Dotplot)

Dotplot of Monthly Sales Income

8400 9000 9600 10200 10800 11400 12000 12600


Monthly Sales Income

Table 02: Descriptive Statistics (Stat  Basic statistics  display descriptive statistics)

Interpretation: According to the Figure 02 and table 02, monthly sales income of the
sales representatives is approximately symmetrically distributed with mean Rs. 10251.

10
Figure 03: Histogram (Graph  Histogram)

Histogram of Monthly Sales Income

25

20

Frequency
15

10

0
8400 9000 9600 10200 10800 11400 12000
Monthly Sales Income

Figure 04: Box Plot (Graph Boxplot)

Boxplot of Monthly Sales Income


13000

12000
Monthly Sales Income

11000

10000

9000

8000

10.3 Relationship between two Quantitative Variables


Figure 05: Scatter Plot (Graph Scatterplot)
Scatterplot of Experience vs Age
16

14

12

10
Experience

0
15 20 25 30 35
A ge

11
 Correlation (Stat  Basic statistics  Correlation)

Interpretation: There is a positive moderate linear relationship between experience


and age. The strength of the relationship is 0.731.

Exercise 01

1. Code the Site variable as follows and store the result in C9. 1=Urban Area, 2=Rural
Area
2. Split the worksheet by using gender.
3. Find the mean, median and standard deviation of monthly sales income for each
gender separately
4. Obtain frequency tables for education qualification, and site. Interpret the outputs.
5. Obtain pie charts for education qualification, and site.
6. Obtain dot plots for experience and age variables.
7. Obtain histogram for coverage variable.
8. To find the relationship between monthly sales income and the experience get a
scatter plot and correlation coefficient.
9. Interpret the results that you obtained in part 8.

12
11. LINEAR REGRESSION

1. Consider a clinical trial involving patients presenting with hyperlipoproteinemia.


Baseline values of the age of patients (years), total serum cholesterol level(mg/ml)
were recorded.
Patient Age Cholesterol
1 46 3.5
2 20 1.9
3 52 4.0
4 30 2.6
5 57 4.5
6 25 3.0
7 28 2.9
8 36 3.8
9 22 2.1
10 43 3.8
11 57 4.1
12 33 3.0
13 22 2.5
14 63 4.6
15 40 3.2
16 48 4.2
17 28 2.3
18 49 4.0

Fit the regression line to check the effect of the age for cholesterol level of the person,
using the above data.

Stat Regression Regression

13
Select Cholesterol level as the response variable and Age as the predictor variable.
The result will be as follows;

For any individual, his/her cholesterol level is completely determined by the equation:
Cholesterol = 1.089 + 0.057 (Age)

Interpretation: If the age of the patient increases by 1, we predict the cholesterol level of the
patient is increased by approximately 0.057 and we can predict the cholesterol level of the new
born baby (age=0) is approximately 1.089.

14
12. Hypothesis testing for a Single mean (Z test)

(Refer: Page No. 20 in Block II)

2. Suppose from the literature we came to know that two decades ago the mean weight
of people in Galle was 50 Kg. A researcher wants to see whether the mean weight of
people in Galle has changed or not. Suppose the mean weight of a sample of 40
people investigated recently is 51 Kg with SD of 2 Kg.

Null hypothesis H0: µ = 50


Alternative hypothesis H1: µ ≠ 50

 Stat Basic Statistics1-Sample ZSummarized data

Interpretation: Since p value is less than 0.05, reject the Ho. (The z table value is
1.96. Test statistic value (3.16) is greater than 1.96). Then at 5% significance level we
can conclude that mean weight of the people in Galle has changed.

15
3. The researcher is interested in testing whether the mean weight of people in Galle today
is greater than the value observed in the past (50Kg).

Null hypothesis H0: µ 50


Alternative hypothesis H1: µ > 50

 Stat Basic Statistics 1-Sample Z Summarized data Option


Alternative: greater than

13. Hypothesis testing for a Single mean (t test)

(Refer: Page No.23 in Block II)

4. Suppose you are conducting an experiment to see if a given therapy works to reduce
test anxiety in a sample of nursing students. A standard measure of test anxiety in this
nursing population is known to produce a µ = 20. In the sample of 20 nursing students
(n= 20) who had undergone the therapy, the mean score of test anxiety was 18 with
SD 9.
H0: The average test anxiety of nursing students who use the therapy is not different
from 20. (i.e. H0: μ 20)

H1: The average test anxiety of nursing students who use the therapy is lower than 20.
(i.e. H1: μ < 20)

16
 Stat Basic Statistics1-Sample tSummarized data

Interpretation: Since p value is greater than 0.05, do not reject the Ho. (The t table
value with 19 df is -1.72. Test statistic (-0.99) is in the acceptance region). Then at 5%
significance level we can conclude that the therapy has not reduced the average anxiety
of nursing students.

14. Paired t-test

(Refer: Page No. 24 in Block II)

5. Suppose you wish to test the effect of Prozac on the well-being of depressed
individuals, using a standardized "well-being scale" that could range from 0 to 20.
Higher scores indicate greater well-being. Before and after taking Prozac, scores
obtained for the measure of well-being on 9 subjects are given below.

17
Well-being Well-being
Subject
Score Score
(pre) (post)
1 3 5
2 0 1
3 6 5
4 7 7
5 4 10
6 3 9
7 2 7
8 1 11
9 4 8

Is there any effect of Prozac on Well-being of depressed?

H0: There is no effect ( ) Vs. H1: There is an effect

Stat Basic StatisticsPaired tSamples in columnsFirst sample: Post, Second


sample: Pre OptionAlternative: greater than

18
Interpretation: Since p value is less than 0.05, reject the Ho. (The t table value with 8 df at
5% level is 2.31. Test statistic is greater than critical value). Then at 5% significance level
we can conclude that, after taking Prozac it will show a positive change on Well-being of
depressed.

15. Significant test for Comparison of two means


(Refer: Page No.28 in Block II)

6. Suppose a medical researcher is investigating the effectiveness of two pain killer drugs
(Drug A and Drug B). Drug A was given to 15 patients and drug B was given to 12
patients. Data are given below. Which drug is more effective in reducing pain? (t test)

Ho: There is no difference in the time taken to alleviate pain in the two drugs (Ho:
)

H1: There is a difference in the time taken to alleviate pain in the two drugs (H1:
)

Time (in minutes) taken to alleviate pain


Cases
Drug A Drug B
1 44 52
2 51 64
3 52 68
4 55 74
5 60 79
6 62 83
7 66 84
8 68 88
9 69 95
10 71 97
11 71 101
12 76 116
13 82
14 91
15 108

19
Stat Basic Statistics2-Sample tSamples in different columns

Interpretation: Since p value is less than 0.05, reject the Ho. (The table value with 25 df at
5% level is -2.06. Test statistic is in the rejection region). Then at 5% significance level we
can conclude that there is a difference in the time taken to alleviate pain in the two drugs.

Note: Looking at the sample mean values, it can be concluded that the time taken to
alleviate pain in drug B is higher than that of drug A. So, the drug A is better than drug B as
a pain killer.

20
16. Chi-square test

(Refer: Page No. 40 in Block II)

7. Suppose a public health nurse had investigated 60 men and 40 women in her area
and found that 50 men and 25 women were physically less active. The data (called
observed data) can be presented in a contingency table as follows. Perform the chi-
square test to see whether there is any association between gender and physical
activity.

Physical activity behavior


Gender
Physically Active Physically less active
Men 10 50
Women 15 25

H0: There is no any association between gender and physical activity


H1: There is an association between gender and physical activity

Stat Tables Chi-square Test (two -Way Table in worksheet)

Then select the two variables and click OK

Interpretation: Since p value is less than 0.05, reject the Ho.

Test statistics value (5.556) > Critical value (3.84)

(The table value with 1 df at 5% level is 3.84. Test statistic is in the rejection region).
Then at 5% significance level we can conclude that there is an association between gender
and physical activity.

21
17. ANOVA

(Refer: Page No.54 in Block II)

8. Suppose a nurse wishes to know whether the blood glucose levels of patients who have
undergone four (4) treatments are the same. The blood glucose levels of the sample of
patients who have undergone the four different treatment are given below. Find the best
treatment. (One-way ANOVA)
H0: µ1 = µ2 = µ3 = µ4

H1: At least one of the population means is different from the others

Treatment 1 Treatment 2 Treatment 3 Treatment 4


288.1 229.1 177.4 299.7
296.8 240.7 202.2 258.3
267.8 239.4 163.1 286.8
256.7 207.7 184.7 244.0
292.1 225.7 197.9 267.1
282.9 230.8 164.6 297.1
260.3 206.6 193.9 249.9
283.8 213.3 158.1 265.1

StatANOVAOne-way (Unstacked)

22
Interpretation: Since p value is less than 0.05, reject the Ho. Then at 5% significance level
we can conclude that at least one of the treatment means is different from the others.

Note: We can determine which population means differ from the other by doing Tukey’s
comparison test. (ComparisonsTukey)

23
In these results, the table shows that group A contains Treatment 1 and 4, group B
contains only Treatment 2, and group C contains only Treatment 3. Differences
between means that share a letter are not statistically significant. Treatment 2 and 3 do
not share a letter, which indicates that Treatment 3 has a significantly lower mean
than Treatment 2. That means treatment 3 has the lowest blood glucose level.
Therefore treatment 3 is the best treatment out of four treatments.

24

You might also like