Statistical Data Analysis - 1 - Step by Step Guide
Statistical Data Analysis - 1 - Step by Step Guide
ANALYSIS – I
& MINISTATISTICAL DATA ANALYSIS - 1
Step by Step Guide to SPSS & MINITAB
Second Edition
Copyright © (2020) Lakmini U. Mallawarachchi
ISBN: 979-8653934339
i
STATISTICAL DATA ANALYSIS - 1
Step by Step Guide to SPSS & MINITAB
Lakmini U. Mallawarachchi
Sri Lanka)
Sri Lanka)
ii
Preface
Statistical Data Analysis -1, Step by Step Guide to SPSS & MINITAB, takes a
straight forward, step by step approach that makes familiar to SPSS and
MINITAB softwares.
This book covers the basics of descriptive statistical analysis and data
presentation techniques using SPSS and MINITAB, in a simple language
with several examples to make easier for a beginner to understand with
less effort. Most importantly, this book is ideal for undergraduates who
need to complete their data analysis in research studies using SPSS and
MINITAB softwares.
I hope that this book will be very much useful to students, instructors and
researchers in applied and social sciences. Additionally, this book can also
be used as a self-study material and text book.
Lakmini U. Mallawarachchi
June 2020
iii
Table of Contents
REFERENCES ...................................................................................................................................................... 63
iv
Examples
v
CHAPTER ONE: INTRODUCTION
There are three main categories in the SPSS window such as menu bar,
tool bar and output viewer window.
1
Menu
Bar Tool Bar
Menu bar indicates menus with all the features that are required to
manage data and other files. Tool bar is useful to access many features.
SPSS directly sends the output to output viewer window.
After opening the SPSS sheet, you can see this at left hand bottom of the
SPSS sheet.
Name: When typing the name of the variable, must start with a letter,
no spaces are allowed, no special characters or symbols should be
included, must be unique with the data set.
Type: It depends on the data entered on the Name Column. Default
Type is Numeric with two decimal places. There are various types
available as shown below.
2
Width: It indicates the maximum number of characters that is
allowed for each response.
Decimals: Number of decimal places available especially for the
numerical variables.
Label: Indicates the full name of the variable. Unlike in the variable
name, the label may include spaces or special characters.
Values: This includes the list of valid options available for the
variable. Example: For the Gender variable, values are 1- Male and 2-
Female.
Missing: This shows the values being used when response is not
applicable or not answered.
Columns: It is about the width of the data column for this variable.
i.e. the number of characters included.
Align: It is whether the data should be presented on either left, right
or center.
Measure: It’s about the level of measurement of the variable. There
are four types of measurements. i.e. nominal, scale and ordinal.
3
- Usually, nominal variable is for the qualitative variable, and do
not represent any ranking such as sex, race, district etc.
- Scale variable measures the quantity such as no of people, age,
distance, weight, amount of money. i.e. for quantitative variables.
- Ordinal variable is for the qualitative variables such as level of
education, level of satisfaction, etc.
Example 1.1: Enter the following data set in to SPSS work sheet.
Step 1: Click the ‘variable view’ tab. Type the names of the variables
under the ‘Name’ column.
4
Step 2: Click ‘data view’ tab. Each variable name that is entered under
the ‘variable view’ will now be included in columns as shown below.
Step 4: Repeat these steps to enter all the data included in the table
given in Example 1.1.
5
1.1.3 Saving the data in to SPSS
Step 2: In the ‘Save Data As’ dialogue box, enter the name you want to
save the data, inside the white space given in front of ‘ File name’ .
Usually, in SPSS files save as .sav type.
6
1.2 Introduction to MINITAB
Session
Window
Data
Window
7
1.2.1 Opening MINITAB software
Generally, data are entered into MINITAB in the same way as entering
them into Excel. Through typing or using arrows data can be easily
moved on to the next cell.
8
2018 December 380 A
2019 January 388 B
2019 February 358 B
2019 March 320 B
2019 April 335 B
2019 June 335 B
Step 1: Type the data in the table in to the ‘Data window’ and the output
is given in the ‘session window’.
9
Step 2: In the ‘Save Project As’ dialogue box, enter the name you want to
save the data, inside the white space given in front of ‘File name’.
Usually, in SPSS files save as .MPJ type.
10
CHAPTER TWO: SUMMARY STATISTICS
Example 2.1: Get the descriptive statistics. i.e. mean, mode, median,
range, variance and standard deviation for the below data set using
SPSS.
11
Method 01- Using the Descriptives command
Step 2: In the ‘Descriptives’ dialog box, add the desired variables (using
the arrow) in to the list of variables that are required to analyze.
12
Step 3: Click the options button to select the descriptive measures and
press the continue button.
Descriptive Statistics
13
Method 02 – Using the Frequencies command
Step 2: In the ‘Frequencies’ dialog box, add the desired variables (using
the arrow) in to the list of variables that are required to analyze. Select
the statistics option to proceed.
14
Step 3: Select the descriptive measures and press the continue button.
15
Method 03 – Using the Explore command
Step 2: In the ‘Explore’ dialog box, add the desired variables using the
arrow in to the dependent list that are required to analyze.
16
Step 3: Under display, if you want to get only the summarized data,
select ‘statistics’ option. Otherwise, select ‘plots’ option to specify the
type of plot you want to include in the output.
Step 4: Press the ‘statistics’ button to select the descriptive measures for
the output and press continue.
Step 5: Press the ok button in the ‘Explore’ dialog box to generate the output
given below.
17
*Note: In SPSS, there are three methods that can be used to obtain
the descriptive statistical measures for a particular data set. When
determining the SPSS command, it’s important to select the one
that provides the information you need in a preferable format.
In MINITAB,
Method 01
18
Step 2: In the ‘Display Descriptive Statistics’ dialog box, add the desired
variables in to the variables list that are required to analyze.
Step 3: Press the ‘statistics’ button to select the measures for the output.
After selecting the expected statistical measures, click the ok button.
19
Method 02
Step 2: In the ‘Graphical Summary’ dialog box, add the desired variables
in to the variables list that are required to analyze.
20
Step 3: Click the ok button in the ‘Graphical Summary’ dialog box to
generate the output as presented below.
Median
21
CHAPTHER THREE: GRAPHICAL PRESENTATION
Generally, by using SPSS and Minitab, raw data is processed. This means,
each unit of sample is entered one at a time. As for an example, if the
sample size is 100, there will be 100 rows of data in the data file.
However, both these softwares have the options to analyze these data
and produce results in the form of a frequency table.
Example 3.1: Get the frequency distribution, for the data set.
22
2019 February 358 B
2019 March 320 B
2019 April 335 B
2019 June 335 B
23
Step 3: In order to obtain the frequency distribution in the output, make
sure that the box labeled as display frequency tables is checked.
34,60,40,72,37,33,42,62,49,32,52,40,31,19,68,55,58,54,37,32,54,38,20,5
0, 56,48,35,52,29,56,68,65,45,42,54,39,29,56
24
a table of classes and their corresponding frequencies can be
developed and it is called a grouped frequency distribution.
In SPSS,
25
Step 3: In the ‘Recode in to different variables’ dialogue box, click ‘old
and new values’ button.
26
Step 5: Recode the numbers in to new values as shown in the following
boxes and press continue.
27
Step 6: Change the value labels as indicated below.
28
Step 8: Generated SPSS output is indicated as follows.
Age in Categories
Frequency Percent Valid Percent Cumulative
Percent
0<19 1 2.6 2.6 2.6
20<29 3 7.9 7.9 10.5
30<39 10 26.3 26.3 36.8
40<49 7 18.4 18.4 55.3
Valid
50<59 11 28.9 28.9 84.2
60<69 5 13.2 13.2 97.4
70<79 1 2.6 2.6 100.0
Total 38 100.0 100.0
Interpretation
29
3.2 Graphical presentation of data in SPSS & MINITAB
Example 3.3: Draw a bar graph using the data given below.
30
Step 2: In the ‘Chart Builder’ dialog box, press ok to proceed.
Step 3: Under Gallery, there are few options to select. It depends on your
requirement, either to draw a bar graph, line graph, pie chart, histogram
or a scatter plot.
Step 5: Drag and drop the bar graph in the white space provided above.
(Refer below graph)
31
Step 6: In order to define the x axis and y axis, select the appropriate
variables given in the variables column.
Step 7: Then drag and drop each variable, each time at the space given
for the x and y axis as shown below.
32
Step 8: Press ok button to get the output as follows.
OAOABAOOAAOBOBOAOOAAAABBAABAAO
In SPSS,
33
Step 2: In the ‘Bar Charts’ dialogue box, select simple, if you need to
draw a graph for a single variable. Then select the option of ‘Summaries
for groups of cases’ and press define to proceed.
34
Step 4: Click the ok button in the ‘Define Simple Bar: Summaries for
Groups of Cases’ dialogue box to generate the output as presented
below.
Step 5: Double click on the bar graph, to make changes in the graph.
35
Example 3.5: Draw a bar graph for the following frequency table for
ages.
Age in Categories
Frequency Percent Valid Percent Cumulative
Percent
Enter the data in to SPSS, by allocating a code for each of the age
categories and carry out the previously discussed steps to draw the bar
graph.
36
Interpretation
In MINITAB,
Step 2: In the ‘Bar Chart’ dialogue box, under Bars represent, select
‘Counts of unique values’ option and simple graph. Then press ok.
37
Step 3: In the ‘Bar Chart’ dialogue box, select ‘blood group’ as the
categorical variable and press ok button.
14
12
10
8
Count
0
A B O
Blood Group
38
According to the above bar graph, there is around 14 people in blood
group of A, 6 people in blood group of B and 10 people in blood group of
O.
39
Step 2: Variables need to be included in the rows, instead in the
columns, and need to get the row percentages. Further, it’s required to
choose for the Category Position the option "Row Labels in Columns".
Strongly Strongly
Disagree Neutral Agree
Disagree Agree
Q01. I'm satisfied with
15% 35% 30% 10% 10%
my basic salary
Q02. I'm satisfied with
the overtime payment 10% 45% 20% 20% 5%
scheme of the company.
Q03. I'm satisfied with
the bonus scheme of 10% 40% 30% 10% 10%
this company.
Q04. I'm satisfied with
the transparency of the
incentive scheme of this 10% 40% 25% 15% 10%
company.
40
Q05. I'm satisfied with
the welfare activities
organized by the 0% 30% 30% 30% 10%
company.
Step 4: After obtaining the SPSS output, bar graph can be drawn in the
similar way as explained in Exercise 3.4.
3.2.2 Histogram
In SPSS,
41
Step 2: In the ‘Histogram’ dialogue box, under variable include ‘age in
categories” and press ok.
42
Example 3.8: Ages of 38 people are given below.
34,60,40,72,37,33,42,62,49,32,52,40,31,19,68,55,58,54,37,32,54,38,20,5
0, 56,48,35,52,29,56,68,65,45,42,54,39,29,56
43
Step 2: In the ‘Histogram’ dialogue box, select ‘simple’ and press ok to
proceed.
Histogram of Age
5
Frequency
0
20 30 40 50 60 70
Age
44
3.2.3 Pie Chart
OAOABAOOAAOBOBOAOOAAAABBAABAAO
Draw a pie chart for the Blood groups using SPSS and MINITAB.
In SPSS,
Step 2: In the ‘Pie Charts’ dialogue box, select ‘Summaries for groups of
cases’ and click the define button.
45
Step 3: In the ‘Pie Charts’ dialogue box, insert ‘Blood Group’ to the space
named as ‘Define Slices by’ and press ok to proceed.
46
In MINITAB,
Step 2: In the ‘pie chart’ dialogue box, select ‘Blood Group’ as the
categorical variable and press ok.
47
Pie Chart of Blood Group
Category
A
B
O
33.3%
46.7%
20.0%
Interpretation
According to the above pie chart, 46.7% persons are having the Blood
group of A and 33.3% of persons are having the blood group of 33.3%.
Only 20% of individuals are having the blood group of O.
Exercise 3.10: Draw a scatter plot using the data given below.
X Y
1 5
2 8
3 12
4 10
5 14
6 9
7 15
8 14
9 17
10 21
48
In SPSS,
Step 3: Select ‘Y’ variable as the Y axis and ‘X’ variable as the X axis and
press ok.
49
Step 4: Generated SPSS output is given below.
Interpretation
According to the above scatter plot, there’s a positive relationship
between X and Y.
50
In MINITAB,
Step 2: In the ‘Scatter plots’ dialogue box, select ‘simple’ and press ok.
51
Step 3: Select ‘Y’ variable as the Y axis and ‘X’ variable as the X axis and
press ok.
Scatterplot of Y vs X
22.5
20.0
17.5
15.0
Y
12.5
10.0
7.5
5.0
0 2 4 6 8 10
X
52
3.2.5 Line graph
Exercise 3.11: Draw a line graph for the data given in the table.
In SPSS,
53
Step 2: Drag and drop the Line graph in the white space provided above.
(Refer below graph)
54
*Note: In MINITAB, line graph can be drawn using the scatter plot.
55
Step 2: In the ‘scatter plot’ dialogue box, select ‘with connect line’ and
press ok button.
Step 3: In the ‘scatter plot’ dialogue box enter ‘sales’ as the Y variable
and ‘months’ as the x variable and press ok button.
56
Step 4: Generated MNITAB output is given below.
380
360
Sales
340
320
300
Interpretation
According to the above line graph, there are fluctuations in sales over
the period of January 2018 June 2019.
A boxplot consists of box and 2 tails. The horizontal line inside the box
shows the position of the median and its upper and lower boundaries
are its upper and lower quartiles. The tails run to the most extreme
values. Boxplot in sum shows structure of the data along with its
skewness and spread.
57
Example 3.12: Student’s GPA at the university are recorded as follows.
3.75, 3.98, 3.92, 3.32, 3.44, 3.10, 2.54, 2.43, 2.80, 2.41
In SPSS,
Step 2: In the ‘Box plot’ dialogue box, select simple option and under
Data in chart, select ‘summaries of separate variables and press define
to proceed.
58
Step 3: In the ‘Box plot’ dialogue box, insert GPA to the space labeled as
‘Boxes represent’ and press ok to proceed.
59
Step 4: Generated SPSS output is given below.
Max
Q3
Q2
Median
Q1
Min
A box plot is used to display the distribution of data using the five
number summary. i.e. minimum, Q1- first quartile, Q2- second quartile
(Median), Q3- third quartile and maximum.
In MINITAB,
60
Step 2: In the ‘Box plot’ dialogue box, select simple option and press ok
button to proceed.
Step 3: In the ‘Box plot’ dialogue box, insert GPA to the space labeled as
‘Boxes represent’ and press ok to proceed.
61
Step 4: Generated MINITAB output is given below.
Boxplot of GPA
4.00
3.75
3.50
3.25
GPA
3.00
2.75
2.50
Interpretation
62
REFERENCES
63