0% found this document useful (0 votes)
3 views26 pages

L2 - IntrotoSPSSandMS Excel

Uploaded by

mohd redza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views26 pages

L2 - IntrotoSPSSandMS Excel

Uploaded by

mohd redza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

MBT1033 Lecture Note

Transportation Quantitative Techniques

I N T R O D U C T I O N T O S P S S AN D M S E X C E L
Dr. Muhammad Zaly Shah B. Muhammad Hussein1
Department of Urban and Regional Planning
Faculty of Built Environment

2.1 Introduction

Data can be easily analysed manually if the amount is manageable. However,


if the number of data that we need to process exceeds a certain limit, we will
find that even the simplest data analysis will be troublesome. In modern days,
this problem is easily solved using the many software applications that are
available in the market. These applications range from the very general
spreadsheet applications like MS Excel and Lotus 1-2-3 to a more advanced
statistical applications like SPSS, Minitab, S-Plus and SAS. All these
applications are not created equal. They all have their strengths and weakness.
For the spreadsheet applications, they are easier to learn but they lack
advanced statistical functions. On the other hand, statistical applications like
SPSS and SAS have more data analysis capabilities but require advanced
mathematical knowledge to fully capitalise their power. For our purpose, we
will learn both MS Excel and SPSS to see the differences between a
spreadsheet application and a statistical software.

At the end of this lecture, students should be able to:

 Create, edit and save data in SPSS and MS Excel


 Perform simple data manipulation in SPSS and MS Excel
 Construct basic data management in SPSS and MS Excel
 Carry out simple data analysis using SPSS and MS Excel

1
Email: [email protected], Office: B11-307, Ph: (07) 5537348, HP: (013) 7426251
© 2003-2007 Muhammad Zaly Shah
MBT1033 Introduction to SPSS and MS Excel

2 . 2 S t a t i s t i c a l An a l y s i s w i t h S P S S

When we have a lot of data and there’s some statistical analysis to be


performed, then we would use statistical software like SPSS. Eventhough the
actual implementation of different statistical softwares might differ, they,
however, share some common steps in performing statistical analysis, which
are:

Step 1: Perform Data Entry

Step 2: Perform the Operation

Step 3: Analyse the Results

Let’s look how we perform each step in SPSS. But first we need to start
SPSS. Once you have started SPSS, you will be greeted by the following
screen (Figure 2-1):

Figure 2-1: Options for starting SPSS

2
MBT1033 Introduction to SPSS and MS Excel

STEP 1: Data Entry

In Figure 2-1, you are presented with several choices on how you can start
your SPSS session. Most common actions are:

1. To create a new data file with fresh data,


2. To open existing SPSS data file, and
3. To open other data file, e.g. MS Excel file, a text file etc.

Click on the appropriate radio button. The default selection is for you open an Creating a
existing data source, i.e. SPSS file that you have entered data into it. In this new, blank
data editor
tutorial, we will start with a blank data sheet to enter a fresh set of data. So,
click CANCEL.

Tips To enter a fresh new set of data, you can select the
‘Type in data’ radio button and click OK. An easier
approach, however, is just to click CANCEL.

Figure 2-2: A blank SPSS data editor

In Figure 2-2, you will notice several important components of SPSS that you
must know in order to move around and operate SPSS effectively. These
components are:

3
MBT1033 Introduction to SPSS and MS Excel

1. The title bar – showing the name of the current active file. In this
instance we have not named our data file, so SPSS simply denotes
our current file as ‘Untitled’. The title bar also shows that we are
currently in the ‘SPSS Data Editor’ and not, for example, in
‘Viewer’.

2. The menu bar – provide the various processes that can be performed
in SPSS. Examples of these processes are data processing (‘Data’),
analyzing data (‘Analyze’), charting data (‘Graphs’), etc.

3. The tool bar – provide easy access to some common actions in


SPSS. For example, to open a data file, you can either select the
menu File > Open > Data …, or simple click the button on the
tool bar.

Once you are in the data editor, you can enter your data. The data in SPSS are
arranged in row-column format. The row represents the observation while the
column represents the variable.

Figure 2-3: Sample data in Data View

In Figure 2-3, we can see that the cursor is at row 6 and column 2, with the
cell having the value of 2.45.

Currently, the description of our data is not intuitive. In other words, we do


not know what var00001 and var00002 represent. To make it clearer, we can
change the column heading so that it can display what the data in its column
represent.

4
MBT1033 Introduction to SPSS and MS Excel

To name a column, just double-click the appropriate column heading. For


example, clicking the heading of column 1 titled ‘var00001’ will bring us to Setting
variable’s
the ‘Variable View’ (see Figure 2-4). Once in the ‘Variable View’, we can name
perform many things on our variables. These actions include:

1. Naming a variable – to make our variable more intuitive rather than


just referring to ‘var00001’. However, the name of our variable is
limited to just 8 characters.
2. Specifying variable type – a variable can be in many forms –
numerical, strings, date, etc.
3. Specifying the variable length and the number of decimal places it
has.
4. Putting label – descriptive text for describing and labelling data
values. These descriptive labels are used in statistical reports and
charts. For example, you could assign the labels 'Male' and 'Female'
to the numeric values 1 and 2.
5. Operate on missing values – tell SPSS what to do when encounter
missing values in our data file.
6. Data presentation – do we want to centre, right-aligned or left-aligned
our data.
7. Data measurement – if our data is a scale, ordinal, or nominal data.

Figure 2-4: Naming a variable and setting its specifications

5
MBT1033 Introduction to SPSS and MS Excel

For our data in Figure 2-3, let’s name our var00001 data as weight and
var00002 as cpa which stands for ‘cumulative grade point average’. The variable
weight, then, is numeric with a width of 4 with 1 decimal place. The variable
cpa is also numeric but with a width of 3 with 2 decimal places. The results
after all the changes are shown in Figure 2-5 and Figure 2-6.

Figure 2-5: Changing variable description and specification

Figure 2-6: Data Editor with variables renamed

6
MBT1033 Introduction to SPSS and MS Excel

Now, once we are done entering and specifying our data, let’s save our data Saving a
file
so that we do not lose it in the event of some unfortunate accident such as
power failure.

Tips It is always ‘good’ computing practice to frequently


save your data. Not just that, it is also advisable that
you make back-up copies of your important data.
Additionally, do not store your back-up copies
together with your original data. For example, if
your original data is in your hard disk then your
back-up copy should not also be in your hard disk.
Rather, have it saved in a floppy disk or a CD.

To save our data, click the menu File > Save (Figure 2-7) or click the
button on the tool bar.

Figure 2-7: Saving a file from the menu

Once you have done the above, the ‘Save Data As’ dialog box will appear
prompting you to specify the location, the name and type of the file you want
to save (see Figure 2-8).

7
MBT1033 Introduction to SPSS and MS Excel

Figure 2-8: The Save Data As dialog box

Let’s name our file ‘Data1.sav’ (note that the file extension ‘.sav’ is the native
file extension for SPSS data file).

Once we have successfully saved our file, you will notice that the ‘Untitled’
file name given previously by SPSS has changed to ‘Data1’ in the title bar.
This is shown in Figure 2-9.

Figure 2-9: The file name as it appears on the title bar.

STEP 2: Perform the Operation

Once we have successfully created (and, save) our data, we are now ready for
the next process which is telling SPSS what we want to do with the data. To
most statisticians, this is the most interesting phase as finally they are able to
see, test and analyse their data. To them, this is ‘play’ time.

In this tutorial, it is not possible to cover every single statistical operation that
SPSS can perform. Rather, we will just select one simple activity that will
roughly show how easy it is to conduct statistical analysis using SPSS.

Using our ‘Data1’ data, we will conduct a descriptive statistical analysis. In Performing
statistical
descriptive analysis, we will try to explain the characteristic of our data. analysis

8
MBT1033 Introduction to SPSS and MS Excel

These include knowing the minimum, maximum, mean, and standard


deviation.

In SPSS, almost all statistical analysis that you want to perform are located in
the Analyze menu. The same goes to the descriptive statistical analysis that
we want to conduct.

To proceed with our descriptive statistical analysis, select the menu Analyze >
Descriptive Statistics > Descriptives … as shown in Figure 2-10.

Figure 2-10: Performing a statistical analysis in SPSS

Once you have done this, the following dialog box, as in Figure 2-11, will
appear as SPSS requires additional information before it can process your
request.

9
MBT1033 Introduction to SPSS and MS Excel

Figure 2-11: Dialog box for performing a descriptive statistical analysis

First, once the dialog box in Figure 2-11 appears, is to select the variable that
you want SPSS to conduct descriptive analysis on. Note that SPSS can only
conduct descriptive statistical analysis on only one variable at a time. In this
case, SPSS is asking you which of the two variables that you want to perform
the descriptive statistical analysis on. For our example, select the variable
weight which is shown as Step 1 in Figure 2-11.

Next, bring the selected variable, i.e. weight, to the area on the right under the
title ‘Variable(s)’. This is done simply by clicking the ‘arrow box’ shown in
Step 2. The result is shown in Figure 2-12 and clicking the OK button will
automatically make SPSS perform the descriptive statistical analysis.

Figure 2-12: The dialog box once selection has been made

Once SPSS has successfully completed performing the descriptive statistical


analysis that we requested, SPSS will display the results in a separate window
– which is the SPSS Viewer window. This is shown in Figure 2-13.

10
MBT1033 Introduction to SPSS and MS Excel

Note that SPSS will automatically name the result with Output1 for the first Saving an
statistical output that it produces for the current working session. Of course output file
you could name your output to something that you like. And, this can be done
once you save the output result. To save the output, select the menu File >
Save or alternatively click the button on the tool bar.

Figure 2-13: The output viewer in SPSS

STEP 3: Analysing and Interpreting the Result

The ‘play’ time is over once you got your result. From here on, it is serious
business as this is arguably the most important step in any statistical analysis.

Similarly, once we have successfully performed the descriptive statistical


analysis in Step 2 and gotten the result, we are left with analysing and
interpreting it.

However, for this example, there is not much interpretation to be done as it is


only a descriptive analysis. Nonetheless, we got some information out of our
exercise. For example, we know that we have n = 5 observations, the
minimum value is 55.0 while the maximum is 70.0. Also, the data is quite
dispersed with a mean of 62.217 and a standard deviation of 5.575.

11
MBT1033 Introduction to SPSS and MS Excel

2.3 Data Manipulation w ith MS Excel

First and foremost, MS Excel is not a dedicated statistical software unlike


SPSS, although MS Excel has some limited capability to perform statistical
analysis. MS Excel is, in fact, a spreadsheet application and its greatest use is
when you have a lot of data to manipulate.

Depending on the version that you use and the amount of customisation been
made on your copy of MS Excel, your MS Excel screen might be slight
different. A screenshot of a MS Excel workbook is shown in Figure 2-14.
Nonetheless, you will notice that MS Excel, or more commonly known as
Excel, shares many common features with SPSS.

Figure 2-14: Blank workbook in MS Excel

One of the most striking similarities that Excel has with SPSS is that they
both operate using the cell paradigm apart from the fact they also have title
bar, menu and tool bar.

In Excel, the columns are represented by letters (A, B, C, …) while the rows Columns
and rows
are denoted with Arabic numbers (1, 2, 3, …). Therefore, you have for in Excel
example cell A1 which refers to the cell in column A, row 1.

An Excel feature that is not available in SPSS is that Excel uses the workbook
paradigm where a single workbook can contain many data sheets. Figure 2-15
shows an example of a workbook which contains several data sheets, each
having its own set of data. Also, Figure 2-15 illustrates the many things that
you can do with Excel and you are not confined to performing just statistical
analysis as with SPSS.

12
MBT1033 Introduction to SPSS and MS Excel

Figure 2-15: An example Excel workbook with several data sheets

STEP 1: Data Entry

As with SPSS, the first step in our work is to enter the data into Excel. To
illustrate, we are going to use some actual data. The data that we are going to
create is an excerpt from a study by the Economic Planning Unit, Prime
Minister’s Department on the Malaysian Quality of Life (EPU, 1999, p.
112)2. The data shows a comparison between several countries in the world
on the number of cellular phone users per 1000 population and the number of
telephone lines per 1000 population in the year 1997. The Excel sheets that
contain these statistics are shown in Figure 2-16 below:

Figure 2-16: Data entry in Excel

2
EPU (1999). Kualiti Hidup Malaysia 1999. Kuala Lumpur: Economic Planning Unit, Prime
Minister’s Department, Malaysia.

13
MBT1033 Introduction to SPSS and MS Excel

Noticeably from Figure 2-16 , except for column A, it is impossible to tell


which of the remaining two columns refer to the statistic on cellular phone
users. To avoid this confusion, we need to name our columns. However,
unlike SPSS, double-clicking the column heading will bring about a dialog
box where we can name our variables. In Excel, we must do this different by
having cells that act as column header. Unfortunately, the data sheet in Figure
2-16 does not have any more space at the top of the sheet to insert our column
header. We can solve this by inserting a new row at the top.

Figure 2-17: Inserting a row

To insert a new row in Excel, first right-click on the row number, above Inserting a
row in
which you want to insert a new row. Therefore, if you want to insert a new Excel
row on top of row 1, then right-click on the number 1 that signifies row no. 1.

Upon right-clicking row 1, the entire row will be highlighted and a pop-up
menu will be displayed (Figure 2-17). Among the options available in this
pop-up menu is one called ‘Insert’. Click Insert and a new row is
automatically inserted on-top of row 1. Now, our previous row 1 has been
renumbered to become row 2. And so do the other rows.

Go ahead and enter the following info, shown in Figure 2-18, so that the
blank new row (i.e. row 1) will now be our column header.

14
MBT1033 Introduction to SPSS and MS Excel

Figure 2-18: Data sheet with column header

You will notice that the some of the header text is hidden as in the header of Adjusting
column B which is partially covered by the header of column C. The reason column
why this happens is because the header text of column B is wider that the width
width of column B itself. We can solve this problem by double-clicking on
the column border. We can do this for all our columns. Double-clicking the
column border will increase the width of the column to fit the widest of the
text in that column. If we do this, we will have the following result as shown
in Figure 2-19.

Figure 2-19: Improved appearance of data sheet

15
MBT1033 Introduction to SPSS and MS Excel

In fact, in Figure 2-19, we have also do other cosmetic changes to the data
sheet e.g. setting bold typeface for the header and to have the values in
columns B and C centred.

Once we have done this, let’s save our data. You can do this by selecting the Saving
Excel
menu File > Save or by clicking the button. Immediately, the following workbook
‘Save As’ dialog box, as in Figure 2-20, will appear. Let’s name our
workbook ‘Quality1997.xls’ (note that Excel’s native file extension is ‘.xls’).

Figure 2-20: Save As dialog box in Excel

If you have successfully saved your Excel workbook, your filename will
appear on the title bar (see Figure 2-21).

Figure 2-21: Title bar showing current file

16
MBT1033 Introduction to SPSS and MS Excel

To rename your data sheet, you only need to double-click the sheet tab as Renaming
shown in Figure 2-21. Then, go ahead and rename your data sheet as data sheet
‘Phones’, and press Enter to effect the changes on the data sheet. The result
should be similar to Figure 2-22.

Figure 2-22: Naming the current data sheet

STEP 2: Perform the Operation

There are infinite possibilities that you can achieve with Excel. And, similar
to SPSS it is impossible to cover every single one of them. In this lecture
note, we will try to cover some operations that are pertinent to our goal of
performing quantitative analysis or data manipulation.

A. Arithmetic Operations

In Excel, data manipulation is achieved through entering a set of formulas. Entering


formulas
There are some formulas that have been defined by Excel, but most of the
time, we will used formulas that we ourselves defined. To enter a formula in a
cell, you must start by typing the equal (‘=’) sign. If the equal (‘=’) is not
entered, Excel will treat the formula as text which means that no computation
will be performed.

Using the addition operation as an example, we show how Excel performs


computations between (1) constants, (2) cell value and constant, and (3) cell
values:

17
MBT1033 Introduction to SPSS and MS Excel

i. Addition between constants

Figure 2-23 shows the formula for the addition between constants in the
formula box while the result is shown in cell E2.

Figure 2-23: Formula box and addition between constants

Tips To perform computation in Excel, first select the


cell where you want the result to appear. Then,
write the formula by first typing the equal (‘=’) sign
in the cell (or, in the formula box). Press Enter and
the result of the computation will be shown in the
cell that you have selected earlier.

ii. Addition between a cell value and a constant

Addition between a cell value and a constant can be achieved by referring to


the cell number. For example, if we want to add 5 to the value located in cell
B2, we would write in the formula box ‘=B2 + 5’. This is shown in Figure
2-24.

Using this reference format, i.e. referring to the cell rather than its value, will
make the computation dynamic. In this sense, whenever, the value in cell B2
change, for example, the result in cell F2 will also change correspondingly.

18
MBT1033 Introduction to SPSS and MS Excel

Tips To refer to a cell, you need not type the cell


number. Instead, you can click the cell which you
want to use and the cell number will be inserted in
the formula. This way, you can avoid referring to
the wrong cell.

Figure 2-24: Addition between cell value and a constant

iii. Addition between cell values

Finally, you can also add values in two or more different cells. Figure 2-25
shows an example where values in two cells are added together. The result of
the addition operation is shown in cell G2.

Figure 2-25: Addition between cell values

19
MBT1033 Introduction to SPSS and MS Excel

The three mode of operations, i.e. between constants, between cell value and
constant, and between cell values, can also be performed on other
mathematical operations e.g. subtraction, multiplication, etc.

In Excel, different symbols are used to represent different arithmetic


operations (see Table 2-1):

Table 2-1: Arithmetic operators in Excel

Operations Symbol Examples


Addition + =B2 + 4
‘plus sign’ =B2 + B3
Subtraction – =B2 – 4
‘minus sign’ =B2 – B3
Multiplication * =B2 * 4
‘asterisk’ =B2 * B3
Division / =B2/4
‘forward slash’ =B2/B3
Exponentiation ^ =B2^2
‘caret’

B. Summation

As has been mentioned in Lecture Note 1, one of the most frequently


performed mathematical operations in quantitative analysis is the summation,
denoted by the ∑ symbol (i.e. the Greek capital letter Sigma). Therefore, our
coverage of Excel would be less than complete if we left out summation.

Here is how you perform summation in Excel:

1. First, click on the cell where you want you summation result to
appear. For this example, we will select cell B13 to display our
summation result. Why? Because we want to sum all the values in
column B.

Next, in cell B13, or in the formula box, type the following


formula, and press Enter:

=sum(B2:B12)

What the above formula do is to sum all the values located in


cells B2 until B12. The result is shown in Figure 2-26.

20
MBT1033 Introduction to SPSS and MS Excel

Figure 2-26: Summation by typing formula

2. An easier alternative to typing the formula is by clicking the icon


on the tool bar. Let’s use this method for our data in column C.

Figure 2-27: Summation using the icon

You will see that (refer to Figure 2-27) using the icon, Excel will
automatically predict that you want to sum all the values above it.

21
MBT1033 Introduction to SPSS and MS Excel

However, before you proceed with the summation, check the range of the
cells that Excel will use. If this is correct, press Enter, and the result is
displayed in cell C13 (see Figure 2-28).

Figure 2-28: Summation result

C. Naming a Cell

If you use a value located in a particular cell a lot, it is good that you provide
a name for that particular cell. In this way, you avoid retyping the value
repeatedly.

Figure 2-29: Cell B13 before given a name

Let’s say that we want to name cell B13 as ‘TotalCellPhone’, we


accomplished this by selecting cell B13 and then type the name

22
MBT1033 Introduction to SPSS and MS Excel

‘TotalCellPhone’ in the name box (see Figure 2-29). The process and its
result are shown in Figure 2-30.

Figure 2-30: Defining a cell name

Key Terms

Column Menu item Column header

Row Excel workbook Cell

Dialog box Data entry

Tool bar Right-click

23
MBT1033 Introduction to SPSS and MS Excel

Exercise
You are given the following data:

Country People per People per Hospital


Doctors Beds
China 1,063 612
Germany 367 118
Indonesia 7,028 1,423
Japan 608 64
Korea 951 300
Philippines 8,273 780
Singapore 714 275
Thailand 4,416 765
USA 421 194

Source: EPU (1999). Kualiti Hidup Malaysia. Kuala Lumpur: Economic Planning Unit, Prime
Minister’s Department, Malaysia. p. 113.

The above data represents the comparison between selected countries in the
world on two important statistics on the quality of health facilities:

1) The number of people per physician (i.e. doctors)


2) The number of people per hospital bed

Do the following:

1. Enter the data in the ‘Quality1997.xls’ file created earlier. Use a new
data sheet and name it ‘Health’.
2. Give appropriate name for the columns.
3. Calculate the total of ‘People per Doctor’ and the total of ‘People per
Hospital Beds’.
4. Name the cells having the sum of ‘People per Doctor’ as ‘SumDr’
and the sum of ‘People per Hospital Beds’ as ‘SumBed’.
5. Insert the following row between Korea and Philippines:

Country People per People per Hospital


Doctors Beds
Malaysia 2,363 501.9

6. Calculate the average of ‘People per Doctor’ and the average of


‘People per Hospital Beds’. Put these values below the cell ‘SumDr’
and ‘SumBed’, respectively. Name these two averages as ‘AveDr’
and ‘AveBed’.

24
MBT1033 Introduction to SPSS and MS Excel

7. Now, calculate the variance of the variable ‘People per Doctor’,


given by the following formula:

Variance =
∑ (x − x ) 2

n −1

where x = People per Doctor


x = Average People per Doctor
n = Number of observations

25
MBT1033 Introduction to SPSS and MS Excel

Answer
If you have done the exercise correctly, you should obtain results similar to
Figure 2-31:

Figure 2-31: Calculation of Variance

26

You might also like