UNIT 6 Practical Application of Descriptive Analytics
UNIT 6 Practical Application of Descriptive Analytics
UNIT
0 PRACTICAL
APPLICATIONS OF
DESCRIPTIVE
ANALYTICS
6
This unit is intended to present you to make
descriptive analytics and developing insights from data
using SPSS (Statistical Package for the Social Sciences).
This software was used by numerous researchers for the
handling, analyzing, and interpreting of data. Also, it
provides answers for managing data, which allow
researchers to execute case selection, build derivative
data, and make file restructuring. The feature of SPSS is
similarly to a spreadsheet programs such as Microsoft
Excel, Goggle Sheet and even MS Access in its main
view.
LESSON 1:
THE SPSS ENVIRONMENT
OBJECTIVES:
After completing the lesson, you should be able to:
Familiarize with the environment of SPSS.
Duration: 5 Hours
ELECTIVE 1 | FUNDAMENTALS OF BUSINESS ANALYTICS 151
UNIT 6: PRACTICAL APPLICATIONS OF DESCRIPTIVE ANALYTICS
When SPSS is started, a dialog box will appear on your screen. If you have the dialog
box, click OK, which will present a blank data window .
If you were not open with the dialog box above, the software should begin
automatically with a blank data window. The data and output window offer the basic
environment for SPSS. A blank data window is shown below.
You will see a taskbar at the bottom of the screen. It displays an IBM SPSS icon.
Once you open any programs, it will show on the taskbar, and the current software
you are using will be emphasized.
Next, spot the 3 small squares in the topmost right-hand angle of the main IBM SPSS
screen. The one utmost to the right, with an X, is used when you want to exit any
program you are using. The inner of the 3 small squares lets you to have the window
you are using in load the whole screen, or to shrink it down to a smaller size. If the
inner square shows 2 cascading rectangles in it, the window is already as big as it can
get— selecting on this square will reduce the window in size. The last square, to the
left of the other two, has what looks like a minus sign on it. This is to minimize a
window.
Hence, the variable name “B9” is acceptable, while the variable name “8M” is not. To
describe a variable, select and click on the Variable View tab at the bottom of the main
screen. While, select and click on the Data View tab to return to the Data View window.
Variable View allows you to make and correct all of the variables in your data file.
Every column signifies some property of a variable, and for every row represents a
variable. All variables must be set for a name. To do that, select and click on the first
blank cell in the Name column and type a valid variable name.
Furthermore, you have the choice of describing the variable type. To do that, simply
select and click on the Type, Width, or Decimals section in the Variable View window.
The default value is numeric, with 8 digits wide, with 2 decimal places displayed. If you
the data are higher than 8 digits to the left of the decimal place, they will be showed in
scientific notation (e.g., the number 2,000,000,000 will be displayed as 2.00E+09).
SPSS preserves correctness beyond 2 decimal places, but all output will be rounded
to 2 decimal places if otherwise specified in the Decimals column.
Example:
Create a data file for the 8 variables and five sample students. Name your variables:
ID, DEPARTMENT, YEAR, SECTION, DAY, TIME, WORK and GRADE. You should
code DEPARTMENT as 1= CICT, 2= CBA, 3=CIT, 4= CON, 5=COE, 6=COS; YEAR
as 1= First Year, 2= Second Year, 3= Third Year, 4= Fourth Year; SECTION as 1= A,
2=B, 3=C, 4=D, 5=E; DAY as 1 = Mon/Wed/Fri, 2 = Tues/Thurs. Code TIME as 1 =
morning, 2 = afternoon. Code WORK as 0 = No, 1 = Part-time, 2 = Full-time. For the
different variables, be sure you enter value labels. When complete, your Variable
View window should look like the screenshot below.
Select and click on the Data View tab to start entering data on screen. Enter data
horizontally, starting with the first student’s ID number. Use the given code for each
variable in the proper column.
1. Based on the student survey questionnaire, enter the codes based on the codebook
below using Data View. Verify the correctness of your data entry by scanning down
each column. Then check everything carefully.
Sample Survey Questionnaire
Source: Nelson, E. (n.d.). IBM SPSS statistics for windows, version 23: A Basic Tutorial.
Retrieved from http://ssric.org/node/582. Social Science Research and Instructional Center
___ Protestant ___Catholic ___ Jewish ___ Some other religion ___No religion
(5) What kind of marriage do you think is the more satisfying way of life?
___ One where the husband provides for the family and the wife takes care of
the house and children
___ One where both the husband and wife have jobs and both take care of the
house and children
(6) Do you think it should be possible for a pregnant woman to obtain a legal
abortion?
(8) If she is married and does not want any more children?
__Yes __ No ___Don’t Know
(10) If the family has a very low income and cannot afford any more children?
age sex rel conlib mg abd abn abh abp abr abs aba
01 20 1 4 2 2 2 2 1 3 1 2 2
02 24 2 5 2 2 1 1 1 1 1 1 9
03 21 2 2 9 2 2 2 2 2 2 2 2
04 24 2 5 3 2 1 1 1 1 1 1 1
05 26 2 4 2 2 1 1 1 1 1 1 1
06 28 2 2 2 2 2 2 1 2 1 2 2
07 23 1 1 2 2 1 2 1 1 1 2 2
08 22 2 4 3 1 1 1 1 1 1 1 1
09 22 1 5 2 2 1 1 1 1 1 1 1
10 22 2 4 4 2 1 1 1 1 1 1 1
11 23 1 2 2 1 2 2 1 2 1 2 3
12 24 2 2 3 2 1 1 1 1 1 1 2
13 51 2 1 2 9 1 1 1 1 1 1 1
14 22 2 2 3 2 1 1 1 1 1 1 1
15 21 2 4 3 2 1 1 1 1 1 1 1
16 37 1 1 3 2 1 2 1 2 1 2 2
17 22 2 4 2 2 1 1 1 1 1 2 2
18 22 2 3 3 2 1 2 1 2 1 2 2
19 22 2 4 3 2 3 2 1 2 1 1 1
20 30 2 5 2 2 1 1 1 1 1 1 1
21 25 2 5 2 2 1 1 1 1 1 1 1
22 23 1 2 2 2 1 1 1 1 1 1 1
23 21 1 1 2 1 1 1 2 1 2 1 1
LESSON 2:
DESCRIPTIVE STATISTICS
OBJECTIVES:
Duration: 5 Hours
In calculating mean (average), the computer will summarize the dataset. Thus, we
run the command by selecting and clicking Analyze menu, then select Descriptive
Statistics, then choose Descriptives .
Descriptives dialog box will appear on your screen. Please be reminded that the left
side of the box holds a list of all the variables in our data file. While the right side is an
area labeled Variable(s), where we can identify the variables we would like to use in
this particular analysis.
Let us compute for the mean for the variable called GRADE. Therefore, we need to
choose the variable name in the left window. To put it to the right window, select and
click on the right arrow among the two windows. The arrow always points to the
window contrasting the selected item and can be used to transfer selected variables
in either direction.
If you select and click on the OK button, the analysis or inquiry will be conducted, and
be prepared to test the output.
The output window is split into two sections. The left section is an outline of the output.
While the right section is the output itself.
The use of Frequencies command produces frequency distributions for the specified
variables. Output comprises of occurrences, percentages, valid percentages, and
cumulative percentages. Valid percentages and the cumulative percentages include
only the data that are not designated as missing. Frequency is useful for describing
samples where the mean is not useful. Also, it is useful as a technique of getting the
composition for your data. It offers more information than just using a mean and
standard deviation. Furthermore, it can be useful in determining skew and identifying
outliers.
Cumulative percentages and percentiles are valid only for data that are measured on
at least an ordinal scale. Since the output holds one line for every value of a variable,
this understanding works best on variables with a comparatively small number of
values.
Frequency command presents output that shows both the number of cases in the
sample of a particular value and the percentage of cases with that value. Therefore,
conclusions drawn should relate only to telling the numbers or percentages of cases
in the sample. If the data are at least ordinal in nature, conclusions regarding the
cumulative percentage and/or percentiles can be drawn.
Finally, it requires one variable in SPSS data file for obtaining frequency distributions,
and that variable can be of any type.
The output consists of two sections. The first section indicates the number of records
with valid data for each variable selected. Records with a blank score are listed as
missing. In this example, the data file contained 10 records.
Because the output encompasses a column or a row for every value of a variable, this
command works best on variables with a relatively small number of values. The SPSS
data file for the crosstabs command requires two or more variables. Those variables
can be of any type.
This example below uses the Practice Exercise 1. To run the procedure, select and
click Analyze menu, then choose Descriptive Statistics , then click Crosstabs . This
will shows the main Crosstabs dialog, box which is shown below.
The Crosstab dialog box initially lists all variables on the left and contains two blank
boxes labeled Row(s) and Column(s). Enter one variable (STATUS) in the Row(s)
box. Enter the second (WORK) in the Column(s) box.
The cells button lets you stipulate percentages and other information to be made for
each combination of values. Select and click Cells, and you will get the box shown
above, with Observed and Round cell counts selected by default. In this example,
check Percentage Row, Column, and Total. Then select and click Continue. This will
display the Crosstabs dialog box. Then click OK to run the analysis.
Each cell contains the number of participants (for example, one part-time participant
received passing grade; four fulltime participants received pass passing grade).
While, three part-time participants and two full time participants have failed The
percentages for each cell are also shown. Row percentages add up to 100%
horizontally. Column percentages add up to 100% vertically. The output consists of a
contingency table. Each level of WORK is given a column. Each level of STATUS is
given a row. In addition, a row is added for total, and a column is added for total.
For example, select and click on Analyze menu, then click on Descriptive Statistics,
then choose Descriptives. This will bring up the main dialog box for the Descriptives
command. Any variables you would like information about can be placed in the right
blank by double-clicking them or by selecting them and then clicking on the arrow.
Move the variable GRADE to the right for this example.
In this example, you will receive the N (number of cases/participants), the minimum
and the maximum value, the mean, and the standard deviation.
If you would like to change the default statistics that are given, click Options in the
main dialog box . You will be given the Options dialog box shown below. Click
Continue or Cancel to close the Options box. Then click OK .
The output for the Descriptives command is shown below. For every type of output is
presented in a column, and each variable is given in a row. The example below shows
that we have one variable (GRADE) and that we obtained the N, minimum, maximum,
mean, and standard deviation for this variable.
LESSON 3:
GRAPHING DATA
OBJECTIVES:
Duration: 5 Hours
Graphing Data
In addition to the frequency distributions, and measures of central tendency, graphing
is a useful way to summarize, organize, and reduce your data. With SPSS, it is
possible to make publication-quality graphs. One important advantage of using SPSS
to create your graphs instead of using other software (e.g., Excel) is that the data have
already been entered. Thus, duplication is eliminated, and the chance of making a
transcription error is reduced (Cronk, 2018).
In this graphing example, use a new set of data. Input the data that follow by defining
the three subject variables in the Variable View window: HEIGHT (in inches), WEIGHT
(in pounds), and SEX (1 = male, 2 = female). When you do the variables, designate
HEIGHT and WEIGHT as Scale measures and SEX as a Nominal measure. Switch
to the Data View to enter the data values for the 16 participants. Now use the Save As
command to save the file, naming it Height.sav.
It is important that you have entered the data correctly by calculating a mean for each
of the three variables (click Analyze menu, then select and click Descriptive Statistics
, then choose or click Descriptives). Compare your results with those in the table
below.
Click the Charts button on the right side of the frequencies command to bring up the
Charts dialog box .
There are three (3) types of charts available with this command: Bar charts , Pie
charts, and Histograms . For each type, the Y-axis can be either a frequency count or
a percentage (selected with the Chart Values option).
Select Bar charts and click Continue. Next, click OK. It will display the charts for any
variables selected in the main Frequencies command dialog box .
ELECTIVE 1 | FUNDAMENTALS OF BUSINESS ANALYTICS 171
UNIT 6: PRACTICAL APPLICATIONS OF DESCRIPTIVE ANALYTICS
Sample Output
In this sample, the Chart Builder command is retrieved using graphs, then choose
Chart Builder in the submenu. This is a very flexible command that can create a variety
of graphs of excellent quality. When you first run the Chart Builder command, you will
probably be presented with the following dialog box :
This dialog box is asking you to ensure that your variables are properly defined.
Scatterplots (also called scatter grams or scatter diagrams) show two values for each
case with a spot on the graph. The X -axis represents the value for one variable. The
Y -axis represents the value for the second variable.
In Gallery Choose from select Scatter/Dot. Then drag the Simple Scatter icon (top-left)
up to the main chart area as shown in the screenshot below. Disregard the Element
Properties window that pops up by choosing Close at the bottom of that window.
Next, drag the HEIGHT variable to the X-Axis area, and drag the WEIGHT variable to
the Y-Axis area. (Remember that standard graphing conventions indicate that
dependent variables should be Y and independent variables should be X . This would
mean that we are trying to predict weights from heights.)
Output
INTERVIEW
LESSON 4: