0% found this document useful (0 votes)
32 views9 pages

7 - Exploring Data - 2

This document provides a summary of statistical computing concepts and SPSS commands used to explore and analyze data. It demonstrates how to obtain descriptive statistics, organize output by categorical variables, create filters for case selection, compare group means, and produce graphs including stem-and-leaf plots, boxplots, and histograms. The document walks through examples of exploring a dataset using these techniques to gain familiarity with the data.

Uploaded by

N Ff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views9 pages

7 - Exploring Data - 2

This document provides a summary of statistical computing concepts and SPSS commands used to explore and analyze data. It demonstrates how to obtain descriptive statistics, organize output by categorical variables, create filters for case selection, compare group means, and produce graphs including stem-and-leaf plots, boxplots, and histograms. The document walks through examples of exploring a dataset using these techniques to gain familiarity with the data.

Uploaded by

N Ff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Statistical Computing I

Dr. Md Jamal Uddin


Professor
Statistics, SUST, Sylhet

(Do not worry to see a lot of information. You do not need all information for examination. If you
go through 1/2 times this file, you will be understood/memorized all codes)

Exploring Data

First, you can try to solve the following programs by clicking the
menus in the SPSS windows and then try to do same works by using
the Syntax

1.0 SPSS commands used in this unit

procedure for obtaining means, standard deviations,


descriptives
etc.
compute creates new numeric variables
split file organizes output by a categorical variable
filter excludes certain cases from the analysis
use all uses all cases in the data set
means calculates means for different groups
examine procedure for obtaining descriptive statistics
graph general procedure for creating graphs
frequencies calculates frequencies
crosstabs calculates crosstabulations
correlations calculates correlations

2.0 Demonstration and explanation

In this unit we will explore our data set.  By "explore", we mean  conduct some descriptive statistics on variables that will be
important to the analysis that we plan to run.  This exploration is very important, because it allows us to become familiar with our
data.  Also, if there are any problems with the data, such as out-of-range values, etc., we can discover them.

Let's begin by opening the data file.

 File * open the data file.


 Open get file "c:\spss_data\hs0.sav".
 select the C: drive, the SPSS folder, and hs0.sav

We will begin by getting the descriptive statistics for some of the variables.

 Analyze * descriptives for some of the


variables.
 Descriptive Statistics
descriptives
 Descriptives... variables=gender read write math
 select gender read write math science science.

We can organize the output by the levels of a categorical variable by sorting on that variable and then splitting the file.

 Data * organize the output by a categorical


 Sort Cases variable.
sort cases by ses.
 select ses
split file by ses.
 Data
 Split file descriptives variables = gender read
 Click on Organize output by groups write math science.
 select ses
 Analyze
 Descriptive Statistics
 Descriptives... split file off.
 select gender read write math science
 Data
 Split file
 Analyze all cases, do not create groups

Now we will do the same thing, but we will only look at that the records for students who earned reading scores of 60 or above.

 Data * create a filter for reading scores 60


 Select Cases... * and above and recalculate the
* descriptive statistics.
 select "if condition is satisfied"
compute f_read60=(read >= 60).
 if read >= 60 filter by f_read60.
 Analyze execute.
 Descriptive Statistics descriptives
 Descriptives... variables=gender read write math
 select gender read write math science science.

For the next example, we will select a different set of cases to be analyzed.  We will begin by using all of the cases and then provide
the selection criteria. 

 Data * after removing the previous filter


 Select Cases... * (with the "use all" command), create
* a new filter and recompute the
 select "all cases"
* descriptive statistics.
 Data use all.
 Select Cases... compute f_acad=(prgtype="academic").
 select "if condition is satisfied" filter by f_acad.
 if prgtype = "academic" execute.
 Analyze descriptives
 Descriptive Statistics variables=gender read write math
science.
 Descriptives...
 select gender read write math science

Instead of selecting cases based on the value of a variable, we will now look at cases that fall into a range.  As before, we will start by
resetting the selection criteria to include all cases.  Next, we will specify the range of cases that we want included in the analysis.
 Data * after removing the previous filter,
 Select Cases... * select the first 40 cases.
filter off.
  select "all cases"
use 1 thru 40.
 Data execute.
  Select Cases...
 select "based on time or case range"
 range 1 to 40 descriptives
 Analyze variables=gender read write math
  Descriptive Statistics science.
 Descriptives...
 select gender read write math science

Now we are going to move on to some different types of analyses.  We will begin by using all of the cases in the data set.  Then we
will compare the means of the variables read, write, math and science broken down by prgtype. 

 Data * compare means using all cases.


 Select Cases... use all.
 select "all cases"
means tables = read write math science
 Analyze by prgtype.
 Compare Means
 Means...
 select read write math science as the dependent variable
 select prgtype as the independent variable

We can do some basic graphics, such as stem and leaf plots, boxplots and histograms.

 Analyze * stem and leaf plot.


 Descriptive Statistics examine variables = write
/plot stemleaf.
 Explore...
 select write as the dependent variable * boxplot.
 click "plots..." button examine variables = write by gender
 select "stem and leaf" /plot = boxplot
 Graphs /statistics = none.
 Legacy Dialogs
 Boxplot...
 select "simple" and "summaries for groups of cases" * histogram.
 click on "define" graph
 select write as the variable and gender as the category /histogram(normal) = write.
axis * histogram.
 Graphs frequencies variables = ses
 Legacy Dialogs /histogram.
 Histogram...
 select write and check "Display normal curve" box
frequencies variables = write
 Analyze /histogram.
 Descriptive Statistics
 Frequencies...
 select ses
 click on "Charts"
 select "histograms"
 Analyze
 Descriptive Statistics
 Frequencies...
 select write
 click on "Charts"
 select "histograms"

Now we will look at some crosstabulations and correlations.

 Analyze * crosstabs.
 Descriptive Statistics crosstabs
/tables = prgtype by ses.
 Crosstabs...
 select prgtype for the rows and ses for the columns * correlations.
 OK correlations
 Analyze /variables=read write math science.
 Correlate
 Bivariate... * changing from casewise to listwise
 select read write math science deletion of missing data.
 Analyze correlations
 Correlate /variables=read write math science
/missing=listwise.
 Bivariate...
 select read write math science
 click on "Options..."
 click to "Exclude cases listwise"

Let's do some more graphics.  The graphical representation of a correlation is a scatterplot, so let's try a couple of those.

 Graphs * scatterplot.
 Legacy Dialogs graph
/scatterplot = read with write.
 Scatter/Dot...
 Simple Scatter * scatterplot matrix.
 click on "Define" graph
 select write for the y-axis and read for the x-axis /scatterplot(matrix) = read write
 Graphs math science.
 Legacy Dialogs
 Scatter/Dot...
 Matrix Scatter
 Define
 select read math science write as matrix variables

3.0 Syntax version

* opening the data file.


get file "c:\spss_data\hs0.sav".
* descriptives for some of the variables.
descriptives
variables=gender read write math science.

* create a filter for reading scores 60 and above and.


* recomputing the descriptive statistics.
compute f_read60=(read >= 60).
filter by f_read60.
execute.
descriptives
variables=gender read write math science.

* after removing the previous filter (with the "use all" command), create .
* a new filter and recompute the descriptive statistics.
use all.

compute f_acad=(prgtype="academic").
filter by f_acad.
execute.

descriptives
variables=gender read write math science.

* after removing the previous filter, select the first 40 cases.


filter off.

use 1 thru 40.


execute.

descriptives
variables=gender read write math science.

* compare means using all cases.


use all.

means tables = read write math science by prgtype.

* stem and leaf plot.


examine variables = write
/plot stemleaf.

* boxplot.
examine variables = write by gender
/plot = boxplot
/statistics = none.

* histogram.
graph
/histogram(normal) = write.
* histogram.
frequencies variables = ses
/histogram.

frequencies variables = write


/histogram.

* crosstabs.
crosstabs
/tables = prgtype by ses.

* correlations.
correlations
/variables=read write math science.

* changing from casewise to listwise deletion of missing data.


correlations
/variables=read write math science
/missing=listwise.

* scatterplot.
graph
/scatterplot = read with write.

* SPSS does not provide code for including sun flowers on the graph.

* scatterplot matrix.
graph
/scatterplot(matrix) = read write math science.

4.0 For more information


 SPSS Frequently Asked Questions
How can I do a scatterplot with regression line in SPSS?
How can I graph two (or more) groups using different symbols?
How can I display overlapping data points on a scatterplot?
How do I interpret the results from crosstabs? 

You might also like