SPSS Introduction
SPSS Introduction
2011.4.6
1
Preface
Statistical Software
• A program, which takes numbers as
input and creates tables (and figures) ?
• A (collection of) program(s) for
exploration, inference and modeling ?
• A tool for administration, manipulation
and analysis of data ?
2011.4.6
2
• A medium of communication with
CPU (graphics card,printer, . . . ) of our
computer
3
Additional Definition
2011.4.6
4
SPSS
5
SPSS
1. Introduction
SPSS, which once stood for "Statistical Package
for the Social Sciences",
Sciences was created by Norman H.
Nie, C. Hadlai (Tex) Hull and Dale H. Bent in 1968,
three young men developed a software system based
on the idea of using statistics to turn raw data into
information essential to decision-making. The initial
work on SPSS was done at Stanford University.
6
SPSS Inc. is a leading worldwide provider of predictive
analytics software and solutions. Today SPSS has more than
250,000 customers worldwide, served by more than 1,200
employees in 60 countries.
7
PASW (formerly SPSS)
SPSS is a computer program used for
statistical analysis. Before 2009 it was called SPSS, but in 2009
it was re-branded as PASW (Predictive Analytics SoftWare). are
The company announced July 28, 2009 that it was being
acquired by IBM for US$1.2 billion.
8
2. How to use it
SPSS performs any task mainly by a menu,
menu but
can also use program,
program too.
9
Before you analyze the data, you must set the
SPSS dataset in Data Editor. Like this:
10
Once you have set the SPSS dataset, you may
use certain statistical method in the Analyze menu
to get the results.
11
For example
linear regression analysis using SPSS
12
First, you need input the raw data in Data
Editor, set a SPSS dataset(filename: exp13_1.sav).
it shows the Data View windows below.
13
From the Variable View window, you can know
the information of all variables in SPSS dataset. it
shows two ( x – age, y – blood pressure ).
14
Next, you need select proper command in
Graphs menu, draw a scatter plot. Like this:
15
It shows the Scatterplot dialog boxes below.
16
It shows the Scatterplot graph below.
17
Next, you may select linear regression command
in Analyze menu, do a linear regression analysis.
18
It shows the Linear Regression dialog box below.
19
It shows the first part of output for linear regression
analysis below.
20
It shows the last part of output for linear regression
analysis below.
21
• More knowledge, see reference books about SPSS.
www.spss.com
22
Part I Set up data file
1. Set up variables
In Variable View window, you set up every
variable’s properties in SPSS. Include Name, Type,
Width, Decimals, Label, Values, Missing, Columns,
Align, Measure.
23
See the sample data file in Variable View .
24
Part II Data analysis
I. Descriptive Statistics
1. Measurement Data
SPSS’s Commands and Options :
Analyze→Descriptive Statistics→ Descriptives…
Options … :
Mean(Arithmetic Mean):
Mean
Std. Deviation(Standard Deviation):
Deviation
S.E. mean(Standard Error of mean):
mean
Range:
25
2. Measurement & Enumeration Data
SPSS’s Commands and Options :
Analyze→Descriptive Statistics→ Frequencies…
Display frequency tables:
Statistics … :
Median:
Percentile:
Quartiles:
You can use “Quartiles” to get
InterQuartile
Range .
26
Charts… :
Histograms:
Row
28
II. Inferential Statistics
1. Normality Test
Analyze→ Nonparametric tests → 1-Sample K-S…
Test Distribution: Normal:
For Output:
Output
Asymp. Sig. (2-tailed)=P-Value,
Value P>0.05 means
data nearly follows normal distribution.
distribution
29
2. One-Sample T Test :
Analyze→Compare Means→One-Sample T Test…
Test Variable: for one sample measurement data
Test Value: to input the known parameter (μ0 )
For Output:
Output
t = t-Value,
Value Sig. (2-tailed)=P-Value,
Value
30
3. Paired-Samples T Test :
Analyze→Compare Means→Paired-Samples T Test…
Paired Variables: for paired data
Variable1: before treatment data
Variable2: after treatment data
For Output:
Output
t = t-Value,
Value Sig. (2-tailed)=P-Value,
Value
31
4. Two-Samples T Test :
Analyze→Compare Means→Independent-Samples T
Test…
Test Variable: for measurement data
Grouping Variable: to mark the data into two
groups
For Output:
Output
t = t-Value,
Value Sig. (2-tailed)=P-Value,
Value
32
5. Chi-square Test :
Analyze→Descriptive Statistics→ Crosstabs…
Row(s): for grouping data
Column(s): for categorical testing data
Statistics … :
Chi-square(Chi-square Test):
Test χ2
For Output:
Output
Value =χ2 -Value,
Value Asymp. Sig. (2-tailed)=P-Value,
Value
P < 0.05 means refuse H0 : π1 = π 2. (seefig2)
33
6. Linear Regression :
Analyze→Regression→ Linear…
Dependent: ( y )
Independent(s): ( x )
7. Linear Correlation :
Analyze→ Correlate→ Bivariate…
Variables: paired variables
coefficients Pearson
Correlation coefficients:
34
•The structure of database
variable
no name sex age Height (cm) Weight (kg) Blood Type
1 Tom male 23 175 71.5 A
2 Jerry female 18 168 58.3 AB
3 Jean female 25 162 61.4 B
4 John male 31 182 74.5 B
case 5 Wendy female 15 173 56.7 O
6 Mark male 28 176 66.9 A
7 Bob male 67 165 75.6 B
8 Bush male 21 191 68.2 AB
9 Ben male 12 174 73.9 O
10 Lily female 75 167 56.1 B
35
36
37
1-male
sex Input code number
2-female
38
1-male
2-female
1-A
2-B
3-AB
4-O
39
2011.4.6
40
2011.4.6
41
How to eat carrot for best nutrition
Cumulative
Frequency Percent Valid Percent Percent
Valid I don't know 24 2.6 2.6 2.6
eat directly 477 51.2 51.5 54.1
eat after water-boiled 185 19.8 20.0 74.1
eat after oil-fried 240 25.8 25.9 100.0
Total 926 99.4 100.0
Missing 9 6 .6
Total 932 100.0
Table3-1 “How to eat carrot for best nutrition” question's frequency table
Item Frequency Percent Valid Percent Cumulative Percent
eat directly 477 51.2 51.5 51.5
eat after water-boiled 185 19.8 20.0 71.5
eat after oil-fried 240 25.8 25.9 97.4
I don't know 24 2.6 2.6 100.0
Missing 6 0.6 — —
total 932 100.0 100.0 —
42
sex * status of study Crosstabulation
Count
status of study
bad good Total
sex male 192 299 491
female 137 280 417
Total 329 579 908
Chi-Square Tests
2011.4.6
43
a
Coefficients
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 142.386 5.252 27.109 .000
age 1.243 .258 .157 4.822 .000
a. Dependent Variable: Height
For test:
test
t = t-Value,
Value Sig.=P-Value,
Value
P < 0.05 means refuse H0 : β = 0.
For Regression Equation:
Equation
Yˆ 142.386 1.243age
2011.4.6
44
Correlations
Height Weight
Height Pearson Correlation 1 .721**
Sig. (2-tailed) .000
N 930 930
Weight Pearson Correlation .721** 1
Sig. (2-tailed) .000
N 930 931
**.Correlation is significant at the 0.01 level
(2-tailed).
For test:
test
Sig. (2-tailed) =P-Value,
Value
P < 0.05 means refuse H0 : ρ = 0.
For Pearson Correlation coefficient:
coefficient
r = 0.721
2011.4.6
45
Thank you!
2011.4.6
46