Lec 7 8
Lec 7 8
C- management of outliers
D- Assignments
2-Boxplot
3-Test of normality
2-Analyze Method:
1- Select “Descriptive Statistics, Select “frequencies.” 2-
Move the measured variable to a box named variable(s).
3-Select mean, standard deviation, skewness, kurtosis, standard error
of skewness, and standard error of kurtosis, click continue. 4-Select
charts, histograms, click show normal curve on histogram, click
continue then ok.
Example 2: by using SPSS find frequency distribution curve for marks of Maths
exam of 20 students, then assess data normality noting that the total score in
Maths exam is 100;
2,2,2,55,55,59,60,61,61,66,71,72,72,74,93,93,97,99,100,100
Class Frequency
0 -20 3
20 -40 0
40 -60 3
60 -80 8
80 -100 4
100 -120 2
Interpretation of results
-Frequency distribution curve shows that the left and right tails are not
symmetrical denoting negative skewness
-The curve shape and its statistics values give signs of abnormal data
distribution.
Negative kurtosis:
-indicated by a flat distribution.
-data are less clustered about the center of the distribution and have thicker tails
in frequency distribution curve and short whiskers in boxplot.
Example 2: Using the following data set, draw a boxplot representing its
distribution:
13,15,20,18,17,26,19,12,24,28,16,11,25,22,14
Data ranking:
11,12,13,14,15,16,17,18,19,20,22,24,25,26,28
Q1 = the order value of (n+1)/4) =14
Q2 =the order value of (n+1)/2) = 18
Q3= the order value of 3(n+1)/4= 24
Minimum=11
Maximum=28
Lower fence= Q1-1.5(Q3-Q1)=14-1.5(24-14)=-1
Upper fence= Q3+1.5(Q3-Q1)=24+1.5(24-14)=39
If the minimum and maximum values are between
lower and upper fences, this indicates no presence
of outliers
In this example; minimum and maximum are 11, 28
which are between -1 and 39
So
no outliers are detected and confirmed in SPSS boxplot (no outliers)
and non-significant normality tests as follows;
Steps of obtaining boxplot and test of normality in SPSS
1. Select Analyze --> Descriptive Statistics --> Explore.
2. Move all variables into the “Variable(s)” window.
3. Click statistics, select outliers, continue
3. Click “Plots”, and click “Normality plots with tests, continue
4. Click OK.
Example3: Fifteen subjects suffering from knee osteoarthritis
volunteered to participate in this study. They were assessed for
isokinetic peak torque value of knee extensors before and 3 months
after exercise program. Examine the normality of the following
collected data via SPSS:
Pre-training knee ext. Post-training knee ext.
PT (N.m) PT (N.m)
37 54
22 43
45 42
29 35
28 57
33 31
35 58
40 49
39 48
46 56
43 48
22 57
20 43
47 56
42 38
Output of
Statistics SPSS
pre training post training
peak torque peak torque
N Valid 15 15
Missing 0 0
Mean 35.2000 47.6667
Std. Deviation 9.15891 8.73962
Skewness -.422- -.470-
Std. Error of Skewness .580 .580
Kurtosis -1.160- -.917-
Std. Error of Kurtosis 1.121 1.121
Frequency distribution curve
Tests of Normality
Kolmogorov- Shapiro-Wilk
Smirnova
Statistic df Sig. Statistic df Sig.
Kolmogorov- Shapiro-Wilk
Smirnova
Statistic df Sig. Statistic df Sig.
2-Winsorising method: by
replacing the outlier by:
A-the highest or the lowest near value of data according to the outlier
type(uppermost or lowermost).
Or
B-the mean value of variable data.
apply statistical parametric test for data with and without outlier (trimming
method) and compare results:
A-If results are not the same, management of outliers should be done before
statistical analysis test.
B-if results are the same, no correction of data is done (use the original data)
Contradicting results of test of normality and boxplot
but
No outliers are detected in a boxplot
Solution
-In this case, the extreme highest or lowest data value must be managed for
test of normality becomes non-significant
Converting SPSS file into word or pdf file
-Click file
-Select export
-Select word or pdf document
-Select browse
-Select a site of saving
-Click save and ok
Assignment 1
Exercise 1: given the following output of data exploration using SPSS; -
Discuss these results and make a conclusion.
Tests of Normality Statistics
Kolmogorov- Shapiro
Smirnova VAR00001
Statist df Sig. Statist Valid 10
N
ic ic Missing 0
Skewness .494
Std. Error of Skewness .687
.236 10 .120 .871 10
Kurtosis -.255-
Std. Error of Kurtosis 1.334
25 15.2500
Percentiles 50 17.0000
75 23.7500
Exercise 2: the following is a data set of test marks of
fifteen students;
(19,22,30,43,29,18,11,17,24,16,100,45,57,44,38)
1-Explore these data using all available tests of SPSS.
2-Write a report about the results.