0% found this document useful (0 votes)
205 views107 pages

Spss Problem Solve

This document provides statistical analysis of serum cholesterol levels and diastolic blood pressure from a sample of 240 individuals. It reports the mean, median, mode and standard deviation for both raw variables. It then calculates standardized z-scores and provides summary statistics for the standardized scores, including the minimum and maximum values, means, standard deviations, and quartiles. The analysis allows comparison of central tendencies and variability for both the raw data and standardized scores.

Uploaded by

Palash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views107 pages

Spss Problem Solve

This document provides statistical analysis of serum cholesterol levels and diastolic blood pressure from a sample of 240 individuals. It reports the mean, median, mode and standard deviation for both raw variables. It then calculates standardized z-scores and provides summary statistics for the standardized scores, including the minimum and maximum values, means, standard deviations, and quartiles. The analysis allows comparison of central tendencies and variability for both the raw data and standardized scores.

Uploaded by

Palash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 107

Answer to the Question No.

1
Section (i)

(a)

Answer: Mean, median and mode of the variables serum cholesterol 58 (cho58) and
diastolic blood pressure (dpb58)

Statistics
Serum Average Diast
Cholesterol 58 -- Blood Pressure
Mg per DL 58
N Valid 240 239
Missing 0 1
Mean 264.09 88.79
Median 261.00 87.00
Mode 235 a
80
Std. Deviation 52.594 13.050
a. Multiple modes exist. The smallest value is shown

Interpretation: Here for both variables (Serum Cholesterol 58 (chol 58) and Average
Diastolic Blood Pressure 58 (dbp 58) Mean is greater than the Median and the Median is
greater than Mode. Mean is the best measure of central tendency because it shows the
average value of all the variables in a data set.

(b)

Answer: Percentage of non-smokers and mean & median number of smokers


Frequency distribution for the percentage of non-smokers (cgt58)

No of Cigarettes per Day in 1958


Cumulative
Frequency Percent Valid Percent Percent
Valid 0 102 42.5 42.7 42.7
1 1 .4 .4 43.1
2 1 .4 .4 43.5
3 1 .4 .4 43.9
4 2 .8 .8 44.8
5 1 .4 .4 45.2
6 2 .8 .8 46.0
9 1 .4 .4 46.4
10 14 5.8 5.9 52.3
11 1 .4 .4 52.7
12 2 .8 .8 53.6
15 13 5.4 5.4 59.0
16 1 .4 .4 59.4
18 2 .8 .8 60.3
20 64 26.7 26.8 87.0
22 1 .4 .4 87.4
24 1 .4 .4 87.9
25 4 1.7 1.7 89.5
30 16 6.7 6.7 96.2
40 7 2.9 2.9 99.2
60 2 .8 .8 100.0
Total 239 99.6 100.0
Missing System 1 .4
Total 240 100.0

Interpretation: Here, 42.7 Percent of men are non-smokers (valid percent due to 1 missing
value) and the remaining 100-42.7= 57.3 percent are smokers.
Mean & median number of smokers

Statistics
No of Cigarettes per Day in 1958
N Valid 137
Missing 0
Mean 20.20
Median 20.00
Std. Deviation 9.353

Interpretation: The mean and median number of cigarettes smoked by smokers is 20.20
and 20.00 (excluding missing values & non-smokers through select cases)
(C)
Answer: Standardized scores for cholesterol values and diastolic blood pressure -

Case Summariesa
Zscore:
Zscore: Serum Average Diast
Cholesterol 58 -- Blood Pressure
Mg per DL 58
1 1.08211 -1.43992
2 -.34391 -.13723
3 -.03969 .01603
4 .20749 1.24209
5 .70184 1.62524
6 -.05870 -.06060
7 -.99037 -1.43992
8 2.05180 -.75026
9 -.91431 1.01221
10 -.24884 .78232
11 .01735 .32255
12 -2.03611 -1.21003
13 .09340 -.44374
14 1.61449 2.23827
15 -.68615 .24592
16 .66381 .24592
17 .53072 .16929
18 -.07772 -.06060
19 1.65251 1.39535
20 -.51503 1.24209
21 .53072 -.44374
22 -.15377 -.29048
23 .91099 -.67363
24 .51170 .55243
25 -.55306 .09266
26 .24551 2.46815
27 .16946 .78232
28 -.93333 .70569
29 -.55306 -.21386
30 .35959 -.13723
31 -1.08544 1.62524
32 1.19619 1.08883
33 .11242 -1.43992
34 .34058 -1.59317
35 .56874 1.08883
36 -1.44669 .09266
37 -.87629 .70569
38 .94901 -.82689
39 -1.33261 -1.28666
40 2.10884 -.67363
41 1.51942 -.13723
42 1.40534 2.77467
43 .58776 -.67363
44 -.11574 -.21386
45 -.34391 .62906
46 -.32489 -.67363
47 1.97574 2.00838
48 .39762 -.44374
49 1.61449 .24592
50 -.68615 .62906
51 -3.00580 -.36711
52 3.05952 -.44374
53 1.67153 -1.05677
54 .34058 1.24209
55 -.89530 -1.21003
56 -1.10445 3.38770
57 .54973 .01603
58 -1.67486 -.67363
59 -.41996 .09266
60 .26453 .70569
61 .20749 -.29048
62 .96803 .55243
63 -1.12346 -.13723
64 -.21081 -.59700
65 -.07772 -.82689
66 -1.10445 -.67363
67 4.77074 1.08883
68 -.41996 -.67363
69 .49269 -.67363
70 -.68615 .09266
71 .34058 .85895
72 -.62911 .01603
73 -.55306 -1.82306
74 -.55306 .62906
75 1.10112 .24592
76 -1.52275 -.90351
77 -.00166 -.90351
78 1.21520 -1.28666
79 .83493 -.67363
80 -.00166 -.21386
81 .13143 -.90351
82 -.41996 .01603
83 -.00166 .70569
84 .26453 -.52037
85 -.83826 -.75026
86 -.15377 -.44374
87 -1.67486 .01603
88 .53072 -1.05677
89 -1.12346 .16929
90 -.83826 .70569
91 .68282 -.67363
92 2.35602 -.13723
93 1.93772 -.98014
94 -2.09315 -.29048
95 -.28687 -.67363
96 1.10112 -.82689
97 -.55306 -.13723
98 1.25323 -.36711
99 .96803 1.85512
100 -.41996 -1.36329
Total N 100 100
a. Limited to first 100 cases.
(d)
Answer: The smallest and the largest standardized score for Diastolic Blood Pressure

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Average Diast Blood 239 65 160 88.79 13.050
Pressure 58
Valid N (listwise) 239

Interpretation: Here, we see that the smallest and the largest standardized scores for
Diastolic Blood Pressure are 65 and 160

(e)

Answer: The smallest and the largest standardized score for cholesterol

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Serum Cholesterol 58 -- Mg 240 106 515 264.09 52.594
per DL
Valid N (listwise) 240

Interpretation: Here, we see that the smallest and the largest standardized scores for
cholesterol are 106 and 515
(f)

Answer: The means, standard deviations and quartiles of the standardized scores for cholesterol
values, diastolic blood pressure

Statistics
Zscore:
Zscore: Serum Average Diast
Cholesterol 58 -- Blood Pressure
Mg per DL 58
N Valid 240 239
Missing 0 1
Mean .0000000 .0000000
Std. Deviation 1.00000000 1.00000000
Percentiles 25 -.6813978 -.6736279
50 -.0587043 -.1372264
75 .5639892 .6290614

Interpretation: Here the table shows that means and standard deviations of the standardized
score for the variables cholesterol values and diastolic blood pressure are 0 and 1
respectively. The Q1, Q2 and Q3 for Serum Cholesterol are respectively, -.6813978,
-.0587043, .5639892. For Average Diastolic Blood Pressure, the Q1, Q2 and Q3 are
-.6736279, -.1372264 .6290614
Section (ii)

(a)

Answer: Descriptive statistics for 40 data of fictional salary

Statistics
Starting salary
N Valid 40
Missing 0
Mean 27.657
Std. Error of Mean .8236
Median 27.550
Mode 22.6
Std. Deviation 5.2088
Variance 27.132
Skewness -.069
Std. Error of Skewness .374
Kurtosis -.622
Std. Error of Kurtosis .733
Range 20.7
Minimum 16.6
Maximum 37.3
Sum 1106.3

(b)
Answer: Histogram of these data with a normal curve
Interpretation: The above histogram shows fictional starting salaries for 40 teachers. Salary values
are on the horizontal axis, and frequencies are on the vertical axis. Most of the salaries lie on the
range 25.0 .From the histogram it is clear that the fictional starting salaries (in four thousands of
dollars) are approximately normally distributed. There are no outliers in this distribution. It is also
supported by the following Box plot.

Comment: There is no outlier in this distribution.

Comment: There is no outlier in this distribution.

(C)

Answer: The mean and median starting salaries


Statistics
Starting salary
N Valid 40
Missing 0
Mean 27.657
Median 27.550
The mean for the incremental salaries = 27.66

The Median for the incremental salaries = 27.55

(D)
Answer: Standard deviation for these fictional salaries

Statistics
Starting salary
N Valid 40
Missing 0
Std. Deviation 5.2088

Interpretation: The standard deviation 5.2088 .A small standard deviation means that the values in
a statistical data set are close to the mean of the data set, on average, and a large standard
deviation means that the values in the data set are farther away from the mean, on average. Here
the estimated standard deviation of this data set (fictional starting salaries) is close to mean that is
supported by histogram i.e., normally distributed & box plot i.e., there is no outlier. There is no out
layer.

(E)

Answer: Salary with increment

Case Summariesa
Starting salary N_salary final_sal
1 34.2 1.09 35.29
2 28.9 .92 29.82
3 37.3 1.19 38.49
4 16.6 .53 17.13
5 17.4 .56 17.96
6 31.0 .99 31.99
7 22.0 .70 22.70
8 24.4 .78 25.18
9 29.9 .96 30.86
10 25.8 .83 26.63
11 34.3 1.10 35.40
12 27.1 .87 27.97
13 25.5 .82 26.32
14 25.3 .81 26.11
15 22.6 .72 23.32
16 29.7 .95 30.65
17 26.3 .84 27.14
18 23.8 .76 24.56
19 35.2 1.13 36.33
20 28.4 .91 29.31
21 34.8 1.11 35.91
22 21.7 .69 22.39
23 26.5 .85 27.35
24 21.5 .69 22.19
25 26.1 .84 26.94
26 33.9 1.08 34.98
27 30.5 .98 31.48
28 29.0 .93 29.93
29 32.1 1.03 33.13
30 36.8 1.18 37.98
31 32.8 1.05 33.85
32 23.0 .74 23.74
33 19.9 .64 20.54
34 30.3 .97 31.27
35 24.7 .79 25.49
36 32.6 1.04 33.64
37 22.6 .72 23.32
38 28.0 .90 28.90
39 24.0 .77 24.77
40 29.8 .95 30.75
Total N 40 40 40
a. Limited to first 100 cases.
(f)
Repeat steps (a) through (d) for these new salary

Statistics
final_sal
N Valid 40
Missing 0
Mean 28.5425
Std. Error of Mean .84994
Median 28.4316
Mode 23.32
Std. Deviation 5.37550
Variance 28.896
Skewness -.069
Std. Error of Skewness .374
Kurtosis -.622
Std. Error of Kurtosis .733
Range 21.36
Minimum 17.13
Maximum 38.49
Sum 1141.70

 Histogram of new salary with a normal curve


Interpretation: The above histogram shows fictional starting salaries for 40 teachers. Salary values
are on the horizontal axis, and frequencies are on the vertical axis. Most of the salaries lie on the
range 27.0 .From the histogram it is clear that the fictional starting salaries (in four thousands of
dollars) are approximately normally distributed. There are no outliers in this distribution. It is also
supported by the following Box plot.

Comment: There is no outlier in this distribution.


 The mean and median final salaries
Statistics
final_sal
N Valid 40
Missing 0
Mean 28.5425
Median 28.4316
The mean for the incremental salaries = 28.54

The Median for the increment salaries= 28.43

 Standard deviation for these fictional salaries

Statistics
final_sal
N Valid 40
Missing 0
Std. Deviation 5.37550

Interpretation: The standard deviation 5.3755 .A small standard deviation means that the values in
a statistical data set are close to the mean of the data set, on average, and a large standard
deviation means that the values in the data set are farther away from the mean, on average. Here
the estimated standard deviation of this data set (fictional starting salaries) is close to mean that is
supported by histogram i.e., normally distributed & box plot i.e., there is no outlier. There is no out
layer.
Question-2
Section-i
(a)

Final grade is not normally distributed.

Home work problem is normally distributed.


(b)

Descriptive Statistics
N Mean Std. Deviation
final_grade 16 75.12 16.577
hw_prob 16 79.00 19.187
Valid N (listwise) 16

Mean for final grade is 75.12 & standard deviation is 16.577


Mean for homework problem is 79.00 and stand deviation is 19.187

(c)
(d)

(e)

Correlations
final_grade hw_prob
final_grade Pearson Correlation 1 .672**
Sig. (2-tailed) .004
N 16 16
hw_prob Pearson Correlation .672 **
1
Sig. (2-tailed) .004
N 16 16
**. Correlation is significant at the 0.01 level (2-tailed).

In the graph, we can see that there is a positive correlation, which means that the final course
grade goes up as the number of homework problems goes up. The plot has a moderate shape
because the coefficient of determination, or "goodness of fit," is r2 = 0.451584.
From coefficient correlation, we can describe the strength of correlation between the
variables. Here, the value of r = 0.672 which implies there is a moderate correlation between
the variables.
(f)

The plot of regression standardized residuals indicates that there is a positive relationship
between homework problem residuals and final course grade.

(g)
Final grade has no outliers. On the other hand, home work problem has outliers.

(h)

Correlations
final_grade hw_prob
final_grade Pearson Correlation 1 .672**
Sig. (2-tailed) .004
N 16 16
hw_prob Pearson Correlation .672** 1
Sig. (2-tailed) .004
N 16 16
**. Correlation is significant at the 0.01 level (2-tailed).

The correlation coefficient between final grade and home work problem is r= 0.672 which
indicates a positive and moderate correlation between the variables.

(i)

Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .672 a
.451 .412 12.710
a. Predictors: (Constant), hw_prob
b. Dependent Variable: final_grade

The r2 value of 0.451indicates that homework issues may explain approximately 45.1% of the
final course grade.

(j)
Final Course Grade, y=a+bx
=29.273+0.580(0)
= 29.273
This means that the average final grade for a student who did not complete any homework
tasks (0) is 29.273.
(k)
Final Course Grade, y=a+bx
=29.273+0.580(55)
= 61.173
This means that the average final grade for a student who completed 55 homework tasks is
61.173.
The score has been calculated both by eye and using SPSS.

(l)

The mean for the final grade and the mean for homework problems lie at point 5.
Section-ii
(a)

Descriptives
Statistic Std. Error
Expenditure Mean 3.100 .7316
95% Confidence Interval for Lower Bound 1.310
Mean Upper Bound 4.890
5% Trimmed Mean 3.022
Median 2.700
Variance 3.747
Std. Deviation 1.9356
Minimum .8
Maximum 6.8
Range 6.0
Interquartile Range 2.3
Skewness 1.181 .794
Kurtosis 1.903 1.587
Sales Mean 20.100 1.5826
95% Confidence Interval for Lower Bound 16.227
Mean Upper Bound 23.973
5% Trimmed Mean 19.933
Median 18.900
Variance 17.533
Std. Deviation 4.1873
Minimum 15.9
Maximum 27.3
Range 11.4
Interquartile Range 7.6
Skewness .934 .794
Kurtosis -.167 1.587

Descriptive statistics for both the variables.


Histogram for expenditure. Here mean is 3.100 and standard deviation is 1.936. Expenditure
is on the horizontal axis and frequency on the vertical axis. The majority of the expenditures
are in the range (20-30). The above histogram clearly shows that expenditures are more flat
on the right hand side, resulting in a positively skewed curve. It is not normally distributed.

Histogram for sales. Here mean is 20.1 and standard deviation is 4.187. Sales is on the
horizontal axis and frequency on the vertical axis. The majority of the sales are in the range
(16-20). The above histogram clearly shows that sales are more flat on the right hand side,
resulting in a positively skewed curve. It is normally distributed.
(b)

The upward trend line indicates that expenditure and sale have a high positive
relation(nearly perfect). The goodness of fit (r2) also contributes to its strength. The value r 2=
0.931 means that 93.1% of the sales can be explained by expense.

(c)

Here regression equation y= a+bx = 19.63+ 2.09x


(d)

The upward trend of residuals indicates that there is a strong positive relationship between
them.
Here from the boxplot, we can see that expenditure has outliers. On the other hand, sales have
no outliers.

(e)

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 13.630 .911 14.968 .000
Expenditure 2.087 .254 .965 8.207 .000
a. Dependent Variable: Sales

The estimated regression equation of Sales (y) on expenditure = 13.630+2.087(expenditure)


[y=ab+x]

When the expenditure is assumed to be 8 million dollars, then the predicted sales will be

y = 13.630+2.087 (8) = 30.326

(f)
When the expenditure is assumed to be 5 million dollars, then the predicted sales will be:

y = 13.630+2.087(5) = 24.065 [y=a+bx]


(g)
It is clear from the estimated calculations in sections 2 (ii) (e) and (f) that expenditure induces
sales. When the cost is increased from $5 million to $8 million, the sales volume increases
from 24.065 to 30.326.

Section- iii (a)


Descriptive data analysis for each of the variables:

Descriptives
Statistic Std. Error
height Mean 69.13 .833
95% Confidence Interval for Lower Bound 67.15
Mean Upper Bound 71.10
5% Trimmed Mean 69.14
Median 69.50
Variance 5.554
Std. Deviation 2.357
Minimum 66
Maximum 72
Range 6
Interquartile Range 5
Skewness -.198 .752
Kurtosis -1.230 1.481
weight Mean 164.38 7.346
95% Confidence Interval for Lower Bound 147.00
Mean Upper Bound 181.75
5% Trimmed Mean 164.03
Median 162.50
Variance 431.696
Std. Deviation 20.777
Minimum 135
Maximum 200
Range 65
Interquartile Range 31
Skewness .484 .752
Kurtosis -.074 1.481
shoe_size Mean 9.625 .5728
95% Confidence Interval for Lower Bound 8.270
Mean Upper Bound 10.980
5% Trimmed Mean 9.611
Median 9.500
Variance 2.625
Std. Deviation 1.6202
Minimum 7.5
Maximum 12.0
Range 4.5
Interquartile Range 2.9
Skewness .139 .752
Kurtosis -1.511 1.481
ring_size Mean 6.9063 .30596
95% Confidence Interval for Lower Bound 6.1828
Mean Upper Bound 7.6297
5% Trimmed Mean 6.8958
Median 7.0000
Variance .749
Std. Deviation .86538
Minimum 6.00
Maximum 8.00
Range 2.00
Interquartile Range 1.69
Skewness .008 .752
Kurtosis -2.310 1.481
iq Mean 120.63 3.391
95% Confidence Interval for Lower Bound 112.61
Mean Upper Bound 128.64
5% Trimmed Mean 119.97
Median 117.00
Variance 91.982
Std. Deviation 9.591
Minimum 113
Maximum 140
Range 27
Interquartile Range 14
Skewness 1.488 .752
Kurtosis 1.457 1.481
Figure: Histogram for height

Figure: Boxplot for height


Figure: Histogram for weight

Figure: Boxplot for weight


Figure: Histogram shoe size

Figure: Boxplot for shoe size


Figure: Histogram for ring size

Figure: Boxplot for ring size


Figure: Histogram for IQ

Figure: Boxplot for IQ


We can see from the given boxplots that all the variables have no outliers.
iii (b)

Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .807a .651 .185 8.658
a. Predictors: (Constant), ring_size, height, shoe_size, weight

We can see from the table that the value of R-square is 0.651, i.e., iq can be explained by the
variables at 65.1%.

iv (c)

Figure: Scatter plot for height and iq


Figure: Scatter plot for weight and iq

Figure: Scatter plot for shoe size and iq


Figure: Scatter plot for ring size and iq

All the scatterplots shown here indicates a positive relationship among height, weight, ring
size, shoe size and IQ.

iii (d)
iii (e)

Correlations
height weight shoe_size ring_size iq
height Pearson Correlation 1 .804* .762* .655 .527
Sig. (2-tailed) .016 .028 .078 .180
N 8 8 8 8 8
weight Pearson Correlation .804* 1 .936** .870** .576
Sig. (2-tailed) .016 .001 .005 .135
N 8 8 8 8 8
shoe_size Pearson Correlation .762*
.936 **
1 .786 *
.638
Sig. (2-tailed) .028 .001 .021 .089
N 8 8 8 8 8
ring_size Pearson Correlation .655 .870 **
.786 *
1 .240
Sig. (2-tailed) .078 .005 .021 .566
N 8 8 8 8 8
iq Pearson Correlation .527 .576 .638 .240 1
Sig. (2-tailed) .180 .135 .089 .566
N 8 8 8 8 8
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).

The Pearson correlation shows that iq has no significant relationship with height, weight,
shoe size, ring size.

iii (f)

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 87.141 132.824 .656 .559
height .153 2.365 .038 .065 .952
weight .409 .622 .885 .657 .558
shoe_size 3.360 5.830 .568 .576 .605
ring_size -11.090 7.870 -1.001 -1.409 .254
a. Dependent Variable: iq

The value of intercept is 87.141 and the value of sloes for the variables height, weight, shoe
size, ring size are 0.153, 0.409, 3.360, -11.090 respectively.
iii (g)
Regression equation for the variables is as follows:
Y = a + bx
= 87.141 + height * 0.153 + weight * 0.409 + shoe size * 3.360 – ring size * 11.090

iii (h)

WILL BE SOLVED LATER

Question- 3
Section- i (a)

Ordinal
number Year of origin Cylinder Garage Color Price (unit)
1 1993 75 0 1 50000
2 1993 125 0 1 70000
3 1993 75 0 1 60000
4 1994 250 1 1 80000
5 1994 75 0 1 70000
6 1994 125 1 1 80000
7 1995 75 0 1 60000
8 1995 125 0 1 80000
9 1995 250 0 2 100000
10 1996 250 1 3 170000
11 1996 250 1 3 168000
12 1997 75 1 2 100000
13 1997 125 1 2 120000
14 1998 250 0 3 156000
15 2004 250 1 5 560000
16 1999 500 1 5 380000
17 2000 500 1 5 425000
18 2001 250 0 4 320000
19 2002 125 1 4 300000
20 2003 75 1 4 220000
Table: SPSS dataset form the given data

Question- 3 Section- i (b)

Figure: Bar chart for car colors

Question- 3 Section- i (c)

Figure: 3D pie chart for car colors


Question- 3 Section- i (d)

Statistics
price_unit
N Valid 3
Missing 0
Mean 164666.67

From the table, we can see that average price for yellow cars is 164666.67

Question- 3 Section- i (e)

Statistics
price_unit
N Valid 20
Missing 0
Mean 178450.00
Median 110000.00
Mode 80000
Skewness 1.382
Std. Error of Skewness .512
Kurtosis 1.170
Std. Error of Kurtosis .992
Percentiles 25 72500.00
50 110000.00
75 280000.00

 Because the calculated mean, median, and modal prices are 178450, 110000, and 80000,
respectively, the prices are not normally distributed and there is an asymmetric relationship.
Skewness=1.383 indicates that this distribution is positively skewed. This distribution is
leptokurtic in terms of Kurtosis. The first quartile (Q1) is P25=72500, the second quartile
(Q2) is P50=110000, and the third quartile (Q3) is P75=280000.
Question- 3 Section- i (f)

Figure: Boxplot for the car prices

From the boxplot, we can see that there is an outlier.

Question- 3 Section- i (g)

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -71007431.535 10135574.748 -7.006 .000
year_of_origin 35647.303 5075.520 .856 7.023 .000
a. Dependent Variable: price_unit

From the table we can see that there is dependency between price and model year (year of
origin).
Question- 3 Section- i (h)

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -40133.333 23603.443 -1.700 .106
Colour 87433.333 8095.914 .931 10.800 .000
a. Dependent Variable: price_unit

From the table we can see that there is dependency between price and color.

Question- 3 Section- i (i)

Colour * Garage Crosstabulation


Garage
The car was not The car was
kept in a garage kept in a garage Total
Colour red Count 6 2 8
% within Colour 75.0% 25.0% 100.0%
green Count 1 2 3
% within Colour 33.3% 66.7% 100.0%
yellow Count 1 2 3
% within Colour 33.3% 66.7% 100.0%
blue Count 1 2 3
% within Colour 33.3% 66.7% 100.0%
black Count 0 3 3
% within Colour 0.0% 100.0% 100.0%
Total Count 9 11 20
% within Colour 45.0% 55.0% 100.0%

Table: Crosstabs about the color and the storage (garage) of cars.

Question- 3 Section- i (j)

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
price_unit .223 20 .010 .813 20 .001
a. Lilliefors Significance Correction

From the table we can see that, the price does not follow a normal distribution.

Question- 3 Section- i (k)

Statistics
price_unit
N Valid 20
Missing 0
Mean 178450.00
Std. Error of Mean 32326.336
Std. Deviation 144567.768

Table: Average price of cars

One-Sample Test
Test Value = 0
90% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
price_unit 5.520 19 .000 178450.000 122553.47 234346.53

Here, 90% confidence interval is lower = 122553.47 and upper = 234346.53


Question No. 4
Section (i) – a

Correlations
Internet usage Respondent's
category sex
Internet usage category Pearson Correlation 1 -.067
Sig. (2-tailed) .090
N 649 649
Respondent's sex Pearson Correlation -.067 1
Sig. (2-tailed) .090
N 649 1419
Table 1: Relationship between time on the Internet (netcat) and gender (sex)
Interpretation: We can see from the table that there is a negative relationship between the
variables netcat and sex.

Correlations
Internet usage Respondent's
Control Variables category sex
Respondent's highest degree Internet usage category Correlation 1.000 -.060
Significance (2-tailed) . .125
df 0 642
Respondent's sex Correlation -.060 1.000
Significance (2-tailed) .125 .
df 642 0
Table 2: Relationship between time on the Internet (netcat) and gender (sex) when control for
education
Interpretation: When we control for education, we can find a change in the magnitude of
relationship between the variables netcat and sex. But there is no change in terms of direction
of relationship. We got the value of r = -0.067 without controlling for education,
demonstrating the negative correlation between the variables Internet (variable netcat) and
gender (variable sex). The value of r = -0.053 obtained by correcting for education, however,
also demonstrates the negative correlation between gender (variable sex) and internet use
(variable netcat).
Section (i)– b

Age category * Respondent's sex * Use Internet? Crosstabulation


Respondent's sex
Use Internet? Male Female Total
No Age category 18-29 Count 42 58 100
% within Age category 42.0% 58.0% 100.0%
30-39 Count 60 61 121
% within Age category 49.6% 50.4% 100.0%
40-49 Count 65 68 133
% within Age category 48.9% 51.1% 100.0%
50-59 Count 43 62 105
% within Age category 41.0% 59.0% 100.0%
60-89 Count 93 182 275
% within Age category 33.8% 66.2% 100.0%
Total Count 303 431 734
% within Age category 41.3% 58.7% 100.0%
Yes Age category 18-29 Count 79 65 144
% within Age category 54.9% 45.1% 100.0%
30-39 Count 88 98 186
% within Age category 47.3% 52.7% 100.0%
40-49 Count 66 96 162
% within Age category 40.7% 59.3% 100.0%
50-59 Count 47 57 104
% within Age category 45.2% 54.8% 100.0%
60-89 Count 26 32 58
% within Age category 44.8% 55.2% 100.0%
Total Count 306 348 654
% within Age category 46.8% 53.2% 100.0%
Total Age category 18-29 Count 121 123 244
% within Age category 49.6% 50.4% 100.0%
30-39 Count 148 159 307
% within Age category 48.2% 51.8% 100.0%
40-49 Count 131 164 295
% within Age category 44.4% 55.6% 100.0%
50-59 Count 90 119 209
% within Age category 43.1% 56.9% 100.0%
60-89 Count 119 214 333
% within Age category 35.7% 64.3% 100.0%
Total Count 609 779 1388
% within Age category 43.9% 56.1% 100.0%

Interpretation:
(i) 54.9 percentages of males under 30 years of age uses the Internet. 45.1 percentages of
females under 30 years of age uses the Internet.
(ii)

Figure-1: Stacked bar chart showing the distribution of time on the Internet (variable
netcat) for men and women

Section-ii (a)

Figure-2: Histogram for ADDSC

Interpretation: The histogram for the ADDSC variable shows the values are on the
horizontal
axis and frequencies are on the vertical axis. From the histogram, it is clear that the ADDSC
is
approximately normally distributed.
Section-ii (b)

Figure-3: Boxplot for ADDSC

Figure-4: Box plot for ADDSC with and without social problems

Interpretation: According to the above Box plot, the median value of addsc for those who
have no social problems (nearly 45.0) is lower than for those who have social problems
(above 60.0). This plot also shows that ADDSC is positively skewed with social problems
and negatively skewed with social problems. In both distributions, there is no outlier.

Section-ii (c)
Figure–5: Scatter plot for the relationship between ADD symptoms and GPA
Interpretation: The graph clearly shows a negative linear relationship between ADD
symptoms and GPA.

Section-ii (d)

Figure–6: Pie chart for the percent of types of English classes

Section-ii (e)
Figure–7: Bar chart for the mean GPA of students in the 3 types of English classes

Interpretation: From the above figure, it is clear that all of the English classes do not appear
to be similar i.e., they appear to be different. Class-1 holds the highest position accounting for
3.09 followed by class-2 and class-3.

Section-ii (f)

Figure–8: Bar chart and a line graph for the mean difference in GPA based on both gender
and level of English class
Interpretation: By comparing the distribution, female has better mean gpa than male. The
first
category of class is found to be the best among 3 classes. So, both gender and English class
have positive effects on GPA. In this case, I prefer line graph because it show both better
performers and the direction of linearity. In addition, line graph is most useful for showing
trends and for identifying whether two variables are related to one another.

Now, to see if there is an interaction effect, we need to run the generalized univariate linear
model which is given below:

Tests of Between-Subjects Effects


Dependent Variable: gpa
Source Type III Sum of df Mean Square F Sig.
Squares
a
Corrected Model .015 1 .015 .020 .888
Intercept 80.140 1 80.140 106.779 .000
sex * engl .015 1 .015 .020 .888
Error 64.544 86 .751
Total 595.478 88
Corrected Total 64.559 87
a. R Squared = .000 (Adjusted R Squared = -.011)

There is no interaction effect of gender and type of class on mean gpa.

Section-iii (a)
Figure– 9: Pie Chart for the percentage of employees within each category.

Section-iii (b)

Figure– 10: Bar chart for distribution of employees


Section-iii (c)
Figure– 11: Bar chart for showing the mean beginning salary of males and females
Here, the mean beginning salary of males are 8121 and mean beginning salary of females are
5237.

Section-iii (d)

Figure– 12: Histogram for the beginning salary of all employees.

Here we can see the beginning salary is not normally distributed.


Section-iii (e)

BEGINNING SALARY Stem-and-Leaf Plot

Frequency Stem & Leaf

11.00 3 . 66999
46.00 4 . 0000000000022333333344
62.00 4 . 555555555555566668888888888999
56.00 5 . 11111111222444444444444444&
37.00 5 . 55566777777777778
106.00 6 . 0000000000000000000000000013333333333333333333333334&
38.00 6 . 666666666666699999&
14.00 7 . 2222222
20.00 7 . 5558888889
11.00 8 . 1444&
6.00 8 . 77&
5.00 9 . 34&
1.00 9 . &
1.00 10 . &
60.00 Extremes (>=10200)

Stem width: 1000


Each leaf: 2 case(s)

& denotes fractional leaves.

Stem and leaf for beginning salary of all employees.

Question- 5 Section- i
5-i (a)
As both the variables (degree, life) are nominal and ordinal, chi square test will be
appropriate for testing the hypothesis. We can formulate the following hypothesizes:
Ho: There is no relationship between degree and life.
Ha: There is a relationship between degree and life.

Chi-Square Tests
Asymptotic
Significance (2-
Value df sided)
Pearson Chi-Square 39.428 a
8 .000
Likelihood Ratio 39.216 8 .000
Linear-by-Linear Association 29.571 1 .000
N of Valid Cases 927
a. 2 cells (13.3%) have expected count less than 5. The minimum
expected count is 2.86.

Since the p value is 0.000 (less than 0.05), so we can say that null hypothesis is rejected. So
there is a relationship between degree and life.

5-i (b)
As both the variables (netcat, life) are of nominal and ordinal level, chi square test will be
appropriate for testing the hypothesis. We can formulate the following hypothesizes:
Ho: There is no relationship between netcat and life.
Ha: There is a relationship between netcat and life.

Chi-Square Tests
Asymptotic
Significance (2-
Value df sided)
Pearson Chi-Square 8.951a 6 .176
Likelihood Ratio 9.019 6 .172
Linear-by-Linear Association .370 1 .543
N of Valid Cases 426
a. 4 cells (33.3%) have expected count less than 5. The minimum
expected count is 1.58.

Since the p value is 1.176 (higher than 0.05), so we can say that null hypothesis is accepted.
So there is no relationship between netcat and life.
5-i (c)
As both the variables (postlife, degree) are of nominal and ordinal level, chi square test will
be appropriate for testing the hypothesis. We can formulate the following hypothesizes:
Ho: There is no relationship between postlife and degree.
Ha: There is a relationship between postlife and degree.

Chi-Square Tests
Asymptotic
Significance (2-
Value df sided)
Pearson Chi-Square 13.037 a
4 .011
Likelihood Ratio 12.163 4 .016
Linear-by-Linear Association 3.240 1 .072
N of Valid Cases 829
a. 0 cells (0.0%) have expected count less than 5. The minimum
expected count is 9.31.

Since the p value is 0.011 (less than 0.05), so we can say that null hypothesis is rejected. So
there is a relationship between postlife and degree.

5-i (d)

As both the variables (tvhours, sex) are of nominal and ordinal level, chi square test will be
appropriate for testing the hypothesis. We can formulate the following hypothesizes:
Ho: There is no relationship between tvhours and sex
Ha: There is a relationship between tvhours and sex.

Chi-Square Tests
Asymptotic
Significance (2-
Value df sided)
Pearson Chi-Square 18.549 a
15 .235
Likelihood Ratio 20.588 15 .151
Linear-by-Linear Association .005 1 .946
N of Valid Cases 906
a. 11 cells (34.4%) have expected count less than 5. The minimum
expected count is .42.

Since the p value is 0.235 (higher than 0.05), so we can say that null hypothesis is accepted.
So there is no relationship between tvhours and sex.

5-i (e)
As respondent’s income is scale level variable, for this one sample t test will be appropriate
for testing the hypothesis. We can formulate the following hypothesizes:
Ho: The average income is same for two groups (those who use and don’t use internet)
Ha: The average income is not same for two groups (those who use and don’t use internet)

Report
Respondent's income; ranges recoded to midpoints
Use Internet? Mean N Std. Deviation
No 23644.67 375 19160.294
Yes 36518.94 528 25903.569
Total 31172.48 903 24177.356

Here we can see from the table that, average income for those who use internet is 36518.94
and who don’t is 23644.67.

One-Sample Test
Test Value = 0
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
Use Internet? 35.154 1387 .000 .471 .44 .50
Respondent's income; ranges 39.279 920 .000 31106.678 29552.45 32660.91
recoded to midpoints

We can see that, the p value is 0.000 (less than 0.05), so null hypothesis is rejected. So the
average income is not same for two groups and the group who uses internet makes more
money.
5-ii (a)

One-Sample Test
Test Value = 20
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
SCORE 20.613 27 .000 26.214 23.60 28.82

From the table, we can see that, the observed significance level is 0.000 which is less than
0.05. So, the null hypothesis will be rejected, and it is not possible to score 20 for students by
guessing. Interpretation: The probability of observing a sample t value greater than +20.613
or less than -20.613 is given by the entry labeled significance (2-tailed).

5-ii (b)

Paired Samples Test

Paired Differences
95% Confidence Interval of
the Difference
Std. Std. Error Sig. (2-
Mean Deviation Mean Lower Upper t df tailed)
Pair 1 ELEVATED - .01900 .13715 .04337 -.07911 .11711 .438 9 .672
LEVEL

From the table, the observed significance level is high (greater than 0.05). So, we can accept
the null hypothesis that there is difference in the moon illusion in the eyes elevated and the
eyes level conditions.

5-ii (c)
The histogram shows that elevated conditions is little bit higher than the moon illusion in the
eyes level.
5-ii (d)

From the boxplot, we can see that there are outliers.

5-ii (e)

Independent Samples Test


Levene's Test for
Equality of Variances t-test for Equality of Means
95% Confidence Interval
Sig. (2- Mean Std. Error of the Difference
F Sig. t df tailed) Difference Difference Lower Upper
Gain Equal variances .305 .584 -3.223 41 .002 -7.7147 2.3939 -12.5492 -2.8802
assumed
Equal variances not -3.299 36.979 .002 -7.7147 2.3384 -12.4528 -2.9766
assumed

Here, the p value is 0.002 (less than 0.05), so the null hypothesis will be rejected. We can say
the control group and family therapy group are not play same role for weight gain.
5-ii (f)

Independent Samples Test


Levene's Test for
Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Sig. (2- Mean Std. Error Difference
F Sig. t df tailed) Difference Difference Lower Upper
Gain Equal variances .557 .459 -1.676 53 .100 -3.4569 2.0626 -7.5939 .6801
assumed
Equal variances -1.668 50.971 .101 -3.4569 2.0728 -7.6183 .7045
not assumed

Here, the p value is 0.100 (higher than 0.05), so the null hypothesis will be accepted. We can
say that conclude that the cognitive behavioral therapy and control group are play same role
for weight gain.
The control and family therapy groups outperform the control and cognitive behavior therapy
groups. We can make a conclusion about family therapy and conclude that it is most
successful based on the 5% significance levels. However, we cannot comment on cognitive
behavioral therapy.

5-ii (g)

Figure: Box plot for weight gain for the control group, family therapy and cognitive
behavior therapy
5-ii (h)

Chi-Square Tests
Asymptotic
Significance (2- Exact Sig. (2- Exact Sig. (1-
Value df sided) sided) sided)
Pearson Chi-Square 2.000a 1 .157
Continuity Correction b
.000 1 1.000
Likelihood Ratio 2.773 1 .096
Fisher's Exact Test 1.000 .500
Linear-by-Linear Association 1.000 1 .317
N of Valid Cases 2
a. 4 cells (100.0%) have expected count less than 5. The minimum expected count is .50.
b. Computed only for a 2x2 table

We cannot reject the null hypothesis that rats are more likely than chance to choose Alley D
based on the observed significance level for the chi-square statistic. The observed
significance level is 0.317, which above the standard 0.05.

Question-5 Section-iii

5-iii (a)

The hypotheses are as follows:

Ho: The sample mean of 120 test score does not differ from a population mean test of 22.

Ha: The sample mean of 120 test score differs from a population mean test of 22.

5-iii (b)

One-Sample Test
Test Value = 22
95% Confidence Interval of the
Difference
t df Sig. (2-tailed) Mean Difference Lower Upper
Test_score 4.331 119 .000 1.98192 1.0758 2.8881
Here we can see that the p value is 0.000 (less than 0.05). So the null hypothesis is rejected.
Therefore we can say that the sample mean of 120 test score differs from a population mean test of
22.

5-iii (c)

Descriptives
Statistic Std. Error
Test_score Mean 23.9819 .45763
95% Confidence Interval for Lower Bound 23.0758
Mean Upper Bound 24.8881
5% Trimmed Mean 23.9136
Median 24.1650
Variance 25.131
Std. Deviation 5.01305
Minimum 10.05
Maximum 37.72
Range 27.67
Interquartile Range 6.42
Skewness .154 .221
Kurtosis .205 .438

Table: Explorative data analysis

Test_score Stem-and-Leaf Plot

Frequency Stem & Leaf

1.00 Extremes (=<10)


1.00 1 . 2
2.00 1 . 55
10.00 1 . 6666777777
12.00 1 . 888888899999
14.00 2 . 00000001111111
19.00 2 . 2222222223333333333
23.00 2 . 44444455555555555555555
20.00 2 . 66666666667777777777
5.00 2 . 88999
3.00 3 . 011
7.00 3 . 2222233
1.00 3 . 5
1.00 3 . 6
1.00 Extremes (>=38)

Stem width: 10.00


Each leaf: 1 case(s)
Figure: Boxplot of test scores

We can see that there are outliers in this dataset of test scores.

Question-6 Section-i

6-i (a)

Descriptive Statistics
N Mean Std. Deviation
birthweight 18 2879.1667 789.19589
trimester 18 2.1111 1.02262
Valid N (listwise) 18

Table for Mann-Whitney test for the variables of babies born to mothers for different trimester

Test Statisticsa
birthweight
Mann-Whitney U 16.000
Wilcoxon W 71.000
Z -2.132
Asymp. Sig. (2-tailed) .033
Exact Sig. [2*(1-tailed Sig.)] .034b
a. Grouping Variable: trimester
b. Not corrected for ties.

From the table, we can see that p value is 0.033 (less than 0.05), so the null hypothesis will be
rejected. We can say that, there is a statistically significant difference between the birth weights of
babies born to mothers who began prenatal care in the third trimester to those who began prenatal
classes in the first trimester.

6-i (b)

Ranks
N Mean Rank Sum of Ranks
AFTER - BEFORE Negative Ranks 4 a
2.75 11.00
Positive Ranks 13b
10.92 142.00
Ties 0c
Total 17
a. AFTER < BEFORE
b. AFTER > BEFORE
c. AFTER = BEFORE

Test Statisticsa
AFTER -
BEFORE
Z -3.101b
Asymp. Sig. (2-tailed) .002
a. Wilcoxon Signed Ranks Test
b. Based on negative ranks.

Comparing the values in text and p value, we can say that there is a significant difference in weight
pre and post.

6-i (c)

Ranks
Group N Mean Rank
Adapt LBW-Exp 29 40.17
LBW-Control 27 60.83
Full-Term 37 42.26
Total 93

Test Statisticsa,b
Adapt
Kruskal-Wallis H 10.189
df 2
Asymp. Sig. .006
a. Kruskal Wallis Test
b. Grouping Variable: Group

The result supports the conclusion that there is a significant difference in the maternal role
adaptation for the 3 groups of mothers.

6-i (d)

Friedman Test
Ranks
Mean Rank
Group 1.01
Recall 1.99

Test Statisticsa
N 50
Chi-Square 49.000
df 1
Asymp. Sig. .000
a. Friedman Test

We can conclude that that there is a significant difference in the effect of processing condition on
recall on the basis of p value.

6 Section-ii

i(a)

Descriptives
Statistic Std. Error
male Mean 135.67 17.452
95% Confidence Interval for Lower Bound 90.80
Mean Upper Bound 180.53
5% Trimmed Mean 136.13
Median 138.00
Variance 1827.467
Std. Deviation 42.749
Minimum 83
Maximum 180
Range 97
Interquartile Range 81
Skewness -.125 .845
Kurtosis -2.718 1.741
Female Mean 45.00 2.989
95% Confidence Interval for Lower Bound 37.32
Mean Upper Bound 52.68
5% Trimmed Mean 45.00
Median 45.00
Variance 53.600
Std. Deviation 7.321
Minimum 35
Maximum 55
Range 20
Interquartile Range 13
Skewness .000 .845
Kurtosis -1.033 1.741

Figure: Normal Q-Q plot for male


Figure: Detrended Q-Q plot for male

Figure: Normal Q-Q plot for female


Figure: Detrended Q-Q plot for female

6 Section -ii

i(b)

In this scenario, nonparametric statistics or distribution free tests are ones that don't rely on precise
parameter estimates or distributional assumptions.

6 Section -ii

i(c)

Because the Kruskal Wallis test is appropriate for more than two samples when samples are drawn
from the same population, it can be used to compare three groups. On the other hand, when there
are more than two repeated assessments, Friedman's test provides a suitable measurement for a
comparison of three groups.
Levene's test can also be used to compare the three groups since it is appropriate for examining the
homogeneity (equality) of variance across more than two samples when applied to more than two
samples.
6 Section -ii

i(d)

Paired Samples Test


Paired Differences
95% Confidence
Interval of the
Difference
Std. Std. Error Sig. (2-
Mean Deviation Mean Lower Upper t df tailed)
Pair male - 90.667 39.611 16.171 49.097 132.236 5.607 5 .002
1 Female

Here we can see the p value is 0.002 which is less than 0.05. So, we can reject the null hypothesis. At
the α = 0.01 level of significance, there is enough evidence to conclude that there is a difference in
the male Meerkats and female Meerkats.

6 Section -ii

ii (a)

Descriptives
Statistic Std. Error
younger_child Mean 1.80 .512
95% Confidence Interval for Lower Bound .64
Mean Upper Bound 2.96
5% Trimmed Mean 1.72
Median 2.00
Variance 2.622
Std. Deviation 1.619
Minimum 0
Maximum 5
Range 5
Interquartile Range 3
Skewness .597 .687
Kurtosis .133 1.334
older_child Mean 5.80 .533
95% Confidence Interval for Lower Bound 4.59
Mean Upper Bound 7.01
5% Trimmed Mean 5.78
Median 6.00
Variance 2.844
Std. Deviation 1.687
Minimum 4
Maximum 8
Range 4
Interquartile Range 3
Skewness .042 .687
Kurtosis -1.831 1.334

Figure: Normal Q-Q plot for younger children

Figure: Detrended Q-Q plot for younger children


Figure: Normal Q-Q plot for older children

Figure: Detrended Q-Q plot for older children


ii (b)
Under this condition, I choose nonparametric statistics or distribution free tests are those that do
not rely on parameter estimates or precise assumptions about the distribution of variables.

ii (c)
Kruskal Wallis test is appropriate measures for a comparison of three groups because it is suitable
for more than two samples under the condition of samples are taken from the same population. On
the other hand, Friedman’s test is appropriate measures for a comparison of three groups when
there are more than 2 repeated measures.
In addition, we can also consider Levene’s test for a comparison of the three groups because it is
suitable for more than two samples under the condition of testing equality (homogeneity) of
variance of more than two samples.
ii (d)

Paired Samples Test

Paired Differences
95% Confidence
Interval of the
Difference
Std. Std. Error Sig. (2-
Mean Deviation Mean Lower Upper t df tailed)
Pair younger_child - -4.000 1.414 .447 -5.012 -2.988 -8.944 9 .000
1 older_child

We will reject the null hypothesis because the p-value is 0.000. There is sufficient data to establish
that there is a difference between younger children and older children at the α = 0.01 level of
significance.

iii (a)

Descriptives
Statistic Std. Error
Lesions Mean 16.67 2.877
95% Confidence Interval for Lower Bound 10.03
Mean Upper Bound 23.30
5% Trimmed Mean 16.13
Median 15.00
Variance 74.500
Std. Deviation 8.631
Minimum 7
Maximum 36
Range 29
Interquartile Range 10
Skewness 1.447 .717
Kurtosis 2.911 1.400
control Mean 6.89 .790
95% Confidence Interval for Lower Bound 5.07
Mean Upper Bound 8.71
5% Trimmed Mean 6.88
Median 6.00
Variance 5.611
Std. Deviation 2.369
Minimum 4
Maximum 10
Range 6
Interquartile Range 5
Skewness .024 .717
Kurtosis -1.963 1.400

iii (b)

Under this condition, I choose nonparametric statistics or distribution free tests are those that do
not rely on parameter estimates or precise assumptions about the distribution of variables.

iii (c)

Kruskal Wallis test is appropriate measures for a comparison of the three groups because it is
suitable for more than two samples under the condition of samples are taken from the same
population. On the other hand, Friedman’s test is appropriate measures for a comparison of three
groups when there are more than 2 repeated measures. In addition, we can also consider Levene’s
test for a comparison of the three groups because it is suitable for more than two samples under the
condition of testing equality (homogeneity) of variance of more than two samples.

iii (d)

Paired Samples Test

Paired Differences

95% Confidence
Interval of the
Difference
Std. Std. Error Sig. (2-
Mean Deviation Mean Lower Upper t df tailed)
Pair Lesions - 9.778 10.109 3.370 2.007 17.548 2.902 8 .020
1 control
Since p-value = 0.002 which is less than 0.05 , we shall reject the null hypothesis. At the α = 0.05
level of significance, there is enough evidence to conclude that there is a difference in the lesion
given group of rats and control group of rats.

iv (a)

Descriptives
Statistic Std. Error
Before Mean 4.40 .909
95% Confidence Interval for Lower Bound 2.34
Mean Upper Bound 6.46
5% Trimmed Mean 4.33
Median 3.50
Variance 8.267
Std. Deviation 2.875
Minimum 1
Maximum 9
Range 8
Interquartile Range 6
Skewness .690 .687
Kurtosis -1.173 1.334
After Mean 6.60 .718
95% Confidence Interval for Lower Bound 4.98
Mean Upper Bound 8.22
5% Trimmed Mean 6.61
Median 7.00
Variance 5.156
Std. Deviation 2.271
Minimum 3
Maximum 10
Range 7
Interquartile Range 3
Skewness -.447 .687
Kurtosis -.177 1.334

iv (b)

Under this condition, I choose nonparametric statistics or distribution free tests are those that do
not rely on parameter estimates or precise assumptions about the distribution of variables.

iv (c)
Kruskal Wallis test is appropriate measures for a comparison of the three groups because it is
suitable for more than two samples under the condition of samples are taken from the same
population. On the other hand, Friedman’s test is appropriate measures for a comparison of three
groups when there are more than 2 repeated measures. In addition, we can also consider Levene’s
test for a comparison of the three groups because it is suitable for more than two samples under the
condition of testing equality (homogeneity) of variance of more than two samples.

Question-7 Section- i

7 i-(a)

MINORITY CLASSIFICATION
Cumulative
Frequency Percent Valid Percent Percent
Valid WHITE 370 78.1 78.1 78.1
NONWHITE 104 21.9 21.9 100.0
Total 474 100.0 100.0

From the table we can see that the majority is of whites and they are 78.1%.

Statistics
MINORITY CLASSIFICATION
N Valid 474
Missing 0
Mean .22
Median .00
Mode 0
Std. Deviation .414
Skewness 1.360
Std. Error of Skewness .112
Kurtosis -.150
Std. Error of Kurtosis .224
Minimum 0
Maximum 1

Based on the value of mean, median, mode, skewness and kurtosis, we say that distribution of
minority classification is not normally distributed. It is positively skewed and platykurtic distribution.
7 i-(b)

BEGINNING SALARY
Cumulative
Frequency Percent Valid Percent Percent
Valid 3600 4 .8 .8 .8
3900 7 1.5 1.5 2.3
4020 1 .2 .2 2.5
4080 21 4.4 4.4 7.0
4200 4 .8 .8 7.8
4260 1 .2 .2 8.0
4380 15 3.2 3.2 11.2
4440 3 .6 .6 11.8
4490 1 .2 .2 12.0
4500 25 5.3 5.3 17.3
4560 2 .4 .4 17.7
4620 9 1.9 1.9 19.6
4800 19 4.0 4.0 23.6
4860 1 .2 .2 23.8
4920 1 .2 .2 24.1
4980 5 1.1 1.1 25.1
5040 1 .2 .2 25.3
5100 15 3.2 3.2 28.5
5160 1 .2 .2 28.7
5220 4 .8 .8 29.5
5280 3 .6 .6 30.2
5340 1 .2 .2 30.4
5400 31 6.5 6.5 36.9
5520 2 .4 .4 37.3
5580 5 1.1 1.1 38.4
5640 4 .8 .8 39.2
5700 21 4.4 4.4 43.7
5760 2 .4 .4 44.1
5820 2 .4 .4 44.5
5880 1 .2 .2 44.7
6000 52 11.0 11.0 55.7
6120 2 .4 .4 56.1
6240 1 .2 .2 56.3
6300 49 10.3 10.3 66.7
6420 2 .4 .4 67.1
6600 26 5.5 5.5 72.6
6720 1 .2 .2 72.8
6840 1 .2 .2 73.0
6900 8 1.7 1.7 74.7
6996 2 .4 .4 75.1
7200 14 3.0 3.0 78.1
7500 6 1.3 1.3 79.3
7800 12 2.5 2.5 81.9
7992 2 .4 .4 82.3
8100 2 .4 .4 82.7
8160 1 .2 .2 82.9
8220 1 .2 .2 83.1
8400 4 .8 .8 84.0
8496 3 .6 .6 84.6
8592 1 .2 .2 84.8
8700 4 .8 .8 85.7
8796 1 .2 .2 85.9
9000 1 .2 .2 86.1
9300 2 .4 .4 86.5
9492 2 .4 .4 86.9
9996 1 .2 .2 87.1
10000 1 .2 .2 87.3
10200 2 .4 .4 87.8
10500 3 .6 .6 88.4
10800 1 .2 .2 88.6
10992 6 1.3 1.3 89.9
11004 2 .4 .4 90.3
11100 1 .2 .2 90.5
11496 3 .6 .6 91.1
11796 1 .2 .2 91.4
12000 4 .8 .8 92.2
12300 1 .2 .2 92.4
12500 1 .2 .2 92.6
12600 1 .2 .2 92.8
12792 2 .4 .4 93.2
12804 1 .2 .2 93.5
12996 5 1.1 1.1 94.5
13200 3 .6 .6 95.1
13500 2 .4 .4 95.6
13992 5 1.1 1.1 96.6
14004 1 .2 .2 96.8
14016 1 .2 .2 97.0
14496 1 .2 .2 97.3
14700 1 .2 .2 97.5
15000 1 .2 .2 97.7
15996 1 .2 .2 97.9
16992 1 .2 .2 98.1
17004 1 .2 .2 98.3
17400 1 .2 .2 98.5
17640 1 .2 .2 98.7
18000 2 .4 .4 99.2
18996 1 .2 .2 99.4
21000 1 .2 .2 99.6
24000 1 .2 .2 99.8
31992 1 .2 .2 100.0
Total 474 100.0 100.0

Table shows that the highest percentage of persons (11.0%) were found to start their salary with
$6000.

SEX OF EMPLOYEE
Cumulative
Frequency Percent Valid Percent Percent
Valid MALES 258 54.4 54.4 54.4
FEMALES 216 45.6 45.6 100.0
Total 474 100.0 100.0

We can see that male employees are higher than female employees.

AGE OF EMPLOYEE
Cumulative
Frequency Percent Valid Percent Percent
Valid 23.00 1 .2 .2 .2
23.25 2 .4 .4 .6
23.33 1 .2 .2 .8
23.42 3 .6 .6 1.5
23.58 1 .2 .2 1.7
23.67 3 .6 .6 2.3
23.75 1 .2 .2 2.5
24.00 2 .4 .4 3.0
24.08 2 .4 .4 3.4
24.17 2 .4 .4 3.8
24.33 5 1.1 1.1 4.9
24.42 2 .4 .4 5.3
24.50 2 .4 .4 5.7
24.58 2 .4 .4 6.1
24.67 2 .4 .4 6.5
24.75 3 .6 .6 7.2
24.83 3 .6 .6 7.8
24.92 3 .6 .6 8.4
25.00 3 .6 .6 9.1
25.08 4 .8 .8 9.9
25.17 1 .2 .2 10.1
25.25 3 .6 .6 10.8
25.42 3 .6 .6 11.4
25.50 3 .6 .6 12.0
25.58 4 .8 .8 12.9
25.75 2 .4 .4 13.3
25.83 3 .6 .6 13.9
25.92 1 .2 .2 14.1
26.08 1 .2 .2 14.3
26.25 3 .6 .6 15.0
26.33 1 .2 .2 15.2
26.58 1 .2 .2 15.4
26.67 1 .2 .2 15.6
26.83 4 .8 .8 16.5
26.92 1 .2 .2 16.7
27.00 1 .2 .2 16.9
27.08 3 .6 .6 17.5
27.17 2 .4 .4 17.9
27.25 3 .6 .6 18.6
27.33 3 .6 .6 19.2
27.42 3 .6 .6 19.8
27.50 2 .4 .4 20.3
27.58 4 .8 .8 21.1
27.67 2 .4 .4 21.5
27.83 2 .4 .4 21.9
28.00 2 .4 .4 22.4
28.08 1 .2 .2 22.6
28.17 3 .6 .6 23.2
28.33 4 .8 .8 24.1
28.42 4 .8 .8 24.9
28.50 3 .6 .6 25.5
28.67 5 1.1 1.1 26.6
28.75 4 .8 .8 27.4
28.83 3 .6 .6 28.1
29.00 2 .4 .4 28.5
29.08 4 .8 .8 29.3
29.17 4 .8 .8 30.2
29.25 3 .6 .6 30.8
29.33 3 .6 .6 31.4
29.42 1 .2 .2 31.6
29.50 6 1.3 1.3 32.9
29.58 4 .8 .8 33.8
29.67 4 .8 .8 34.6
29.75 4 .8 .8 35.4
29.92 4 .8 .8 36.3
30.00 1 .2 .2 36.5
30.08 3 .6 .6 37.1
30.17 5 1.1 1.1 38.2
30.25 4 .8 .8 39.0
30.33 6 1.3 1.3 40.3
30.42 4 .8 .8 41.1
30.50 2 .4 .4 41.6
30.58 1 .2 .2 41.8
30.67 4 .8 .8 42.6
30.75 5 1.1 1.1 43.7
30.83 1 .2 .2 43.9
30.92 2 .4 .4 44.3
31.00 2 .4 .4 44.7
31.08 1 .2 .2 44.9
31.17 3 .6 .6 45.6
31.25 2 .4 .4 46.0
31.33 1 .2 .2 46.2
31.42 1 .2 .2 46.4
31.50 3 .6 .6 47.0
31.67 3 .6 .6 47.7
31.75 4 .8 .8 48.5
31.92 5 1.1 1.1 49.6
32.00 3 .6 .6 50.2
32.08 5 1.1 1.1 51.3
32.17 1 .2 .2 51.5
32.25 3 .6 .6 52.1
32.33 2 .4 .4 52.5
32.50 2 .4 .4 53.0
32.67 4 .8 .8 53.8
32.83 2 .4 .4 54.2
32.92 3 .6 .6 54.9
33.08 1 .2 .2 55.1
33.33 1 .2 .2 55.3
33.42 2 .4 .4 55.7
33.50 4 .8 .8 56.5
33.67 1 .2 .2 56.8
33.75 2 .4 .4 57.2
33.83 2 .4 .4 57.6
34.00 1 .2 .2 57.8
34.17 3 .6 .6 58.4
34.25 2 .4 .4 58.9
34.33 2 .4 .4 59.3
34.50 1 .2 .2 59.5
34.58 2 .4 .4 59.9
34.67 1 .2 .2 60.1
34.75 1 .2 .2 60.3
34.83 1 .2 .2 60.5
34.92 1 .2 .2 60.8
35.17 2 .4 .4 61.2
35.25 1 .2 .2 61.4
35.33 1 .2 .2 61.6
35.42 2 .4 .4 62.0
35.58 1 .2 .2 62.2
35.67 1 .2 .2 62.4
36.00 1 .2 .2 62.7
36.92 1 .2 .2 62.9
37.08 1 .2 .2 63.1
37.17 1 .2 .2 63.3
37.50 1 .2 .2 63.5
37.83 1 .2 .2 63.7
38.00 1 .2 .2 63.9
38.17 1 .2 .2 64.1
38.42 1 .2 .2 64.3
38.50 1 .2 .2 64.6
38.67 1 .2 .2 64.8
38.92 1 .2 .2 65.0
39.00 1 .2 .2 65.2
39.33 2 .4 .4 65.6
39.42 1 .2 .2 65.8
39.50 1 .2 .2 66.0
39.67 3 .6 .6 66.7
39.75 1 .2 .2 66.9
39.83 1 .2 .2 67.1
40.08 1 .2 .2 67.3
40.17 1 .2 .2 67.5
40.33 1 .2 .2 67.7
40.50 1 .2 .2 67.9
40.58 1 .2 .2 68.1
40.67 1 .2 .2 68.4
41.00 1 .2 .2 68.6
41.17 2 .4 .4 69.0
41.67 1 .2 .2 69.2
41.92 2 .4 .4 69.6
42.08 1 .2 .2 69.8
42.17 1 .2 .2 70.0
42.33 1 .2 .2 70.3
42.42 1 .2 .2 70.5
42.58 2 .4 .4 70.9
43.25 1 .2 .2 71.1
43.33 1 .2 .2 71.3
43.42 1 .2 .2 71.5
43.67 1 .2 .2 71.7
43.92 1 .2 .2 71.9
44.00 1 .2 .2 72.2
44.42 1 .2 .2 72.4
44.50 3 .6 .6 73.0
44.58 1 .2 .2 73.2
44.67 1 .2 .2 73.4
44.83 1 .2 .2 73.6
44.92 1 .2 .2 73.8
45.17 1 .2 .2 74.1
45.50 2 .4 .4 74.5
45.67 1 .2 .2 74.7
45.92 1 .2 .2 74.9
46.00 1 .2 .2 75.1
46.17 1 .2 .2 75.3
46.25 2 .4 .4 75.7
46.42 1 .2 .2 75.9
46.50 2 .4 .4 76.4
46.58 2 .4 .4 76.8
47.25 1 .2 .2 77.0
47.33 2 .4 .4 77.4
47.58 2 .4 .4 77.8
47.92 1 .2 .2 78.1
48.00 1 .2 .2 78.3
48.25 1 .2 .2 78.5
48.33 1 .2 .2 78.7
48.50 1 .2 .2 78.9
48.67 1 .2 .2 79.1
48.83 1 .2 .2 79.3
49.08 1 .2 .2 79.5
49.17 1 .2 .2 79.7
49.58 1 .2 .2 80.0
49.92 1 .2 .2 80.2
50.00 1 .2 .2 80.4
50.17 1 .2 .2 80.6
50.25 2 .4 .4 81.0
50.33 1 .2 .2 81.2
51.00 1 .2 .2 81.4
51.17 1 .2 .2 81.6
51.42 2 .4 .4 82.1
51.50 3 .6 .6 82.7
51.58 2 .4 .4 83.1
51.92 1 .2 .2 83.3
52.00 2 .4 .4 83.8
52.17 1 .2 .2 84.0
52.33 1 .2 .2 84.2
52.50 1 .2 .2 84.4
52.92 1 .2 .2 84.6
53.08 1 .2 .2 84.8
53.33 1 .2 .2 85.0
53.50 1 .2 .2 85.2
53.92 3 .6 .6 85.9
54.08 1 .2 .2 86.1
54.17 2 .4 .4 86.5
54.33 1 .2 .2 86.7
54.42 1 .2 .2 86.9
54.92 1 .2 .2 87.1
55.08 1 .2 .2 87.3
55.17 1 .2 .2 87.6
55.25 2 .4 .4 88.0
55.33 1 .2 .2 88.2
55.50 1 .2 .2 88.4
55.58 3 .6 .6 89.0
55.92 1 .2 .2 89.2
56.00 1 .2 .2 89.5
56.67 2 .4 .4 89.9
56.92 1 .2 .2 90.1
57.17 1 .2 .2 90.3
57.42 1 .2 .2 90.5
57.50 1 .2 .2 90.7
57.83 2 .4 .4 91.1
58.00 1 .2 .2 91.4
58.08 1 .2 .2 91.6
58.50 1 .2 .2 91.8
58.75 1 .2 .2 92.0
59.08 2 .4 .4 92.4
59.42 1 .2 .2 92.6
59.50 1 .2 .2 92.8
59.75 1 .2 .2 93.0
59.83 3 .6 .6 93.7
60.00 1 .2 .2 93.9
60.50 3 .6 .6 94.5
60.67 3 .6 .6 95.1
60.75 1 .2 .2 95.4
61.33 1 .2 .2 95.6
61.50 1 .2 .2 95.8
61.67 2 .4 .4 96.2
61.75 1 .2 .2 96.4
62.00 1 .2 .2 96.6
62.08 1 .2 .2 96.8
62.33 1 .2 .2 97.0
62.42 1 .2 .2 97.3
62.50 1 .2 .2 97.5
63.00 1 .2 .2 97.7
63.25 1 .2 .2 97.9
63.42 1 .2 .2 98.1
63.50 1 .2 .2 98.3
63.58 1 .2 .2 98.5
63.75 2 .4 .4 98.9
63.83 1 .2 .2 99.2
63.92 1 .2 .2 99.4
64.25 2 .4 .4 99.8
64.50 1 .2 .2 100.0
Total 474 100.0 100.0

Table shows that the highest number of aged persons belong to the age of 29.5 and 30.33 which
accounts for 2.6 (1.3+1.3) percent of the total sample.

CURRENT SALARY
Frequency Percent Valid Percent Cumulative
Percent
Valid 6300 1 .2 .2 .2
6360 1 .2 .2 .4
6480 3 .6 .6 1.1
6540 1 .2 .2 1.3
6600 1 .2 .2 1.5
6660 1 .2 .2 1.7
6720 1 .2 .2 1.9
6780 3 .6 .6 2.5
6840 2 .4 .4 3.0
6900 1 .2 .2 3.2
6960 2 .4 .4 3.6
7080 1 .2 .2 3.8
7260 2 .4 .4 4.2
7380 1 .2 .2 4.4
7500 1 .2 .2 4.6
7680 2 .4 .4 5.1
7860 6 1.3 1.3 6.3
7920 1 .2 .2 6.5
7980 2 .4 .4 7.0
8040 2 .4 .4 7.4
8160 3 .6 .6 8.0
8220 1 .2 .2 8.2
8280 2 .4 .4 8.6
8340 5 1.1 1.1 9.7
8400 2 .4 .4 10.1
8460 2 .4 .4 10.5
8520 4 .8 .8 11.4
8580 3 .6 .6 12.0
8640 3 .6 .6 12.7
8700 3 .6 .6 13.3
8760 5 1.1 1.1 14.3
8820 4 .8 .8 15.2
8880 3 .6 .6 15.8
8940 6 1.3 1.3 17.1
9000 7 1.5 1.5 18.6
9060 2 .4 .4 19.0
9120 2 .4 .4 19.4
9180 4 .8 .8 20.3
9240 4 .8 .8 21.1
9300 2 .4 .4 21.5
9360 4 .8 .8 22.4
9420 2 .4 .4 22.8
9480 2 .4 .4 23.2
9540 2 .4 .4 23.6
9600 8 1.7 1.7 25.3
9660 4 .8 .8 26.2
9720 4 .8 .8 27.0
9780 8 1.7 1.7 28.7
9840 2 .4 .4 29.1
9900 4 .8 .8 30.0
9960 1 .2 .2 30.2
10020 4 .8 .8 31.0
10080 5 1.1 1.1 32.1
10140 2 .4 .4 32.5
10200 4 .8 .8 33.3
10260 2 .4 .4 33.8
10320 2 .4 .4 34.2
10380 4 .8 .8 35.0
10440 2 .4 .4 35.4
10500 7 1.5 1.5 36.9
10560 4 .8 .8 37.8
10620 5 1.1 1.1 38.8
10680 7 1.5 1.5 40.3
10740 2 .4 .4 40.7
10800 3 .6 .6 41.4
10860 2 .4 .4 41.8
10920 5 1.1 1.1 42.8
10980 5 1.1 1.1 43.9
11040 2 .4 .4 44.3
11100 7 1.5 1.5 45.8
11160 4 .8 .8 46.6
11220 3 .6 .6 47.3
11280 1 .2 .2 47.5
11340 3 .6 .6 48.1
11400 6 1.3 1.3 49.4
11460 1 .2 .2 49.6
11520 2 .4 .4 50.0
11580 1 .2 .2 50.2
11640 5 1.1 1.1 51.3
11664 1 .2 .2 51.5
11700 2 .4 .4 51.9
11736 1 .2 .2 52.1
11760 5 1.1 1.1 53.2
11820 1 .2 .2 53.4
11880 1 .2 .2 53.6
11940 4 .8 .8 54.4
12000 5 1.1 1.1 55.5
12060 2 .4 .4 55.9
12108 1 .2 .2 56.1
12120 4 .8 .8 57.0
12180 1 .2 .2 57.2
12240 3 .6 .6 57.8
12300 13 2.7 2.7 60.5
12360 3 .6 .6 61.2
12420 1 .2 .2 61.4
12480 3 .6 .6 62.0
12540 3 .6 .6 62.7
12600 3 .6 .6 63.3
12660 4 .8 .8 64.1
12780 4 .8 .8 65.0
12840 1 .2 .2 65.2
12960 1 .2 .2 65.4
13020 3 .6 .6 66.0
13140 1 .2 .2 66.2
13200 1 .2 .2 66.5
13260 1 .2 .2 66.7
13320 3 .6 .6 67.3
13380 1 .2 .2 67.5
13416 1 .2 .2 67.7
13500 1 .2 .2 67.9
13560 6 1.3 1.3 69.2
13764 1 .2 .2 69.4
13800 5 1.1 1.1 70.5
13848 1 .2 .2 70.7
13920 2 .4 .4 71.1
13980 1 .2 .2 71.3
14040 2 .4 .4 71.7
14100 4 .8 .8 72.6
14220 2 .4 .4 73.0
14280 3 .6 .6 73.6
14400 4 .8 .8 74.5
14460 1 .2 .2 74.7
14640 1 .2 .2 74.9
14820 1 .2 .2 75.1
15000 1 .2 .2 75.3
15060 1 .2 .2 75.5
15120 3 .6 .6 76.2
15360 1 .2 .2 76.4
15420 1 .2 .2 76.6
15480 1 .2 .2 76.8
15540 2 .4 .4 77.2
15660 1 .2 .2 77.4
15720 1 .2 .2 77.6
15840 1 .2 .2 77.8
15960 1 .2 .2 78.1
16020 1 .2 .2 78.3
16080 4 .8 .8 79.1
16140 2 .4 .4 79.5
16320 2 .4 .4 80.0
16440 1 .2 .2 80.2
16620 1 .2 .2 80.4
16800 1 .2 .2 80.6
16920 2 .4 .4 81.0
17200 1 .2 .2 81.2
17364 1 .2 .2 81.4
17400 1 .2 .2 81.6
17460 1 .2 .2 81.9
17580 1 .2 .2 82.1
17950 1 .2 .2 82.3
18000 1 .2 .2 82.5
18060 1 .2 .2 82.7
18100 1 .2 .2 82.9
18250 1 .2 .2 83.1
18400 2 .4 .4 83.5
18750 1 .2 .2 83.8
18900 1 .2 .2 84.0
19020 1 .2 .2 84.2
19200 1 .2 .2 84.4
19500 1 .2 .2 84.6
19600 1 .2 .2 84.8
20000 1 .2 .2 85.0
20220 1 .2 .2 85.2
20400 1 .2 .2 85.4
20500 1 .2 .2 85.7
20580 1 .2 .2 85.9
20850 1 .2 .2 86.1
21060 1 .2 .2 86.3
21250 1 .2 .2 86.5
21600 1 .2 .2 86.7
21750 1 .2 .2 86.9
21950 1 .2 .2 87.1
21960 1 .2 .2 87.3
22000 3 .6 .6 88.0
22200 1 .2 .2 88.2
22300 1 .2 .2 88.4
22600 1 .2 .2 88.6
22620 1 .2 .2 88.8
22700 1 .2 .2 89.0
22800 1 .2 .2 89.2
23250 1 .2 .2 89.5
23500 1 .2 .2 89.7
23750 1 .2 .2 89.9
23760 1 .2 .2 90.1
24000 2 .4 .4 90.5
24150 1 .2 .2 90.7
24250 1 .2 .2 90.9
24500 1 .2 .2 91.1
24750 2 .4 .4 91.6
25000 1 .2 .2 91.8
26000 3 .6 .6 92.4
26400 1 .2 .2 92.6
26500 1 .2 .2 92.8
26700 1 .2 .2 93.0
26750 2 .4 .4 93.5
27000 1 .2 .2 93.7
27250 2 .4 .4 94.1
27500 2 .4 .4 94.5
27700 1 .2 .2 94.7
28000 2 .4 .4 95.1
28350 1 .2 .2 95.4
29000 1 .2 .2 95.6
29400 1 .2 .2 95.8
29500 1 .2 .2 96.0
30000 2 .4 .4 96.4
31250 1 .2 .2 96.6
31300 1 .2 .2 96.8
31400 1 .2 .2 97.0
32000 1 .2 .2 97.3
32500 1 .2 .2 97.5
33000 1 .2 .2 97.7
33500 1 .2 .2 97.9
34500 1 .2 .2 98.1
36250 1 .2 .2 98.3
36500 1 .2 .2 98.5
36800 1 .2 .2 98.7
38800 1 .2 .2 98.9
40000 1 .2 .2 99.2
41400 1 .2 .2 99.4
41500 1 .2 .2 99.6
44250 1 .2 .2 99.8
54000 1 .2 .2 100.0
Total 474 100.0 100.0

Table shows that the highest number of persons have current salary of $12300 which accounts for
2.7 percent of the total sample.
EDUCATIONAL LEVEL
Cumulative
Frequency Percent Valid Percent Percent
Valid 8 53 11.2 11.2 11.2
12 190 40.1 40.1 51.3
14 6 1.3 1.3 52.5
15 116 24.5 24.5 77.0
16 59 12.4 12.4 89.5
17 11 2.3 2.3 91.8
18 9 1.9 1.9 93.7
19 27 5.7 5.7 99.4
20 2 .4 .4 99.8
21 1 .2 .2 100.0
Total 474 100.0 100.0

Table shows that the highest percentage of persons belong to the educational level of 12 which
accounts for 40.1 percent followed by 24.5 percent of the total sample.

7 i-(b)

There are positive correlations between the variables (current salary, beginning salary)= 0.880,
(current salary, educational level) = 0.661. But current salary and age of employee are negatively
correlated i.e., (current salary, age of employee) = -0.281.
7-i (c)

Figure shows that scatter plot of beginning salary and current salary is well fitted with respect to
linearity. Thus, we can conclude that there is a strong positive correlation between the variables
current and beginning salary.
There is a possibility to arise multicollinearity problem if we use highly correlated variables in a linear
regression model. Multicollinearity which means the high degree of correlation between two or
more predictor variables in a multiple regression model has overall effect on the accuracy of
prediction.
Since the observations at the scatter plots fall far from cloud of points between current salary and
age of employee as well as current salary and educational level, we can say that there are potential
outliers.

7-ii (a)

GENDER OF RESPONDENT
Cumulative
Frequency Percent Valid Percent Percent
Valid 0 1532 54.5 54.5 54.5
1 1280 45.5 45.5 100.0
Total 2812 100.0 100.0

Gender male to ‘0’ and female to ‘1’ has been recoded into the same variable gender.

7-iii (a)

Correlations
iq addsc engg gpa
iq Pearson Correlation 1 -.632** .370** .497**
Sig. (2-tailed) .000 .000 .000
N 88 88 88 88
addsc Pearson Correlation -.632 **
1 -.478
**
-.615**
Sig. (2-tailed) .000 .000 .000
N 88 88 88 88
engg Pearson Correlation .370 **
-.478 **
1 .839**
Sig. (2-tailed) .000 .000 .000
N 88 88 88 88
gpa Pearson Correlation .497 **
-.615 **
.839**
1
Sig. (2-tailed) .000 .000 .000
N 88 88 88 88
**. Correlation is significant at the 0.01 level (2-tailed).

Table: Correlation of the variables (2-tailed)

Correlations
iq addsc engg gpa
iq Pearson Correlation 1 -.632 **
.370**
.497**
Sig. (1-tailed) .000 .000 .000
N 88 88 88 88
addsc Pearson Correlation -.632 **
1 -.478
**
-.615**
Sig. (1-tailed) .000 .000 .000
N 88 88 88 88
engg Pearson Correlation .370 **
-.478 **
1 .839**
Sig. (1-tailed) .000 .000 .000
N 88 88 88 88
gpa Pearson Correlation .497** -.615** .839** 1
Sig. (1-tailed) .000 .000 .000
N 88 88 88 88
**. Correlation is significant at the 0.01 level (1-tailed).

Table: Correlation of the variables (1-tailed)

There are no difference between the variables of Add symptoms and GPA, Add symptoms and
English grade, Add symptoms and IQ, GPA and English grade, GPA and IQ, and lastly English Grade
and IQ under the one-tailed test and two-tailed test.

7-iii (b)

Correlations
dropout = 1
iq addsc engg gpa (FILTER)
iq Pearson Correlation 1 -.137 -.156 .020 .a
Sig. (2-tailed) .706 .667 .955 .
N 10 10 10 10 10
addsc Pearson Correlation -.137 1 .036 -.216 .a
Sig. (2-tailed) .706 .921 .548 .
N 10 10 10 10 10
engg Pearson Correlation -.156 .036 1 .825 **
.a
Sig. (2-tailed) .667 .921 .003 .
N 10 10 10 10 10
gpa Pearson Correlation .020 -.216 .825 **
1 .a
Sig. (2-tailed) .955 .548 .003 .
N 10 10 10 10 10
dropout = 1 (FILTER) Pearson Correlation . a
. a
. a
. a
.a
Sig. (2-tailed) . . . .
N 10 10 10 10 10
**. Correlation is significant at the 0.01 level (2-tailed).
a. Cannot be computed because at least one of the variables is constant.

Table for correlations (dropout=1)


Correlations
dropout = 0
iq addsc engg gpa (FILTER)
iq Pearson Correlation 1 -.614** .365** .491** .b
Sig. (2-tailed) .000 .001 .000 .
N 78 78 78 78 78
addsc Pearson Correlation -.614** 1 -.493** -.625** .b
Sig. (2-tailed) .000 .000 .000 .
N 78 78 78 78 78
engg Pearson Correlation .365 **
-.493 **
1 .836 **
.b
Sig. (2-tailed) .001 .000 .000 .
N 78 78 78 78 78
gpa Pearson Correlation .491 **
-.625 **
.836 **
1 .b
Sig. (2-tailed) .000 .000 .000 .
N 78 78 78 78 78
dropout = 0 (FILTER) Pearson Correlation .b .b .b .b .b
Sig. (2-tailed) . . . .
N 78 78 78 78 78
**. Correlation is significant at the 0.01 level (2-tailed).
b. Cannot be computed because at least one of the variables is constant.

Table for correlations (dropout=0)


There is a difference of correlations separately for those who did and did not drop out using 2 tailed
test. For those who did not drop, the correlation of Add symptoms with GPA, IQ and English grade
were significant. On the other hand, for those who did drop, we find only the correlation between
GPA and English grade.

7-iii (c)
Scatter plot shows that for those who did not drop, there is strong correlation between GPA & IQ.
On the other hand, for those who did drop, we find no correlation between the same.

Answer to the Question No.- 8(i)


(a)

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 771.282 355.472 2.170 .031
BEGINNING SALARY 1.909 .047 .880 40.276 .000
a. Dependent Variable: CURRENT SALARY

Interpretation: Here, we can see that, the value of b = 1.909. From this, we can estimate that
the increment in current salary is 1.909 due to 1% increase in the beginning salary. If the
beginning salary is 0, the current salary will be 1.909

(b)

Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .880a .775 .774 3246.142
a. Predictors: (Constant), BEGINNING SALARY
Interpretation: From the one value we can predict R- square. Here the R-square value is
0.775 which implies that 77.5% of the variation in salary can be explained by this regression
model.

(c)

Coefficients
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 771.282 355.472 2.170 .031
BEGINNING SALARY 1.909 .047 .880 40.276 .000
a. Dependent Variable: CURRENT SALARY

Interpretation: We can interpret the constant as here the value of b= 771.282 which implies
that on the average increase in current salary is 771.282 when increase in beginning salary is
0 (%).
(d)
The predictive equation of current salary for a person that had a beginning salary of $6400
from the following:
Current salary Y= a + bx
= 771.282 + 1.909 * 6400
= 771.282 + 12217.6
= 12988.882
So, we can see that the current salary of a person having beginning salary of $6400 is now
getting $ 12988.882 as his current salary.

(e)

Correlations
BEGINNING CURRENT EDUCATIONAL AGE OF
SALARY SALARY LEVEL EMPLOYEE
BEGINNING SALARY Pearson Correlation 1 .880** .633** -.011
Sig. (2-tailed) .000 .000 .811
N 474 474 474 474
CURRENT SALARY Pearson Correlation .880** 1 .661** -.146**
Sig. (2-tailed) .000 .000 .001
N 474 474 474 474
EDUCATIONAL LEVEL Pearson Correlation .633**
.661 **
1 -.281**
Sig. (2-tailed) .000 .000 .000
N 474 474 474 474
AGE OF EMPLOYEE Pearson Correlation -.011 -.146 **
-.281 **
1
Sig. (2-tailed) .811 .001 .000
N 474 474 474 474
**. Correlation is significant at the 0.01 level (2-tailed).

Interpretation: We can see from this table that the estimated correlation coefficient is
highest between the current salary and the beginning salary (0.880). The second highest
correlation is between the current salary and the level of education (0.661). So, we can run a
regression analysis with current salary with education level depending on the correlation
strength.

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -7332.471 1128.765 -6.496 .000
EDUCATIONAL LEVEL 1563.963 81.819 .661 19.115 .000
a. Dependent Variable: CURRENT SALARY

The following regression model:


Current salary (Y) = a + bx
= -7332.471 + 1563.963 * education level

Answer to the Question No. 8(ii)


(a)

Coefficientsa
Standardized

Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) 1316.422 1075.121 1.224 .221

BEGINNING SALARY 1.357 .081 .626 16.667 .000


AGE OF EMPLOYEE -47.794 12.395 -.082 -3.856 .000

EDUCATIONAL LEVEL 293.700 64.854 .124 4.529 .000

EMPLOYMENT CATEGORY 910.645 152.899 .187 5.956 .000

MINORITY CLASSIFICATION -11.344 346.598 -.001 -.033 .974

SEX & RACE CLASSIFICATION -396.361 155.475 -.061 -2.549 .011

a. Dependent Variable: CURRENT SALARY

Casewise Diagnosticsa
CURRENT
Case Number Std. Residual SALARY Predicted Value Residual
1 -1.098 16080 19300.79 -3220.790
2 .198 41400 40819.46 580.544
3 -.095 21960 22237.51 -277.514
4 -.130 19200 19580.36 -380.361
5 -1.471 28350 32666.58 -4316.577
6 .399 27250 26078.41 1171.588
7 .619 16080 14263.15 1816.854
8 .651 14100 12191.39 1908.605
9 .223 12420 11766.34 653.657
10 -.306 12300 13197.95 -897.952
Figure: Scatter plot of residuals between the variables current salary and age of employee
From the analysis given above, we can point out the followings:
Current salary is approximately normally distributed as we can see from the
histogram.Scatter plot of residuals, approximate normality and linearity are supported.If we
examine case-by-case diagnostic data, the total of present current salary and predicted salary
must be zero (0). [For first 10 cases]
The answers as wanted in the question:
Significant predictor of current salary is the Beginning salary because it is the strongest
predictor because of its highest beta of 0.626 among all. Among the significant predictors,
sex and race classification is the weakest predictor because of it’s beta -0.061. Among all the
predictors, minority classification is the weakest predictor because of it’s beta -0.001.
The R-square value of 0.818 indicates that the current salary can be explained by the
predictors.
The prediction equation is as follows:
Y = a + b1x1 + b2x2 + b3x3 + . . . . . + bkxk + ei
Current salary (Y) = 1316.422 + beginning salary * 1.357 – age of employee * 47.794 +
education level * 293.700 + employment category * 910.645 – minority classification*
11.344 – 396.361 * sex & race classification

(b)
To find out whether there are any problems for linear regression, we have to conduct
homogeneity test:

Test of Homogeneity of Variances


Levene Statistic df1 df2 Sig.
BEGINNING SALARY Based on Mean 4.063 102 253 .000
Based on Median 2.042 102 253 .000
Based on Median and with 2.042 102 104.714 .000
adjusted df
Based on trimmed mean 3.828 102 253 .000
SEX OF EMPLOYEE Based on Mean 6.679 102 253 .000
Based on Median .839 102 253 .845
Based on Median and with .839 102 133.310 .823
adjusted df
Based on trimmed mean 5.294 102 253 .000
AGE OF EMPLOYEE Based on Mean 2.953 102 253 .000
Based on Median 1.171 102 253 .162
Based on Median and with 1.171 102 110.115 .208
adjusted df
Based on trimmed mean 2.686 102 253 .000
EDUCATIONAL LEVEL Based on Mean 2.607 102 253 .000
Based on Median .884 102 253 .761
Based on Median and with .884 102 118.631 .738
adjusted df
Based on trimmed mean 2.511 102 253 .000
EMPLOYMENT CATEGORY Based on Mean 5.686 102 253 .000
Based on Median 1.648 102 253 .001
Based on Median and with 1.648 102 85.768 .009
adjusted df
Based on trimmed mean 4.896 102 253 .000
MINORITY Based on Mean 6.848 102 253 .000
CLASSIFICATION Based on Median .809 102 253 .891
Based on Median and with .809 102 137.319 .871
adjusted df
Based on trimmed mean 5.431 102 253 .000
SEX & RACE Based on Mean 3.353 102 253 .000
CLASSIFICATION Based on Median 1.089 102 253 .294
Based on Median and with 1.089 102 136.378 .319
adjusted df
Based on trimmed mean 3.104 102 253 .000

Interpretation: Homogeneity of variance exists here. So, there are no problems with the
assumptions for linear regression of this model.

(c)
From the above data analysis of (a) and (b) it's essential for any company to increase salaries
based on employee age, education promotion, starting salary, sex, and job type. These
predictors largely affect current salary.
Male employees may leave the workplace, so females and non-white employees can't adapt to
the working environment and don't perform well. Their pay may drop. Beginning salary
predicts present salary the most.

(d)
If we re run the linear regression again with the significant variables, we can find the
followings:

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 754.862 1002.548 .753 .452
BEGINNING SALARY 1.365 .081 .629 16.817 .000
SEX OF EMPLOYEE -728.029 306.448 -.053 -2.376 .018
AGE OF EMPLOYEE -49.273 12.340 -.085 -3.993 .000
EDUCATIONAL LEVEL 295.335 64.872 .125 4.553 .000
EMPLOYMENT CATEGORY 922.115 152.678 .190 6.040 .000
a. Dependent Variable: CURRENT SALARY

The following table compares estimated coefficients of significant and insignificant variables:

Variables Differences value of variables


Beginning Salary 0.008
Sex of Employee -740.424
Age of Employee -1.479
Education Level 1.635
Employment Category 11.47

Interpretation: There is a considerable difference between these two variables.


Question-10-i

10-i (a)

Descriptives
group Statistic Std. Error
yields 1 Mean 13.4150 .72612
95% Confidence Interval for Lower Bound 11.8952
Mean Upper Bound 14.9348
5% Trimmed Mean 13.3356
Median 13.4450
Variance 10.545
Std. Deviation 3.24732
Minimum 8.62
Maximum 19.64
Range 11.02
Interquartile Range 4.82
Skewness .305 .512
Kurtosis -.524 .992
2 Mean 9.6140 .73951
95% Confidence Interval for Lower Bound 8.0662
Mean Upper Bound 11.1618
5% Trimmed Mean 9.5750
Median 8.8750
Variance 10.938
Std. Deviation 3.30719
Minimum 4.79
Maximum 15.14
Range 10.35
Interquartile Range 6.50
Skewness .363 .512
Kurtosis -1.059 .992
3 Mean 12.8780 .39070
95% Confidence Interval for Lower Bound 12.0603
Mean Upper Bound 13.6957
5% Trimmed Mean 12.9161
Median 13.0900
Variance 3.053
Std. Deviation 1.74725
Minimum 9.61
Maximum 15.46
Range 5.85
Interquartile Range 2.80
Skewness -.374 .512
Kurtosis -.826 .992
4 Mean 14.2900 1.25232
95% Confidence Interval for Lower Bound 11.6689
Mean Upper Bound 16.9111
5% Trimmed Mean 14.1022
Median 14.0500
Variance 31.366
Std. Deviation 5.60056
Minimum 5.37
Maximum 26.59
Range 21.22
Interquartile Range 8.83
Skewness .294 .512
Kurtosis -.313 .992

10-i (b)

ANOVA
yields
Sum of Squares df Mean Square F Sig.
Between Groups 250.070 3 83.357 5.965 .001
Within Groups 1062.134 76 13.975
Total 1312.204 79

The result shows that there is a any significant difference between the groups.

10-i (c)

Tests of Between-Subjects Effects


Dependent Variable: yields
Type III Sum of
Source Squares df Mean Square F Sig.
Corrected Model 250.070a 3 83.357 5.965 .001
Intercept 12598.694 1 12598.694 901.488 .000
Group 250.070 3 83.357 5.965 .001
Error 1062.134 76 13.975
Total 13910.898 80
Corrected Total 1312.204 79
a. R Squared = .191 (Adjusted R Squared = .159)

Multiple Comparisons
Dependent Variable: yields
LSD
Mean Difference 95% Confidence Interval
(I) Group (J) Group (I-J) Std. Error Sig. Lower Bound Upper Bound
1 2 3.8010* 1.18218 .002 1.4465 6.1555
3 .5370 1.18218 .651 -1.8175 2.8915
4 -.8750 1.18218 .461 -3.2295 1.4795
2 1 -3.8010 *
1.18218 .002 -6.1555 -1.4465
3 -3.2640* 1.18218 .007 -5.6185 -.9095
4 -4.6760* 1.18218 .000 -7.0305 -2.3215
3 1 -.5370 1.18218 .651 -2.8915 1.8175
2 3.2640 *
1.18218 .007 .9095 5.6185
4 -1.4120 1.18218 .236 -3.7665 .9425
4 1 .8750 1.18218 .461 -1.4795 3.2295
2 4.6760 *
1.18218 .000 2.3215 7.0305
3 1.4120 1.18218 .236 -.9425 3.7665
Based on observed means.
The error term is Mean Square(Error) = 13.975.
*. The mean difference is significant at the 0.05 level.

From the table we can see that among group 1 and group 2 there are significant differences.

10-i (d)
The results show that, among the groups, there is significant differences. We found that using one
way anova and later it was again tested with general linear model.

10-ii (a)

Since the dependent variable (production) is ratio level variable and there are more than two types
of machine (A, B, C & D), ANOVA is only the way of testing if there is a significant difference in the
performance of four machines. In this regard, the test will be conducted on the following
hypotheses:

Ho: Mean effect of all types of machine on production is equal


Ha: Mean effect of at least two types of machine on production is unequal

Table for ANOVA Summary


Production of Cotton Fabrics
Sum of Squares df Mean Square F Sig.
Between Groups 540.688 3 180.229 25.222 .000
Within Groups 85.750 12 7.146
Total 626.438 15
Since P value of (0.000) is less than 0.05, Ho is rejected. That is, mean effect of at least two types of
machines on production is unequal. That is, there is a significant difference in the performance of
four machines.

10-ii (b)

LSD udder Post Hoc method will be used to complete a multiple comparisons analysis to determine
which machine(s) are different from other machines as follows:

Table for Multiple Comparisons


Dependent Variable: Production of Cotton Fabrics
LSD
(I) Types of (J) Types of Mean Std. Error Sig. 95% Confidence Interval
Machine Machine Difference (I-J) Lower Bound Upper Bound
B 3.000 1.890 .138 -1.12 7.12
C -3.250 1.890 .111 -7.37 .87
A D -12.500* 1.890 .000 -16.62 -8.38
A -3.000 1.890 .138 -7.12 1.12
C -6.250* 1.890 .006 -10.37 -2.13
B D -15.500* 1.890 .000 -19.62 -11.38
A 3.250 1.890 .111 -.87 7.37
B 6.250* 1.890 .006 2.13 10.37
C D -9.250* 1.890 .000 -13.37 -5.13
A 12.500* 1.890 .000 8.38 16.62
B 15.500* 1.890 .000 11.38 19.62
C 9.250* 1.890 .000 5.13 13.37
*. The mean difference is significant at the 0.05 level.
Since P value is less than 0.05 between them, machine A&D, machine B&C, machine B&D and
machine C&D are significantly different.

10-ii (c)

Based on the solution (a) & (b), we have observed that the means of the performance of four
machines on production were found to be significantly different. More specifically, machines A&D,
machines B&C, machines B&D and machines C&D are significantly different.

Question 11-i

Since the dependent variable (production) is ratio level variable and there are more than two
types of machine (A, B, C & D) as well as workers (1, 2, 3, 4 & 5), ANOVA is only the way
of testing if there is a significant difference in the performance of four machines as well as
five workers. In this regard, the test will be conducted on the following hypotheses:
 Ho: Mean effect of all types of machines & workers on production is equal
 Ha: Mean effect of at least two types of machines & workers on production is unequal
Table for ANOVA Summary
Tests of Between-Subjects Effects
Dependent Variable: production
Source Type III Sum of df Mean Square F Sig.
Squares
Corrected Model 408.950a 7 58.421 5.654 .005
Intercept 34362.050 1 34362.050 3325.360 .000
worker 161.200 4 40.300 3.900 .030
machine 247.750 3 82.583 7.992 .003
Error 124.000 12 10.333
Total 34895.000 20
Corrected Total 532.950 19
a. R Squared = .767 (Adjusted R Squared = .632)
Since P values of F in cases of both worker (0.030) & machine (0.003) are less than 0.05, Ho is rejected.
That is, mean effect of at least two types of machines & workers on production is unequal. That is,
there is a significant difference in the performance of four machines & workers.

Multiple Comparisons: LSD udder Post Hoc method will be used to complete a multiple comparisons
analysis to determine which machine(s) & workers are different from other machines & workers as
follows:

Table for Multiple Comparisons


Dependent Variable: production LSD

(I) worker (J) worker Mean Difference Std. Error Sig. 95% Confidence Interval
(I-J) Lower Bound Upper Bound
worker 2 -4.00 2.273 .104 -8.95 .95
worker 3 3.75 2.273 .125 -1.20 8.70
worker 1
worker 4 2.50 2.273 .293 -2.45 7.45
worker 5 -2.00 2.273 .396 -6.95 2.95
worker 1 4.00 2.273 .104 -.95 8.95
worker 3 7.75* 2.273 .005 2.80 12.70
worker 2
worker 4 6.50* 2.273 .014 1.55 11.45
worker 5 2.00 2.273 .396 -2.95 6.95
worker 1 -3.75 2.273 .125 -8.70 1.20
worker 3 worker 2 -7.75* 2.273 .005 -12.70 -2.80
worker 4 -1.25 2.273 .592 -6.20 3.70
worker 5 -5.75* 2.273 .026 -10.70 -.80
worker 1 -2.50 2.273 .293 -7.45 2.45
worker 4 worker 2 -6.50* 2.273 .014 -11.45 -1.55
worker 3 1.25 2.273 .592 -3.70 6.20
worker 5 -4.50 2.273 .071 -9.45 .45
worker 1 2.00 2.273 .396 -2.95 6.95
worker 5
worker 2 -2.00 2.273 .396 -6.95 2.95
worker 3 5.75* 2.273 .026 .80 10.70
worker 4 4.50 2.273 .071 -.45 9.45
Based on observed means.
The error term is Mean Square(Error) = 10.333.
*. The mean difference is significant at the 0.05 level.
Since P values of F are less than 0.05 between them, workers 2&3, workers 2&4 and workers 3&5
are significantly different.

Table for Multiple Comparisons


Dependent Variable: production
LSD
(I) machine (J) machine Mean Difference Std. Error Sig. 95% Confidence Interval
(I-J) Lower Bound Upper Bound
machine B 4.40 2.033 .051 -.03 8.83
machine A machine C -3.80 2.033 .086 -8.23 .63
machine D 4.80* 2.033 .036 .37 9.23
machine A -4.40 2.033 .051 -8.83 .03
machine B machine C -8.20* 2.033 .002 -12.63 -3.77
machine D .40 2.033 .847 -4.03 4.83
machine A 3.80 2.033 .086 -.63 8.23
machine B 8.20* 2.033 .002 3.77 12.63
machine C
machine D 8.60* 2.033 .001 4.17 13.03
machine A -4.80* 2.033 .036 -9.23 -.37
machine B -.40 2.033 .847 -4.83 4.03
machine D
machine C -8.60* 2.033 .001 -13.03 -4.17
Based on observed means.
The error term is Mean Square(Error) = 10.333.
*. The mean difference is significant at the 0.05 level.
Since P values of F are less than 0.05 between them, machine A&B, machine A&D,
machine B&C and machine C&D are significantly different.

Summary Findings: Based on ANOVA table summary and multiple comparisons analysis, we
have observed that mean effect of at least two types of machines & workers on production is
unequal. That is, there is a significant difference in the performance of four machines &
workers. More specifically, workers 2&3, workers 2&4 and workers 3&5 are significantly
different. In terms of machine, A&B, A&D, B&C and C&D are significantly different.

Question-11 (ii)

Since the dependent variable (time in minutes) is ratio level variable and there are more than two
days (Monday, Tuesday, Wednesday, Thursday & Friday) as well as routes (1, 2, 3 & 4), ANOVA is
only the way of testing whether the difference among the means obtained for the different routes
(treatments) and days (blocks) are significant. In this regard, the test will be conducted on the
following hypotheses:

 Ho: Mean effect of all days & routes on time (in minutes) taken to drive to work is equal
 Ha: Mean effect of at least two days & routes on time (in minutes) taken to drive to work is
unequal

Tests of Between-Subjects Effects


Dependent Variable: Time (in minutes)
Source Type III Sum of Squares df Mean Square F Sig.
Corrected Model 126.000a 7 18.000 7.941 .001
Intercept 15456.800 1 15456.800 6819.176 .000
Treatment 52.800 3 17.600 7.765 .004
Block 73.200 4 18.300 8.074 .002
Error 27.200 12 2.267
Total 15610.000 20
Corrected Total 153.200 19
a. R Squared = .822 (Adjusted R Squared = .719)
The difference among the means obtained for the different routes (treatments) and days (blocks)
are significant.

Since P values of F in cases of both Treatment (0.004) & Block (0.002) are less than 0.05, Ho is
rejected. That is, mean effect of at least two days & routes on time (in minutes) taken to drive to
work is unequal. In other words, the difference among the means obtained for the different routes
(treatments) and days (blocks) are significant.

You might also like