SUSS BSBA: BUS105 Jan 2021 TOA Answers

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 9

SUSS

BUS 105
January 2021 TOA

Question 1
(a)
2019 Jul
Mean 54.48888889
Standard Error 3.149734993
Median 50
Mode 80
Standard
Deviation 21.12906467
Sample
Variance 446.4373737
-
Kurtosis 0.985252425
-
Skewness 0.035217992
Range 75
Minimum 15
Maximum 90
Sum 2452
Count 45

Measure of location is a single value that encapsulates the central tendency of a


collection of data. These can be described as the mean, median and mode (Zheng, F.,
Soh, I., & Tan, C., 2021).
Measure of dispersion shows spread intensity of a set of data. A small value for
dispersion suggests that the data are closely clustered, whereas a big value suggests that
the mean is unreliable and does not represent the data effectively (Zheng, F., Soh, I., &
Tan, C., 2021). The three measures of dispersion can be described as the range,
variance, and standard deviation.
The descriptive statistics for 2019 July semester and 2020 July semester are as
follows:

2019 July
The mean, median and mode are 54.489, 50 and 80 respectively.
The range, variance and standard deviation are 75, 446.437 and 21.129 respectively.

2020 July
The mean, median and mode are 54.96, 58 and 65 respectively.
The range, variance and standard deviation are 78, 400.61 and 20.015 respectively.

(b)
i. The sample is highly likely to be biased, as these samples (45 – statistics overall
scores in July 2019 and 50 – statistics overall scores in July 2020) are drawn from
students who attended face-to-face lessons in July 2019 and students who attended
online lessons in July 2020. There is a possibility that students who had attended
the face-to-face lessons in July 2019 may have better understanding of the module
as compared to those students who had attended online lessons in July 2020 vice
versa. Additionally, the sample size may be insufficient, as 45 – statistics overall
scores in July 2019 and 50 – statistics overall scores in July 2020 may not be
sufficient to be used to reliably interpret the data.
ii. There is also a concern on the level randomness of the samples. This can be
overcome by verifying that the sample is random using a sampling approach such
as the simple random sampling or systemic random sampling.

iii. Lastly, there is also a concern that both samples from July 2019 and July 2020 use
independent samples, the difference in factors like mode of assessment (e.g. online
or written paper assessment) and demographic can contribute to differences in the
overall scores.

If given the opportunity to conduct the study, I would ensure that a that a large
population of the respondents is sampled at the beginning. I would further go on to
ensure that the samples are collected from students who are undergoing the same mode
(either online or face-to-face, it cannot be a mixture of both) of lessons and final
assessment.

Question 2
(a)
(i)
P(no problem in the next quarter) = 0.8*0.5 + 0.7*0.3 + 0.6*0.2 = 0.73

(ii)
P(engaged C / No problem) = 0.8219

(iii)
P(A will chose for next two quarters) = 0.73*0.8 + (1-0.73)*0.5 = 0.719

(b)
μ = 266
σ = 16
(i)
Let X be the length of pregnancy.
Answer from Excel using norm.dist.
P(260<X<270) = P(X<270) – P(X<260)
= 0.024 – 0.023
= 0.0009
= 0.09%

(ii)
n = 20
standard error = 16/ √20 = 3.578
Answer from Excel using norm.dist
P(X>265) = 1 – 0.107236 = 0.892764 = 89.28%

Question 3
(a)
(i)
Male

Mean 69.9
Standard Error 0.24140394
Median 70
Mode 70
Standard Deviation 1.322223832
Sample Variance 1.748275862
-
Kurtosis 0.005748394
-
Skewness 0.475645301
Range 5
Minimum 67
Maximum 72
Sum 2097
Count 30
Confidence Level
(90.0%) 0.410175958
Male
Lower Limit of Confidence Interval = 69.9 – 0.41 = 69.49 (2 d.p)
Upper Limit of Confidence Interval = 6.99 + 0.41 = 70.31 (2 d.p)

The 90% confidence interval for mean stress level for male students is between 69.49
and 70.31 stress scores. This confidence interval is an interval estimate of the
population stress score for male students. The 90% significance means that even if the
population mean may not always be in between 69.49 and 70.31, for most instances if
we repeat the calculations, we will find the population mean in the interval.

Female
Lower Limit of Confidence Interval = 72.93 – 1.46 = 71.47 (2 d.p)
Upper Limit of Confidence Interval = 72.93 – 1.46 = 74.39 (2 d.p)

The 90% confidence interval for mean stress level for female students is between 71.47
and 74.39 stress scores. This confidence interval is an interval estimate of the
population stress score for female students. The 90% significance means that even if
the population mean may not always be in between 71.47 and 74.39, for most instances
if we repeat the calculations, we will find the population mean in the interval.

(ii)
t-Test: Paired Two Sample for Means

  Male Female
Mean 69.9 72.93333333
Variance 1.748275862 22.13333333
Observations 30 30
-
Pearson Correlation 0.167409675
Hypothesized Mean
Difference 0
df 29
-
t Stat 3.260557689
P(T<=t) one-tail 0.001420712
t Critical one-tail 1.311433647
P(T<=t) two-tail 0.002841425
t Critical two-tail 1.699127027  

Step 1:
H0: μd = 0
H1: μd not equal 0

where μd is the mean stress difference between Male and female

Step 2: Select the level of significance.


The level of significance decided is α = 0.10.

Step 3: Decide on a test statistic.


We will use paired t-test since population standard deviation is unknown and samples
are not
independent, and because this is a consecutive situation.

Step 4: Develop a decision rule.


If p-value < 0.10, we reject H0 and accept H1.

Step 5: Compute the value of the test statistic, make a decision regarding the null
hypothesis, and interpret the results.
From Excel output table, this is a two-tailed test.
Since p-value of 0.002841 is < 0.10, we reject H0.
Therefore, we can conclude that the stress scores between male and female students are
not similar.
(b)

SUSS  

Mean 67.8
Standard Error 2.727636339
Median 66
Mode #N/A
Standard Deviation 6.099180273
Sample Variance 37.2
Kurtosis 2.86463753
Skewness 1.53951589
Range 16
Minimum 62
Maximum 78
Sum 339
Count 5
Confidence
Level(95.0%) 7.573132563

NUS   NTU  

Mean 73.2 Mean 72


Standard Error 3.839270764 Standard Error 3.987480407
Median 76 Median 71
Mode #N/A Mode #N/A
Standard Deviation 8.584870413 Standard Deviation 8.91627725
Sample Variance 73.7 Sample Variance 79.5
- Kurtosis 1.155096713
Kurtosis 1.078909142 Skewness 1.061587537
- Range 23
Skewness 0.696691247 Minimum 63
Range 21 Maximum 86
Minimum 61 Sum 360
Maximum 82 Count 5
Sum 366 Confidence
Count 5 Level(95.0%) 11.07102046
Confidence
Level(95.0%) 10.65952452

Step 1:
H0: μi = 0
H1: at least one μi not equal 0

where μi is the mean stress for SUSS, SIM, NUS and NTU
Step 2: Select the level of significance.
The level of significance decided is α = 0.05.

Step 3: Decide on a test statistic.


We will use paired t-test since population standard deviation is unknown and samples
are not
independent, and because this is a one-after-another situation.

Step 4: Develop a decision rule.


If p-value < 0.05, we reject H0 and accept H1.

Step 5: Compute the value of the test statistic, make a decision regarding the null
hypothesis, and interpret the results.
From Excel output table, this is a two-tailed test.
Since p-value of 0.002841 is < 0.05, we reject H0.
Therefore, we can conclude that any of the stress scores USS, SIM, NUS and NTU will
not equal to zero.

Question 4

(a)
SUMMARY
OUTPUT

Regression Statistics
Multiple R 0.862776744
R Square 0.744383711
Adjusted R Square 0.676219367
Standard Error 10.30414179
Observations 20

ANOVA
Significance
  df SS MS F F
Regression 4 4637.92 1159.48 10.9204266 0.000237317
Residual 15 1592.63 106.1753
Total 19 6230.55      

Standard
  Coefficients Error t Stat P-value Lower 95% Upper 9
-
Intercept 129.3986876 65.11146 1.987341 0.06546009 9.383114154 268.180
Experience 10.82184778 3.599151 3.006778 0.00884973 3.150439754 18.4932
-
Age -3.268778436 1.8103 -1.80566 0.09107787 7.127341439 0.58978
- -
Shift -17.56720195 4.746842 -3.70082 0.0021355 27.68485585 7.44954
-
Gender 6.195458179 7.99367 0.775045 0.45035887 10.84264676 23.2335

The linear equation is:


ŷ = 129.398 + 10.821X1 – 3.268X2 – 17.567X3 + 6.195X4

where X1 is Experience, X2 is the Age, X3 is the Shift and X4 is the Gender.


Other things unchanged

For every additional 1 experience in amount, the production will increase by 10.821.
For every additional 1 Age in amount, the production will decrease by 3.268.
For every additional 1 shift, the production will be decreased by 17.567.
For every additional 1 person (gender), the production will increase by 6.195.

(b)
ŷ = 129.398 + 10.821*(6) – 3.268*(30) – 17.567*(1) + 6.195*(0)

= 78.717

(c)

Relevant Excel output table is shown above in part (a).


The coefficient of multiple determination is 0.7443.

The adjusted coefficient of multiple determination is 0.6762. This means that 74.43%
(or 67.62% respectively) of the variation in productivity could be explained by the
variation of the independent variables, which are the Experience, Age, Shift, and
Gender.

The adjusted coefficient of multiple determination is a more appropriate measure in this


scenario since it accounts for both model simplicity and predictive ability. The
unadjusted one merely account for prediction ability, and therefore increasing it by
adding additional independent variables to the model.

However, there are already three independent variables in this example. Thus,
employing the adjusted coefficient of multiple determination would prevent
independent variables from being overloaded, as after adding additional independent
variables one at a time, the adjusted coefficient of multiple determination would begin
to decline, indicating that the model had gotten more complex.

(d)

Step 1: State the null and alternate hypotheses.


H0: β1 = 0
H1: β1 not equal 0
Where β1 is the coefficient of amount spend on food (X1)
H0: β2 = 0
H1: β2 not equal 0
Where β2 is the coefficient of total room occupancy (X2)
H0: β3 = 0
H1: β3 not equal 0
Where β3 is the coefficient of whether there were performances (X3)

Step 2: Select the level of significance.


The level of significance is α = 0.05

Step 3: Decide on a test statistic.


We will perform the individual t-tests.

Step 4: Develop a decision rule.


Reject H0 if p-value < 0.05.

Step 5: Compute the value of the test statistic, make a decision regarding the null
hypothesis, and interpret the results.

P-value for (X1) = 0.0088


Since P-value < 0.05, reject H0.

P-value for (X2) = 0.0910


Since P-value > 0.05, do not reject H0.

P-value (X3) = 0.00213


P-value < 0.05, reject H0.

P-value (X4) = 0.77


P-value > 0.05, do not H0.

Therefore, the two independent variables (X1, X3) are significant, while the
independent variable (Whether there were performances) is insignificant and therefore
should be dropped from the model.

Reference

Zheng, F., Soh, I., & Tan, C. (2021). BUS105 Statistics (study guide). Singapore:
Singapore
University of Social Sciences.

You might also like