0% found this document useful (0 votes)
70 views

Data Analysis - Answer

1. The human resource manager wanted to know if the median salary of fresh graduates with IT backgrounds was more than RM3500. 2. A hypothesis test was conducted using a t-test and found that there was not enough evidence to conclude the median salary was more than RM3500. 3. A fast food manager claimed the average variance in customers between 8-11am was 2. A friend counted customers in 15 other outlets and the data did not support a variance of 2. 4. A chi-square test was conducted and found that the data did not provide enough evidence to reject the claim that the variance was 2.

Uploaded by

zulhusni
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Data Analysis - Answer

1. The human resource manager wanted to know if the median salary of fresh graduates with IT backgrounds was more than RM3500. 2. A hypothesis test was conducted using a t-test and found that there was not enough evidence to conclude the median salary was more than RM3500. 3. A fast food manager claimed the average variance in customers between 8-11am was 2. A friend counted customers in 15 other outlets and the data did not support a variance of 2. 4. A chi-square test was conducted and found that the data did not provide enough evidence to reject the claim that the variance was 2.

Uploaded by

zulhusni
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

NAME : AHMAD ZULHUSNI BIN OMAR

IC NO : 930314-05-5057
STUDENT ID : 930314055057001

QUESTION 1
A Human Resource manager in an IT company would like to know whether the median
salary of fresh graduate employees with IT background is more than RM3500 per month.
Data of salary for 10 fresh graduates is shown in Table 1.

Table 1
Employee Salary (RM)
1 2300
2 1800
3 3600
4 3700
5 4000
6 5500
7 4400
8 3500
9 2900
10 3700

a) Explain which statistic distribution should be used in order to test the manager’s claim.
T- test statistic.

b) Construct a complete hypothesis test to test whether the manager’s claim is correct or
not at  = 0.05.
Hypothesis testing
H 0 : μ≤ 3500

H 1 : μ> 3500

Critical value
The significance level is α =0.05
The critical value for a right-tailed test is, t = 1.833 based on t distribution table.
Test Statistic
^
X−μ
t=
s
√n

Mean ( x )=
∑x
n
2300+1800+3600+3700+ 4000+5500+4400+ 3500+ 2900+3700
Mean ( x )=
10
35400
Mean ( x )=
10
Mean ( x )=3540

√ (∑ x )
2

∑ x− n
Sample Standard deviation( s)=
n−1
Employee Salary (RM) X2
(x)
1 2300 5290000
2 1800 3240000
3 3600 12960000
4 3700 13690000
5 4000 16000000
6 5500 30250000
7 4400 19360000
8 3500 12250000
9 2900 8410000
10 3700 13690000
35400 135140000


( 35400 )2
135140000−
10
Sample Standard deviation( s)=
10−1

Sample Standard deviation( s)=


√ 135140000−125316000
9
Sample Standard deviation( s)=
√ 9824000
9
Sample Standard deviation( s)=√ 1091555.5556
Sample Standard deviation( s)=1044.7754
So;
Sample Mean ( x )=3540
Sample Standard deviation( s)=1044.7754
Sample ¿(n¿)=10
Significance level ( α )=0.05

3540−3500
t=
1044.7754
√10
t=0.121

Decision about the null hypothesis


Since it is observed that t=0.121 ≤t c =1.833 , it is then concluded that the null hypothesis is
not rejected.

Conclusion
It is concluded that the null hypothesis H0 is not rejected. Therefore, there is not enough
evidence to claim that the population mean μ is greater than 3500, at the α =0.05 significance
level.

c) Based on your answer in (b) state a brief summary of the median salary of fresh
graduate employees with IT background.
Based on the answer in (b), at 0.05 significance level, there is not enough evidence to
conclude that median salary of fresh graduate employees with IT background is more than
RM 3500 per month.
QUESTION 2
According to a fast-food manager, the average number of customers between 8am to 11am
is 10, with a variance of two. His friend from another outlet does not believe that the
variance is two. He counts the number of customers in 15 other outlets and obtained the
data as in Table 2.

Table 2

Outlet Number of Customers


1 11
2 10
3 9
4 10
5 10
6 11
7 11
8 10
9 12
10 9
11 7
12 9
13 11
14 10
15 11

a) Explain which statistic distribution should be used in order to test whether the variance
is different from two.
Chi-Square test.

b) Construct a complete hypothesis test to test the claim in (a) at  = 0.05.


Hypothesis Testing
2
H 0 :σ =2
2
H1: σ ≠ 2
Critical Value
The significance level is α =0.05
The rejection region for this two-tailed is;
2 2 2
R={X : X <5.629∨ X >26.119→ ¿ excel

Test Statistics

(n−1)s 2
X2= 2
σ

( (∑ ) )
n n 2
1 1
s=
2
n−1 ∑ X −n 2
i X
2
i
i=1 i=1

Outlet Number of Customers X


2

(X )
1 11 121
2 10 100
3 9 81
4 10 100
5 10 100
6 11 121
7 11 121
8 10 100
9 12 144
10 9 81
11 7 49
12 9 81
13 11 121
14 10 100
15 11 121
TOTAL 151 1541

( )
2
2 1 151
s= 1541−
15−1 15
2
s =1.4952
So;
Sample variance( s)=1.4952
2
Population Variance(σ )=2
Sample ¿ n ¿=15
Significance Level(α )=0.05

( 15−1 ) ∙ 1.4952
X2=
2
2
X =10.466

Decision

Since it is observed that X 2L =5.629≤ X 2=10.466 ≤ X 2U =26.119 , then we can conclude that the
null hypothesis is not rejected.
Conclusion
It is concluded that the null hypothesis H 0 is not rejected. Therefore, there is enough
evidence to claim that the population variance σ 2 is different that 2 at the 0.05 significance
level.

c) Based on your answer in (b) state a brief summary of the variance number of
customers.
Based on the answer in (b), we have concluded that variance is not different from 2 at 0.05
significance level.
QUESTION 3
A homeowner is interested in the effect that using the air conditioner and washing machine
had on the electric bill. He recorded the number of hours the air conditioner and the
number of times washing machine were used for 14 days. He also monitored the electric
meter for these 14 days, and computed the amount of electricity used each day in kilowatt-
hours. Data is shown in Table 3.

Table 3

Air Conditioner (hours) Washing Machine (times) Electricity Used (kwh)


1.5 1 35
4.5 2 63
5.0 2 66
2.0 0 17
8.5 3 94
6.0 3 79
13.5 1 93
8.0 1 66
12.5 1 94
7.5 2 82
8.0 3 78
7.5 1 65
12.0 1 77
6.0 0 75

a) Use Microsoft Excel to produce Analysis of Variance table for the multiple linear
regression and state the model.

Regression Statistics
Multiple R 0.904805
R Square 0.818672
Adjusted R Square 0.785703
Standard Error 10.09556
Observations 14
ANOVA
Significance
df SS MS F
F
Regression 2 5061.73 2530.87 24.832 0.000
Residual 11 1121.12 101.92
Total 13 6182.86

Coefficient Standar P- Lower Upper Lower Upper


t Stat
  s d Error value 95% 95% 95.0% 95.0%
38.632 6.232
22.4323 7.3603 3.0477 0.0111 6.2323 38.6322
Intercept 2 3
Air Conditioner 3.062
4.7778 0.7795 6.1294 0.0001 3.0621 6.4934 6.4934
(hours) 1
Washing Machine 14.640 2.524
8.5823 2.7522 3.1183 0.0098 2.5247 14.6400
(times) 0 7

Multiple Regression Model


y=b0 +b1 x 1+ b2 x 2

Where ;
y=Electricity Used (kwh)
x 1= Air Conditioner ( hours )

x 2=Washing Machine (¿)

Electricity used (kwh)=22.4323+ 4.7778 ( hours ) + 8.5823(¿)

b) Construct a complete hypothesis test to test whether the model in (a) is fit at  = 0.05.
H 0 : β 1=β 2=0

H 1 : β1 ≠ β 2 ≠ 0

F Stats =24.832

P−value=0.000
The p−value isless than α . So we can reject the null hypothesis( H 0)
As a conclusion , at α=0.05 ,the model is significant .
c) Construct a complete hypothesis test to test whether the regression coefficients, 1 and
2, are significantly zero at  = 0.05.
β1

H 0 : β 1=0

H 1 : β1 ≠ 0

t=6.1294
P−value=0.000
Sincethe p−valueis less than 0.05 , so we can reject null hypothesis ( H 0 ) .

We can conclude that the coefficients β 1 is significant at α =0.05

β2

H 0 : β 2=0

H 1 : β2 ≠ 0

t=6.1294
P−value=0.000
Sincethe p−valueis less than 0.05 , so we can reject null hypothesis ( H 0 ) .

We can conclude that the coefficients β 2 is significant at α=0.05

d) Based on your answer in (b) and (c) state a brief summary on the linear relationship
between electricity used and the two factors.
Based on the answer in (b) and (c), there is a positive linear relationship between dependent
(electronic consumption) and independent variables (air conditioner and washing machine).
It shows that when the consumption of air conditioner increases than the electronic
consumption also increases. Similar for washing machine when the washing machine usage
increases electronic consumption also increases.

You might also like