0% found this document useful (0 votes)
10 views36 pages

CGRP MSC Stat 3 PDF

The document outlines the M.Sc. course RPE 804 - Applied Statistics for Engineers, focusing on statistical tests for significance, including hypothesis testing, t-tests, z-tests, F-tests, chi-square tests, and ANOVA. It explains the concepts of null and alternate hypotheses, significance levels, and provides examples for each statistical test. Additionally, it includes assignments for practical application of the statistical methods discussed.

Uploaded by

Udop Charles
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views36 pages

CGRP MSC Stat 3 PDF

The document outlines the M.Sc. course RPE 804 - Applied Statistics for Engineers, focusing on statistical tests for significance, including hypothesis testing, t-tests, z-tests, F-tests, chi-square tests, and ANOVA. It explains the concepts of null and alternate hypotheses, significance levels, and provides examples for each statistical test. Additionally, it includes assignments for practical application of the statistical methods discussed.

Uploaded by

Udop Charles
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Centre for Gas, Refining and Petrochemical

Engineering.
University of Port Harcourt.

M. Sc. Programme

RPE 804 - Applied Statistics For


Engineers

Course Lecturer – Engr. Dr. (Mrs.) Ibifuro Altraide

Date: 22nd January, 2025.


Part 3

Statistical test for significance


Basic concepts in significance testing
1. Hypothesis
Hypothesis is an assumption that is made based on some
evidence. A predictive statement capable of being tested by
a statistical method which relates an independent variable
to some dependent variables.

2. Null hypothesis, Ho
It provides a statement which is contrary to the hypothesis.
It’s a negative statement: there is no relationship between
independent and dependent variables. This hypothesis states
that there is no difference between groups or no
relationship between variables.
3. Alternate hypothesis, H1
This hypothesis should state what you expect the data to
show, based on your research on the topic.
Example:
HO :Conc. of reactants does not have effect on reaction rate
H1: Conc. of reactants does have effect on reaction rate.

4. Significance level
This is the percentage risks of making a wrong decision. A
significance level of 0.05 is often used in hypothesis testing.
Generally, the level of significance is taken as 1%, 5% and 10%.
If the results of the study indicates a probability lower than
the significance level, the researcher can reject the null higher
and vice versa.
Parametric and non-parametric tests
t - test
The t - test is a statistical test used to determine if there is a
significant difference between the means of two groups or
samples. It is commonly employed when dealing with small
sample sizes (<30) and unknown population standard
deviations. By calculating the t-statistic and comparing it to
the critical value, the t-test helps assess the probability that
the observed difference between the means is due to chance
or a genuine effect.

One sample t-test is

S = sample standard deviation


Two sample t-test

Criteria
tcritical ≤ tstatistic - reject null hypothesis
tcritical > tstatistic - fail to reject or accept null hypothesis

Example
Is there a significant difference in test scores between 25
students who received in-person instruction and 25 students
who received online instruction? The mean test score for the
in-person group is 80 (SD = 5) and for the online group is 75
(SD = 7).
Solution
This is a two samples t-test problem as the two groups being
compared are independent of each other. To perform the t-
test, we first calculate the t-stat value using the t test formula:

Where, x1 is the mean test score for the in-person group,


x2 is the mean test score for the online group.
Ho: there is no significant difference in test scores between
in-person students and online students
H1: there is a significant difference in test scores between in-
person students and online students

Substituting the numbers, we get: tstat = 2.02


Next, a t-table is used to find the critical t-value for the
desired level of significance and degrees of freedom (df = n1
+ n2 - 2).

Let us assume a significance level of 0.05 and df = 48. The


critical t-value is 2.01.

Since the tstat of 2.02 > tcritical of 2.01, there is a significant


difference between the test scores of students who receive
in-person instruction versus those who receive online
instruction, hence we reject the null hypothesis.
Z - test
A z test is conducted on a population that follows a normal
distribution with independent data points and has a sample
size that is greater than or equal to 30. It is used to determine
if the means of two populations are equal to each other when
the population variance is known.
Types of Z test
1. One tailed z test
2. Left tailed z test
3. Right tailed z test
4. Two tailed z test
F - test
A Statistical F-Test uses an F Statistic to compare
two variances, s1 and s2, by dividing them. The result is always
a positive number (because variances are always positive).
The equation for comparing two variances with the f-test is:

F = s2 1 / s2 2
F = Larger sample variance/Smaller sample variance
Note
• The larger variance should always go in the numerator
• For two-tailed tests, divide alpha by 2 before finding the
right critical value.
• If you are given standard deviations they must be
squared to get the variances.
Example
Use an appropriate statistical test to determine the variability
of salt content in two different populations in the given data
below:
Salt content mg/125ml
Brand A Brand B
860 540
850 640
750 600
870 640
940 300
410 610
410 430
820 280
890 300
890 610

Level of significance = 95%


Solution
Step 1: Choose the test: F – test
F = s21 / s22
Step 2: State the hypothesis
H0: There is no significant difference between the variances
H1: There is a significant difference between the variances
Step 3: Calculate the F – value using the formula
Mean A = 769
Mean B = 495
Variance A = 38255.45
Variance B = 23116.16
Fstat = 1.65
Step 4: Obtain Fc from F tables (the table gives values for one-tailed test,
since this is a two-tailed test, divide significance level by 2 (0.025)

FC = 4.03. Since FC > Fstat , the null hypothesis is accepted


Chi square test
This is a statistical test that is used to determine the
relationship between two categorical variables (gender,
educational level, animals, countries etc). The statistical
procedure determines the difference between observed and
expected data (compare observed and expected results).

O is the observed results


E is the expected results
Steps
Step 1: Define the hypothesis

Step 2: Calculate the expected values

Step 3: Calculate (O - E)^2 / E for each cell in the table.

Step 4: Calculate the Test Statistic X^2

Step 5: Compare Chi stat and Chi critical and conclude


Example
The table below shows the preferred professional courses and gender of a
university in 2015. Use a relevant statistical test to determine if there a
relationship between the two variables and draw a conclusion.

Law Engineering AI Total


Male 100 70 30 200
Female 140 60 20 220
Total 240 130 50 420

Solution
H0: There is no relationship between gender and preferred courses
H1: There is a relationship between gender and preferred courses.

2. Use the formula for expected values and calculate for each value
Observed Expected values O - E (
values (O) (E)
100 114.29 -14.29 204.20 1.79
70 61.90 8.10 65.61 1.06
30 23.81 6.19 38.32 1.61
140 125.71 14.29 204.20 1.62
60 68.10 -8.10 65.61 0.96
20 26.19 -6.19 38.32 1.46
= 8.50

DF = (rows -1) (columns -1) = (3-1)(2-1) = 2.


stat = 8.50
critical = 5.991
For an alpha level of 0.05 and 2 df, the critical statistic is 5.991, which is
less than our obtained statistic of 8.50. You can reject our null
hypothesis.
Analysis of variance (ANOVA)
ANOVA is a statistical test used to analyze the difference between
the means of more than two groups. It is basically an extension of t - test

Types
A one-way ANOVA uses one independent variable or factor. It compares
three or more levels of one factor.
Two-way ANOVA uses two independent variables or factors. It compares
the effect of multiple levels of two factors.
The test statistic is F – test

Assumptions
• Population is assumed to be normally distributed
• Samples are selected randomly
• Data is independent
The null hypothesis (H0) of ANOVA is that there is no difference among
group means. The alternative hypothesis (H1) is that at least one group
differs significantly from the overall mean of the dependent variable.
ANOVA =
If:
> 1 reject Ho

< 1 fail to reject Ho

= 1 fail to reject Ho
Mathematically,
Variance between group:

Variance within groups:

Total sum of variation:

SST = SSB + SSE


Example
Is there a difference in the studying methods leading to
different mean exams scores or not in the following students
scores.
S/No Method A Method B Method C
1 10 8 9
2 9 9 8
3 8 10 7
4 7.5 8 10
5 8.5 8.5 9
6 9 7 8
7 10 9.5 7
8 8 9 10
9 8 7 9
10 9 10 8
Gp 8.7 8.6 8.5
Mean
Solution
Ho: there is no difference between the means of studying method
H1: there is a difference between the means of studying method

Overall mean = 8.7 + 8.6 + 8.5 /3 = 8.6

Variance between groups,

= 10(8.7 – 8.6)^2 + 10(8.6 – 8.6)^2 + 10(8.5 - 8.6)^2 = 0.2

Variance within groups, SSE =∑ ^2

Where
Xi is the ith observation in gp I
Xj is the mean of gp j
For method A
(10 – 8.7)^2 + (9 - 8.7)^2 + (8 – 8.7)^2 + (7.5 – 8.7)^2 + (8.5 –
8.7)^2 + (9 - 8.7)^2 + (10 – 8.7)^2 + (8 – 8.7)^2 + (8 – 8.7)^2 +
(9 - 8.7)^2 = 6.6

Method B = 10.9

Method C = 10.5

Within groups variations SSE = 6.6 + 10.9 + 10.5 = 28

ANOVA = 0.2/28 = 0.0071 < 1,


Therefore, we fail to reject Ho, which states that there is no
difference in the means of the studying methods.
Completing the Anova table

where
X = individual observation,
Xj = sample mean of the jth treatment (or group),
X bar = overall sample mean,
k = the number of treatments or independent comparison
groups, and
N = total number of observations or total sample size.
Source of Sum of Degree of Mean squares F
variation squares, SS freedom (MS)
B/w 0.2 2 0.10 0.096
treatments,
SSB
Error 28.0 27 1.04 -
(within), SSE
Total, SST 28.2 29 - -
Kruskal Wallis H test
The Kruskal Wallis H test is the non parametric alternative to
the One Way ANOVA. The H test is used when the
assumptions for ANOVA aren’t met (like the assumption of
normality). The test determines whether the medians of two
or more groups are different. The test statistic used in this test
is called the H statistic.

H0: there is no difference in the population medians


H1: there is a difference in population medians

The Kruskal Wallis test will tell you if there is a significant


difference between groups. However, it won’t tell
you which groups are different.
{ + + …. ]–3 1

Where:
N = sum of sample sizes for all samples,
k = number of samples,
Rj = sum of ranks in the jth sample,
nj = size of the jth sample.
Example
The following data set is selected from a non normal distribution.
With the application of an appropriate statistical test determine if
the null hypothesis will be rejected or accepted at a significance level
of 0.05.

Sample 1 Sample 2 Sample 3

8.2 10.2 13.5

10.3 9.1 8.4

9.1 13.9 9.6

12.6 14.5 13.8

11.4 9.1 17.4

13.2 16.4 15.3


Step 1: define the hypothesis
Ho : there is no difference between the ranks of the three group
H1: there is difference between the ranks
Steps 2: rank the observations and sum

Sample 1 R1 Sample 2 R2 Sample 3 R3


8.2 1 10.2 7 13.5 12
10.3 8 9.1 3 8.4 2
9.1 3 13.9 14 9.6 6
12.6 10 14.5 15 13.8 13
11.4 9 9.1 3 17.4 18
13.2 11 16.4 17 15.3 16
∑ 42 59 67
Write out all the observations and rank in ascending order,
8.2 being the lowest number has the rank 1…. Repeated
numbers has same rank.
Step 3: substitute the values into the formula
N = 18 (total number of observations)
n = 6 (observations in each sample)

Hstat = 5.924
Using Degree of Freedom = 3 – 1 = 2, find the value of
Hcritical using Chi square table
Chi square tables = 5.991
Hstat < X^2
Since the Hstat value is less than critical chi square value, we
fail to reject the null hypothesis.
Assignment 1
1. In the synthesis of a locally formulated retarder in the
Chemical Engineering lab of Uniport, the thickening times
from analysis are presented in the table below. Use a relevant
statistical test to determine if there a relationship between the
two variables and draw a conclusion.
Conc 1 Conc 2 Conc 3 Total
Commercial retarder 180 120 80 380
Local retarder 130 70 45 245
Total 310 190 125 625

2. Essential oil was extracted from waste orange peels in a


laboratory using three different extraction techniques
and the yield (ml) presented in the following table. Is
there any difference in the means of the different
techniques? Complete the ANOVA table with the
calculated values.
S/No Method 1 Method 2 Method 3
1 15 12 10
2 13 13 14
3 18 16 17
4 14 12 15
5 17 18 16
3. With the application of an appropriate statistical test
determine if the null hypothesis of the means of the samples
will be rejected or accepted at a significance level of 0.05.

Sample 1 Sample 2 Sample 3


15.8 11.4 12.2
18.6 15.8 14.5
13-1 11.4 11.4
11.4 12.2 10.5
Thank you for listening

You might also like