0% found this document useful (0 votes)

13 views27 pages

Lecture 2 1

Uploaded by

Rafalel Jupio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views27 pages

Lecture 2 1

Uploaded by

Rafalel Jupio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

BIOE70037 - Computational and Statistical

Methods for Research 2023-2024

Lecture 2: Data analysis

Chiu Fan Lee | [email protected]

Standard Error of the Mean
▪ The standard error is the standard deviation of sample means.
▪ It is a measure of how representative a sample mean is likely to be of the population mean
▪ A large standard error (relative to the sample mean) means that there is a lot of variability
between the means of different samples and so the sample we have might not be
representative of the population mean
▪ A small standard error indicates that most sample means are similar to the population mean
and so our sample is likely to be an accurate reflection of the population.
Frequency Distribution
▪ Histograms are good way of visualizing data
▪ With enough data points histograms may indicate
the potential distribution, multimodality, skewness.
▪ Number of bins is important.
➢ Too many bins might be very noisy
➢ Too few bins can mask out important features
Frequency Distribution Histogram of rn1 Histogram of rn2
Histogram of rn1 Histogra

10000
Histograms are good way of visualizing data

80
200
4000
▪ With enough data points histograms may indicate

8000

60
150
the potential distribution, multimodality, skewness.

3000
6000

Frequency

Frequency
Frequency

Frequency
▪

40
Number of bins is important.

100
2000
4000
➢ Too many bins might be very noisy

20
50
1000
2000
➢ Too few bins can mask out important features

0
0

0
0 5 10 0 10 20
-5 0 5 10 15 0 10 20 30 40 50 60
rn1 r
rn1 rn2
Frequency Distribution
▪ Normal distribution
▪ Skewed distribution
▪ Modality
Frequency Distribution
▪ Normal distribution
▪ Skewed distribution
▪ Modality
Frequency Distribution
▪ Normal distribution
▪ Skewed distribution
▪ Modality
Universality of normal distribution! Basis
of most statistical testing methods

99%
95%

8
Standard Deviation (σ)
99%
95%

9
Z-scores
▪ Used to convert any normal distribution, N(µ, σ) where
➢ Mean (µ) = 0
𝑋−μ
➢ Standard deviation (σ) = 1 𝑧=
σ
To the Standard normal distribution: N(0,1)
▪ Allows the use of probability tables (p-values)
▪ Important z-score: ±1.96 (removes the outer 2.5%) or a 95% CI
▪ https://www.mathsisfun.com/data/standard-normal-distribution-table.html
Z-scores
Z-scores
▪ Example:
Every year, 50,000 runners compete in the Victoria Park Fun Run. They run 10 kilometres. The average finishing
time is 55 minutes, with a standard deviation of 10 minutes. Fred and Wilma completed the race in 61 and 51
minutes, respectively. Barney and Betty had finishing times with z-scores of -0.3 and 0.7, respectively.

List the runners in order, starting with the fastest runner and ending with the slowest runner.
Z-scores
▪ Example:
This problem can be solved by converting Fred and Wilma's raw scores into z-scores. To do this, we use the z-
score equation:
𝑋 − 𝑋ത
𝑧=
𝑠
where z is the z-score, X is the runner's raw score, X is the mean finishing time, and s is the standard deviation of
finishing times.
Fred's z-score
= ( 61-55) / 10 = 0.60
Wilma's z-score
= ( 51-55) / 10 = - 0.4
Based on z-scores, we can order the runners from fastest to slowest as follows: Wilma (z = -0.4), Barney (z = -0.3),
Fred (z = 0.6), and Betty (z = 0.7).
Hypothesis Testing
▪ Null hypothesis (H0)
▪ Experimental hypothesis or alternative hypothesis (H1)
▪ The null hypothesis must take the opposite to the experimental hypothesis
▪ Collect data and seek evidence against H0 as a way of bolstering H1
(deduction)
P-values
▪ The probability that the observed test statistic is equal to or more extreme, than the observed
result when H0 is true
▪ The p-value is used in the context of null hypothesis testing in order to quantify the idea of
statistical significance of evidence
▪ P-value will answer the question: What is the probability of the observed test statistic when H0
is true?
▪ Smaller P-values provide stronger and stronger evidence against H0
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation was 18 kg. Test whether mean body weight in the population now differs.
▪ Null hypothesis H0: μ = 78 (“no difference”)
▪ The alternative hypothesis can be either
➢ H1: μ > 78 (one-sided test)
➢ H1 : μ ≠ 78 (two-sided test)
P-values: One sided test
▪ The critical value is either + or -, but not
both.

▪ In this case, you would have statistical

significance (p < .05) if z ≥ 1.645.

17
P-values: Two sided test
▪ The critical value is the number that
separates the “blue zone” from the
middle (± 1.96 this example)

▪ To be statistically significant the z-score

needs to be in the “blue zone”

18
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation was 18 kg. Test whether mean body weight in the population now differs.
▪ Null hypothesis H0: μ = 78 (“no difference”)
▪ The alternative hypothesis can be either
➢ H1: μ > 78 (one-sided test)
➢ H1 : μ ≠ 78 (two-sided test)
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation was 18 kg. Test whether mean body weight in the population now differs.
▪ Null hypothesis H0: μ = 78 (“no difference”)
▪ The alternative hypothesis can be either
➢ H1: μ > 78 (one-sided test)
➢ H1 : μ ≠ 78 (two-sided test)
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation for the population was 18 kg.
▪ A sample was taken from 64 people, finding a mean weight of 80kg
z-score = (80-78) / (18/√64) = 0.89
Probability tables can then be used to ascertain the P-value: 0.19
Is this strong evidence for or against the null hypothesis, H0?
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation for the population was 18 kg.
▪ Another sample was taken from 64 people, finding a mean weight of 83kg
z-score = (83-78) / (18/√64) = 2.22
Probability tables can then be used to ascertain the P-value: 0.01
Is this strong evidence for or against the null hypothesis, H0?
P-values
▪ Example: In the 1970s, 20–29 year old men in the U.K. had a mean body weight of 78kg.
Standard deviation for the population was 18 kg.

▪ What if we consider the two-sided test, where the P-value is 0.02

Is this strong evidence for or against the null hypothesis, H0?

▪ If one-sided P = 0.01, then two-sided P = 2 × 0.01 = 0.02.

Is this strong evidence for or against the null hypothesis, H0?
P-value or α-level??
α-level P-value

▪ Set BEFORE we collect data, run ▪ Calculated AFTER we gather the data
statistics ▪ The calculated probability of a mistake
▪ Defines how much of an error we are by saying it works
willing to make to say we made a ▪ AKA: level of significance
difference ▪ Describes the percent of the
▪ If we’re wrong, it’s an α error or Type 1 population/area under the curve (in the
error tail) that is beyond our statistic
P-value or α-level??
▪ Let α ≡ probability of erroneously rejecting H0
▪ Set α threshold (e.g., let α = .10, .05, or whatever)
▪ Reject H0 when P ≤ α

▪ Retain H0 when P > α

▪ Example: Set α = .10. Find P = 0.27  retain H0

▪ Example: Set α = .01. Find P = .001  reject H0
β-level
▪ Let β ≡ probability of erroneously retaining H0

▪ Two types of decision errors:

➢ Type I error (α) = erroneous rejection of true H0

➢ Type II error (β) = erroneous retention of false H0

Power
▪ β ≡ probability of a Type II error

➢ β = Pr(retain H0 | H0 false) (the “|” is read as “given”)

▪ 1 – β = “Power” ≡ probability of avoiding a Type II error

➢ 1– β = Pr(reject H0 | H0 false)

Hypothesis Testing: By: Janice Galus Cordova
75% (4)
Hypothesis Testing: By: Janice Galus Cordova
22 pages
Hypothesis Testing For One Population Parameter - Samples
100% (1)
Hypothesis Testing For One Population Parameter - Samples
68 pages
Elements of Nonlinear Series Analysis and Forecasting PDF
100% (8)
Elements of Nonlinear Series Analysis and Forecasting PDF
626 pages
Baed-Rsch2122 Inquiries, Investigations and Immersion: Week 11-20 By: Marygenkie 100% Legit
50% (2)
Baed-Rsch2122 Inquiries, Investigations and Immersion: Week 11-20 By: Marygenkie 100% Legit
9 pages
Financial Risk Analysis Project Report Financial Risk Analysis Project Report
100% (2)
Financial Risk Analysis Project Report Financial Risk Analysis Project Report
29 pages
ch11 Heteroscedasticity
No ratings yet
ch11 Heteroscedasticity
31 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
36 pages
Quiz Bank BRM
100% (2)
Quiz Bank BRM
118 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
60 pages
Things To Know PDF
No ratings yet
Things To Know PDF
56 pages
Hypothesis Testing 1
No ratings yet
Hypothesis Testing 1
2 pages
Purposive Sampling Also Known As Judgmental
No ratings yet
Purposive Sampling Also Known As Judgmental
3 pages
Normal Distribution and Statistical Hypothesis
No ratings yet
Normal Distribution and Statistical Hypothesis
4 pages
Navidi ch6
No ratings yet
Navidi ch6
82 pages
Hypothesis Testing 15pages
No ratings yet
Hypothesis Testing 15pages
15 pages
Basic Statistics
No ratings yet
Basic Statistics
101 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
Lecture8 Hypothesistest Pairs PDF
No ratings yet
Lecture8 Hypothesistest Pairs PDF
15 pages
Statistical Foundations: SOST70151 - LECTURE 7
No ratings yet
Statistical Foundations: SOST70151 - LECTURE 7
46 pages
W7 Lecture7
No ratings yet
W7 Lecture7
19 pages
Lecture08 Hypothesis Testing Inf Stats FA24
No ratings yet
Lecture08 Hypothesis Testing Inf Stats FA24
29 pages
Z Teast
No ratings yet
Z Teast
32 pages
22 Hypothesis 2
No ratings yet
22 Hypothesis 2
36 pages
Gerstman PP09
No ratings yet
Gerstman PP09
36 pages
Slides CH 14
No ratings yet
Slides CH 14
50 pages
Navidi ch6
No ratings yet
Navidi ch6
82 pages
Module 2 - Hypothesis Testing - Afterclass
No ratings yet
Module 2 - Hypothesis Testing - Afterclass
37 pages
02 - Statistical Inference
No ratings yet
02 - Statistical Inference
10 pages
Lect4 Hypothesis Tests
No ratings yet
Lect4 Hypothesis Tests
20 pages
Statistical Inferences
No ratings yet
Statistical Inferences
46 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
36 pages
Hyp
No ratings yet
Hyp
19 pages
11 Hypothesis Testing
No ratings yet
11 Hypothesis Testing
118 pages
Tests of Significance Notes PDF
No ratings yet
Tests of Significance Notes PDF
12 pages
Lecture III
No ratings yet
Lecture III
52 pages
Chapter 6
No ratings yet
Chapter 6
28 pages
Chapter 7 Hypothesis Testing and Sample Size Determination - 2
No ratings yet
Chapter 7 Hypothesis Testing and Sample Size Determination - 2
69 pages
Biostat Week4 Lecture 2024B 13427701
No ratings yet
Biostat Week4 Lecture 2024B 13427701
57 pages
23 +Lecture23+MAT361+ (08+APRIL+2025)
No ratings yet
23 +Lecture23+MAT361+ (08+APRIL+2025)
51 pages
1 Vocab Reasoning
No ratings yet
1 Vocab Reasoning
3 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
35 pages
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
No ratings yet
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
22 pages
Basics of Hypothesis Testing
No ratings yet
Basics of Hypothesis Testing
42 pages
Basics of Hypothesis Testing
No ratings yet
Basics of Hypothesis Testing
36 pages
Basics of Hypothesis Testing: October 19
No ratings yet
Basics of Hypothesis Testing: October 19
36 pages
Biostats Midterms
No ratings yet
Biostats Midterms
4 pages
Psych Stats Reviewer
No ratings yet
Psych Stats Reviewer
21 pages
Chapter 5
No ratings yet
Chapter 5
65 pages
Hypothesis Test
No ratings yet
Hypothesis Test
23 pages
Chapter 21
No ratings yet
Chapter 21
29 pages
Basics of Hypothesis Testing
No ratings yet
Basics of Hypothesis Testing
36 pages
Hypothesis Test
No ratings yet
Hypothesis Test
35 pages
09 Uji Hipotesis
No ratings yet
09 Uji Hipotesis
45 pages
P-Values Explained by Data Scientist For Data Scientists
No ratings yet
P-Values Explained by Data Scientist For Data Scientists
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
41 pages
Basics of Hypothesis Testing
No ratings yet
Basics of Hypothesis Testing
36 pages
02 - Statistical Inference
No ratings yet
02 - Statistical Inference
10 pages
Hypothesis Testing With Z Tests
No ratings yet
Hypothesis Testing With Z Tests
32 pages
Testing
No ratings yet
Testing
29 pages
Inferential Statistics Powerpoint
No ratings yet
Inferential Statistics Powerpoint
65 pages
Unit 7 - Hypothesis Testing
No ratings yet
Unit 7 - Hypothesis Testing
33 pages
Chapter Five Hypothesis Testing
No ratings yet
Chapter Five Hypothesis Testing
50 pages
Testing Hypotheses About Proportions
No ratings yet
Testing Hypotheses About Proportions
26 pages
Wa0102.
No ratings yet
Wa0102.
52 pages
Sip PDF
No ratings yet
Sip PDF
54 pages
Gas Prod
100% (3)
Gas Prod
24 pages
Fournier - Consumer Brand Relationship PDF
No ratings yet
Fournier - Consumer Brand Relationship PDF
32 pages
What Is A Shell Corporation
No ratings yet
What Is A Shell Corporation
25 pages
Module 3 Regression Notes
No ratings yet
Module 3 Regression Notes
3 pages
Lesson 13 Activity 16 BUTALIDQUEENIE
No ratings yet
Lesson 13 Activity 16 BUTALIDQUEENIE
4 pages
Statistics Study Group 1
No ratings yet
Statistics Study Group 1
3 pages
Annotated Bibliography Family Involvement
No ratings yet
Annotated Bibliography Family Involvement
18 pages
Minna No Nihongo 1 - Anglais-18
No ratings yet
Minna No Nihongo 1 - Anglais-18
1 page
The Professional Geographer: To Cite This Article: Trisalyn A. Nelson (2012) Trends in Spatial Statistics, The
No ratings yet
The Professional Geographer: To Cite This Article: Trisalyn A. Nelson (2012) Trends in Spatial Statistics, The
14 pages
NX0472 / LD0472: Developing Global Management Competencies I (BI Strand)
No ratings yet
NX0472 / LD0472: Developing Global Management Competencies I (BI Strand)
5 pages
Business Intelligence Assignment
No ratings yet
Business Intelligence Assignment
4 pages
Numerical Descriptive Measures
No ratings yet
Numerical Descriptive Measures
52 pages
Minna No Nihongo 1 - Anglais-19
No ratings yet
Minna No Nihongo 1 - Anglais-19
1 page
Study of Promotional Activities at Choithram LLC
No ratings yet
Study of Promotional Activities at Choithram LLC
83 pages
Minna No Nihongo 1 - Anglais-11
No ratings yet
Minna No Nihongo 1 - Anglais-11
1 page
Data Analyst Information
No ratings yet
Data Analyst Information
15 pages
Shreyas Report
No ratings yet
Shreyas Report
11 pages
Chapter 2 Arranging and Collecting Data
No ratings yet
Chapter 2 Arranging and Collecting Data
5 pages
Data Science and Applications Notes
No ratings yet
Data Science and Applications Notes
4 pages
Intervention Analysis
No ratings yet
Intervention Analysis
37 pages
Sas Sur
No ratings yet
Sas Sur
9 pages
Probability: PSYB07 Gabriel Baylon October 2, 2013
No ratings yet
Probability: PSYB07 Gabriel Baylon October 2, 2013
9 pages
DJW175
No ratings yet
DJW175
3 pages
HOMESHA and Meng Worda
No ratings yet
HOMESHA and Meng Worda
14 pages
Dummy and Mediating
No ratings yet
Dummy and Mediating
12 pages
Correlating Human Traits and Cybersecurity Behavior Intentions
No ratings yet
Correlating Human Traits and Cybersecurity Behavior Intentions
15 pages
Two-Way Anova Interaction
No ratings yet
Two-Way Anova Interaction
9 pages

Lecture 2 1

Uploaded by

Lecture 2 1

Uploaded by

BIOE70037 - Computational and Statistical

Methods for Research 2023-2024

Lecture 2: Data analysis

Chiu Fan Lee | [email protected]

▪ In this case, you would have statistical

▪ To be statistically significant the z-score

▪ What if we consider the two-sided test, where the P-value is 0.02

▪ If one-sided P = 0.01, then two-sided P = 2 × 0.01 = 0.02.

▪ Retain H0 when P > α

▪ Example: Set α = .10. Find P = 0.27  retain H0

▪ Two types of decision errors:

➢ Type II error (β) = erroneous retention of false H0

➢ β = Pr(retain H0 | H0 false) (the “|” is read as “given”)

▪ 1 – β = “Power” ≡ probability of avoiding a Type II error

You might also like