0% found this document useful (0 votes)

94 views35 pages

Hypothesis Testing

Here are the steps to perform a two-proportion z-test in Python: 1. Define the null and alternative hypotheses: H0: p1 - p2 = 0 Ha: p1 - p2 ≠ 0 2. Calculate the sample proportions: p1 = 36.8% = 0.368 p2 = 38.9% = 0.389 3. Calculate the standard error: se = sqrt(p1(1-p1)/247 + p2(1-p2)/308) 4. Calculate the z-statistic: z = (p1 - p2) / se 5. Calculate the p-value from

Uploaded by

Sparsh Vijayvargia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views35 pages

Hypothesis Testing

Uploaded by

Sparsh Vijayvargia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Hypothesis Testing

1
What is hypothesis testing ?

Hypothesis testing is a statistical method that is used in making statistical decisions

using experimental data. Hypothesis Testing is basically an assumption that we make
about the population parameter.

Ex : You say avg height of citizens of city is more than 5.8 ft.
Votes for the politician will be more than 60%

All those example we assume need some statistic way to prove those. we need some
mathematical conclusion what ever we are assuming is true.

2
Example

Process A Process
✔It is claimed that a process has been 89.7 B 84.7
improved in yield by bringing a 81.4 86.1
change in an important factor X. 84.5 83.2
Yield data are collected from old and 84.8 91.9

new processes. 87.3 86.3

79.7 79.3
✔Random samples are drawn from
85.1 82.6
yield data from old process A and 81.7 89.1
improved process B. 83.7 83.7
84.5 88.5

“Is there real difference between Process A and Process B?”

3
Example: Hypothesis Testing
⮚ Real Question:
Can we say that the yield of improved Process B is greater than old Process
A?

⮚ Descriptive Statistics
Variable Process N Mean Std. Dev.
Yield A 10 84.24 2.90
B 10 85.54 3.65

⮚ Statistical Question:
Is there a statistically significant difference between mean of Process B (85.54)
and mean of Process A (84.24)? Or, is this difference in mean just due to
chance?
4
Hypothesis Testing
Example: Medicine B for treating
Develop the hypothesis for population
headache that is newly developed by a
and make statistical decision by
pharmaceutical company has 30
determining the acceptance of
minutes longer effect than existing
hypothesis using sample data.
Medicine A.
• Null Hypothesis (H0): Argument • H0 : Medicine A and B have same
made so far, or hypothesis saying that effect
there is no change or difference • H1 : Medicine B has 30 minutes
• Alternative Hypothesis (H1): New longer effect than Medicine A
argument, that is a hypothesis that
you want to prove with solid ground
obtained from sample

5
Procedure of Hypothesis Testing

The steps for hypothesis tests are as follows:

1. Define null and alternative hypotheses.
2. Identify the test statistic to be used for testing the validity of the null hypothesis,
for example, Z-test or t-test.
3. Decide the significance value (Alpha). Typical value used for a is 0.05.
4. Calculate the p-value (probability value), which is the conditional probability of
observing the
test statistic value when the null hypothesis is true. We will use the functions
provided in scipy.stats module for calculating the p-value.
5. Take the decision to reject or retain the null hypothesis based on the p-value and
the significance
value a.
6
Procedure of Hypothesis Testing

7
1-tail, 2-Tail
1-Sample, 2-Sample

8
9
from scipy import stats
stats.norm.cdf(z)
stats.t.cdf(t,df=10)
10
11
12
Normal Distribution codes

13
Student’ t- Distribution codes

14
Simple Exercise

Conducting a hypothesis test is a bit like putting accused person on trial in front of
a jury.
The jury assumes that the accused person is innocent unless there is strong
evidence against him, but even after considering the evidence,

it’s still possible for the jury to make wrong decisions

15
Simple Exercise

A accused person is on trial for a crime, and you’re on the jury. The jury’s
task is to assume the prisoner is innocent, but if there’s enough
evidence against him, they need to convict him.

1. In the trial, what’s the null hypothesis?

2. What’s the alternate hypothesis?
3. In what ways can the jury make a verdict that’s correct?
4. In what ways can the jury make a verdict that’s incorrect?

16
Simple Exercise

In the trial, what’s the null hypothesis?

The null hypothesis is that the prisoner is innocent, as that is what we
have to assume until there’s proof otherwise.

What’s the alternate hypothesis?

The alternate hypothesis is that the prisoner is guilty. In other words, if
there’s sufficient proof that the prisoner is not innocent, then we’ll
accept that he’s guilty and convict him.

17
Simple Exercise

In what ways can the jury make a verdict that’s correct?

We can make a correct verdict if:
a) The prisoner is innocent, and we find him innocent.
b) The prisoner is guilty, and we find him guilty.

In what ways can the jury make a verdict that’s incorrect?

We can make an incorrect verdict if:
a) The prisoner is innocent, and we find him guilty.
b) The prisoner is guilty, and we find him innocent.

18
The errors we can make when conducting a hypothesis test are the same sort
of errors we could make when putting a prisoner on trial

Hypothesis tests are basically tests where you take a claim and put it on trial
by assessing the evidence against it. If there’s sufficient evidence against it,
you reject it, but if there’s insufficient evidence against it, you accept it.

You may correctly accept or reject the null hypothesis, but even considering
the evidence, it’s also possible to make an error. You may reject a valid null
hypothesis, or you might accept it when it’s actually false

19
Statisticians have special names for these types of errors.

A Type I error is :

when you wrongly reject a true null hypothesis (Punished a innocent guy), and

A Type II error is :

when you wrongly accept a false null hypothesis (Let guilty go free).

20
ERRORS

Actual
Situation

21
22
23
Let us solve the problem

Mean =4.0
Standard deviation
=3
Sample Size =50
Sample mean =4.6
import scipy
import numpy as np

T statistic = (4-4.6)/(3/np.sqrt(50))

2*stats.t.cdf(-1.41,df=49)

24
25
26
Let us do it in Python

scipy.stats.ttest_1samp(array,m
u)

27
One-sample and one tail t-tests

Ex. An outbreak of Salmonella-related illness was attributed to ice cream

produced at a certain factory. Scientists measured the level of Salmonella in 9
randomly sampled batches of ice cream. The levels (in MPN/g) were

0.593, 0.142, 0.329, 0.691, 0.231, 0.793, 0.519, 0.392, 0.418

Is there evidence that the mean level of Salmonella in the ice cream is
greater than 0.3 MPN/g?

Let us try in Python

28
Let be the mean level of Salmonella in all batches of ice cream. Here
the hypothesis of interest can be expressed as:

H0: <= 0.3

Ha: > 0.3

Data = pd.Series([0.593, 0.142, 0.329, 0.691, 0.231, 0.793, 0.519, 0.392, 0.418])
scipy.stats.ttest_1samp(data,0.3)

29
Two-sample t-tests
Ex. 6 subjects were given a drug (treatment group) and an additional 6 subjects a
placebo (control group). Their reaction time to a stimulus was measured (in ms).
We want to perform a two-sample t-test for comparing the means of the
treatment and control groups.

Control : 91, 87, 99, 77, 88, 91

Treat :101, 110, 103, 93, 99, 104

30
Let Mu1 be the mean of the population taking medicine and Mu2 the mean of
the untreated population. Here the hypothesis of interest can be expressed as:

H0: Mu1-Mu2=0
Ha: Mu1-Mu2 !=0

Control=pd.Series([91, 87, 99, 77, 88, 91])

Treat =pd.Series([101, 110, 103, 93, 99, 104])
stats.ttest_ind( control,Treat)

Ttest_indResult(statistic=-3.445612673536487, pvalue=0.006272124350809803)
31
2 Proportion t test
Usecase : Is there a significant difference between the population proportions of state 1
and state 2 who report that they have been placed immediately after education?

Populations: All Students who have completed graduation and Post graduation in both
both states
Parameter of Interest: p1 — p2, where p1 = state1 and p2 = state2

Data: 247 students from state 1. 36.8% of students report that they have got the job.
308 students from state 2. 38.9% of students report that they have got the job.

Hypothesis Definition:
Null Hypothesis: p1 - p2 = 0
Alternative Hypothesis: p1 -p2 ≠ 0
The difference in population proportion needs t-test. Also, the population
follows a binomial distribution here. We can just pass on the two population
quantities with the appropriate binomial distribution parameters to the t-test
function

We can use the ttest_ind() function from Statsmodels.

The function returns three values: (a) test statistic, (b) p-value of the t-test, and
(c) degrees of freedom used in the t-test

Data Given:
n1 = 247
p1 = .37

n2 = 308
p2 = .39
Thank you

Hypothesis Testing
No ratings yet
Hypothesis Testing
86 pages
Biostateunit 3
No ratings yet
Biostateunit 3
102 pages
Mil q1m8 Intellectual Property
No ratings yet
Mil q1m8 Intellectual Property
58 pages
Unit 4 Statistical Testing and Modeling in R
No ratings yet
Unit 4 Statistical Testing and Modeling in R
25 pages
Chapter 2 Defining The Research Problem
No ratings yet
Chapter 2 Defining The Research Problem
17 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
Module 3 Lesson 1 Selection of A Research Problem
No ratings yet
Module 3 Lesson 1 Selection of A Research Problem
30 pages
Q4 WEEK 1 LESSON 1 Data Analysis Method - Discussion
No ratings yet
Q4 WEEK 1 LESSON 1 Data Analysis Method - Discussion
54 pages
Chapter 4 - Engineering Design
No ratings yet
Chapter 4 - Engineering Design
26 pages
Introduction To Statistical Hypothesis Testing in R
No ratings yet
Introduction To Statistical Hypothesis Testing in R
8 pages
Multicultural Diversity Dance Performance Rubric
No ratings yet
Multicultural Diversity Dance Performance Rubric
3 pages
Presenting and Describing Data
No ratings yet
Presenting and Describing Data
7 pages
Electron Configuration
100% (1)
Electron Configuration
39 pages
Photosynthesis and Cellular Respration
No ratings yet
Photosynthesis and Cellular Respration
13 pages
Hypothesis Testing in Research Methodolo PDF
No ratings yet
Hypothesis Testing in Research Methodolo PDF
3 pages
Research Methodology 3
No ratings yet
Research Methodology 3
29 pages
Researchdesign 161020092154
No ratings yet
Researchdesign 161020092154
44 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Statistical Graphics Procedures by Example Effective Graphs Using SAS by Sanjay Matange, Dan Heath
No ratings yet
Statistical Graphics Procedures by Example Effective Graphs Using SAS by Sanjay Matange, Dan Heath
371 pages
PR2 Lesson 7 Hypothesis Testing
No ratings yet
PR2 Lesson 7 Hypothesis Testing
59 pages
Electron Configuration
100% (1)
Electron Configuration
23 pages
Research 1 Q1 Module1 Nature of Reserach
No ratings yet
Research 1 Q1 Module1 Nature of Reserach
40 pages
Mortality Rate
No ratings yet
Mortality Rate
5 pages
Statistics For Health Data Science An Organic Approach
No ratings yet
Statistics For Health Data Science An Organic Approach
238 pages
Dokumen - Tips - Spss Lecture Notes
100% (1)
Dokumen - Tips - Spss Lecture Notes
58 pages
1 Research Process
100% (1)
1 Research Process
35 pages
Research Methodology Lecture 1
No ratings yet
Research Methodology Lecture 1
42 pages
Formulatinghypotheses 110911135920 Phpapp02
No ratings yet
Formulatinghypotheses 110911135920 Phpapp02
53 pages
Distance Time and Speed
No ratings yet
Distance Time and Speed
75 pages
Online Lesson Plan
No ratings yet
Online Lesson Plan
10 pages
Chapter 1 Data Analysis
No ratings yet
Chapter 1 Data Analysis
18 pages
Variables Worksheet
0% (2)
Variables Worksheet
2 pages
Introduction To Statistical Computing in Clinical Research: Biostatistics 212
No ratings yet
Introduction To Statistical Computing in Clinical Research: Biostatistics 212
39 pages
Test Questions For Grade 11
No ratings yet
Test Questions For Grade 11
10 pages
Taxonomy
No ratings yet
Taxonomy
46 pages
2.descriptive Statistics-Measures of Central Tendency
100% (1)
2.descriptive Statistics-Measures of Central Tendency
25 pages
SAS TXT Import
No ratings yet
SAS TXT Import
13 pages
Hypothesis Testing: SUBJECT: Statistics and Probability Subject Teacher: Marilou A. Basilio
No ratings yet
Hypothesis Testing: SUBJECT: Statistics and Probability Subject Teacher: Marilou A. Basilio
29 pages
Statistics For College Students-Part 2
100% (1)
Statistics For College Students-Part 2
43 pages
Camm 3e Ch03 PPT PDF
No ratings yet
Camm 3e Ch03 PPT PDF
66 pages
Atomic Model
100% (1)
Atomic Model
17 pages
Navidi ch6
No ratings yet
Navidi ch6
82 pages
Master of Statistics
100% (1)
Master of Statistics
24 pages
Non Deterministic Finite Automata
No ratings yet
Non Deterministic Finite Automata
30 pages
Test of Hypothesis
100% (1)
Test of Hypothesis
10 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
Week1a Descriptive Vs Inferential
No ratings yet
Week1a Descriptive Vs Inferential
26 pages
Biol 101 Lecture 01-Scientific Method PDF
No ratings yet
Biol 101 Lecture 01-Scientific Method PDF
35 pages
Formulating RP Revised
No ratings yet
Formulating RP Revised
69 pages
Sample Mean Distribution
No ratings yet
Sample Mean Distribution
10 pages
10 Challenging Problems in Data Mining Research
No ratings yet
10 Challenging Problems in Data Mining Research
8 pages
A Lesson 1 Introduction To Statistics & SPSS
100% (1)
A Lesson 1 Introduction To Statistics & SPSS
8 pages
Frequency Distribution For Categorical Data
No ratings yet
Frequency Distribution For Categorical Data
6 pages
Module 1.1 Stata For Beginners
100% (1)
Module 1.1 Stata For Beginners
3 pages
Chapter 1 Eqt 271 (Part 1) : Basic Statistics
No ratings yet
Chapter 1 Eqt 271 (Part 1) : Basic Statistics
69 pages
Review Article: Data Mining For The Internet of Things: Literature Review and Challenges
No ratings yet
Review Article: Data Mining For The Internet of Things: Literature Review and Challenges
14 pages
Sample Problems With Answers For Measures of Variability
No ratings yet
Sample Problems With Answers For Measures of Variability
1 page
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
26 pages
Analysis of Quantitative Research
No ratings yet
Analysis of Quantitative Research
3 pages
Encoded Problems (StatsFinals)
No ratings yet
Encoded Problems (StatsFinals)
4 pages
From GLM To GLIMMIX-Which Model To Choose
No ratings yet
From GLM To GLIMMIX-Which Model To Choose
7 pages
Glimmix
No ratings yet
Glimmix
244 pages
Statistics
No ratings yet
Statistics
116 pages
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
No ratings yet
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
20 pages
Statistics 622: Calibration
No ratings yet
Statistics 622: Calibration
25 pages
Day 02-Random Variable and Probability - Part (I)
No ratings yet
Day 02-Random Variable and Probability - Part (I)
34 pages
Deepseek 02495v1
No ratings yet
Deepseek 02495v1
42 pages
Exploratory Data Analysis-1 (EDA-1)
No ratings yet
Exploratory Data Analysis-1 (EDA-1)
38 pages
Sas Procs
No ratings yet
Sas Procs
8 pages
Paper 3
No ratings yet
Paper 3
22 pages
Workshop On Statistical Mediation and Moderation: Statistical Mediation
No ratings yet
Workshop On Statistical Mediation and Moderation: Statistical Mediation
54 pages
CH 14 .....
No ratings yet
CH 14 .....
36 pages
103-Quantitative Techniques For Management: Bba 1 Semester
No ratings yet
103-Quantitative Techniques For Management: Bba 1 Semester
29 pages
Chapter 11
No ratings yet
Chapter 11
35 pages
HASTS201 Tut 3
No ratings yet
HASTS201 Tut 3
2 pages
Inferential Statistics
No ratings yet
Inferential Statistics
21 pages
ASM Question Paper
No ratings yet
ASM Question Paper
2 pages
Advanced Statistical Methods and Data Analytics For Research - Hypothesis Testing and SPSS - by Prof.M.guruprasad
No ratings yet
Advanced Statistical Methods and Data Analytics For Research - Hypothesis Testing and SPSS - by Prof.M.guruprasad
38 pages
Business Statistics Assignment
No ratings yet
Business Statistics Assignment
4 pages
GRADE 12 - Print Players - Quizizz
No ratings yet
GRADE 12 - Print Players - Quizizz
22 pages
FBA 310 - (Business Statistics) Assignment Questions - FIN'22
No ratings yet
FBA 310 - (Business Statistics) Assignment Questions - FIN'22
3 pages
Asistensi Statistik1
No ratings yet
Asistensi Statistik1
24 pages
Lecture 21: Model Selection 1 Choosing Models
No ratings yet
Lecture 21: Model Selection 1 Choosing Models
14 pages
Determination of Ripening Stages and Nutritional Content of Tomatoes Using Color Space Conversion Algorithm, Processed Through Raspberry Pi
No ratings yet
Determination of Ripening Stages and Nutritional Content of Tomatoes Using Color Space Conversion Algorithm, Processed Through Raspberry Pi
42 pages
T Test
No ratings yet
T Test
7 pages
1stSemAY2021-2022 - OBE FLEXIBLE LEARNING COURSE PLAN FOR BAC 04-18 APPLIED STATISTICS IN BUSINESS
No ratings yet
1stSemAY2021-2022 - OBE FLEXIBLE LEARNING COURSE PLAN FOR BAC 04-18 APPLIED STATISTICS IN BUSINESS
17 pages
Factorial Analysis of Variance PDF
No ratings yet
Factorial Analysis of Variance PDF
2 pages
7 Minitab Regression
No ratings yet
7 Minitab Regression
18 pages
Confidence Intervals For The Odds Ratio in Logistic Regression With Two Binary X's PDF
No ratings yet
Confidence Intervals For The Odds Ratio in Logistic Regression With Two Binary X's PDF
10 pages
BONUS Week 3 Homework - Fall 2021
No ratings yet
BONUS Week 3 Homework - Fall 2021
4 pages
Actual Base+Trend Month Number+Seasonal Index: Airline Miles Data
No ratings yet
Actual Base+Trend Month Number+Seasonal Index: Airline Miles Data
3 pages
Multiple-Choice Questions: Describing Data: Numerical
No ratings yet
Multiple-Choice Questions: Describing Data: Numerical
4 pages
Regression Through The Origin
No ratings yet
Regression Through The Origin
5 pages

Hypothesis Testing

Uploaded by

Hypothesis Testing

Uploaded by

Hypothesis Testing

Hypothesis testing is a statistical method that is used in making statistical decisions

new processes. 87.3 86.3

“Is there real difference between Process A and Process B?”

The steps for hypothesis tests are as follows:

it’s still possible for the jury to make wrong decisions

1. In the trial, what’s the null hypothesis?

In the trial, what’s the null hypothesis?

What’s the alternate hypothesis?

In what ways can the jury make a verdict that’s correct?

In what ways can the jury make a verdict that’s incorrect?

Ex. An outbreak of Salmonella-related illness was attributed to ice cream

0.593, 0.142, 0.329, 0.691, 0.231, 0.793, 0.519, 0.392, 0.418

Let us try in Python

H0: <= 0.3

Control : 91, 87, 99, 77, 88, 91

Control=pd.Series([91, 87, 99, 77, 88, 91])

We can use the ttest_ind() function from Statsmodels.

You might also like