0% found this document useful (0 votes)

31 views41 pages

Chapter 3

The document discusses performing a one-sample proportion test and a two-sample proportion test to analyze data from Stack Overflow users. It provides the formulas and steps to calculate the test statistics and p-values for each test. A chi-square test of independence is also demonstrated to test if two variables, such as hobbyist status and age category, are independent.

Uploaded by

Komi David ABOTSITSE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views41 pages

Chapter 3

Uploaded by

Komi David ABOTSITSE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

One-sample

proportion tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Chapter 1 recap
Is a claim about an unknown population proportion feasible?

1. Standard error of sample statistic from bootstrap distribution

2. Compute a standardized test statistic

3. Calculate a p-value

4. Decide which hypothesis made most sense

Now, calculate the test statistic without using the bootstrap distribution

HYPOTHESIS TESTING IN PYTHON

Standardized test statistic for proportions
p: population proportion (unknown population parameter)

p^: sample proportion (sample statistic)

p0 : hypothesized population proportion

p^ − mean( p^) p^ − p
z= =
SE( p^) SE( p^)
Assuming H0 is true, p = p0 , so
p^ − p0
z=
SE( p^)

HYPOTHESIS TESTING IN PYTHON

Simplifying the standard error calculations
SE p^ = √
p0 ∗ (1 − p0 )
→ Under H0 , SE p^ depends on hypothesized p0 and sample size n
n
Assuming H0 is true,

p^ − p0
z=
√ p0 ∗ (1 − p0 )
n
^ and n) and the hypothesized parameter (p0 )
Only uses sample information ( p

HYPOTHESIS TESTING IN PYTHON

Why z instead of t?
(x̄child − x̄adult )
t=
√
s2child s2adult
+
nchild nadult

s is calculated from x̄
x̄ estimates the population mean
s estimates the population standard deviation
↑ uncertainty in our estimate of the parameter
t-distribution - fatter tails than a normal distribution

p^ only appears in the numerator, so z-scores are fine

HYPOTHESIS TESTING IN PYTHON

Stack Overflow age categories
H0 : Proportion of Stack Overflow users under thirty = 0.5

HA : Proportion of Stack Overflow users under thirty ≠ 0.5

alpha = 0.01

stack_overflow['age_cat'].value_counts(normalize=True)

Under 30 0.535604
At least 30 0.464396
Name: age_cat, dtype: float64

HYPOTHESIS TESTING IN PYTHON

Variables for z
p_hat = (stack_overflow['age_cat'] == 'Under 30').mean()

0.5356037151702786

p_0 = 0.50

n = len(stack_overflow)

2261

HYPOTHESIS TESTING IN PYTHON

Calculating the z-score
p^ − p0
z=
√
p0 ∗ (1 − p0 )
n

import numpy as np
numerator = p_hat - p_0
denominator = np.sqrt(p_0 * (1 - p_0) / n)
z_score = numerator / denominator

3.385911440783663

HYPOTHESIS TESTING IN PYTHON

Calculating the p-value
Two-tailed ("not equal"):

p_value = norm.cdf(-z_score) +
1 - norm.cdf(z_score)

p_value = 2 * (1 - norm.cdf(z_score))
Left-tailed ("less than"):
0.0007094227368100725
from scipy.stats import norm
p_value = norm.cdf(z_score)
p_value <= alpha

Right-tailed ("greater than"):

True

p_value = 1 - norm.cdf(z_score)

HYPOTHESIS TESTING IN PYTHON

Let's practice!
HYPOTHESIS TESTING IN PYTHON
Two-sample
proportion tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Comparing two proportions
H0 : Proportion of hobbyist users is the same for those under thirty as those at least thirty

H0 : p≥30 − p<30 = 0

HA : Proportion of hobbyist users is different for those under thirty to those at least thirty

HA : p≥30 − p<30 ≠ 0

alpha = 0.05

HYPOTHESIS TESTING IN PYTHON

Calculating the z-score
z-score equation for a proportion test:
( p^≥30 − p^<30 ) − 0
z=
SE( p^≥30 − p^<30 )
Standard error equation:

SE( p^≥30 − p^<30 ) = √

p^ × (1 − p^) p^ × (1 − p^)
+
n≥30 n<30

p^ → weighted mean of p^≥30 and p^<30

n≥30 × p^≥30 + n<30 × p^<30
p^ =
n≥30 + n<30
Only require p^≥30 , p^<30 , n≥30 , n<30 from the sample to calculate the z-score

HYPOTHESIS TESTING IN PYTHON

Getting the numbers for the z-score
p_hats = stack_overflow.groupby("age_cat")['hobbyist'].value_counts(normalize=True)

age_cat hobbyist
At least 30 Yes 0.773333
No 0.226667
Under 30 Yes 0.843105
No 0.156895
Name: hobbyist, dtype: float64

n = stack_overflow.groupby("age_cat")['hobbyist'].count()

age_cat
At least 30 1050
Under 30 1211
Name: hobbyist, dtype: int64

HYPOTHESIS TESTING IN PYTHON

Getting the numbers for the z-score
p_hats = stack_overflow.groupby("age_cat")['hobbyist'].value_counts(normalize=True)

age_cat hobbyist
At least 30 Yes 0.773333
No 0.226667
Under 30 Yes 0.843105
No 0.156895
Name: hobbyist, dtype: float64

p_hat_at_least_30 = p_hats[("At least 30", "Yes")]

p_hat_under_30 = p_hats[("Under 30", "Yes")]
print(p_hat_at_least_30, p_hat_under_30)

0.773333 0.843105

HYPOTHESIS TESTING IN PYTHON

Getting the numbers for the z-score
n = stack_overflow.groupby("age_cat")['hobbyist'].count()

age_cat
At least 30 1050
Under 30 1211
Name: hobbyist, dtype: int64

n_at_least_30 = n["At least 30"]

n_under_30 = n["Under 30"]
print(n_at_least_30, n_under_30)

1050 1211

HYPOTHESIS TESTING IN PYTHON

Getting the numbers for the z-score
p_hat = (n_at_least_30 * p_hat_at_least_30 + n_under_30 * p_hat_under_30) /
(n_at_least_30 + n_under_30)

std_error = np.sqrt(p_hat * (1-p_hat) / n_at_least_30 +

p_hat * (1-p_hat) / n_under_30)

z_score = (p_hat_at_least_30 - p_hat_under_30) / std_error

print(z_score)

-4.223718652693034

HYPOTHESIS TESTING IN PYTHON

Proportion tests using proportions_ztest()
stack_overflow.groupby("age_cat")['hobbyist'].value_counts()

age_cat hobbyist
At least 30 Yes 812
No 238
Under 30 Yes 1021
No 190
Name: hobbyist, dtype: int64

n_hobbyists = np.array([812, 1021])

n_rows = np.array([812 + 238, 1021 + 190])
from statsmodels.stats.proportion import proportions_ztest
z_score, p_value = proportions_ztest(count=n_hobbyists, nobs=n_rows,
alternative="two-sided")

(-4.223691463320559, 2.403330142685068e-05)

HYPOTHESIS TESTING IN PYTHON

Let's practice!
HYPOTHESIS TESTING IN PYTHON
Chi-square test of
independence
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Revisiting the proportion test
age_by_hobbyist = stack_overflow.groupby("age_cat")['hobbyist'].value_counts()

age_cat hobbyist
At least 30 Yes 812
No 238
Under 30 Yes 1021
No 190
Name: hobbyist, dtype: int64

from statsmodels.stats.proportion import proportions_ztest

n_hobbyists = np.array([812, 1021])
n_rows = np.array([812 + 238, 1021 + 190])
stat, p_value = proportions_ztest(count=n_hobbyists, nobs=n_rows,
alternative="two-sided")

(-4.223691463320559, 2.403330142685068e-05)

HYPOTHESIS TESTING IN PYTHON

Independence of variables
Previous hypothesis test result: evidence that hobbyist and age_cat are associated

Statistical independence - proportion of successes in the response variable is the same

across all categories of the explanatory variable

HYPOTHESIS TESTING IN PYTHON

Test for independence of variables
import pingouin
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x='hobbyist',
y='age_cat', correction=False)
print(stats)

test lambda chi2 dof pval cramer power

0 pearson 1.000000 17.839570 1.0 0.000024 0.088826 0.988205
1 cressie-read 0.666667 17.818114 1.0 0.000024 0.088773 0.988126
2 log-likelihood 0.000000 17.802653 1.0 0.000025 0.088734 0.988069
3 freeman-tukey -0.500000 17.815060 1.0 0.000024 0.088765 0.988115
4 mod-log-likelihood -1.000000 17.848099 1.0 0.000024 0.088848 0.988236
5 neyman -2.000000 17.976656 1.0 0.000022 0.089167 0.988694

χ2 statistic = 17.839570 = (−4.223691463320559)2 = (z -score)2

HYPOTHESIS TESTING IN PYTHON

Job satisfaction and age category
stack_overflow['age_cat'].value_counts() stack_overflow['job_sat'].value_counts()

Under 30 1211 Very satisfied 879

At least 30 1050 Slightly satisfied 680
Name: age_cat, dtype: int64 Slightly dissatisfied 342
Neither 201
Very dissatisfied 159
Name: job_sat, dtype: int64

HYPOTHESIS TESTING IN PYTHON

Declaring the hypotheses
H0 : Age categories are independent of job satisfaction levels

HA : Age categories are not independent of job satisfaction levels

alpha = 0.1

Test statistic denoted χ2

Assuming independence, how far away are the observed results from the expected values?

HYPOTHESIS TESTING IN PYTHON

Exploratory visualization: proportional stacked bar plot
props = stack_overflow.groupby('job_sat')['age_cat'].value_counts(normalize=True)
wide_props = props.unstack()
wide_props.plot(kind="bar", stacked=True)

HYPOTHESIS TESTING IN PYTHON

Exploratory visualization: proportional stacked bar plot

HYPOTHESIS TESTING IN PYTHON

Chi-square independence test
import pingouin
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x="job_sat", y="age_cat")
print(stats)

test lambda chi2 dof pval cramer power

0 pearson 1.000000 5.552373 4.0 0.235164 0.049555 0.437417
1 cressie-read 0.666667 5.554106 4.0 0.235014 0.049563 0.437545
2 log-likelihood 0.000000 5.558529 4.0 0.234632 0.049583 0.437871
3 freeman-tukey -0.500000 5.562688 4.0 0.234274 0.049601 0.438178
4 mod-log-likelihood -1.000000 5.567570 4.0 0.233854 0.049623 0.438538
5 neyman -2.000000 5.579519 4.0 0.232828 0.049676 0.439419

Degrees of freedom:

(No. of response categories − 1) × (No. of explanatory categories − 1)

(2 − 1) ∗ (5 − 1) = 4

HYPOTHESIS TESTING IN PYTHON

Swapping the variables?
props = stack_overflow.groupby('age_cat')['job_sat'].value_counts(normalize=True)
wide_props = props.unstack()
wide_props.plot(kind="bar", stacked=True)

HYPOTHESIS TESTING IN PYTHON

Swapping the variables?

HYPOTHESIS TESTING IN PYTHON

chi-square both ways
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x="age_cat", y="job_sat")
print(stats[stats['test'] == 'pearson'])

test lambda chi2 dof pval cramer power

0 pearson 1.0 5.552373 4.0 0.235164 0.049555 0.437417

Ask: Are the variables X and Y independent?

Not: Is variable X independent from variable Y?

HYPOTHESIS TESTING IN PYTHON

What about direction and tails?
Observed and expected counts squared must be non-negative

chi-square tests are almost always right-tailed 1

1Left-tailed chi-square tests are used in statistical forensics to detect if a fit is suspiciously good because the
data was fabricated. Chi-square tests of variance can be two-tailed. These are niche uses, though.

HYPOTHESIS TESTING IN PYTHON

Let's practice!
HYPOTHESIS TESTING IN PYTHON
Chi-square
goodness of fit tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Purple links
How do you feel when you discover that you've already visited the top resource?

purple_link_counts = stack_overflow['purple_link'].value_counts()

purple_link_counts = purple_link_counts.rename_axis('purple_link')\
.reset_index(name='n')\
.sort_values('purple_link')

purple_link n
2 Amused 368
3 Annoyed 263
0 Hello, old friend 1225
1 Indifferent 405

HYPOTHESIS TESTING IN PYTHON

Declaring the hypotheses
hypothesized = pd.DataFrame({ purple_link prop
'purple_link': ['Amused', 'Annoyed', 'Hello, old friend', 'Indifferent'], 0 Amused 0.166667
'prop': [1/6, 1/6, 1/2, 1/6]}) 1 Annoyed 0.166667
2 Hello, old friend 0.500000
3 Indifferent 0.166667

H0 : The sample matches the hypothesized χ2 measures how far observed results are
distribution from expectations in each group

HA : The sample does not match the alpha = 0.01

hypothesized distribution

HYPOTHESIS TESTING IN PYTHON

Hypothesized counts by category
n_total = len(stack_overflow)
hypothesized["n"] = hypothesized["prop"] * n_total

purple_link prop n
0 Amused 0.166667 376.833333
1 Annoyed 0.166667 376.833333
2 Hello, old friend 0.500000 1130.500000
3 Indifferent 0.166667 376.833333

HYPOTHESIS TESTING IN PYTHON

Visualizing counts
import matplotlib.pyplot as plt

plt.bar(purple_link_counts['purple_link'], purple_link_counts['n'],
color='red', label='Observed')
plt.bar(hypothesized['purple_link'], hypothesized['n'], alpha=0.5,
color='blue', label='Hypothesized')

plt.legend()
plt.show()

HYPOTHESIS TESTING IN PYTHON

Visualizing counts

HYPOTHESIS TESTING IN PYTHON

chi-square goodness of fit test
print(hypothesized)

purple_link prop n
0 Amused 0.166667 376.833333
1 Annoyed 0.166667 376.833333
2 Hello, old friend 0.500000 1130.500000
3 Indifferent 0.166667 376.833333

from scipy.stats import chisquare

chisquare(f_obs=purple_link_counts['n'], f_exp=hypothesized['n'])

Power_divergenceResult(statistic=44.59840778416629, pvalue=1.1261810719413759e-09)

HYPOTHESIS TESTING IN PYTHON

Let's practice!
HYPOTHESIS TESTING IN PYTHON

Fresco
100% (2)
Fresco
17 pages
Z-Test For One Two Sample-1
No ratings yet
Z-Test For One Two Sample-1
9 pages
The Potentiality of Wilf Sugarcane Grass Saccharum Spontaneum in Paper Production
No ratings yet
The Potentiality of Wilf Sugarcane Grass Saccharum Spontaneum in Paper Production
45 pages
BIO401 Midterm Subjective Solved
100% (3)
BIO401 Midterm Subjective Solved
2 pages
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
No ratings yet
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
9 pages
Revision Questions
No ratings yet
Revision Questions
32 pages
Hypothesis Testing in Python
No ratings yet
Hypothesis Testing in Python
149 pages
PPC 2 Marks
100% (1)
PPC 2 Marks
23 pages
Visualization Only! Not Enough. How To Carry Out Bivariate Statistical Test in Python - by Ayobami Akiode - Geek Culture - Medium
No ratings yet
Visualization Only! Not Enough. How To Carry Out Bivariate Statistical Test in Python - by Ayobami Akiode - Geek Culture - Medium
79 pages
Final Exam of Business Statistics I at ADA University
No ratings yet
Final Exam of Business Statistics I at ADA University
14 pages
UL3
No ratings yet
UL3
2 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Everything
No ratings yet
Everything
23 pages
Confidence Interval
100% (1)
Confidence Interval
19 pages
Practical Research
No ratings yet
Practical Research
55 pages
Process-Tracing Methods Foundations and PDF
No ratings yet
Process-Tracing Methods Foundations and PDF
208 pages
A Complete Guide To Hypothesis Testing For Data Scientists Using Python - by Rashida Nasrin Sucky - Oct, 2020 - Towards Data Science
No ratings yet
A Complete Guide To Hypothesis Testing For Data Scientists Using Python - by Rashida Nasrin Sucky - Oct, 2020 - Towards Data Science
14 pages
Chapter 2 T Test
No ratings yet
Chapter 2 T Test
42 pages
Probability and Statistics - Asynch B.1
No ratings yet
Probability and Statistics - Asynch B.1
5 pages
Hypothesis Testing Statistics
No ratings yet
Hypothesis Testing Statistics
59 pages
Lab 8 - Shell
No ratings yet
Lab 8 - Shell
6 pages
One-Sample Test of Proportions: Z 1.733 One-Tailed Probability 0.042 Two-Tailed Probability 0.084
No ratings yet
One-Sample Test of Proportions: Z 1.733 One-Tailed Probability 0.042 Two-Tailed Probability 0.084
4 pages
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
No ratings yet
Week 2 Part 1 Inferential Statistics 1 Self Paced TutorialsUpload
16 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
BE186
No ratings yet
BE186
51 pages
Labsheet 7 - 241206 - 181406
No ratings yet
Labsheet 7 - 241206 - 181406
12 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Hypothesis Tests in R
No ratings yet
Hypothesis Tests in R
25 pages
Hypotesis Testing Chapter1
No ratings yet
Hypotesis Testing Chapter1
32 pages
Module3 Cse3190 FDA-1
No ratings yet
Module3 Cse3190 FDA-1
85 pages
Stats 10 F21 Lab 5
No ratings yet
Stats 10 F21 Lab 5
6 pages
Hypothesis Testing RJ
No ratings yet
Hypothesis Testing RJ
42 pages
Statistics
No ratings yet
Statistics
163 pages
Unit 4 Part 2
No ratings yet
Unit 4 Part 2
24 pages
Hypothesis Testing PDF
No ratings yet
Hypothesis Testing PDF
9 pages
Lab 04 Hypothesis Testing
No ratings yet
Lab 04 Hypothesis Testing
9 pages
14 UnknownMeans
No ratings yet
14 UnknownMeans
43 pages
Two Sample Test For Proportions - Coursera
No ratings yet
Two Sample Test For Proportions - Coursera
4 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
1.hypothesis Testing Fundamentals
No ratings yet
1.hypothesis Testing Fundamentals
34 pages
Experiment 3
No ratings yet
Experiment 3
6 pages
Z - TEST and T Test
No ratings yet
Z - TEST and T Test
45 pages
2nd Half Notes
No ratings yet
2nd Half Notes
131 pages
ML Lab Manual Experiment For College
No ratings yet
ML Lab Manual Experiment For College
11 pages
Unit 5.2 Testing Two Population Means
No ratings yet
Unit 5.2 Testing Two Population Means
24 pages
Chi-Square (And Post-Hoc) Tests in Python
No ratings yet
Chi-Square (And Post-Hoc) Tests in Python
6 pages
Stats - Hypothesis - Testing - Ipynb at Main Pik1989 - Stats GitHub
No ratings yet
Stats - Hypothesis - Testing - Ipynb at Main Pik1989 - Stats GitHub
10 pages
AD3411 - 6 To11
No ratings yet
AD3411 - 6 To11
15 pages
Chapter 4 - STAT1204 (B) (1) الحوسبة الاحصائية
No ratings yet
Chapter 4 - STAT1204 (B) (1) الحوسبة الاحصائية
13 pages
Hands On With Probability and Statistical
No ratings yet
Hands On With Probability and Statistical
9 pages
Hypothesis Testing: Objectives
No ratings yet
Hypothesis Testing: Objectives
9 pages
Pratical 11 Python DP
No ratings yet
Pratical 11 Python DP
5 pages
Datascince 2
No ratings yet
Datascince 2
90 pages
Practical 8 PDF
No ratings yet
Practical 8 PDF
3 pages
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
No ratings yet
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
15 pages
Experiment 3 - Minor
No ratings yet
Experiment 3 - Minor
3 pages
(B) Solution and Explanation
No ratings yet
(B) Solution and Explanation
5 pages
T Test, ANOVA, Chi Square Test
No ratings yet
T Test, ANOVA, Chi Square Test
26 pages
1.3kuaniti Fizik Dan Unitnya
No ratings yet
1.3kuaniti Fizik Dan Unitnya
73 pages
Lab 5
No ratings yet
Lab 5
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
10 pages
STATSCHEATSHeet
No ratings yet
STATSCHEATSHeet
5 pages
Fha Unit 2
No ratings yet
Fha Unit 2
17 pages
The Transmission Electron Microscope
No ratings yet
The Transmission Electron Microscope
73 pages
6 2hypothesis
No ratings yet
6 2hypothesis
3 pages
Power BI Case Study Meta Data Sheet-2
No ratings yet
Power BI Case Study Meta Data Sheet-2
1 page
Sarah Babb
No ratings yet
Sarah Babb
29 pages
Review (Cungco TA8)
No ratings yet
Review (Cungco TA8)
13 pages
Minutes of Research Proposal Presentation
No ratings yet
Minutes of Research Proposal Presentation
2 pages
1998 Exams
No ratings yet
1998 Exams
6 pages
BP701TP Syllabus
No ratings yet
BP701TP Syllabus
1 page
Hypothesis
No ratings yet
Hypothesis
24 pages
Chapter 13
No ratings yet
Chapter 13
129 pages
Chapter 1
No ratings yet
Chapter 1
16 pages
Edited - Mind Map Biology
No ratings yet
Edited - Mind Map Biology
1 page
Curriculum Guides Grade 5 2019-2020
No ratings yet
Curriculum Guides Grade 5 2019-2020
20 pages
Lesson 10 Simple Linear Regression and Correlation
No ratings yet
Lesson 10 Simple Linear Regression and Correlation
70 pages
Chapter 16 - MCQ
No ratings yet
Chapter 16 - MCQ
5 pages
Lecture 7 Tracking
No ratings yet
Lecture 7 Tracking
43 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Z Test For Proportion
No ratings yet
Z Test For Proportion
29 pages
Science Skills Test
No ratings yet
Science Skills Test
4 pages
Envelope No R
No ratings yet
Envelope No R
14 pages
Chapter 3
No ratings yet
Chapter 3
16 pages
Assessing PS-Garrido Et Al 2014
No ratings yet
Assessing PS-Garrido Et Al 2014
10 pages
如何撰写研究假设
100% (1)
如何撰写研究假设
6 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
DEFINITION AND NATURE OF SOCIOLOGY Intro
100% (3)
DEFINITION AND NATURE OF SOCIOLOGY Intro
2 pages
Chapter 3
No ratings yet
Chapter 3
7 pages
Chapter 3
No ratings yet
Chapter 3
12 pages
Chapter 1
No ratings yet
Chapter 1
10 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Chapter 3
No ratings yet
Chapter 3
7 pages
1 PB
No ratings yet
1 PB
8 pages
LONG QUIZ Set A
No ratings yet
LONG QUIZ Set A
3 pages
Programme Term-End Examination, 2019: No. of Printed Pages: 4
No ratings yet
Programme Term-End Examination, 2019: No. of Printed Pages: 4
4 pages
Double Exp Smoothing
No ratings yet
Double Exp Smoothing
4 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
GRE - Quantitative Reasoning: QuickStudy Laminated Reference Guide
From Everand
GRE - Quantitative Reasoning: QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

Chapter 3

Uploaded by

Chapter 3

Uploaded by

One-sample

1. Standard error of sample statistic from bootstrap distribution

4. Decide which hypothesis made most sense

HYPOTHESIS TESTING IN PYTHON

p^: sample proportion (sample statistic)

p0 : hypothesized population proportion

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

p^ only appears in the numerator, so z-scores are fine

HYPOTHESIS TESTING IN PYTHON

HA : Proportion of Stack Overflow users under thirty ≠ 0.5

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

Right-tailed ("greater than"):

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

SE( p^≥30 − p^<30 ) = √

p^ → weighted mean of p^≥30 and p^<30

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

p_hat_at_least_30 = p_hats[("At least 30", "Yes")]

HYPOTHESIS TESTING IN PYTHON

n_at_least_30 = n["At least 30"]

HYPOTHESIS TESTING IN PYTHON

std_error = np.sqrt(p_hat * (1-p_hat) / n_at_least_30 +

z_score = (p_hat_at_least_30 - p_hat_under_30) / std_error

HYPOTHESIS TESTING IN PYTHON

n_hobbyists = np.array([812, 1021])

HYPOTHESIS TESTING IN PYTHON

from statsmodels.stats.proportion import proportions_ztest

HYPOTHESIS TESTING IN PYTHON

Statistical independence - proportion of successes in the response variable is the same

HYPOTHESIS TESTING IN PYTHON

test lambda chi2 dof pval cramer power

χ2 statistic = 17.839570 = (−4.223691463320559)2 = (z -score)2

HYPOTHESIS TESTING IN PYTHON

Under 30 1211 Very satisfied 879

HYPOTHESIS TESTING IN PYTHON

HA : Age categories are not independent of job satisfaction levels

Test statistic denoted χ2

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

test lambda chi2 dof pval cramer power

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

test lambda chi2 dof pval cramer power

Ask: Are the variables X and Y independent?

Not: Is variable X independent from variable Y?

HYPOTHESIS TESTING IN PYTHON

chi-square tests are almost always right-tailed 1

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HA : The sample does not match the alpha = 0.01

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

HYPOTHESIS TESTING IN PYTHON

from scipy.stats import chisquare

HYPOTHESIS TESTING IN PYTHON

You might also like