0% found this document useful (0 votes)

57 views23 pages

Common Statistics

This document describes various statistical tests used to compare sample means and proportions from one or more populations, including: - One-sample and two-sample t-tests and z-tests to compare means of independent samples - Paired t-test to compare means of related samples - ANOVA to compare means of more than two independent samples - One-sample and two-sample proportion tests to compare sample proportions - Chi-square tests to compare variances or test independence between categorical variables - F-test to compare variances of independent samples

Uploaded by

Jeff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views23 pages

Common Statistics

Uploaded by

Jeff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 23

•A one-sample z-test is used to compare a sample mean with a hypothesized population mean when the population standard deviation

is known

•A one-sample t-test is used to compare a sample mean with a hypothesized population mean when the population standard deviation is
unknown

•A two-sample independent z-test is used to compare the sample means from two independent populations when the population standard
deviations are known

•A two-sample independent t-test is used to compare the sample means from two independent populations

•A two-sample independent t-test is used to compare the sample means from two independent populations when the population standard
deviations are unknown

•A paired t-test is used to compare the sample means from two related (dependent) populations

•An ANOVA test is used to compare the sample means from two or more populations.

•An ANOVA test is used to compare the sample means from two or more independent populations

•A one-sample proportion z-test is used to compare a sample proportion with a population proportion

•A two-sample proportion z-test is used to compare the sample proportions from two independent populations

•A chi-square test for variance is used to compare a sample variance with a population variance

•A chi-square test of independence is used to check the dependence (relationship) between two categorical variables

•An F-test of equality of variances is used to compare the sample variances from two populations

CLT – The Sample Size should be more than 30

Since the p-value is less than 0.05 (level of significance), there is enough statistical evidence to reject the null hypothesis. So, the
jeweler will reject the null hypothesis.

As the p-value (0.00086) is less than the level of significance (0.05), there is enough statistical evidence to reject the null hypothesis.
Thus, there is enough statistical evidence to conclude that the mean post-weight is less than the mean pre-weight.

As the calculated p-value (0.079) is greater than the level of significance (0.05), there is not enough statistical evidence to reject the
null hypothesis. Hence, there is not enough statistical evidence to say that the proportion of orders mailed within 72 hours after
they are received is smaller than 90%.

As the p-value is greater than the level of significance (0.05), we do not have enough evidence to reject the null hypothesis. Hence,
we do not have enough statistical evidence to say that the energy expenditure of obese and lean is different.

As the p-value is greater than the level of significance (0.05), we do not have enough statistical evidence to reject the null
hypothesis. Thus, there is not enough statistical evidence to say that the scores of the two groups of students are different.
The null hypothesis always contains some form of equality ( ) and the alternative hypothesis never contains
equality ( )

So, the valid null and alternative hypotheses are:

 If the p-value is less than the level of significance, we have enough statistical evidence to reject the null hypothesis.
 If the p-value is greater than the level of significance, we do not have enough statistical evidence to reject the null
hypothesis. Hence, we fail to reject the null hypothesis.
scipy.stats.ttest_ind
scipy.stats.ttest_ind(a, b, axis=0, equal_var=True, nan_policy='propagate', permutations=None, random
_state=None, alternative='two-sided', trim=0)[source]
Calculate the T-test for the means of two independent samples of scores.

This is a test for the null hypothesis that 2 independent samples have identical average (expected) values. This test
assumes that the populations have identical variances by default.

Parameters
a, barray_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
axisint or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
equal_varbool, optional
If True (default), perform a standard independent 2 sample test that assumes equal population variances [1]. If
False, perform Welch’s t-test, which does not assume equal population variance [2].
New in version 0.11.0.
nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
 ‘propagate’: returns nan
 ‘raise’: throws an error
 ‘omit’: performs the calculations ignoring nan values

The ‘omit’ option is not currently available for permutation tests or one-sided asympyotic tests.
permutationsnon-negative int, np.inf, or None (default), optional
If 0 or None (default), use the t-distribution to calculate p-values. Otherwise, permutations is the number of
random permutations that will be used to estimate p-values using a permutation test. If permutations equals or
exceeds the number of distinct partitions of the pooled data, an exact test is performed instead (i.e. each distinct
partition is used exactly once). See Notes for details.
New in version 1.7.0.
random_state{None, int, numpy.random.Generator,
numpy.random.RandomState}, optional
If seed is None (or np.random), the numpy.random.RandomState singleton is used. If seed is an int, a
new RandomState instance is used, seeded with seed. If seed is already a Generator or RandomState instance
then that instance is used.
Pseudorandom number generator state used to generate permutations (used only when permutations is not
None).
New in version 1.7.0.
alternative{‘two-sided’, ‘less’, ‘greater’}, optional
Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):
 ‘two-sided’: the means of the distributions underlying the samples are unequal.
 ‘less’: the mean of the distribution underlying the first sample is less than the mean of the distribution
underlying the second sample.
 ‘greater’: the mean of the distribution underlying the first sample is greater than the mean of the
distribution underlying the second sample.
New in version 1.6.0.
trimfloat, optional
If nonzero, performs a trimmed (Yuen’s) t-test. Defines the fraction of elements to be trimmed from each end of
the input samples. If 0 (default), no elements will be trimmed from either side. The number of trimmed elements
from each tail is the floor of the trim times the number of elements. Valid range is [0, .5).
New in version 1.7.
Returns
statisticfloat or array
The calculated t-statistic.
pvaluefloat or array
The p-value.
Notes

Suppose we observe two independent samples, e.g. flower petal lengths, and we are considering whether the two
samples were drawn from the same population (e.g. the same species of flower or two species with similar petal
characteristics) or two different populations.

The t-test quantifies the difference between the arithmetic means of the two samples. The p-value quantifies the
probability of observing as or more extreme values assuming the null hypothesis, that the samples are drawn from
populations with the same population means, is true. A p-value larger than a chosen threshold (e.g. 5% or 1%) indicates
that our observation is not so unlikely to have occurred by chance. Therefore, we do not reject the null hypothesis of
equal population means. If the p-value is smaller than our threshold, then we have evidence against the null hypothesis
of equal population means.

By default, the p-value is determined by comparing the t-statistic of the observed data against a theoretical t-distribution.
When 1 < permutations < binom(n, k), where

 k is the number of observations in a,

 n is the total number of observations in a and b, and
 binom(n, k) is the binomial coefficient (n choose k),
the data are pooled (concatenated), randomly assigned to either group a or b, and the t-statistic is calculated. This
process is performed repeatedly (permutation times), generating a distribution of the t-statistic under the null hypothesis,
and the t-statistic of the observed data is compared to this distribution to determine the p-value.
When permutations >= binom(n, k), an exact test is performed: the data are partitioned between the groups in
each distinct way exactly once.

The permutation test can be computationally expensive and not necessarily more accurate than the analytical test, but it
does not make strong assumptions about the shape of the underlying distribution.
Use of trimming is commonly referred to as the trimmed t-test. At times called Yuen’s t-test, this is an extension of
Welch’s t-test, with the difference being the use of winsorized means in calculation of the variance and the trimmed
sample size in calculation of the statistic. Trimming is recommended if the underlying distribution is long-tailed or
contaminated with outliers [4].
scipy.stats.ttest_rel
scipy.stats.ttest_rel(a, b, axis=0, nan_policy='propagate', alternative='two-sided')[source]
Calculate the t-test on TWO RELATED samples of scores, a and b.
This is a test for the null hypothesis that two related or repeated samples have identical average (expected)
values.

Parameters
a, barray_like
The arrays must have the same shape.
axisint or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):
 ‘propagate’: returns nan
 ‘raise’: throws an error
 ‘omit’: performs the calculations ignoring nan values

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):
 ‘two-sided’: the means of the distributions underlying the samples are unequal.
 ‘less’: the mean of the distribution underlying the first sample is less than the mean of the distribution
underlying the second sample.
 ‘greater’: the mean of the distribution underlying the first sample is greater than the mean of the
distribution underlying the second sample.
New in version 1.6.0.
Key Differences Between Null and Alternative Hypothesis

The important points of differences between null and alternative hypothesis are explained as under:

1. A null hypothesis is a statement, in which there is no relationship between two variables. An alternative
hypothesis is a statement; that is simply the inverse of the null hypothesis, i.e. there is some statistical
significance between two measured phenomenon.
2. A null hypothesis is what, the researcher tries to disprove whereas an alternative hypothesis is what the
researcher wants to prove.
3. A null hypothesis represents, no observed effect whereas an alternative hypothesis reflects, some observed
effect.
4. If the null hypothesis is accepted, no changes will be made in the opinions or actions. Conversely, if the
alternative hypothesis is accepted, it will result in the changes in the opinions or actions.
5. As null hypothesis refers to population parameter, the testing is indirect and implicit. On the other hand, the
alternative hypothesis indicates sample statistic, wherein, the testing is direct and explicit.
6. A null hypothesis is labelled as H0 (H-zero) while an alternative hypothesis is represented by H1 (H-one).
7. The mathematical formulation of a null hypothesis is an equal sign but for an alternative hypothesis is not
equal to sign.
8. In null hypothesis, the observations are the outcome of chance whereas, in the case of the alternative
hypothesis, the observations are an outcome of real effect.
The one-sample proportions z-test is used to compare a sample proportion with a population proportion. The following
code has been provided:

from statsmodels.stats.proportion import proportions_ztest

proportions_ztest(count, nobs, value = 0.7, alternative='two-sided')

In the above line of code,

 proportions_ztest() is a function in scipy.stats that performs a one-sample proportions z-test.

 count is the number of successes out of the total number of observations in the sample
 nobs is the total number of observations in the sample
 value = 0.7 is a parameter that indicates that the hypothesized population proportion is 0.7
 alternative = 'two-sided' is an argument used to specify the tail of the test. This argument depends on the formulated
alternative hypothesis for the test.
Python writes scientific notation using 'e' for example, 7.84 x 10-5 is displayed as 7.84e-05. So the value 7.84e-05 i.e 7.84 x 10-5 is less
than 0.05
Here is a list of the important Python functions for conducting different types of Hypothesis Tests from the
scipy.stats library and the statsmodels library along with links to their official documentation:
Tests from scipy.stats library (Alias: stats)

1. ttest_1samp(): Test to compare a sample mean with a population mean when the population standard deviation is
unknown.
2. ttest_ind(): Test to compare two sample means from two independent populations when the population standard
deviations are unknown.
3. ttest_rel(): Test to compare two sample means from two related (dependent) populations.
4. chi2_contingency(): Test to check the dependence(relationship) between two categorical variables.
5. shapiro(): Test to determine whether a sample has been drawn from a normal population.
6. levene(): Test to determine whether several samples have been drawn from populations with equal variances.
7. f_oneway(): Test to compare the sample means from several populations.
Note: In 'scipy.stats' test functions, the 'alternative' argument is used to define the alternative hypothesis. The following
options are available (the default is 'two-sided’):
 ‘two-sided’: to perform the test for a two-tailed alternative hypothesis (containing ≠ sign)
 ‘less’: to perform the test for a one-tailed alternative hypothesis (containing < sign)
 ‘greater’: to perform the test for a one-tailed alternative hypothesis (containing > sign)

Tests from Statsmodels library

1. statsmodels.stats.proportion.proportions_ztest(): Test to compare proportions based on normal (z) test.

2. statsmodels.stats.multicomp.pairwise_tukeyhsd(): Test to conduct pairwise comparisons for several sample means
Note: In the test functions of statsmodels library, the 'alternative' argument is used to define the alternative hypothesis.
The 'alternative' argument can take any of the possible values: ‘two-sided’, ‘smaller’, ‘larger’

Libraries Used in ANOVA Monograph

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

import statsmodels.api as sm

from statsmodels.formula.api import ols

from statsmodels.graphics.gofplots import ProbPlot

from statsmodels.graphics.factorplots import interaction_plot

from statsmodels.stats.multicomp import (pairwise_tukeyhsd,MultiComparison)

Presentation T Test
50% (2)
Presentation T Test
31 pages
Final - Module 4 B
No ratings yet
Final - Module 4 B
61 pages
Instruction
No ratings yet
Instruction
6 pages
Earthing Design
No ratings yet
Earthing Design
12 pages
5 Session 18-19 (Z-Test and T-Test)
No ratings yet
5 Session 18-19 (Z-Test and T-Test)
28 pages
T TEST Lecture
No ratings yet
T TEST Lecture
26 pages
T - Test
No ratings yet
T - Test
45 pages
1 Hypothesis Testing Rev
No ratings yet
1 Hypothesis Testing Rev
122 pages
Plastic Waste Management
No ratings yet
Plastic Waste Management
9 pages
X5000 Safety Manual
No ratings yet
X5000 Safety Manual
24 pages
BPS651 Exercise V
50% (2)
BPS651 Exercise V
5 pages
Statistics 1. T-Test Review: 2014. Prepared by Lauren Pincus With Input From Mark Bell
No ratings yet
Statistics 1. T-Test Review: 2014. Prepared by Lauren Pincus With Input From Mark Bell
16 pages
Research 2
No ratings yet
Research 2
5 pages
Group Comparision
No ratings yet
Group Comparision
49 pages
08 Parametric Tests
100% (1)
08 Parametric Tests
129 pages
DAILY LESSON LOG OF STEM - PC11AG-Ib-1 (Week Two-Day One) : 4 Cy 4cy
No ratings yet
DAILY LESSON LOG OF STEM - PC11AG-Ib-1 (Week Two-Day One) : 4 Cy 4cy
4 pages
Isds361b Notes
No ratings yet
Isds361b Notes
103 pages
Determining Liquid Limits of Soils: Test Procedure For
No ratings yet
Determining Liquid Limits of Soils: Test Procedure For
12 pages
Vibration Meter Circuit Using LED Driver IC LM3915 - Gadgetronicx
No ratings yet
Vibration Meter Circuit Using LED Driver IC LM3915 - Gadgetronicx
4 pages
Probability and Statistics - 3
No ratings yet
Probability and Statistics - 3
59 pages
Independent Samples T Test (LECTURE)
No ratings yet
Independent Samples T Test (LECTURE)
10 pages
Learning Episode 03
No ratings yet
Learning Episode 03
4 pages
Large Sample Test
No ratings yet
Large Sample Test
6 pages
Two Sample Inference: By: Girma M
No ratings yet
Two Sample Inference: By: Girma M
33 pages
Practice Problem - Hypothesis Testing Pragati
No ratings yet
Practice Problem - Hypothesis Testing Pragati
17 pages
Statistics U4
No ratings yet
Statistics U4
38 pages
Hypothesis Test
No ratings yet
Hypothesis Test
19 pages
Rules and Procedures of Solving Mathematical Problems
No ratings yet
Rules and Procedures of Solving Mathematical Problems
17 pages
Weir & Retaining Wall
No ratings yet
Weir & Retaining Wall
2 pages
Two Populations PDF
No ratings yet
Two Populations PDF
16 pages
Comparison of Means: Hypothesis Testing
No ratings yet
Comparison of Means: Hypothesis Testing
52 pages
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
No ratings yet
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
51 pages
Admitted Student (Edited by Anmol)
No ratings yet
Admitted Student (Edited by Anmol)
16 pages
Bilateral Dimples That Are Rarely Seen in The Lower Alignment of The Mouth Corners Fovea Inferior Anguli Oris PDF
No ratings yet
Bilateral Dimples That Are Rarely Seen in The Lower Alignment of The Mouth Corners Fovea Inferior Anguli Oris PDF
2 pages
AFS Foundations Handbook - Final
No ratings yet
AFS Foundations Handbook - Final
41 pages
Ortho Dimensions and Sketching
No ratings yet
Ortho Dimensions and Sketching
13 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
60 pages
Lecture 6 Part One
No ratings yet
Lecture 6 Part One
11 pages
T-Test Material
No ratings yet
T-Test Material
10 pages
Session 2 On Hypothesis Testing
No ratings yet
Session 2 On Hypothesis Testing
13 pages
Unit 5 Mba 1ST
No ratings yet
Unit 5 Mba 1ST
197 pages
T Test Function in Statistical Software
No ratings yet
T Test Function in Statistical Software
9 pages
Oracle Fusion HRMS UAE HR Data Rel13 1
No ratings yet
Oracle Fusion HRMS UAE HR Data Rel13 1
138 pages
BOT 614 Test of Significance - 095338
No ratings yet
BOT 614 Test of Significance - 095338
3 pages
Eigen Values and Eigen Vector
No ratings yet
Eigen Values and Eigen Vector
13 pages
Stats Cheatsheet Final
No ratings yet
Stats Cheatsheet Final
2 pages
AI and Machine Learning For Disaster Prediction
No ratings yet
AI and Machine Learning For Disaster Prediction
18 pages
Statistics in Research
No ratings yet
Statistics in Research
17 pages
P&S Unit 5 (1) 111111111111
No ratings yet
P&S Unit 5 (1) 111111111111
16 pages
Understanding The Aumann's Agreement Theorem
No ratings yet
Understanding The Aumann's Agreement Theorem
25 pages
ST130 - CHP 11
No ratings yet
ST130 - CHP 11
9 pages
Session 09
No ratings yet
Session 09
6 pages
Ebecryl-898 en A4
No ratings yet
Ebecryl-898 en A4
2 pages
Independent Samples T Test
No ratings yet
Independent Samples T Test
19 pages
Supreet Kaur 4132 (Research Methodology)
No ratings yet
Supreet Kaur 4132 (Research Methodology)
8 pages
Chapter 6 Hypothesis Test Anova
No ratings yet
Chapter 6 Hypothesis Test Anova
63 pages
Biol2001 Stats-Lecture 6
No ratings yet
Biol2001 Stats-Lecture 6
35 pages
Sound Simulation-Based Design Optimization of Brass Wind
No ratings yet
Sound Simulation-Based Design Optimization of Brass Wind
11 pages
Eda Research
No ratings yet
Eda Research
11 pages
Statistical Tests
No ratings yet
Statistical Tests
11 pages
Basic Statistical Analysis
No ratings yet
Basic Statistical Analysis
12 pages
Chapter2 Handout
No ratings yet
Chapter2 Handout
34 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Neon Green White Playful Illustrative Market Research Presentation
No ratings yet
Neon Green White Playful Illustrative Market Research Presentation
21 pages
Day 6
No ratings yet
Day 6
3 pages
Statistics07 TwoSamplesHypothesisTest
No ratings yet
Statistics07 TwoSamplesHypothesisTest
45 pages
MSDS LDPE - LLDPE Version 7 EN
No ratings yet
MSDS LDPE - LLDPE Version 7 EN
11 pages
Chapter8 - Hyp - Test - 2 - Samples - Student
No ratings yet
Chapter8 - Hyp - Test - 2 - Samples - Student
45 pages
F-OPN-20-48g IICL - For Civil Works On Excavation & Hauling (Rev. 00)
No ratings yet
F-OPN-20-48g IICL - For Civil Works On Excavation & Hauling (Rev. 00)
2 pages
Eda Final Topic
No ratings yet
Eda Final Topic
108 pages
Climatology PPT - Year1
No ratings yet
Climatology PPT - Year1
57 pages
R Commands New 2
No ratings yet
R Commands New 2
23 pages
Hypothesis
No ratings yet
Hypothesis
16 pages
Mod 7 Study Guide
No ratings yet
Mod 7 Study Guide
3 pages
Comprehensive Summary of Arun Joshi
No ratings yet
Comprehensive Summary of Arun Joshi
4 pages
Student's T Test (With Correlation)
No ratings yet
Student's T Test (With Correlation)
23 pages
ANP 802 Lecture 2verynew
No ratings yet
ANP 802 Lecture 2verynew
50 pages
7-Applying The T-Test For Independent and Dependent Samples-13
No ratings yet
7-Applying The T-Test For Independent and Dependent Samples-13
6 pages
Radiation Protection in Medical Radiography 9th Edition Sherer Solution Manual Full Download
100% (2)
Radiation Protection in Medical Radiography 9th Edition Sherer Solution Manual Full Download
411 pages
Cable As Axial Elements
No ratings yet
Cable As Axial Elements
9 pages
Transfer Application 445584
No ratings yet
Transfer Application 445584
1 page
Raghunath Chatterjee - Statistical Tests - Lecture
No ratings yet
Raghunath Chatterjee - Statistical Tests - Lecture
47 pages
Lecture 15
No ratings yet
Lecture 15
27 pages
生物统计方法与应用6 Two Sample
No ratings yet
生物统计方法与应用6 Two Sample
43 pages
Mayo Clinic Internal Medicine Board Review 10th
No ratings yet
Mayo Clinic Internal Medicine Board Review 10th
303 pages

Common Statistics

Uploaded by

Common Statistics

Uploaded by

•A one-sample z-test is used to compare a sample mean with a hypothesized population mean when the population standard deviation

CLT – The Sample Size should be more than 30

So, the valid null and alternative hypotheses are:

 k is the number of observations in a,

alternative{‘two-sided’, ‘less’, ‘greater’}, optional

from statsmodels.stats.proportion import proportions_ztest

In the above line of code,

 proportions_ztest() is a function in scipy.stats that performs a one-sample proportions z-test.

1. statsmodels.stats.proportion.proportions_ztest(): Test to compare proportions based on normal (z) test.

Libraries Used in ANOVA Monograph

import seaborn as sns

import matplotlib.pyplot as plt

from statsmodels.formula.api import ols

from statsmodels.graphics.gofplots import ProbPlot

from statsmodels.graphics.factorplots import interaction_plot

from statsmodels.stats.multicomp import (pairwise_tukeyhsd,MultiComparison)

You might also like