HYPOTHESIS TESTING
+ Hypothesis testing is a statistical method used to evaluate whether an observed data sample is consistent
with a hypothesis or a null hypothesis. The null hypothesis is typically a statement that assumes there is no
difference or relationship between two or more variables, while the alternative hypothesis is the opposite,
suggesting there is a difference or relationship.
+ To perform hypothesis testing, a statistical test is conducted using the data to calculate a test statistic, such
as a t-score or p-value, which measures the probability of observing the data if the null hypothesis is true. If
the p-value is less than a predetermined significance level, typically 0.05, then the null hypothesis is
rejected, and the alternative hypothesis is supported.
+ NOTE:
' significance value(alpha) is either 0.05/0.01
= pvalue > significance value -> accept the null hypothesis else reject,
In [3]
import nunpy as np
import scipy. stats
from scipy.stats import t
Confidence Interval
-A confidence interval is a range of values that is likely to contain the true value of a population parameter with
a certain degree of confidence.
The formula for calculating a confidence interval for a population mean is:
C1=X+#Z*(8/ sqrtin))
where:
-Clis the confidence interval
-X is the sample mean
Zis the z-score for the desired level of confidence (e.9., 1.96 for a 95% confidence interval)
-s is the sample standard deviation
-nis the sample size
Int]
#taking an example array:
arr = np.array([78 ,65 ,20 ,69 ,36 ,81 ,85 ,71 ,44 71 ,47 ,37 89 ,25 ,73 ,55 ,80 ,52 ,46,Int]
tusing formula:
X = np.mean(arr)
= 1.96
np.std(arr)
len(arr)
CI1 = X = Z*(s/np.sqrt(n))
C12 = X + 2*(s/np.sqrt(n))
print (CI1, C12)
51.24255838509275 68.85744161490724
In[]
fusing scipy Library:
sc = scipy.stats.sem(arr) #calculating standard error
# getting 95% confidence interval
tval = t-interval (confidence-9.95, df:
tval
‘n-1),loc=X, seale=se)
out[1@]:
(58.400470937660145, 69.69952906233985)
In[]
Shapiro Wilk Test
+ to check if sample has normal distribution we use shapiro test , if pval>0.05 the data can be normaly
distributed, else not
In[]
fusing built in module:
from scipy.stats import shapiro
shapiro(arr)
out[11]:
ShapiroResult(statistic=0.9377500414848328, pvalue:
-21734236180782318)
Here pvalue is more than 0.05, thus our array can be normally distributed.
Int]
#H#Z test (when we have pop std available):
+ Az-test is a statistical test that compares a sample mean to a known population mean when the population
standard deviation is known. It is used to determine whether the difference between the sample mean and
the population mean is statistically significant.
+ The formula for calculating the z-score is:25 (X-p)/(6/ sqrt(n))
where:
Xis the sample mean
iis the population mean
iis the population standard deviation
nis the sample size
-If the absolute value of the z-score is greater than the critical value for a given level of significance (e.g., 1.96
for a 95% confidence interval), then we can reject the null hypothesis and conclude that the sample mean is
statistically significantly different from the population mean.
In[{ ]
#importing ztest:
from statsmodels.stats.weightstats import ztest
Int]
#tcreating a population array
pop = np.random.uniform(@,50, size=(52))
pop
out[24]:
array([32.74604246, 33.56518868, 43.50795015, 45.50614668, 38.8306301 ,
22.20365074, 47.50829914, 23.27547057, 37.47812872, 41.60528632,
44,85228347, 16.49288566, 40.3834803 , 30.52390039, 44.81210912,
33.67518587, 12.79291627, 6.23261258, 49.39945951, 32.95685744,
@.65689254, 49,10668004, 37.58585442, 7.73867263, 36.82612653,
33,86925638, 9.45873698, 28.94881697, 15.21290017, 24.38346338,
15.90480788, 44.48180982, 2.44119756, 26.67762388, 7.78469611,
32.67353678, 26.50769285, 4,56499116, 3.38449021, 38.49501697,
29.11635786, 37.25719946, 44.90976652, 32.15207807, 35.98638424,
28.82627884, 7.77957377, 31.84726721, 14.87935178, 12.12519745])
In[{]
#randomly selecting sample from pop array:
Sample = np.random.choice(pop, size=30)
Sample
out [25]:
array([43.50795015, 44.90976652, 35.98638424, 23.27547057, 7.77957377,
24.38346338, 29.11635786, 22.20365074, 38.49501697, 31.84726721,
777957377, 32.67353678, 6.23261258, 9.45873698, 7.77957377,
30.52390039, 44.81210912, 49.10668004, 4.564916, 44.48180982,
29.11635786, 40.3834803 , 7.73867263, 45.50614668, 15.90480788,
28.94881697, 37.25719946, 43.50795015, 33.56518868, 6.23261258])nt]
tusing the formula:
X = np.mean(Sample)
mu = np.mean(pop)
np.std(pop)
n = len(Sample)
formula:
z= (X- mu) / (s / np.sqrt(n))
out [28]:
-0.16701890401142058
mnt]
#getting z score and pvalue using scipy ztest
zsc,pval = ztest(Sanple, value-mu)
print(zse, pval)
-0.16015770562560863 @.872756845272247
int]
1 sample T test
+ Aone-sample Hest is used to determine whether a sample mean is significantly different from a
hypothesized population mean when the population standard deviation is not known. The test is a
statistical test that compares the mean of a sample to a known value or hypothesized population mean.
+ Formula:
t= (R-W)/(S/ sartin))
Where:
is the sample mean
tis the hypothesized population mean
sis the sample standard deviation
nis the sample sizeInt]
#ereating a population array
pop = np.random.uniform(a, 50, size-(5®))
pop
out[31]:
array([ 5.18@36289, 1.11277454, 0.96467437,
25.89987369, 12.47829944, 35,56420211,
46.51837308, 49.82584427, 46.37285163,
48.42552787, 4.98772717, 42.75864617,
38.5191423 , 35.94388045, 1,65996516,
20.33068123, 33.84763756, 8.52579212,
28.49794178, 20.39327941, 39.56176362,
2.23302955, 21.999271 , 20.63344138,
47..07243369, 29.92872407, 22.85556021,
39.99725439, 19.82553468, 30.18148773,
mnt]
#randomly selecting sample from pop array:
Sample = np.random.choice(pop, size=30)
Sample
out[32]:
array([19.82553468, 3.@918093 , 31.36255499,
39,56176362, 20.63344138, 44.82387921,
27.73661023, 39.56176362, 44.82387921,
25.89987369, 38.5191423 , 40,54238469,
29.34713621, 27.73661023, 10. 36310353,
44,82387921, 29.34713621, 49.82584427,
In[]
x = np.mean(Sample)
mu = np.mean(pop)
s = np.std(Sample)
n= len(Sample)
In[ J
#using formula:
t= (x- mu) / (5 / np.sqrt(n))
t
out [38]:
2.0796521700566624
Int]
ttimporting ttest_1samp
from scipy.stats import ttest_isamp
29.34713621,
4482387921,
10. 36310353,
33.09138446,
8.5997978 ,
11.64226606,
27.49154006,
15.1@922416,
5.38556096,
3.e918093 ,
20.63344138,
33.84763756,
12.47829944,
27.49154006,
45.18582971,
2.23302955,
47, 74696878,
2159886285,
4054238469,
45.65280508,
30. 38039978,
3136255499,
45.18582971,
23.97528398,
27.73661023,
4.91344648])
39.56176362,
48.42552787,
39.99725439,
45.65280508,
42.75864617,
4.98772717])Int]
agetting t stats
ttest_isamp(Sample, mu)
out [37]:
TtestResult(statistic=2.0446975432748005, pvalue=0.05005580140504867, df=29)
Int]
# incase of one tailed pvalue*2 is the pvalue
Int]
Hitindependent 2 sample T test
+ An independent two-sample test is used to determine whether the means of two independent groups are
significantly different from each other. The two groups should be independent of each other, meaning that
the observations in one group are not related to the observations in the other group.
Int]
#creating two arrays:
arrl = np.array([5,3,3,5,5,6,6,7,3,8]
arr2 = np.array([6,7,9,3,6,5,4,2,6,8])
In[{]
#tchecking normality using shapiro
s1 = shapiro(arri)
s2= shapiro(arr2)
print(s1)
print(s2)
ShapiroResult(statistic-0.910980761051178, pvalue=0.28779593110084534)
ShapiroResult(
[email protected],
[email protected])
In{]
#checking if the variance is same using LEVENE test where if pval>6.05 the var is same else
from scipy.stats import levene
1 = levene(arrt,arr2)
1
out[41}:
LeveneResult(statisti:
- 2842105263157895, pvalue=0.680475751924876)nt]
uperforming the t test for 2 samples:
from scipy.stats import ttest_ind
‘ttest_ind(arri,arr2,equal_var=True)
out[42]:
Ttest_indResult (statisti
0. 5698028822981898,
[email protected])
+ Since the p-value is greater than 0.05, we fail to reject the null hypothesis and conclude that there is
insufficient evidence to suggest that the means of the two samples are different at the 95% confidence
level.
2 sample dependent t test (Paired T test)
+ Apaired test is used to determine whether there is a significant difference between the means of two
related groups.
Inf]
# Importing Library
import scipy.stats as stats
# pre holds the mileage before
# applying the different engine oil
pre = [30, 31, 34, 49, 36, 35,
34, 30, 28, 29]
# post holds the mileage after
# applying the different engine oil
post = [30, 31, 32, 38, 32, 31,
32, 29, 28, 30]
# Performing the paired sample t-test
stats.ttest_rel(pre, post)
out [43]:
TestResult(statistic=2.584921310565987,
[email protected], df=9)
+ Since the p-value is less than 0.05, we reject the null hypothesis and conclude that there is sufficient
evidence to suggest that the means of the two samples are different at the 95% confidence level.
inf]F-test
+ An F-test is a statistical hypothesis test that is used to compare the variances of two samples or the ratio of
variances between two populations. The F-test is based on the F-distribution, which is a continuous
probability distribution that arises in the analysis of variance (ANOVA) and regression.
int]
from scipy.stats import f_oneway
# create three samples
sample1 = np.array([1, 2, 3, 4, 5])
sample2 = np.array([2, 4, 6, 8, 10])
sample3 = np.array([3, 6, 9, 12, 15])
# perform F-test
statistic, p_value = f_oneway(sample1, sample2, sample3)
# print results
print("F-statistic:", f_statistic)
print("p-value:", p_value)
F-statistic: 3.857142857142857
p-value: @.0508629933139865
+ If the F-statistic is larger than the p-value, it means that the observed variability between groups is larger
than would be expected by chance, and the null hypothesis of equal means across all groups is rejected at
the chosen level of significance,
In[{]
ANOVA test:
+ ANOVA stands for Analysis of Variance. It is a statistical method used to compare means of two or more
groups to determine whether there is a significant difference between them.
+ The F-test is used in ANOVA to test the null hypothesis that the means of all groups are equalInt]
import scipy.stats as stats
# Generate some sample data
groupl = [1, 2, 3, 4, 5]
group2 = [2, 4, 6, 8, 10]
group3 = [3, 6, 9, 12, 15]
# Perform one-way ANOVA
statistic, p_value = stats.f_oneway(group1, group2, group3)
# print the results
print("F-statistic:", f_statistic)
print("p-value:", p_value)
Festatistic: 3.857142857142857
p-value: ©.05086290933139865
‘+ If the F-statistic is larger than the p-value, it means that the observed variability between groups is larger
than would be expected by chance, and the null hypothesis of equal means across all groups is rejected at
the chosen level of significance.
In[]
#Chi-Square test:
+ The Chi-Square test is a statistical test used to determine if there is a significant association between two
categorical variables. It is a non-parametric test, meaning that it does not make any assumptions about the
distribution of the data
+ The formula for the Chi-Square test statistic is
x'2 = E [(Oi-Ei)*2/ Ei]
= where x'2 is the test statistic, Oi is the observed frequency for category i, Ei is the expected frequency
for category i (based on the assumption of independence), and the summation is taken over all
categoriesIn [11):
import pandas as pd
# Creating a datafrane with 2 categorical features
data = pd.DataFrame({"study’: ['Yes', "No", "Yes','Yes’, ‘No’, 'No',"Yes’, 'No', 'No'],
*Result': ['Good', ‘Bad’, ‘Bad','Good', ‘Good’, *Bad’,'Good', *Bad', ‘Good’]})
data
out [11]:
Study Result
0 Yes Good
1 No Bad
2 Yes Bad
3 Yes Good
4 No Good
5 No Bad
6 Yes Good
In [13]:
# creating a crosstab table:
data_table = pd.crosstab(data[ 'Study"],data[ ‘Result"])
data_table
out[13]:
Result Bad Good
Study
No 3 2
Yes 1 3In [14]:
import scipy.stats as stats
observed = data_table.values
# Perform Chi-Square test
chi2_statistic, p_value, degrees_of_freedom, expected = stats.chi2_contingency(observed)
# Print the results
print("Chi-Square statistic:
print("p-value:", p_value)
print("Degrees of freedom:", degrees_of_freedom)
print("Expected frequencies:", expected)
» chi2_statistic)
Chi-Square statistic: @.1496249999999999
p-value: @.7076604666545525,
Degrees of freedom: 1
Expected frequencies: [[2.22222222 2.77777778]
[1.77777778 2.22222222]]
«Ifthe Chi-Square statistic is larger than the p-value, it means that the observed data has a larger deviation
from the expected data than would be expected by chance, and the null hypothesis of independence
between the two categorical variables is rejected at the chosen level of significance.
Int]