0% found this document useful (0 votes)
46 views6 pages

Python Statistics

This document discusses and provides examples of performing one-way ANOVA and t-tests in Python using libraries like scipy, statsmodels, and pingouin. It shows how to conduct one-way ANOVA on different groups of performance data and explore differences between groups. It also demonstrates three methods of performing two-sample t-tests to compare two groups of data and determine if their means are statistically different.

Uploaded by

Garuma Abdisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views6 pages

Python Statistics

This document discusses and provides examples of performing one-way ANOVA and t-tests in Python using libraries like scipy, statsmodels, and pingouin. It shows how to conduct one-way ANOVA on different groups of performance data and explore differences between groups. It also demonstrates three methods of performing two-sample t-tests to compare two groups of data and determine if their means are statistically different.

Uploaded by

Garuma Abdisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

One-way ANOVA:

# Importing library

from scipy.stats import f_oneway

# Performance when each of the engine

# oil is applied

performance1 = [89, 89, 88, 78, 79]

performance2 = [93, 92, 94, 89, 88]

performance3 = [89, 88, 89, 93, 90]

performance4 = [81, 78, 81, 92, 82]

# Conduct the one-way ANOVA

print(f_oneway(performance1, performance2, performance3, performance4))

Output: F_onewayResult(statistic=4.625000000000002, pvalue=0.016336459839780215)

###############################################

import pandas as pd

# load data file

df = pd.read_excel("C:/Users/user/Documents/sampanova.xlsx")

# reshape the d dataframe suitable for statsmodels package

df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=['A', 'B', 'C', 'D'])

# replace column names

df_melt.columns = ['index', 'treatments', 'value']

# generate a boxplot to see the data distribution by treatments. Using boxplot, we can

# easily detect the differences between different treatments

import matplotlib.pyplot as plt

import seaborn as sns


ax = sns.boxplot(x='treatments', y='value', data=df_melt, color='#99c2a2')

ax = sns.swarmplot(x="treatments", y="value", data=df_melt, color='#7d0013')

plt.show()

import scipy.stats as stats

# stats f_oneway functions takes the groups as input and returns ANOVA F and p value

fvalue, pvalue = stats.f_oneway(df['A'], df['B'], df['C'], df['D'])

print(fvalue, pvalue)

# 17.492810457516338 2.639241146210922e-05

# get ANOVA table as R like output

import statsmodels.api as sm

from statsmodels.formula.api import ols

# Ordinary Least Squares (OLS) model

model = ols('value ~ C(treatments)', data=df_melt).fit()

anova_table = sm.stats.anova_lm(model, typ=2)

print(anova_table)

#######################

# install

pip install bioinfokit

# upgrade to latest version

pip install bioinfokit --upgrade

# uninstall

pip uninstall bioinfokit

################################
t-test

import scipy.stats as stats

import numpy as np

# Creating data groups

data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,

17, 16, 14, 19, 20, 21, 15,

15, 16, 16, 13, 14, 12])

data_group2 = np.array([15, 17, 14, 17, 14, 8, 12,

19, 19, 14, 17, 22, 24, 16,

13, 16, 13, 18, 15, 13])

# Print the variance of both data groups

print(np.var(data_group1), np.var(data_group2))

output: 7.727500000000001 12.260000000000002

1. Performing Two-Sample T-Test

Method 1

# Python program to demonstrate how to

# perform two sample T-test

# Import the library

import scipy.stats as stats

import numpy as np

# Creating data groups

data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,

17, 16, 14, 19, 20, 21, 15,

15, 16, 16, 13, 14, 12])


data_group2 = np.array([15, 17, 14, 17, 14, 8, 12,

19, 19, 14, 17, 22, 24, 16,

13, 16, 13, 18, 15, 13])

# Perform the two sample t-test with equal variances

print(stats.ttest_ind(a=data_group1, b=data_group2, equal_var=True))

output: Ttest_indResult(statistic=-0.6337397070250238, pvalue=0.5300471010405257)

method 2

# Python program to conduct two-sample

# T-test using pingouin library

# Importing library

from statsmodels.stats.weightstats import ttest_ind

import numpy as np

import pingouin as pg

# Creating data groups

data_group1 = np.array([160, 150, 160, 156.12, 163.24,

160.56, 168.56, 174.12,

167.123, 165.12])

data_group2 = np.array([157.97, 146, 140.2, 170.15,

167.34, 176.123, 162.35, 159.123,

169.43, 148.123])

# Conducting two-sample ttest

result = pg.ttest(data_group1,

data_group2,

correction=True)
# Print the result

print(result)

output: T dof alternative ... cohen-d BF10 power

T-test 0.653148 14.389477 two-sided ... 0.292097 0.462 0.094912

Method 3

from statsmodels.stats.weightstats import ttest_ind

import numpy as np

import pingouin as pg

# Creating data groups

data_group1 = np.array([160, 150, 160, 156.12,

163.24,

160.56, 168.56, 174.12,

167.123, 165.12])

data_group2 = np.array([157.97, 146, 140.2, 170.15,

167.34, 176.123, 162.35,

159.123, 169.43, 148.123])

# Conducting two-sample ttest

print(ttest_ind(data_group1, data_group2))

output: (0.6531479162158739, 0.5219170107019715, 18.0) ….> t-stat, p-val, df


linear regression

pip install sklearn-pandas==1.5.0

You might also like