0% found this document useful (0 votes)
2 views

CODE2

The code performs normality tests on numeric columns from a CSV dataset containing NVDA stock data. It uses Shapiro-Wilk, Kolmogorov-Smirnov, D’Agostino, and Anderson-Darling tests to assess normality, and visualizes the results with histograms and Q-Q plots. The results are compiled into a DataFrame for display.

Uploaded by

suryanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

CODE2

The code performs normality tests on numeric columns from a CSV dataset containing NVDA stock data. It uses Shapiro-Wilk, Kolmogorov-Smirnov, D’Agostino, and Anderson-Darling tests to assess normality, and visualizes the results with histograms and Q-Q plots. The results are compiled into a DataFrame for display.

Uploaded by

suryanshu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CODE

import csv

import pandas as pd

import scipy.stats as stats

import seaborn as sns

import matplotlib.pyplot as plt

# Load the dataset

df = pd.read_csv("NVDA.csv")

# Select numeric columns

numeric_cols = df.select_dtypes(include=['number']).dropna()

# Create an empty list to store results

results = []

# Perform normality tests on each numeric column

for col in numeric_cols.columns:

shapiro_stat, shapiro_p = stats.shapiro(numeric_cols[col])

ks_stat, ks_p = stats.kstest(numeric_cols[col], 'norm',


args=(numeric_cols[col].mean(), numeric_cols[col].std()))

dagostino_stat, dagostino_p = stats.normaltest(numeric_cols[col])

anderson_stat = stats.anderson(numeric_cols[col], dist='norm')

# Store results in a dictionary

results.append({

"Column": col,

"Shapiro-Wilk Stat": round(shapiro_stat, 4),


"Shapiro-Wilk p-value": round(shapiro_p, 4),

"Kolmogorov-Smirnov Stat": round(ks_stat, 4),

"Kolmogorov-Smirnov p-value": round(ks_p, 4),

"D’Agostino K² Stat": round(dagostino_stat, 4),

"D’Agostino K² p-value": round(dagostino_p, 4),

"Anderson-Darling Stat": round(anderson_stat.statistic, 4),

})

# Histogram & Q-Q Plot

fig, ax = plt.subplots(1, 2, figsize=(12, 5))

sns.histplot(numeric_cols[col], kde=True, bins=20, ax=ax[0])

ax[0].set_title(f"Histogram of {col}")

stats.probplot(numeric_cols[col], dist="norm", plot=ax[1])

ax[1].set_title(f"Q-Q Plot of {col}")

plt.show()

# Convert results into a DataFrame

results_df = pd.DataFrame(results)

# Display the table

print(results_df)

RESULT
Column Shapiro-Wilk Stat ... D’Agostino K² p-value Anderson-Darling Stat

0 Adj Close 0.3999 ... 0.0 1488.3597

1 Close 0.4002 ... 0.0 1486.6029

2 High 0.4000 ... 0.0 1487.3982


3 Low 0.4005 ... 0.0 1485.8193

4 Open 0.4000 ... 0.0 1487.1525

5 Volume 0.7517 ... 0.0 293.6605

[6 rows x 8 columns]

You might also like