0% found this document useful (0 votes)
4 views

Assignment -2

The document presents Python code for generating random datasets based on specific roll numbers, calculating descriptive statistics, and visualizing the data using histograms and boxplots. It utilizes libraries such as numpy, matplotlib, scipy.stats, and pandas for computations and visualizations. The output includes saved plots, printed statistics in table format, and insights for three datasets.

Uploaded by

me220003077
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assignment -2

The document presents Python code for generating random datasets based on specific roll numbers, calculating descriptive statistics, and visualizing the data using histograms and boxplots. It utilizes libraries such as numpy, matplotlib, scipy.stats, and pandas for computations and visualizations. The output includes saved plots, printed statistics in table format, and insights for three datasets.

Uploaded by

me220003077
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Visualization

Samruddhi R.Patil -220003058


Shalini Bharti -220003077
Code for getting results In Python and code summary

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import mode
import pandas as pd

roll_no_p1 = 3058
roll_no_p2 = 3077

mean1, std1 = roll_no_p1 % 10000, roll_no_p2 % 100


mean2, std2 = roll_no_p2 % 10000, roll_no_p1 % 100

dataset1 = np.random.normal(mean1, std1, 100)


dataset2 = np.random.normal(mean2, std2, 100)
dataset3 = np.concatenate([dataset1, dataset2])

def calculate_statistics(data):
return {
"Mean": np.mean(data),
"Std Dev": np.std(data),
"Median": np.median(data),
"Mode": mode(data, keepdims=True).mode[0]
}

stats1 = calculate_statistics(dataset1)
stats2 = calculate_statistics(dataset2)
stats3 = calculate_statistics(dataset3)

for i, (data, stats) in enumerate(zip([dataset1, dataset2, dataset3], [stats1, stats2, stats3]), start=1):
plt.figure()
plt.hist(data, bins=20, alpha=0.7, color='blue', edgecolor='black')
plt.title(f"Dataset {i}: Mean={stats['Mean']:.2f}, Std Dev={stats['Std Dev']:.2f}")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.savefig(f"histogram_dataset_{i}.png")
plt.show()

plt.figure()
plt.boxplot([dataset1, dataset2, dataset3], labels=["Dataset 1", "Dataset 2", "Dataset 3"])
plt.title("Box-and-Whisker Plots for Datasets")
plt.ylabel("Values")
plt.savefig("boxplots.png")
plt.show()

table_data = pd.DataFrame([stats1, stats2, stats3], index=["Dataset 1", "Dataset 2", "Dataset 3"])
print(table_data)

●​ Libraries: The code uses numpy for numerical computations, matplotlib for data visualization, scipy.stats for
statistical mode, and pandas for tabular data representation.
●​ Functionality: It generates random datasets based on roll numbers, computes descriptive statistics (mean,
standard deviation, median, mode), and visualizes data with histograms and boxplots.
●​ Output: Saves plots as images, prints statistics in tabular form, and displays insights for three datasets.

You might also like