0% found this document useful (0 votes)
4 views18 pages

4 Compressed

The document outlines Experiment 2, focusing on data manipulation in Python using libraries like Pandas and sklearn. It includes tasks such as creating and loading datasets, computing statistical measures (mean, median, mode, variance, standard deviation), and demonstrating data pre-processing techniques. Specific programming tasks are provided for reshaping, filtering, merging data, handling missing values, and feature normalization.

Uploaded by

aliza.17259
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views18 pages

4 Compressed

The document outlines Experiment 2, focusing on data manipulation in Python using libraries like Pandas and sklearn. It includes tasks such as creating and loading datasets, computing statistical measures (mean, median, mode, variance, standard deviation), and demonstrating data pre-processing techniques. Specific programming tasks are provided for reshaping, filtering, merging data, handling missing values, and feature normalization.

Uploaded by

aliza.17259
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Experiment 2.

(Week 2 )

a)​ Creation and Loading different types of datasets in Python using the required libraries.
i.​ Creation using pandas
ii.​ Loading CSV dataset files using Pandas
iii.​ Loading datasets using sklearn

b)​ Write a python program to compute Mean, Median, Mode, Variance,


Standard Deviation using Datasets
c)​ Demonstrate various data pre-processing techniques for a given dataset.
Write a python program to compute
i.​ Reshaping the data,
ii.​ Filtering the data,
iii.​ Merging the data
iv.​ Handling the missing values in datasets
v. Feature Normalization: Min-max normalization
1.​ Creation and loading different datasets in Python
Program 1:
Program 2:
Program 3:

2.Write a python program to compute Mean, Median, Mode, Variance,


Standard Deviation using Datasets

●​ Python Statistics library


This module provides functions for calculating mathematical statistics of
numeric (Real-valued) data. The statistics module comes with very useful
functions like: Mean, median, mode, standard deviation, and variance.
The four functions we'll use are common in statistics:
1.​ mean-average value
2.​ median-middle value
3.​ mode-most often value
4.​ standard deviation –spread of values

●​ Averages and measures of center allocation

These functions calculate an average or typical value from a population or sample.


mean() Arithmetic mean (―average‖) of data.
harmonic_mean()​ Harmonic mean of data.
median()​ Median (middle value) of data. median_low(),Low
median of data.
median_high()​ High median of data.
median_grouped()​ ​ Median, or 50th percentile, of
grouped data.mode() Mode(most common
value)of discrete data.

●​ Measures of spread
These functions calculate a measure of how much the population or sample
tends to deviate from the typical or average values.
pstdev()​ Population standard deviation of data.
pvariance()​ Population variance of data.
stdev()​ Sample standard deviation of data.
variance()​ Sample variance of data.

Program 1:

Program 2:

Program 3:
Program 4:

Program 5:
Program 2:

Program 3:
Program 1:

Program 2:
Program 3:

Program 4:
Program 5:

Program 6:
Handling the missing values:
Program 1:
Program 2:
Program 3:

Program 4:
Program 5:
Program 6:
Program 7:

Program 8:
Program 9:

Program 10:
Program 11:

You might also like