0% found this document useful (0 votes)
48 views

Data Preprocessing Assignments

The document outlines 5 assignments with Python programming exercises involving data analysis and visualization. Assignment 1 includes creating pie charts and line plots from CSV data and handling missing values. Assignment 2 focuses on box plots and basic statistics. Assignment 3 repeats some tasks from Assignment 1. Assignment 4 includes one-hot encoding and label encoding on categorical data. Assignment 5 consists of calculating statistics, distances, modifying data frames, weighted averages, and bar and pie charts using NumPy.

Uploaded by

Akash Bhosale
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Data Preprocessing Assignments

The document outlines 5 assignments with Python programming exercises involving data analysis and visualization. Assignment 1 includes creating pie charts and line plots from CSV data and handling missing values. Assignment 2 focuses on box plots and basic statistics. Assignment 3 repeats some tasks from Assignment 1. Assignment 4 includes one-hot encoding and label encoding on categorical data. Assignment 5 consists of calculating statistics, distances, modifying data frames, weighted averages, and bar and pie charts using NumPy.

Uploaded by

Akash Bhosale
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Assignment -1 :-

1) Write a Python program to create a Pie


plot to get the frequency of the three species
of the Iris data (Use iris.csv)

2) Write a Python program to view basic


statistical details of the data.(Use wineequality-
red.csv

3)Write a Python program for Handling Missing


Value. Replace missing value of salary, age
column with mean of that column.(Use Data.csv
file).

4) Write a Python program to generate a line plot


of name Vs salary
Assignment -2 :-
1)Write a Python program to create box plots to
see how each feature i.e. Sepal Length, Sepal
Width, Petal Length, Petal Width are distributed
across the three species. (Use iris.csv dataset)

2)Write a Python program to view basic statistical


details of the data (Use Heights and Weights
Dataset)

3)Generate a random array of 50 integers and


display them using a line chart, scatter plot,
histogram and box plot. Apply appropriate color,
labels and styling options.

4)Write a Python program to print the shape,


number of rows-columns, data types, feature
names and the description of the data(Use
User_Data.csv)
Assignment -3 :-
1)Generate a random array of 50 integers and
display them using a line chart, scatter plot,
histogram and box plot. Apply appropriate color,
labels and styling options.

2) Write a Python program to print the shape,


number of rows-columns, data types, feature
names and the description of the data(Use
User_Data.csv)

3) Write a Python program for Handling Missing


Value. Replace missing value of salary, age
column with mean of that column.(Use Data.csv
file).

4)Write a Python program to generate a line plot


of name Vs salary
Assignment -4 :-

1) Write a Python program to perform the


following tasks :
a. Apply OneHot coding on Country column.
b. Apply Label encoding on purchased column
(Data.csv have two categorical column the
country column, and the purchased column).

3) Write a program in python to perform


following task :
Standardizing Data (transform them into a
standard Gaussian distribution with a mean of 0
and a standard deviation of 1) (Use winequality-
red.csv)
Assignment -5:-
1)Write a python program to Display column-wise
mean, and median for SOCRHeightWeight
dataset.

2)Write a python program to compute sum of


Manhattan distance between all pairs of points.

3)Write a Python program to create data frame


containing column name, salary, department add
10 rows with some missing and duplicate values
to the data frame. Also drop all null and empty
values. Print the modified data frame.

4)Write a Python NumPy program to compute the


weighted average along the specified axis of a
given flattened array.

5)Write a python program to create two lists, one


representing subject names and the other
representing marks obtained in those subjects.
Display the data in a pie chart and bar chart

You might also like