The document outlines 5 assignments with Python programming exercises involving data analysis and visualization. Assignment 1 includes creating pie charts and line plots from CSV data and handling missing values. Assignment 2 focuses on box plots and basic statistics. Assignment 3 repeats some tasks from Assignment 1. Assignment 4 includes one-hot encoding and label encoding on categorical data. Assignment 5 consists of calculating statistics, distances, modifying data frames, weighted averages, and bar and pie charts using NumPy.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
48 views
Data Preprocessing Assignments
The document outlines 5 assignments with Python programming exercises involving data analysis and visualization. Assignment 1 includes creating pie charts and line plots from CSV data and handling missing values. Assignment 2 focuses on box plots and basic statistics. Assignment 3 repeats some tasks from Assignment 1. Assignment 4 includes one-hot encoding and label encoding on categorical data. Assignment 5 consists of calculating statistics, distances, modifying data frames, weighted averages, and bar and pie charts using NumPy.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6
Assignment -1 :-
1) Write a Python program to create a Pie
plot to get the frequency of the three species of the Iris data (Use iris.csv)
2) Write a Python program to view basic
statistical details of the data.(Use wineequality- red.csv
3)Write a Python program for Handling Missing
Value. Replace missing value of salary, age column with mean of that column.(Use Data.csv file).
4) Write a Python program to generate a line plot
of name Vs salary Assignment -2 :- 1)Write a Python program to create box plots to see how each feature i.e. Sepal Length, Sepal Width, Petal Length, Petal Width are distributed across the three species. (Use iris.csv dataset)
2)Write a Python program to view basic statistical
details of the data (Use Heights and Weights Dataset)
3)Generate a random array of 50 integers and
display them using a line chart, scatter plot, histogram and box plot. Apply appropriate color, labels and styling options.
4)Write a Python program to print the shape,
number of rows-columns, data types, feature names and the description of the data(Use User_Data.csv) Assignment -3 :- 1)Generate a random array of 50 integers and display them using a line chart, scatter plot, histogram and box plot. Apply appropriate color, labels and styling options.
2) Write a Python program to print the shape,
number of rows-columns, data types, feature names and the description of the data(Use User_Data.csv)
3) Write a Python program for Handling Missing
Value. Replace missing value of salary, age column with mean of that column.(Use Data.csv file).
4)Write a Python program to generate a line plot
of name Vs salary Assignment -4 :-
1) Write a Python program to perform the
following tasks : a. Apply OneHot coding on Country column. b. Apply Label encoding on purchased column (Data.csv have two categorical column the country column, and the purchased column).
3) Write a program in python to perform
following task : Standardizing Data (transform them into a standard Gaussian distribution with a mean of 0 and a standard deviation of 1) (Use winequality- red.csv) Assignment -5:- 1)Write a python program to Display column-wise mean, and median for SOCRHeightWeight dataset.
2)Write a python program to compute sum of
Manhattan distance between all pairs of points.
3)Write a Python program to create data frame
containing column name, salary, department add 10 rows with some missing and duplicate values to the data frame. Also drop all null and empty values. Print the modified data frame.
4)Write a Python NumPy program to compute the
weighted average along the specified axis of a given flattened array.
5)Write a python program to create two lists, one
representing subject names and the other representing marks obtained in those subjects. Display the data in a pie chart and bar chart