0% found this document useful (0 votes)
31 views

Pandas

The document provides a list of 24 data analysis and wrangling tasks to perform on a practice.csv dataset loaded as a pandas DataFrame. The tasks include displaying and renaming columns, finding distinct values and counts, calculating summary statistics, creating new columns, sorting, grouping, filtering, and selecting subsets of data.

Uploaded by

Sisya MB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Pandas

The document provides a list of 24 data analysis and wrangling tasks to perform on a practice.csv dataset loaded as a pandas DataFrame. The tasks include displaying and renaming columns, finding distinct values and counts, calculating summary statistics, creating new columns, sorting, grouping, filtering, and selecting subsets of data.

Uploaded by

Sisya MB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Load practice.

csv file as a data-frame and perform following operations on the data-frame (try to
use pandas methods)
1. Display all columns
2. create numerical and categorical columns list
3. display size of the data-frame
4. rename column MSSubClass -> SubClass, MSZoning -> Zones
5. display distinct values for Zoning, LotShape, LotConfig
6. display count of distinct values for Zoning, LotShape, LotConfig
7. max, min of column YearBuilt
8. create a new column “year_diff’. This will be holding difference of current year and
YearBuilt
9. display distinct MSZoning for each OverallQual
10. what is maximum LotArea where BsmtExposure = Mn?
11. Sort dataframe based on following columns and orders: MSSubClass; ascending,
YearBuilt; descending
12. What is average OverallQual.
13. Group by YearBuilt and find maximum OverallQal
14. Load the data_1.csv again with MSSubClass as new index
15. Convert LotArea as numpy array
16. In column MasVnrArea replace 0 with -1
17. Check if there is/are any Null values (NaN) in the data given
18. Display percentage of missing values in each column if any
19. Select records where LotConfig is Inside
20. Make a new dataframe with only numeric columns
21. Make a new dataframe with only factorial/string columns
22. Drop column ExterQual
23. Group data on LotShape and find average LotArea
24.

You might also like