2-Introduction To Data Cleaning P02

The document provides an introduction to various DataFrame methods used for data cleaning in Python. It includes functions for summarizing data, counting non-null values, retrieving column names, and calculating cumulative sums, among others. Each method is briefly described, highlighting its purpose and functionality.

Uploaded by

mymopop

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views7 pages

2-Introduction To Data Cleaning P02

Uploaded by

mymopop

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Introduction to Data Cleaning

1. df.info()
Displays a summary of the DataFrame, including the number of non-null
values per column, data types, and memory usage.

2. df.count()
Counts the number of non-null (non-missing) values in each column.

3. df.columns
Returns a list of column names in the DataFrame.
4. df.cumsum()
Computes the cumulative sum for each numerical column.

5. df.nsmallest(n, column_name)
Returns the n smallest rows based on the values in the specified column.
6. df.nlargest(n, column_name)
Returns the n largest rows based on the values in the specified column.

7. df.sum()
Computes the sum of values for each numerical column.

8. df.idxmax()
Returns the index (row label) of the maximum value in each column.
9. df.idxmin()
Returns the index (row label) of the minimum value in each column.

10. df.shape
Returns the number of rows and columns in the DataFrame as a tuple.

11. df.head(n=5)
Returns the first n rows of the DataFrame (default is 5).

12. df.tail(n=5)
Returns the last n rows of the DataFrame (default is 5).
13. df.sample(frac=0.5)
Returns a random sample containing frac percentage of rows (here, 50%).

14. df.index
Returns an Index object representing the row labels of the DataFrame.

15. df.value_counts()
Counts the number of occurrences of each unique value in a specific
column (usually used as df['column_name'].value_counts()).

16. df.isnull().sum()
Counts the number of missing (NaN) values in each column

Data Cleaning - Cheatsheet
100% (2)
Data Cleaning - Cheatsheet
8 pages
Code Explanation For Date Types
No ratings yet
Code Explanation For Date Types
8 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Chapter 2 - Python Pandas II
No ratings yet
Chapter 2 - Python Pandas II
71 pages
Pandas Notes
No ratings yet
Pandas Notes
27 pages
Exp3 Python
No ratings yet
Exp3 Python
15 pages
DA Cheat Codes
No ratings yet
DA Cheat Codes
2 pages
DAP Writeups - Merged
No ratings yet
DAP Writeups - Merged
33 pages
Data Handling Part Ii
No ratings yet
Data Handling Part Ii
41 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
Pandas Commands
No ratings yet
Pandas Commands
3 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Project
No ratings yet
Project
10 pages
Pandas
No ratings yet
Pandas
30 pages
Python (Unit - 2)
No ratings yet
Python (Unit - 2)
22 pages
Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Analystics Data Cleaning Questions Interview
No ratings yet
Analystics Data Cleaning Questions Interview
8 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
PJT Explanation of Code Line by Line
No ratings yet
PJT Explanation of Code Line by Line
2 pages
Python-for-Data-Analysis (Pandas
No ratings yet
Python-for-Data-Analysis (Pandas
31 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Data Project
No ratings yet
Data Project
12 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Pandas
No ratings yet
Pandas
4 pages
Pandas Methods
No ratings yet
Pandas Methods
6 pages
Pandas
No ratings yet
Pandas
4 pages
Data Cleaning With Python and Pandas
No ratings yet
Data Cleaning With Python and Pandas
49 pages
IP Practic MINE
No ratings yet
IP Practic MINE
30 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
3 pages
Lab 1 ML Lab
No ratings yet
Lab 1 ML Lab
15 pages
Pandas Module (Part-I)
No ratings yet
Pandas Module (Part-I)
36 pages
Create A Pandas Series From A Dictionary of Values and An Ndarray
No ratings yet
Create A Pandas Series From A Dictionary of Values and An Ndarray
15 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Lecture - 2 Pandas
No ratings yet
Lecture - 2 Pandas
24 pages
Dataframe in Pandas - Cheatsheet
No ratings yet
Dataframe in Pandas - Cheatsheet
8 pages
Lab File
No ratings yet
Lab File
96 pages
Exercise 3
No ratings yet
Exercise 3
12 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Module 3
No ratings yet
Module 3
20 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Pandas 1
No ratings yet
Pandas 1
2 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Data Frame in Panda 01
No ratings yet
Data Frame in Panda 01
9 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Java Programming Tutorial With Screen Shots & Many Code Example
From Everand
Java Programming Tutorial With Screen Shots & Many Code Example
Desmond Ohwofosirai
No ratings yet

2-Introduction To Data Cleaning P02

Uploaded by

2-Introduction To Data Cleaning P02

Uploaded by

Introduction to Data Cleaning

You might also like