Pandas DataFrame DataFrame.dropna() Function

  1. Syntax of pandas.DataFrame.dropna()
  2. Example Codes: DataFrame.dropna() to Drop Row
  3. Example Codes: DataFrame.dropna() to Drop Column
  4. Example Codes: DataFrame.dropna() With how=all
  5. Example Codes: DataFrame.dropna() With a Specified Subset or Thresh
  6. Example Codes: DataFrame.dropna() With inplace=True
Pandas DataFrame DataFrame.dropna() Function

pandas.DataFrame.dropna() function removes null values (missing values) from the DataFrame by dropping the rows or columns containing the null values.

ADVERTISEMENT

NaN (not a number) and NaT (Not a Time) represent the null values. DataFrame.dropna() detects these values and filters the DataFrame accordingly.

Syntax of pandas.DataFrame.dropna()

Python
 pythonCopyDataFrame.dropna(axis, how, thresh, subset, inplace)

Parameters

axis It determines the axis to be either row or column.
If it is 0 or 'index', then it drops the rows containing missing values.
If it is 1 or 'columns', then it drops the columns containing the missing values. By default, its value is 0.
how This parameter determines how the function drops rows or columns. It only accepts two strings, either any or all. By default, it’s set to any.
any drops the row or column if there is any null value in it.
all drops the row or column if all values are missing in it.
thresh It is an integer that specifies the least number of non-missing values that prevent rows or columns from dropping.
subset It is an array that has the names of rows or columns to specify the dropping procedure.
inplace It is a Boolean value that changes the caller DataFrame if set to True. By default, its value is False.

Return

It returns a filtered DataFrame with dropped rows or columns according to the passed parameters.

Example Codes: DataFrame.dropna() to Drop Row

By default, the axis is 0 i.e rows, so all the outputs have rows dropped.

Python
 pythonCopyimport pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: None, 2: 80,3: None, 4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: None, 1: 75, 2: 82, 3: 64, 4: None}})
print(dataframe)

The example DataFrame is as follows.

 textCopy   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
1         NaN    John            75.0
2        80.0   Laura            82.0
3         NaN     Ben            64.0
4        95.0   Kevin             NaN

All the parameters of this function are optional. If we pass no parameter, then the function drops all the rows containing a single null value.

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna()
print(dataframe1)

Output:

 textCopy   Attendance   Name  Obtained Marks
2        80.0  Laura            82.0

It has dropped all the rows that contained a single missing value.

Example Codes: DataFrame.dropna() to Drop Column

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna(axis=1)

print(dataframe1)

Output:

 textCopy     Name
0  Olivia
1    John
2   Laura
3     Ben
4   Kevin

It has dropped all the columns that contained a single missing value because we set axis=1 in the DataFrame.dropna() method.

Example Codes: DataFrame.dropna() With how=all

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(axis=1, how="all")
print(dataframe1)

Output:

 textCopy   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
1         NaN    John            75.0
2        80.0   Laura            82.0
3         NaN     Ben            64.0
4        95.0   Kevin             NaN

The rows containing the missing values are not dropped because the how parameter has value set to all which means that all the values of the row should be null.

If all the values are missing in the specified axis, then DataFrame.dropna() method drops that axis even when the how is set to be all.

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: None, 2: None, 3: None, 4: None},
    }
)

print(dataframe)
print("--------")
dataframe1 = dataframe.dropna(axis=1, how="all")
print(dataframe1)

Output:

 textCopy   Attendance    Name Obtained Marks
0        60.0  Olivia           None
1         NaN    John           None
2        80.0   Laura           None
3         NaN     Ben           None
4        95.0   Kevin           None
--------
   Attendance    Name
0        60.0  Olivia
1         NaN    John
2        80.0   Laura
3         NaN     Ben
4        95.0   Kevin

Example Codes: DataFrame.dropna() With a Specified Subset or Thresh

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(thresh=3)
print(dataframe1)

Output:

 textCopy   Attendance   Name  Obtained Marks
2        80.0  Laura            82.0

The value of thresh is 3 which means that to prevent dropping, at least 3 non-empty values are required.

We could also specify the subset.

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(subset=["Attendance", "Name"])
print(dataframe1)

Output:

 textCopy   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
2        80.0   Laura            82.0
4        95.0   Kevin             NaN

It drops rows with missing values on the basis of Attendance and Name column. It doesn’t drop rows if only the values in other columns, Obtained Marks here, have missing values.

Example Codes: DataFrame.dropna() With inplace=True

DataFrame.dropna() changes the caller DataFrame in-place if inplace is set to True.

Python
 pythonCopyimport pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna(inplace=True)
print(dataframe1)

Output:

 textCopyNone

The parameter has modified the caller DataFrame in-place and returned None.

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

Related Article - Pandas DataFrame