Dataframe is a two dimensional data structure, where data is stored in a tabular format, in the form of rows and columns.
It can be visualized as an SQL data table or an excel sheet representation. A column in a dataframe can be deleted using different methods.
We will see the ‘del’ operator that takes the name of the column that needs to be deleted as a parameter, and deletes it −
Example
import pandas as pd my_data = {'ab' : pd.Series([1, 8, 7], index=['a', 'b', 'c']), 'cd' : pd.Series([1, 2, 0, 9], index=['a', 'b', 'c', 'd']), 'ef' : pd.Series([56, 78, 32],index=['a','b','c']), 'gh' : pd.Series([66, 77, 88, 99],index=['a','b','c', 'd']) } my_df = pd.DataFrame(my_data) print("The dataframe is :") print(my_df) print("Deleting the column using the 'del' operator") del my_df['cd'] print(my_df)
Output
The dataframe is : ab cd ef gh a 1.0 1 56.0 66 b 8.0 2 78.0 77 c 7.0 0 32.0 88 d NaN 9 NaN 99 Deleting the column using the 'del' operator ab ef gh a 1.0 56.0 66 b 8.0 78.0 77 c 7.0 32.0 88 d NaN NaN 99
Explanation
The required libraries are imported, and given alias names for ease of use.
Dictionary values consisting of key and value is created, wherein a value is actually a series data structure.
This dictionary is later passed as a parameter to the ‘Dataframe’ function present in the ‘pandas’ library
The ‘del’ keyword is used to delete a specific column.
The name of the column that needs to be deleted is passed as a parameter to the ‘del’ operator.
The new dataframe is printed on the console.
Note − The word ‘NaN’ refers to ‘Not a Number’, which means that specific [row,col] value doesn’t have any valid entry.