Open In App

Convert column type from string to datetime format in Pandas dataframe

Last Updated : 27 Nov, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

To perform time-series operations, dates should be in the correct format. Let’s learn how to convert a Pandas DataFrame column of strings to datetime format.

Pandas Convert Column To DateTime using pd.to_datetime()

pd.to_datetime() function in Pandas is the most effective way to handle this conversion. This is used to handle different formats and to convert string columns to datetime objects.

Python
# Importing pandas
import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({
    'Date': ['11/8/2011', '04/23/2008', '10/2/2019'],
    'Event': ['Music', 'Poetry', 'Theatre'],
    'Cost': [10000, 5000, 15000]
})

# Display initial data types
print("Before Conversion:")
print(df.info())

# Converting 'Date' column to datetime
df['Date'] = pd.to_datetime(df['Date'])

# Display data types after conversion
print("\nAfter Conversion:")
print(df.info())

Output:

Before Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null object
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes
None
After Conversion:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 3 non-null datetime64[ns]
1 Event 3 non-null object
2 Cost 3 non-null int64
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes
None

Before conversion, the Date column has the data type object (string). After using pd.to_datetime(), it changes to datetime64[ns]. Now, let’ s explore how can we can convert columns from string-based formats like yymmdd to yyyymmdd using Pandas.

Converting from ‘yymmdd’ Format

If a column contains dates in the yymmdd format, you can use pd.to_datetime() with the appropriate format to convert it to datetime64[ns].

Python
# Importing pandas library
import pandas as pd

# Initializing the dataset
player_list = [
    ['200712', 50000],
    ['200714', 51000],
    ['200716', 51500],
    ['200719', 53000],
    ['200721', 54000],
    ['200724', 55000],
    ['200729', 57000]
]

# Creating a pandas DataFrame
df = pd.DataFrame(player_list, columns=['Dates', 'Patients'])

# Displaying the DataFrame and its data types before conversion
print("Before Conversion:")
print(df)
print(df.dtypes)

# Converting 'Dates' from 'yymmdd' to datetime format
df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')

# Displaying the DataFrame and its data types after conversion
print("\nAfter Conversion:")
print(df)
print(df.dtypes)

Output:

Before Conversion:
Dates Patients
0 200712 50000
1 200714 51000
2 200716 51500
3 200719 53000
4 200721 54000
5 200724 55000
6 200729 57000
Dates object
Patients int64
dtype: object

After Conversion:
Dates Patients
0 2020-07-12 50000
1 2020-07-14 51000
2 2020-07-16 51500
3 2020-07-19 53000
4 2020-07-21 54000
5 2020-07-24 55000
6 2020-07-29 57000
Dates datetime64[ns]
Patients int64
dtype: object

Converting from ‘yyyymmdd’ Format

When the column is already in yyyymmdd format, use pd.to_datetime() with the %Y%m%d format to convert it directly to datetime64[ns].

Python
import pandas as pd 

# Initializing the dataset
player_list = [
    ['20200712', 50000, '20200812'],
    ['20200714', 51000, '20200814'],
    ['20200716', 51500, '20200816'],
    ['20200719', 53000, '20200819'],
    ['20200721', 54000, '20200821'],
    ['20200724', 55000, '20200824'],
    ['20200729', 57000, '20200824']
]

# Creating a pandas DataFrame
df = pd.DataFrame(player_list, columns=['Treatment_start', 'No_of_Patients', 'Treatment_end'])

# Displaying the DataFrame and its data types before conversion
print("Before Conversion:")
print(df)
print(df.dtypes)

# Converting 'Treatment_start' and 'Treatment_end' to datetime format
df['Treatment_start'] = pd.to_datetime(df['Treatment_start'], format='%Y%m%d')
df['Treatment_end'] = pd.to_datetime(df['Treatment_end'], format='%Y%m%d')

# Displaying the DataFrame and its data types after conversion
print("\nAfter Conversion:")
print(df)
print(df.dtypes)

Output:

converting-from-yyyymmdd-format


Convert Column To DateTime using DataFrame.astype() function

Here’s how you can create a DataFrame, inspect its data types, and convert the string-based date column to datetime using DataFrame.astype().

Python
# Importing pandas
import pandas as pd

# Creating the DataFrame
df = pd.DataFrame({
    'Date': ['11/8/2011', '04/23/2008', '10/2/2019'],
    'Event': ['Music', 'Poetry', 'Theatre'],
    'Cost': [10000, 5000, 15000]
})

# Checking the initial data type of the 'Date' column
print("Before Conversion:")
df.info()

Output: 

The Date column initially has the data type object, indicating it contains strings.

Python
# Converting the 'Date' column to datetime format using astype()
df['Date'] = df['Date'].astype('datetime64[ns]')

# Checking the data type after conversion
print("\nAfter Conversion:")
df.info()

Output:

Using astype(‘datetime64[ns]’)** successfully converts the Date column to the **datetime64[ns]` format.



Next Article

Similar Reads