How to extract Time data from an Excel file column using Pandas?
Last Updated :
02 Sep, 2020
Prerequisite: Regular Expressions in Python
In these articles, we will discuss how to extract Time data from an Excel file column using Pandas. Suppose our Excel file looks like below given image then we have to extract the Time from the Excel sheet column and store it into a new Dataframe column.
For viewing the Excel file Click Here.
Approach:
- Import the required module.
- Import data from Excel file.
- Make an extra column for store extracted time.
- Set Index for searching for extracting column.
- Define the pattern of Time format (HH: MM: SS).
- Search Time and assigning to the respective column in Dataframe.
Let's see Step-By-Step-Implementation:
Step 1: Import the required module and read data from Excel file.
Python3
# importing required module
import pandas as pd;
import re;
# Read excel file and store in to DataFrame
data = pd.read_excel("time_sample_data.xlsx");
print("Original DataFrame")
data
Output:
Step 2: Make an extra column for storing Time data.
Python3
# Create column for Time
data['New time'] = None
data
Output:
Step 3: Set Index for searching
Python3
# set index
index_set = data.columns.get_loc('Description')
index_time = data.columns.get_loc('New time')
print(index_set, index_time)
Output:
1 2
Step 4: Defining the Regular expression (regex) for the time.
Regex for time HH/ MM/ SS format:
[0-24]{2}\:[0-60]{2}\:[0-60]{2}.
Python3
# define time pattern
time_pattern = r'([0-24]{2}\:[0-60]{2}\:[0-60]{2})'
Step 5: Search Time and assigning to the respective column in Dataframe.
For searching the time using regex in a string we are using re.search() function of re library.
Python3
# searching the entire DataFrame
# with Time pattern
for row in range(0, len(data)):
time = re.search(time_pattern,
data.iat[row,index_set]).group()
data.iat[row, index_time] = time
print("Final DataFrame")
data
Output:
Complete Code:
Python3
# importing required module
import pandas as pd;
import re;
data = pd.read_excel("time_sample_data.xlsx");
print("Original DataFrame")
print(data)
# Create column for Date
data['New time']= None
print(data)
# set index
index_set= data.columns.get_loc('Description')
index_time=data.columns.get_loc('New time')
print(index_set,index_time)
# define the time pattern in HH:MM:SS
time_pattern= r'([0-24]{2}\:[0-60]{2}\:[0-60]{2})'
#searching dataframe with time pattern
for row in range(0, len(data)):
time= re.search(time_pattern,data.iat[row,index_set]).group()
data.iat[row,index_time] = time
print("\n Final DataFrame")
data
Output:
Note: Before running this program, make sure you have already installed xlrd library in your Python environment.
Similar Reads
How to extract date from Excel file using Pandas? Prerequisite: Regular Expressions in Python In this article, Let's see how to extract date from the Excel file. Suppose our Excel file looks like below given image then we have to extract the date from the string and store it into a new Dataframe column. date_sample_data.xlsx For viewing the Excel f
3 min read
How to import excel file and find a specific column using Pandas? To read specific columns from an Excel file in Pandas, you have the flexibility to use either column indices or letters. This is achieved by setting the usecols argument, which can take a comma-separated string or a list containing column identifying letters or indices. In this article, we will lear
5 min read
How to remove timezone from a Timestamp column in a Pandas Dataframe The world is divided into 24 timezones. We all know that different timezones are required as the entire globe is not lit at the same time. While for many instances we might not require timezones especially in cases where the data resides on a common server present at some location or even our local
3 min read
How to import an excel file into Python using Pandas? It is not always possible to get the dataset in CSV format. So, Pandas provides us the functions to convert datasets in other formats to the Data frame. An excel file has a '.xlsx' format. Before we get started,  we need to install a few libraries. pip install pandas pip install xlrd  For importin
2 min read
Convert column type from string to datetime format in Pandas dataframe To perform time-series operations, dates should be in the correct format. Let's learn how to convert a Pandas DataFrame column of strings to datetime format. Pandas Convert Column To DateTime using pd.to_datetime()pd.to_datetime() function in Pandas is the most effective way to handle this conversio
4 min read
Convert Datetime Object To Seconds, Minutes & Hours Using pandas We can convert a datetime object to seconds, minutes, and hours using pandas in Python by accessing the attributes of the datetime object and performing the necessary calculations. Convert Datetime Object To Seconds, Minutes & Hours Using pandasConvert a datetime Object to SecondsTo retrieve the
1 min read