Python has many data science libraries and Pandas is one of Python's most popular data science libraries. Like the NumPy library arrays and ndArrays Python Pandas support Series and DataFrames.
Here, Series represents 1D array and DataFrames represents multi-dimensional arrays. An excel sheet can be represented as a DataFrame, and the pandas library provides us an inbuilt
read_excel()
method that can be used to achieve this goal.
In this short Python tutorial, you will learn how can you import an Excel sheet in Python using pandas (with the read_excel() method). Before you import an excel sheet in Python using pandas make sure that pandas in installed in your system.
Python Libraries Required to Import an Excel File in Python
There are 3 libraries you need to install in your python environment if you want to import an excel sheet using pandas.
pip install numpy
pip install pandas
pip install xlrd
Make sure that you have installed these three libraries before importing an Excel sheet in Python with Pandas, else you will be getting this error: ImportError: Missing optional dependency 'xlrd'. Install xlrd >= 1.0.0 for Excel support Use pip or conda to install xlrd.
How to Import an Excel File into Python Using Pandas?
read_excel()
is a pandas method that allows us to access an Excel sheet using Python. The
read_excel()
method can load the Excel file from the local system or specified URL and the
read_excel()
method allows us to access Excel files with extension
xls, xlsx, xlsm, xlsb, odf, ods
and
odt
.
For the example, below we have used the countries.xlsx file:
Python Program to Import an Excel File Using pandas
Output
Behind the code:
From the above example, you can see that the
read_excel()
method imports the countries.xlsx file and convert it into a Pandas Dataframe object. Moreover, it converts the first row of the excel sheet to the columns named.
In the above example, the python script and the excel file are at the same location that’s why we are directly able to access the file using the file name. If the excel file and python script are located at different locations then you need to specify the path location of the excel file.
Pandas read_excel() Method Arguments
The read_excel() method accepts multiple arguments, and most of the arguments are optional, except the file name.
read_excel() important arguments
- io
- header
- name
- index_col
io
represents the file name and it is the mandatory argument. It is represented by a string value that specifies the path for the excel file:
df = pd.read_excel(io ='countries.xlsx')
header
specifies the head value for the data frame and by default, its value is
0
, which represents that row 0 will be the header value for the data frame. If we set it to
None
then indexing values will be used as headers.
Output
The
names
represent a list of values that specifies the header for the data frame if
header
value is None.
Output
index_col
argument represents the first row labels of the data frame.
Output
Conclusion
Here in this Python tutorial, you learned how can you import an Excel file in python using pandas with the
read_excel()
method. Pandas also provide various methods, such as
read_table, read_csv, read_json,
and
read_html
to read and import tables, CSV, JSON, and HTML files, respectively.
Before you use the
read_excel()
method in Python, ensure that all the other dependencies (NumPy and xlrd) have been installed in your Python environment.
People are also reading:
Leave a Comment on this Post