0% found this document useful (0 votes)
208 views

Convert HTML Table Into CSV File in Python

This document provides a Python code example to convert an HTML table to a CSV file. It uses the BeautifulSoup and Pandas modules to parse the HTML, extract the table data and header into lists, then converts it into a Pandas DataFrame and exports to a CSV file. The code gets the header row, loops through each table row extracting the text from each cell into a sublist, appends it to a master list of rows, then loads this into a DataFrame and saves as CSV.

Uploaded by

Jayadevan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views

Convert HTML Table Into CSV File in Python

This document provides a Python code example to convert an HTML table to a CSV file. It uses the BeautifulSoup and Pandas modules to parse the HTML, extract the table data and header into lists, then converts it into a Pandas DataFrame and exports to a CSV file. The code gets the header row, loops through each table row extracting the text from each cell into a sublist, appends it to a master list of rows, then loads this into a DataFrame and saves as CSV.

Uploaded by

Jayadevan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Convert HTML table into CSV file in python

geeksforgeeks.org/convert-html-table-into-csv-file-in-python/

April 2, 2020

Convert HTML table into CSV file in python

Last Updated : 21 Apr, 2020

CSV file is a Comma Separated Value file that uses a comma to separate values. CSV
file is a useful thing in today’s world when we are talking about machine learning, data
handling, and data visualization. In this article, we will discuss how to convert an HTML
table into a CSV file.

Converting HTML Table into CSV file in Python


Example: Suppose HTML file looks like,

HTML table can be converted to CSV file using BeautifulSoup and Pandas module of
Python. These modules do not comes built-in with Python. To install them type the below
command in the terminal.

pip install BeautifulSoup


pip install pandas

Python3 Code for converting the HTML table into CSV file

# Importing the required modules

import os

1/4
import sys

import pandas as pd

from bs4 import BeautifulSoup

path = 'html.html'

# empty list

data = []

# for getting the header from

# the HTML file

list_header = []

soup = BeautifulSoup( open (path), 'html.parser' )

header = soup.find_all( "table" )[ 0 ].find( "tr" )

for items in header:

try :

list_header.append(items.get_text())

except :

continue

# for getting the data

HTML_data = soup.find_all( "table" )[ 0 ].find_all( "tr" )


[ 1 :]

for element in HTML_data:

sub_data = []

for sub_element in element:

try :

sub_data.append(sub_element.get_text())

except :

2/4
continue

data.append(sub_data)

# Storing the data into Pandas

# DataFrame

dataFrame = pd.DataFrame(data = data, columns = list_header)

# Converting Pandas DataFrame

# into CSV file

dataFrame.to_csv( 'Geeks.csv' )

Output:

Attention geek! Strengthen your foundations with the Python Programming Foundation
Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with
the Python DS Course. And to begin with your Machine Learning Journey, join the
Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up

Article Contributed By :

3/4
SohelRaja
@SohelRaja
Vote for difficulty

Report Issue

4/4

You might also like