How to convert tab-separated file into a dataframe using Python
Last Updated :
11 Jul, 2024
In this article, we will learn how to convert a TSV file into a data frame using Python and the Pandas library.
A TSV (Tab-Separated Values) file is a plain text file where data is organized in rows and columns, with each column separated by a tab character.
- It is a type of delimiter-separated file, similar to CSV (Comma-Separated Values).
- Tab-separated files are commonly used in data manipulation and analysis, and being able to convert them into a data frame can greatly enhance our ability to work with structured data efficiently.
Methods to Convert Tab-Separated Files into a Data Frame
Method 1: Using pandas 'read_csv()' with 'sep' parameter
In this method, we will use the Pandas library to read a tab-separated file into a data frame.
Look at the following code snippet.
- We have imported the pandas library and defined the path of the tab-separated file.
- Then, we use 'pd.read_csv()' function to read the contents of the tab-separated file into a DataFrame and specified that the file is tab-separated using "sep ='\t'"
- The '
read_csv()'
function automatically detects the delimiter and parses the file accordingly.
Python
import pandas as pd
file_path = "file.tsv"
df = pd.read_csv(file_path,sep='\t')
df.head()
Output:
0 50 5 881250949
0 0 172 5 881250949
1 0 133 1 881250949
2 196 242 3 881250949
3 186 302 3 891717742
4 22 377 1 878887116
Method 2: Using pandas 'read_table()' function
In the following code snippet, we have again used the pandas library in Python to read the contents of a tab-separated file named 'file.tsv' into a DataFrame named 'df'. The pd.read_table()
function is employed for this task, which automatically infers the tab separator.
Python
import pandas as pd
df = pd.read_table('file.tsv')
df.head()
Output:
0 50 5 881250949
0 0 172 5 881250949
1 0 133 1 881250949
2 196 242 3 881250949
3 186 302 3 891717742
4 22 377 1 878887116
Method 3: Using csv module
The code example, begin by importing the csv module, which provides functionality for reading and writing CSV files.
- Uses the
open()
function to open the file specified by file_path
in read-only mode ('r'
). Utilized the with
statement to ensure proper file closure after reading. - Creates a CSV reader object using
csv.reader
(file, delimiter='\t'), specifing that the values in the file are tab-separated.
Python
import csv
file_path = "file.tsv"
with open(file_path, 'r') as file:
reader = csv.reader(file, delimiter='\t')
df = pd.DataFrame(reader)
df.head()
Output:
0 1 2 3
0 0 50 5 881250949
1 0 172 5 881250949
2 0 133 1 881250949
3 196 242 3 881250949
4 186 302 3 891717742
Method 4: Use 'numpy' to load the data and then convert to a DataFrame
This code segment employs NumPy's 'genfromtxt()' function to import tab-separated data from 'file.tsv' into a NumPy array, configuring the tab delimiter and data type. Following this, it converts the NumPy array into a pandas DataFrame, facilitating structured data representation for further analysis and manipulation.
Python
import numpy as np
import pandas as pd
data = np.genfromtxt('file.tsv', delimiter='\t', dtype=None, encoding=None)
df = pd.DataFrame(data)
df.head()
Output:
0 1 2 3
0 0 50 5 881250949
1 0 172 5 881250949
2 0 133 1 881250949
3 196 242 3 881250949
4 186 302 3 891717742
Similar Reads
How to Convert Tab-Delimited File to Csv in Python? We are given a tab-delimited file and we need to convert it into a CSV file in Python. In this article, we will see how we can convert tab-delimited files to CSV files in Python. Convert Tab-Delimited Files to CSV in PythonBelow are some of the ways to Convert Tab-Delimited files to CSV in Python: U
2 min read
Convert PDF to TXT File Using Python We have a PDF file and want to extract its text into a simple .txt format. The idea is to automate this process so the content can be easily read, edited, or processed later. For example, a PDF with articles or reports can be converted into plain text using just a few lines of Python. In this articl
2 min read
Reading .Dat File in Python Python, with its vast ecosystem of libraries and modules, provides a flexible and efficient environment for handling various file formats, including generic .dat files. In this article, we will different approaches to reading and process .dat files in Python. your_file.dat 1.0 2.0 3.04.0 5.0 6.07.0
2 min read
Exporting Multiple Sheets As Csv Using Python In data processing and analysis, spreadsheets are a common format for storing and manipulating data. However, when working with large datasets or conducting complex analyses, it's often necessary to export data from multiple sheets into a more versatile format. CSV (Comma-Separated Values) files are
3 min read
Convert Dict of List to CSV - Python To convert a dictionary of lists to a CSV file in Python, we need to transform the dictionary's structure into a tabular format that is suitable for CSV output. A dictionary of lists typically consists of keys that represent column names and corresponding lists that represent column data.For example
3 min read
Python Dictionary with Multiple Values to DataFrame Python's powerful libraries like Pandas make data manipulation and analysis easier. One of the most common things that can be performed with Pandas is to convert data structures like Dictionaries into Dataframes. In this article, we are covering how to convert Python dictionaries with multiple value
3 min read