
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas to_csv() Method
The to_csv() method in Python's Pandas library provides a convenient way to write data stored in a Pandas DataFrame or Series object to a CSV file. This is particularly useful for data storage, sharing, and further analysis in other applications. This method can handle compression, support various storage options, and allow customization using numerous parameters.
CSV (Comma-Separated Values) is a common plain-text format for storing tabular data, where rows are represented as lines and columns are separated by commas ",". It is a common delimiter for CSV files, and the file has .csv extension.
Syntax
Below is the syntax of the Python Pandas to_csv() method −
DataFrame.to_csv(path_or_buf=None, *, sep=',', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, compression='infer', quoting=None, quotechar='"', lineterminator=None, chunksize=None, date_format=None, doublequote=True, escapechar=None, decimal='.', errors='strict', storage_options=None)
When using the to_csv() method on a Series object, you should call it as Series.to_csv().
Parameters
The Python Pandas to_csv() method accepts the below parameters −
path_or_buf: This parameter accepts a string, path object, or file-like object, representing the location of the CSV file. If not specified, the result is returned as a string.
sep: Specifies the delimiter(Character or regex pattern) to use. Defaults to a comma (,).
na_rep: String to represent missing values.
float_format: This parameter determines format string for floating-point numbers.
columns: A list of column labels to include in the output. If not specified, all columns are written.
header: Boolean or list of strings. If True, includes column labels. Otherwise, no headers are written.
index: Whether to include the DataFrame's index in the output CSV file. By default, writes the DataFrame index.
index_label: Specifies a label for the index columns.
mode: Determines file opening mode. 'w' truncate the file first, 'a' for append, and 'x' for exclusive creation, failing if the file already exists.
encoding: A string representing the encoding of the file (e.g., 'utf-8').
compression: Specifies the compression method to use. If set to 'infer', the method will automatically detect the compression type based on the file extension (e.g., .gz, .bz2, .zip). You can also pass a dictionary to customize compression methods such as gzip, zip, bz2, zstd, etc. If set to None, no compression will be applied.
chunksize: Writes rows in chunks of this size.
storage_options: Additional options for connecting to certain storage back-ends (e.g., AWS S3, Google Cloud Storage).
errors: It specifies how to handle encoding and decoding errors.
Additional parameters: It takes many other parameters for fine tuning data storage operations.
Return Value
The Pandas to_csv() method returns a string if path_or_buf is not specified. Otherwise, it saves the data to the specified location.
Example: Creating CSV file from Pandas Series
Here is a basic example demonstrating creating CSV file from Pandas Series using the Pandas Series.to_csv() method.
import pandas as pd # Create a Pandas Series s = pd.Series([1, 2, 3], index=["duck", "cat", "wolf"]) # Save the Series in a CSV file s.to_csv('series_data.csv') print("Series is successfully saved in a CSV file..")
Following is an output of the above code −
Series is successfully saved in a CSV file..
If you visit the folder where the CSV files are saved, you can observe the generated CSV file.
Example: Exporting a DataFrame to a CSV File
This example exports the data in a Pandas DataFrame object to CSV file using the Pandas to_csv() method.
import pandas as pd # Create an CSV file engine df = pd.DataFrame({'name': ['Ravi', 'Priya', 'Kiran', ''], 'salary': [50000, 45000, 65000, 55000]}) # Export the DataFrame to a CSV file df.to_csv('DataFrame_data.csv') print("DataFrame has been saved to 'DataFrame_data.csv'.")
When we run above program, it produces following result −
DataFrame has been saved to 'DataFrame_data.csv'.
Example: Saving DataFrame to a CSV file with index
This example demonstrates saving the DataFrame to a CSV file and include a custom label for the index, using the index_label parameter of the DataFrame.to_csv() method.
import pandas as pd # Create a sample DataFrame with car data data = { 'Car': ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'], 'Date_of_purchase': ['2025-01-01', '2025-01-02', '2025-01-07', '2025-01-06', '2025-01-09', '2025-01-22'] } # Create the DataFrame df = pd.DataFrame(data) # Save the DataFrame to a CSV file, including a custom index label df.to_csv('car_with_index.csv', index_label="ID") print("DataFrame has been saved with index as 'car_with_index.csv'.") # Read the saved CSV file back into a DataFrame result = pd.read_csv('car_with_index.csv') print("\nDataFrame loaded from CSV file:") print(result)
While executing the above code we obtain the following output −
DataFrame has been saved with index as 'car_with_index.csv'. DataFrame loaded from CSV file:
ID | Car | Date_of_purchase | |
---|---|---|---|
0 | 0 | BMW | 2025-01-01 |
1 | 1 | Lexus | 2025-01-02 |
2 | 2 | Audi | 2025-01-07 |
3 | 3 | Mercedes | 2025-01-06 |
4 | 4 | Jaguar | 2025-01-09 |
5 | 5 | Bentley | 2025-01-22 |
Example: Saving DataFrame to a CSV file with Missing Data
The following example demonstrates using the to_csv() method for handling the missing values. Here missing values in the CSV file are represented as "--" string.
import pandas as pd # Create a sample DataFrame with car data data = { 'Car': ['BMW', 'Lexus', 'Audi', 'Mercedes', '', 'Bentley'], 'Color': ['red', 'yellow', None, 'black', 'white', 'blue'], 'Date_of_purchase': [None, '2025-01-02', '2025-01-07', '2025-01-06', '2025-01-09', '2025-01-22'] } # Create the DataFrame df = pd.DataFrame(data) # Save the DataFrame to a CSV file df.to_csv("missing_data.csv", index=False, na_rep="--") print("DataFrame with missing values saved as 'missing_data.csv'.") # Read the saved CSV file back into a DataFrame result = pd.read_csv('missing_data.csv') print("\nDataFrame loaded from CSV file:") print(result)
Following is an output of the above code −
DataFrame with missing values saved as 'missing_data.csv'. DataFrame loaded from CSV file:
Car | Color | Date_of_purchase | |
---|---|---|---|
0 | BMW | red | -- |
1 | Lexus | yellow | 2025-01-02 |
2 | Audi | -- | 2025-01-07 |
3 | Mercedes | black | 2025-01-06 |
4 | NaN | white | 2025-01-09 |
5 | Bentley | blue | 2025-01-22 |
Example: Appending Data to an Existing CSV File
This example demonstrates appending data to an existing CSV file using the mode='a' parameter of the DataFrame.to_csv() method.
import pandas as pd # Create a sample DataFrame with car data data = { 'Car': ['BMW', 'Lexus', 'Audi', 'Mercedes', '', 'Bentley'], 'Color': ['red', 'yellow', None, 'black', 'white', 'blue'], 'Date_of_purchase': [None, '2025-01-02', '2025-01-07', '2025-01-06', '2025-01-09', '2025-01-22'] } # Create the DataFrame df = pd.DataFrame(data) # Save the DataFrame to a CSV file df.to_csv("missing_data.csv", index=False, na_rep="--") # Create a new DataFrame new_data = { "Car": ["Ferrari", "Honda"], "Color": ['red', 'black'], "Date_of_purchase": ['2024-10-23', "2025-01-01"] } df_new = pd.DataFrame(new_data) # Append data to an existing CSV file df_new.to_csv("missing_data.csv", mode="a", index=False, header=False) print("New data has been appended to 'missing_data.csv'.") # Read the saved CSV file back into a DataFrame result = pd.read_csv('missing_data.csv') print("\nDataFrame loaded from CSV file:") print(result)
While executing the above code we obtain the following output −
New data has been appended to 'missing_data.csv'. DataFrame loaded from CSV file:
Car | Color | Date_of_purchase | |
---|---|---|---|
0 | BMW | red | -- |
1 | Lexus | yellow | 2025-01-02 |
2 | Audi | -- | 2025-01-07 |
3 | Mercedes | black | 2025-01-06 |
4 | NaN | white | 2025-01-09 |
5 | Bentley | blue | 2025-01-22 |
6 | Ferrari | red | 2024-10-23 |
7 | Honda | black | 2025-01-01 |