0% found this document useful (0 votes)
2 views

Python Importing Data Cheat Sheet

This cheat sheet provides essential methods for importing and analyzing data sets using Python's pandas library. It includes code examples for reading CSV files, printing entries, assigning headers, replacing values, retrieving data types, and saving data frames. The document also notes the usage of JupyterLite and local Python environments for file paths.

Uploaded by

w123lucy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Python Importing Data Cheat Sheet

This cheat sheet provides essential methods for importing and analyzing data sets using Python's pandas library. It includes code examples for reading CSV files, printing entries, assigning headers, replacing values, retrieving data types, and saving data frames. The document also notes the usage of JupyterLite and local Python environments for file paths.

Uploaded by

w123lucy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

2/23/25, 9:15 PM about:blank

Data Analysis with Python


Cheat Sheet: Importing Data Sets

Package/Method Description Code Example

df = pd.read_csv(<CSV_path>, header = None)


# load without header
df = pd.read_csv(<CSV_path>, header = 0)
Read the CSV file containing a data set to a # load using first row as header
Read CSV data set
pandas data frame
Note: The labs in this course run in JupyterLite environment. In JupyterLite environment, you'll need to download the required file to the local
environment and then use the local path to the file as the CSV_path. However, in case you are using JupyterLabs, or any other Python compiler on your
local machine, you can use the URL of the required file directly as the CSV_path.

Print first few Print the first few entries (default 5) of the df.head(n) #n=number of entries; default 5
entries pandas data frame

Print the last few entries (default 5) of the df.tail(n) #n=number of entries; default 5
Print last few entries
pandas data frame

Assign header Assign appropriate header names to the data df.columns = headers
names frame

Replace "?" with Replace the entries "?" with NaN entry from df = df.replace("?", np.nan)
NaN Numpy library

Retrieve the data types of the data frame df.dtypes


Retrieve data types
columns

Retrieve the statistical description of the data


Retrieve statistical set. Defaults use is for only numerical data df.describe() #default use df.describe(include="all")
description types. Use include="all" to create summary
for all variables

Retrieve data set Retrieve the summary of the data set being df.info()
summary used, from the data frame

Save data frame to Save the processed data frame to a CSV file df.to_csv(<output CSV path>)
CSV with a specified path

about:blank 1/1

You might also like