Cheat Sheet Pandas
Cheat Sheet Pandas
Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes
import pandas as pd > # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop pandas.to_numeric(arg, errors="ra‐
Create a DataFrame df_urb_pop = next(urb_pop_reader) ise")
# Method 1 errors:
Inspect a DataFrame "raise" -> raise an exception
df1 = pd.DataFrame({
df.head(5) First 5 rows "coerce" -> invalid parsing will be set as NaN
'name':
['John Smith',
'Jane Doe'], df.info() Statistics of columns (row
DataFrame for Select Columns / Rows
'address':
['13 Main St.', count, null values, datatype)
'46 Maple Ave.'], df = pd.DataFrame([
'age':
[34, 28] Reshape (for Scikit) ['January', 100, 100, 23,
}) 100],
nums = np.array(range(1, 11))
# Method 2 ['February', 51, 45, 145,
-> [ 1 2 3 4 5 6 7 8 9 10]
df2 = pd.DataFrame([ 45],
nums = nums.reshape(-1, 1)
['John Smith', '123 Main ['March', 81, 96, 65, 96],
-> [ [1],
St.', 34], ['April', 80, 80, 54, 180],
[2],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[3],
Ave.', 28], ['June', 112, 109, 79,
[4],
['Joe Schmo', '9 129]],
[5],
Broadway', 51] columns=['month',
[6],
], 'east', 'north', 'south',
[7],
columns=['name',
'address', 'west']
[8],
'age']) )
[9],
[10]]
Loading and Saving CSVs Select Columns
You can think of reshape() as rotating this
# Load a CSV File in to a # Select one Column
array. Rather than one big row of numbers,
clinic_north = df.north
DataFrame nums is now a big column of numbers -
df = pd.read_csv('my-csv- there’s one number in each row. --> Reshape values for Scikit
le.csv') clinic_north_south
= df[['n‐
DataFrame move the old indices into a new Performing Column Operation else row['Price'],
colum called index. df = pd.DataFrame([
axis=1
)
['JOHN SMITH', 'john.smi‐
Use .reset_index(drop=True) if you dont th@gmail.com'], We apply a lambda to rows, as opposed to
need the index column. columns, when we want to perform functi‐
['Jane Doe', 'jdoe@yah‐
Use .reset_index(inplace=True) to prevent a onality that needs to access more than one
oo.com'],
new DataFrame from brein created. column at a time.
['joe schmo', 'joeschmo‐
@hotmail.com']
],
columns=['Name', 'Email'])
# Changing a column with an
Operation
df['Name'] = df.Name. apply(‐
lower)
bakery =
pd.read_csv('bakery.csv')
ice_cream = pd.read_csv('ic‐
e_cream.csv')
menu = pd.concat([bakery,
ice_cream])