Cleaning Data in Python: Pu!ing It All Together
Cleaning Data in Python: Pu!ing It All Together
Pu!ing it all
together
Cleaning Data in Python
Useful methods
In [1]: import pandas as pd
In [2]: df = pd.read_csv('my_data.csv')
In [3]: df.head()
In [4]: df.info()
In [5]: df.columns
In [6]: df.describe()
In [7]: df.column.value_counts()
In [8]: df.column.plot('hist')
Cleaning Data in Python
Data quality
In [9]: def cleaning_function(row_data):
...: # data cleaning steps
...: return ...
Combining data
● pd.merge(df1, df2, …)
● pd.concat([df1, df2, df3, …])
CLEANING DATA IN PYTHON
Let’s practice!
CLEANING DATA IN PYTHON
Initial impressions
of the data
Cleaning Data in Python
In [6]: df.to_csv['my_data.csv']
CLEANING DATA IN PYTHON
Let’s practice!
CLEANING DATA IN PYTHON
Final thoughts
Cleaning Data in Python
Congratulations!