Eda 3
Eda 3
E X P L O R AT O R Y D ATA A N A LY S I S I N P Y T H O N
Izzy Weber
Curriculum Manager, DataCamp
Patterns over time
divorce = pd.read_csv("divorce.csv")
divorce.head()
marriage_date marriage_duration
0 2000-06-26 5.0
1 2000-02-02 2.0
2 1991-10-09 10.0
3 1993-01-02 10.0
4 1998-12-11 7.0
divorce.dtypes
marriage_date object
marriage_duration float64
dtype: object
marriage_date datetime64[ns]
marriage_duration float64
dtype: object
divorce["marriage_date"] = pd.to_datetime(divorce["marriage_date"])
divorce.dtypes
marriage_date datetime64[ns]
marriage_duration float64
dtype: object
divorce["marriage_month"] = divorce["marriage_date"].dt.month
divorce.head()
Izzy Weber
Curriculum Manager, DataCamp
Correlation
Describes direction and strength of relationship between two variables
Can help us use variables to predict future outcomes
divorce.corr()
Timestamp('2000-01-08 00:00:00')
divorce["divorce_date"].max()
Timestamp('2015-11-03 00:00:00')
Izzy Weber
Curriculum Manager, DataCamp
Level of education: male partner
divorce["education_man"].value_counts()
Professional 1313
Preparatory 501
Secondary 288
Primary 100
None 4
Other 3
Name: education_man, dtype: int64