P1) Code Uber
P1) Code Uber
In [6]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200000 entries, 0 to 199999
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Unnamed: 0 200000 non-null int64
1 key 200000 non-null object
2 fare_amount 200000 non-null float64
3 pickup_datetime 200000 non-null datetime64[ns, UTC]
4 pickup_longitude 200000 non-null float64
5 pickup_latitude 200000 non-null float64
6 dropoff_longitude 199999 non-null float64
7 dropoff_latitude 199999 non-null float64
8 passenger_count 200000 non-null int64
dtypes: datetime64[ns, UTC](1), float64(5), int64(2), object(1)
memory usage: 13.7+ MB
Unnamed: 0 0
Out[7]:
key 0
fare_amount 0
pickup_datetime 0
pickup_longitude 0
pickup_latitude 0
dropoff_longitude 1
dropoff_latitude 1
passenger_count 0
dtype: int64
In [22]: #Correlation
df.corr()
Out[22]: Unnamed:
fare_amount pickup_longitude pickup_latitude dropoff_longitude drop
0
In [27]: plt.boxplot(df['fare_amount'])
Unnamed: 0 0
Out[11]:
key 0
fare_amount 0
pickup_datetime 0
pickup_longitude 0
pickup_latitude 0
dropoff_longitude 0
dropoff_latitude 0
passenger_count 0
dtype: int64
Out[17]: ▾ LinearRegression
LinearRegression()
In [18]: #Prediction
predict = lrmodel.predict(x_test)
In [ ]: