0% found this document useful (0 votes)
46 views5 pages

Petrol Assignment

Uploaded by

Ayush Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
46 views5 pages

Petrol Assignment

Uploaded by

Ayush Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
In [1]: In [2]: In [3]: out[3]: In [4]: out [4]: Use petrol_consumption dataset. Your task is to predict the gas consumption (in millions of gallons) in 48 of the US states based on petrol tax (in cents), per capita income (dollars), paved highways (in miles) and the proportion of population with the driving License. Build the regression model using Random Forest Regressor. Analyze the prediction ability of your model. importing dependencies import numpy as np import pandas as pd data collection and analysis petrol consumption dataset #loading the diabetes dataset to a pandas DataFrame dataset = pd.read_csv("C:\\Users\This PC\Desktop\ml practice\petrol_consumption.< # printing the first 5 rows of the dataset dataset.head() Petrol_tax Average_income Paved Highways Population_Driver_licence(%) Petrol_Consumption ° 20 3671 1976 0525 5a 1 90 4092 1250 osre 524 2 90 3865 1586 0.580 561 3 75 4870 2351 0.529 ana ‘ 80 4399 4st ose 410 » finumber of rows and columns in this dataset dataset. shape (48, 5) In [5]: #getting the statistical measures of the data dataset. describe() out [5]: Petrol_tax Average_income Paved Highways Population Driver_licence(%) Petrol_Consump count 48,000000 -48,000000 mean 7.668333 4247.893333 std 0.950770 573.628768 ‘min 5.000000 3063,000000 25% 7.000000 —_3739,000000 50% 7.500000 428.0000 75% 8.125000 4578.750000 max 10,0000 5342,000000 In [7]: # Preparing the Data x = dataset. iloc[:,:-1].values y = dataset. iloc[:,-1].values In [8]: # Now Data Split function from sklearn.model_selection import train_test_split x_train,x_test,y train,y test train test_split(x,y, test_siz In [9]: from sklearn.ensenble import RandonForestRegressor RandomForestRegressor(max_depth=2, random_stat: reer regr.fit(x_train,y_train) out[9]: RandonForestRegressor(max_dept In [10]: #Making Prediction y_pred = regr.predict(x_test) 48000000 18565.416867 3491.507168 431.000000 3110.250000 44735.500000 17486.000000 17782.000000 , randon_state=@) 48000000 0.570333 0.085470 0.481000 0.529750 0.584500 0.595250 0.724000 48.00 576.77 111.88 348.006 509.506 568.50 632.75¢ 968,00¢ 3, random_state: In [11]: df = pd.DataFrame({'Actual': y_test, ‘Predicted df y_pred}) out (13): ‘Actual Predicted 0 628 571.496777 1547 538.999¢80 2 648 647.580572 3 640 590.781977 4 561 558,606228 5 414 494660716 6 554 598.109298 7 srt 583.124799 8 7e2 27321213 9 631 576.802467 10574 547.557079 11524 564.262270 12540. 497.195530 13487 551.243347 14 640. 705.237167 In [12]: # Evaluating the algo from sklearn import metrics print(‘Mean Absolute Error:', metrics.mean_absolute_error(y test, y_pred)) print(‘Mean Squared Error:', metrics.mean_Squared_error(y test, y_pred)) print(‘Root Mean Squared Error:', np.sqrt(metrics.mean_squared error(y test, y_pr Mean Absolute Error: 46.24860698172747 Mean Squared Error: 3472.023958285361 Root Mean Squared Error: 58.92388274957244 In [16]: import seaborn as sns import matplotlib.pyplot as plt plt.figure(figsize=(7, 9)) > ax=ax) plt-grid() plt.title(‘Actual vs Fitted Values for Price’) plt.xlabel("Actual values for price”) plt.ylabel("Fitted values for price”) plt-legend() pltshow() €:\Anaconda\1ib\site-packages\seaborn\distributions.py:2619: Futurearning: “di stplot’ is a deprecated function and will be removed in a future version. Pleas adapt your code to use either “displot’ (a figure-level function with similar flexibility) or “kdeplot’ (an axes-level function for kernel density plots). warnings.warn(msg, FutureWarning) :\Anaconda\1ib\site-packages\seaborn\distributions.py:2619: FutureWarning: “di stplot’ is a deprecated function and will be removed in a future version. Pleas adapt your code to use either “displot™ (a figure-level function with similar flexibility) or “kdeplot’ (an axes-level function for kernel density plots). warnings.warn(msg, FutureWarning) In In In In Ch: Ch: Ch Ch Actual vs Fitted Values for Price awe — fied ves 0.005 red oes & ons i 2 wea oes cao | a a a ‘Actual values for price petrol_consumption_dataset[ ‘Petrol_tax' ].value_counts() petrol_consumption_dataset..groupby( ‘Petrol_tax') .mean() on the basis of increasing petrol tax we predict that petrol consumption is getti »

You might also like