Liner Regression
Liner Regression
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
np.set_printoptions(precision=6, linewidth=100)
0 1 62.00 270000
1 2 76.33 200000
2 3 72.00 240000
3 4 60.00 250000
4 5 61.00 180000
5 6 55.00 300000
6 7 70.00 260000
7 8 68.00 235000
8 9 82.80 425000
9 10 59.00 240000
In [3]: mba_salary_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 S. No. 50 non-null int64
1 Percentage in Grade 10 50 non-null float64
2 Salary 50 non-null int64
dtypes: float64(1), int64(2)
memory usage: 1.3 KB
0 1.0 62.00
1 1.0 76.33
2 1.0 72.00
3 1.0 60.00
4 1.0 61.00
In [5]: Y = mba_salary_df['Salary']
Splitting the dataset into training and validation sets
In [6]: from sklearn.model_selection import train_test_split
train_X, test_X, train_y, test_y = train_test_split( X , Y, train_size =
const 30587.285652
Percentage in Grade 10 3560.587383
dtype: float64
Model Diagnostics
In [9]: mba_salary_lm.summary2()
Percentage in Grade
3560.5874 1116.9258 3.1878 0.0029 1299.4892 5821.6855
10
73458.04348346895
Out[12]:
In [14]: pred_y_df[0:10]
6 70.0 279828.402452
36 68.0 272707.227686
37 52.0 215737.829560
28 58.0 237101.353858
43 74.5 295851.045675
49 60.8 247070.998530
5 55.0 226419.591709
33 78.0 308313.101515
20 63.0 254904.290772
42 74.4 295494.986937
Abdulla,
0 1 2 SA KXIP Allrounder 0 0 0 0.00 ...
YA
Abdur
1 2 2 BAN RCB Bowler 214 18 657 71.41 ...
Razzak
Agarkar,
2 3 2 IND KKR Bowler 571 58 1269 80.62 ... 12
AB
Badrinath,
4 5 2 IND CSK Batsman 63 0 79 45.93 ... 12
S
5 rows × 26 columns
In [16]: ipl_auction_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 130 entries, 0 to 129
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Sl.NO. 130 non-null int64
1 PLAYER NAME 130 non-null object
2 AGE 130 non-null int64
3 COUNTRY 130 non-null object
4 TEAM 130 non-null object
5 PLAYING ROLE 130 non-null object
6 T-RUNS 130 non-null int64
7 T-WKTS 130 non-null int64
8 ODI-RUNS-S 130 non-null int64
9 ODI-SR-B 130 non-null float64
10 ODI-WKTS 130 non-null int64
11 ODI-SR-BL 130 non-null float64
12 CAPTAINCY EXP 130 non-null int64
13 RUNS-S 130 non-null int64
14 HS 130 non-null int64
15 AVE 130 non-null float64
16 SR-B 130 non-null float64
17 SIXERS 130 non-null int64
18 RUNS-C 130 non-null int64
19 WKTS 130 non-null int64
20 AVE-BL 130 non-null float64
21 ECON 130 non-null float64
22 SR-BL 130 non-null float64
23 AUCTION YEAR 130 non-null int64
24 BASE PRICE 130 non-null int64
25 SOLD PRICE 130 non-null int64
dtypes: float64(7), int64(15), object(4)
memory usage: 26.5+ KB
0 1 0 0 0
1 0 0 1 0
2 0 0 1 0
3 0 0 1 0
4 0 1 0 0
In [23]: ipl_auction_encoded_df.columns
PLAYING
75724.7643 150250.0240 0.5040 0.6158 -223793.1844 375242.7130
ROLE_Batsman
PLAYING
-71358.6280 213585.7444 -0.3341 0.7393 -497134.0278 354416.7718
ROLE_W. Keeper
CAPTAINCY
164113.3972 123430.6353 1.3296 0.1878 -81941.0772 410167.8716
EXP_1
Multi-Collinearity
VIF
0 T-RUNS 12.612694
1 T-WKTS 7.679284
2 ODI-RUNS-S 16.426209
3 ODI-SR-B 13.829376
4 ODI-WKTS 9.951800
5 ODI-SR-BL 4.426818
6 RUNS-S 16.135407
7 HS 22.781017
8 AVE 25.226566
9 SR-B 21.576204
10 SIXERS 9.547268
11 RUNS-C 38.229691
12 WKTS 33.366067
13 AVE-BL 100.198105
14 ECON 7.650140
15 SR-BL 103.723846
16 AGE_2 6.996226
17 AGE_3 3.855003
18 COUNTRY_BAN 1.469017
19 COUNTRY_ENG 1.391524
20 COUNTRY_IND 4.568898
21 COUNTRY_NZ 1.497856
22 COUNTRY_PAK 1.796355
23 COUNTRY_SA 1.886555
24 COUNTRY_SL 1.984902
25 COUNTRY_WI 1.531847
26 COUNTRY_ZIM 1.312168
0 SIXERS 2.397409
1 COUNTRY_BAN 1.094293
3 COUNTRY_NZ 1.173418
4 COUNTRY_ENG 1.131869
5 COUNTRY_SA 1.416657
6 COUNTRY_ZIM 1.205305
7 WKTS 2.883101
10 COUNTRY_WI 1.194093
11 COUNTRY_SL 1.519752
12 ODI-SR-BL 2.822148
14 AGE_3 1.779861
15 COUNTRY_PAK 1.334773
16 COUNTRY_IND 3.144668
17 ODI-WKTS 2.742889
CAPTAINCY
208376.6957 98128.0284 2.1235 0.0366 13304.6315 403448.7600
EXP_1
PLAYING
-55121.9240 169922.5271 -0.3244 0.7464 -392916.7280 282672.8801
ROLE_W. Keeper
PLAYING
-18315.4968 106035.9664 -0.1727 0.8633 -229108.0215 192477.0279
ROLE_Bowler
PLAYING
121382.0570 106685.0356 1.1378 0.2584 -90700.7746 333464.8886
ROLE_Batsman
In [ ]: