Deepak Data Analysis 1

Uploaded by

Omkar More

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

735 views31 pages

Deepak Data Analysis 1

Uploaded by

Omkar More

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 31

Name: Deepak Yadav Assignment 1:Linear and Logistic Regression SETA Create ‘sales’ Data set having 5 columns namely: ID, TV, Radio, Newspaper and Sales. (random 500 entries) Build a linear regression model by identifying independent and target variable. Split thevariables into training and testing sets.then divide the training and testing sets into a 7:3 ratio,respectively and print them. Build a simple linear regression model. tees import numpy as np d = pd.read_csv("Conpany_data.csv") print (d) TV Radio Newspaper Sales @ 230.1 37.8 69.2 22.1 1 44.5 39.3 45.1 10.4 2 17.2. 45.9 69.3 12.6 3 151.5 41.3 58.5 16.5 4 180.8 10.8 58.4 17.9 195 38.2 3.7 3.8 7.6 196 94.2 4.9 81 1.e 197 177.8 9.3 64 14.8 198 283.6 42.6 66.2 25.5 199 232.1 8.6 8.7 18.4 200 rows x 4 columns d.shape 4.info() d.deseribe() Rangelndex: 200 entries, @ to 199 Data columns (total 4 columns) # Column —-Non-Null Count Dtype ew 200 non-null —_floate4 1 Radio 200 non-null —_floate4 2 Newspaper 200 non-null floated 3 Sales 200 non-null floated dtypes: floate4(4) memory usage: 6.4 KB TV Radio Newspaper Sales count 200,00000¢ 200000000 200.0000 200.000000 mean 147.04250C 23264000 30.554000 _15.130500 std 85,85423€ 14.846809 21.7862 5.283892 min 9.700000 0.000000 9.300000 1.600000 25% 74375000 9975000 12.750000 11.0000, 50% 149.75000¢ 22.9000 2.750000 16,000000 75% 218825000 36525000 45.1000 19.050000, max 296400000 49.600000 114,000000 7.000000 import matplotlib.pyplot as plt import seaborn as sns # Using pairplot we'll visualize the data for correlation sns.pairplot(d, x_vars=['TV', ‘Radio’, Newspaper"), y_vars='Sales', size=4, aspect=1, kind='scatter’) plt.show() C:\Users\YASH KULKARNI\AppData\Local \Programs\Python\Python31@\1ib\site-packages\seat orn\axisgrid.py:2076: Userwarning: The “size” parameter has been renamed to ‘height’; please update your code. warnings.warn(msg, Userarning) sns.heatmap(d.corr(), cma plt.show() 'YignBu", annot = True)In [10 In [12 In [1s 10 os a] a6 g - 4 2 02 a Newspaper Sales Simple Linear Regression # Creating X and y X= d['1V"] y = d['Sales"] # Splitting the varaibles as training and testing from sklearn.model_selection import train_test_split Xtrain, X_test, y train, y_test = train test_split(X, y, train_size = 0.7, test_size # Take a Look at the train dataset X_train y_train m 17.8 3 16.5 185 (22.6 2 © 15.8 90 (ide 87 16.6 103 19.7 67 134 4 9.7 8 a8 Name: Sales, Length: 148, dtype: floates # Importing Statsmodels.api Library from Stamodel package import statsmodels.api as sm # Adding a constant to get an intercept X_train_sm = sm.add_constant (X_train) # Fitting the resgression Line using ‘OLS" Ir = sm.OLS(y_train, X_train_sm).Fit() # Printing the parameters Ir. params const 6.948683 wv 0.054546 dtype: floatedLr.summary() OLS Regression Results Dep. Variable: Sales Resquared: 0816 Mode: OLS Adj. R-squared: 0.814 Method: Least Squares Festatistic 6112 Date: Sat,07 May 2022 Prob (F-statistic): 1.52e-52 Time: 163620 ikelihood: —-321.12 No. Observations: 1a Alc: 6462 Df Residuals: 138 BIC: 652.1 Df Model: 1 Covariance Type: onrobust coet stderr t-Pa|t| [0.025 0.975) const 59487 0385 18068 0.000 6.188 7.708 TW 20545 0.002 24722 0000 0.050 0059 Omnibus: 0.027 Durbin-Watson: 2.196 Prob(Omnibus): 0.987 Jarque-Bera (JB): 0.150 Skew: -0.00€ Probus): 0.928 Kurtosis: 2.840 Cond.No. 328 Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified Create ‘realestate’ Data set having 4 columns namely: ID,flat, houses and purchases (random 500entries). Build a linear regression model by identifying independent and target variable. Split thevariables into training and testing sets and print them. Build a simple linear regression model for predicting purchases. pee import pandas as pd import seaborn as snsfrom pylab import rcParans import matplotlib.pyplot as plt import matplotlib.animation as animation from matplotlib import rc import unittest. xmatplotlib inline sns.set(style='whitegrid’, palette='muted’, font_scale=1.5) reParans['Figure.figsize'] = 14, 8 RANDOM_SEED = 42 np. random. seed (RANDOM_SEED) def run_tests(): unittest .main(arg 1, verbosity=1, exit-False) import pandas as pd train = pd.read_csv('train.csv') print(train)Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape \ @ 1 60 RL 65.0 8450 Pave NaN Reg 1 2 20 RL 80.0 9608 Pave NaN Reg 2 3 60 RL 68.0 11258 Pave NaN IRL 3 4 70 RL 60.0 9558 Pave NaN IRL 4 5 60 RL 84.0 14260 Pave NaN IRL 1455 1456 6 RL 62.0 7917 Pave NaN Reg 1456 1457 28 RL 85.0 13175 Pave NaN Reg 1457 1458 78 RL 66.0 9842 Pave NaN Reg 1458 1459 28 RL 68.0 9717 Pave NaN Reg 1459. 1460 28 RL 75.0 9937 Pave NaN Reg Landcontour Utilities ... PoolArea PoolQC Fence MiscFeature MiscVal \ @ Lvl allpub @ NaN NaN NaN @ 1 Lvl allpub @ NaN NaN NaN @ 2 Lvl al.pub @ NaN NaN NaN @ 3 Lvl allpub @ NaN NaN NaN @ 4 Lvl allpub @ NaN NaN NaN @ 1455 Lvl al.pub @ NaN NaN NaN @ 1456 Lvl allpub @ NaN HnPrv NaN @ 1457 Lvl ALLPub @ NaN GaPrv Shed 2500 1458 Lvl ALLPub @ NaN NaN NaN @ 1459 Lvl al.pub @ NaN NaN NaN @ MoSold YrSold SaleType SaleCondition SalePrice @ 2 2008 wo Normal 208580 1 5 2007 wo Normal 181580 2 9 2008 wo Normal 223500 3 2 2006 wo Abnorml 140000 4 12 2008 Wo Normal 250000 1455 8 2007 wo Normal 17588 1456 2 2010 wo Normal 210000 1457 5 2010 wo Normal 266580 1458 4 2010 wo Normal 142125 1459 6 2008 wo Normal 147500 [1468 rows x 81 columns] ‘train[ 'SalePrice* ].describe() count 1460. 000000 mean 18921195890 std 79442.502883 min 34900..000000 25% 129975..000000 50% 163000..000000 73% 214900.080000 max 755000..000000 Name: SalePrice, dtype: floats sns.distplot(train[ 'SalePrice’ ]);C:\Users\YASH KULKARNI\AppData\Local \Programs\Python\Python31@\1ib\site-packages\seat orn\distributions.py:2619: FutureWiarning: “distplot’ is a deprecated function and wil 1 be removed in a future version. Please adapt your code to use either ‘displot’ (a f igure-level function with similar flexibility) or “histplot’ (an axes-level function for histograms). warnings.warn(msg, FutureWiarning) 128 Density 0 200000 490000 ‘600000 SalePrice var = ‘GrLivarea’ data = pd.concat((df_train{'SalePrice'], df_train{var)], axis-1) data.plot.scatter(x-var, y="SalePrice’, ylin=(0,880008), s=32); ‘ct argument looks like a single numeric RGB or RGBA sequence, which should be avoide d as value-mapping will have precedence in case its length matches with *x* & *y*. F lease use the *color* keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.In [ 00000 ‘700000 ‘600000 3 00000 400000 SalePrice 00000 200000 100000 1000 2000 ‘3000 4000 5000 GrLivArea data data. plot. scatter “TotalpsntsF* pd.concat([df train['SalePrice’], df_train[var]], axis=1) ar, SalePrice', ylim=(@,800000)); ‘ct argument looks like a single numeric RGB or RGBA sequence, which should be avoide d as value-mapping will have precedence in case its length matches with *x* & *y*. F lease use the *color* keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points. 00000, 700000 00000 00000 400000 SalePrice 00000 ‘200000 100000 ° 1000 ‘2000 3000 4000 5000 ‘6000 TotalBsmiSF ‘overalqual’ data = pd.concat([df_train[ 'SalePrice'], df_train{var]], axis=1) #, ax = plt.subplots(Figsize=(14, 8)) fig = sns.boxplot(x-var, y="SalePrice", data=data) fig.axis(ymin=@, ymax=800000);00000 e000 20000 $ se0000 400000 SalePrice 00000 4 econo a | 100000 am S =" = 5 6 7 8 9 10 OverallQual In [26]: coremat = train.corr() £, ax = plt.subplots(figsize=(12, 9) sns.heatmap(corrmat, vmax=.8, square=True); id -08 LotFrontage OveraliQual YearBuilt MasVnrArea BsmtFinSF2 TotalBsmtsF 2ndFirsF GrlivArea BsmtHaltBath HalfBath KitehenAbyGr Fireplaces GarageCars WoodDeckSF EnclosedPorch ScreenPorch MiscVal YrSold 06 04 }o2 0.0 0.2 --0.4 ld LotFrontage YearBuilt MasVnrArea HalfBath OverallQual BomtFinsF2 TotalBsmtsF 2ndFirsF GrLivArea BsmtHialfBath KitchenAbvGr Fireplaces GarageCars WoodDeckSF EnclosedPorch ScreenPorchIn [27]: ols = ['SalePrice’, ‘Overallqual', ‘GrLivArea’, ‘GarageCars'] sns.pairplot(df_train{cols], size = 4); C:\Users\YASH KULKARNI\AppData\Local \Programs\Python\Python310\1ib\site-packages\seat orn\axisgrid.py:2076: UserWarning: The “size” parameter has been renamed to “height”; please update your code. warnings.warn(msg, UserWarning) ooo vil te i i i | — alll | | | i sdf I i a z e 8 : a LE I 3 eo edd Sale G1. | oem ‘ « mc00 000% 09 —=S«28 87S OSC OD ‘ ‘aeProe overtua ‘Givin df _train{ 'GrLivarea' } df _train{ 'SalePrice' ] x = (x ~ x.mean()) / x.std() x = np.c_[np.ones(x.shape[@]), x] In [29]: xeshape ouspoe), (2460, 2) In [38]: def loss(h, y): sqerron = (h = y)**2n= len(y) return 1.8 / (2*n) * sqerror.sum() class TestLoss(unittest.TestCase): def test_zero_h_zero_y(self): self. assertAlnost€qual (1oss( p-array([@]), y=np.array([@])), @) def test_one_h_zero_y(self): self. assertAlnostEqual (1oss (=F p-array([1]), y=np.array([@])), @.5) def test_two_h_zero_y(self): self. assertAlnostEqual (1oss (=F p-array([2]), y=np.array([@])), 2) def test_zero_h_one_y(self): self. assertAlnostéqual (1oss (h=F p-array([@]), y=np.array([1])), @.5) def test_zero_h_two_y(self): self.assertAlnost€qual (loss(h=np.array((@]), y=np.array([2])), 2) run_tests() Ran 5 tests in 0.0085 0K. class Linearkegression: def predict(self, X): return np.dot(X, self._W) def _gradient_descent_step(self, X, targets, Ir): predictions = self.predict(x) error = predictions - targets gradient = np.dot(X.T, error) / len(x) self._W -= Ir * gradient def fit(self, X, y, n_iter=100000, 1r-0.01): self._W = np.zeros(X.shape[1]) self._cost_history = [] self._whistory = [self._W] for i in range(n_iter): prediction = self.predict(x) cost = loss(prediction, y) self._cost_history.append(cost) self._gradient_descent_step(x, y, Ir) self.» return self history.append(self._W.copy())class TestLinearRegression(unittest. TestCase) : def test_find_coefficients (self): lf = LinearRegression() clf.fit(x, y, n_iter=2000, 1r-0.01) np. testing.assert_array_almost_equal(clf._W, np-array([180921.19555322, 56294.¢ run_tests() Ran 6 tests in 1.6615, 0K. clf = LinearRegression() clf.fit(x, y, n_iter=2008, 1r=0.01) ._tnain__.LinearRegression at @x219d3211c0> elf. array([180921.19555322, 56294.9199925]) plt.title('Cost Function 3") plt.xlabel(‘No. of iterations") plt.ylabel(‘Cost") pit. plot(clf._cost_history) plt.show() te10 Cost Funetion J 200 175 150 1.25 8 100 ors oso 025 o 250 500 750 tooo 1250» 1500-1750 ©2000 No. of iterations. clf._cost_history[-1] 1569921604.833264fig = plt.figure() ax = plt.axes() plt.title('Sale Price vs Living Area’) plt.xlabel( ‘Living Area in square feet (normalised) ') plt.ylabel(‘Sale Price ($)") plt.scatter(x[:,1], y) Line, = ax.plot([], [], Iw=2, colo annotation = ax.text(-1, 700000, '') annotation. set_animated(True) plt.close() red") Generate the animation data, def init(): Line. set_data([], []) annotation. set_text('") return line, annotation # animation function. This is called sequentially def animate(i): x = np.linspace(-5, 20, 100) y = clf._whistory[i][1]*x + cl Line.set_data(x, y) annotation. set_text(‘Cost = %.2f e10" % (clf._cost_history[1]/1¢900000000) ) return line, annotation _w_history[i][@] anim = animation.FuncAnimation(fig, animate, init_func=init, frames=300, interval=10, blit-True) rc(‘animation', html= jshtm") anim ‘Animation size has reached 20990697 bytes, exceeding the limit of 20971520.0. If yo 4're sure you want a larger animation embedded, set the animation.enbed limit rc para meter to a larger value (in MB). This and further frames will be dropped.Sale Price vs Living Area o0000) Cost= 1.95 e10 0000 ‘50000 Sale Price (S) 5 00000 ‘000 10000 6 a 2 4 Living rea in square feet (normalised) ee Once @Loop O Reflect Create ‘User’ Data set having 5 columns namely: User ID, Gender, Age, EstimatedSalary andPurchased. Build a logistic regression model that can predict whether on the given parameter aperson will buy a car or not. import pandas as pd import numpy as np import matplotlib.pyplot as plt % matplotlib inline df=pd.read_csv("suv_data.csv") df.head() User ID Gender Age EstimatedSalary Purchased 15624510 Male 19 19000 0 15810944 Male 35 20000 15668575 Female 26 43000 1860324 Female 27 57000 0 0 0 0 15804002 Male 19 76000In [52 In [54 In [55 df.info() RangeIndex: 400 entries, @ to 39 Data colunns (total 5 colunns): # Column’ Non-Nul. Count @ User 1D 400 non-null int64 1 Gender 400 non-null object. 2 Age 400 non-null intea 3. EstimatedSalary 400 non-null —int64 4 Purchased 400 non-null int64 types: int64(4), object(1) memory usage: 15.8% KB d#.4snul1() .sun() User ID Gender Age EstimatedSalary Purchased dtype: inte4 sns.countplot (df ‘Gender’ }) plt.show() C:\Users\YASH KULKARNI \AppData\ Local \Programs \Python\Python310\1ib\site-packages\seab orn\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version @.12, the only valid positional argument will be ‘data’, and passing other arguments without an explicit keyword will result in an error or misinterpretat ion. warnings.warn( 200 175 150 125 8 100 rc 8 Gender If. iloc[:,[2,3]].values If. iloc[:,4].valuestsarray([[ 19, 19000), [ 35, 20000], [ 26, 43000], [ 27, 57008), [ 19, 76088), [ 27, 58008), [ 27, 84000), [ 32, 158000), [ 25, 33008), [ 35, 65000), [ 26, seeee), [ 26, 52088), [ 28, 86000], [ 32, 18088), [ 18, 82000], [ 29, seeee), [ 47, 25088), [ 45, 26000], [ 46, 28000), [ 48, 2900), [ 45, 22008), [ 47, 49000), [ 48, 41000), [ 45, 22008), [ 46, 23000), [ 47, 20008), [ 49, 28008], [ 47, 30008), [ 29, 43088), [ 31, 18088), [ 31, 74988], [ 27, 137008), [ 21, 16008), [ 28, 44000], [ 27, 90008), [ 35, 27008), [ 33, 28008), [ 30, 49000), [ 26, 72008), [ 27, 31008), [ 27, 17000), [ 33, 51¢08), [ 35, 198000], [ 38, 15000], [ 28, 84000), [ 23, 20000], [ 25, 79000], [ 27, sae0e], [ 30, 135000], [ 31, 89000], [ 24, 32000), [ 18, 44000), [ 29, 83000], [ 35, 23000], [ 27, 58008), [ 24, 55e08), [ 23, 48000), [ 28, 79008), [ 22, 18088), [ 32, 117000],20000], 87000], 66000), 120000), 83000], 58000], 19000], 82000], 63000], 68000], 80000], 27000), 23008), 113000], 18000], 112000], 52008), 27000), 87000], 17000], 30000], 42008), 49000), 88000], 62000], 118000], 55000], 85000], 81000], 50280], 81000), 116000], 15000], 28000], 83000], 44000), 25000], 123000], 73000], 37000], 88000], 59000], 86000], 149008), 21000], 72000], 35000], 89000], 86000], 80000], 71000], 71000], 61000], 55000], 80000], 57000], 75000], 52000], 59000], 59000],75000], 72000), 75000), 53000], 51000], 61000], 65000], 32000], 17000], 34000], 58000], 31000], 87000], 68000], 55000], 63000], 82000], 107000), 59000], 25000], 35000], 68000], 59000], 89000], 25000], 89000], 96008], 30000], 61000], 74000), 15000], 45000], 76000], 52000], 47008), 15000], 59000], 75000], 30000], 135000], 100000], 90000], 33000], 38000], 69000], 86000], 55000], 71000), 148000], 47000), 88000], 115009], 118000], 43000), 72000], 28000], 47000), 22000], 23008), 34000),16000], 71000), 117000], 43000), 60000], 66000], 82000], 41000), 72000], 32000], 84000], 26008), 43000), 70008), 89000), 43000], 79000], 36000), 80000], 22000], 39000], 74000], 134000], 71000], 101000], 47000), 130000], 114008), 142008), 22000], 96000], 158000], 42000], 58000], 43000), 108000], 65000], 78000], 96000], 143000], 80000], 91000], 1440008], 102000], 60000], 53000], 126000], 133000], 72000), 80000], 147000], 42000), 107000], 86000), 112000], 79000], 57000], 80000], 82000], 143000],149000], 59000], 88000), 104000], 72000], 146000], 50000], 122000], 52000], 97000], 39000], 52000], 134008), 146008), 44000], 90000], 72008), 57008), 95000], 131000], 77000), 144000], 125000], 72000], 90000], 108000], 75008), 74000), 144008), 61000], 133000], 76000], 42000], 196000], 26000], 74000], 71000], 88000], 38000], 36000], 38000], 61000], 70000), 21000], 141008), 93000], 62000], 138000], 79000), 78000], 134000], 89000], 39000], 77000), 57000], 63000], 73000], 112009], 79000], 117000],38000], 74000), 137000], 79000], 60000], 54000], 134000], 113000], 125000], 50000], 70200), 96008), 50000], 141008), 79008), 75000], 104000], 55000], 32000], 60000], 138000], 82000], 52000], 30000], 131000], 60000], 72000], 75008), 118000], 107000), 51000), 119000], 65000], 65000], 60000], 54000], 144000), 79000], 55000], 122000], 104000), 75000], 65000], 51000], 105090], 63000], 72000), 108000], 77e0@), 61000], 113009], 75000], 90000], 57e0@], 99000], 34000], 70000], 72000], 71000], 54000],43, 129000], 53, 34000], 47, 50000], 42, 79000), 42, 104800), 59, 29000), 58, 47000), 46, 88000), 38, 71000), 54, 26000), 60, 46000], 60, 83000], 39, 73000), 59, 130000), 37, 80000], 46, 32000], 46, 74000], 42, 53000], 41, 87600), 23008), 42, 64800), 48, 33000), 44, 139000), 43, 28000), 57, 33000), 56, 60000), 43, 39000), 39, 71000), 47, 34000), 48, 35000], 48, 33000], 47, 23000), 45, 45000], 60, 42000), 39, 59000), 46, 41600), 51, 23000), 58, 20000), 36, 33000), 49, 36000}], dtype=inte4) In [57]: /¥Beeeeeeeseeeer eae split in_test, port trail y_train,y_test=train_test_split (x,y, test_size: from sklearn.model_selection im -25, randon_state=0) x_test, X_train from sklearn.preprocessing import Standardscaler sc=StandardScaler() 5g 55 X trainarray([ 39000), 128000], 50000], 135000], 21008); 104000), 42000], 61008], 52008], 63008), 25008), 50000) , 73008), 49000), 29000], 65000), 131000], 89000), 82008), 51008], 15008]; 102000), 112000], 107000), 53008], 59008], 41000), 134000), 113000), 148008), 15000], 42000), 19000], 149000], 96008), 59008], 96008), 89008), 72008), 26008); 69008]; 82008), 74008), 80000), 72008), 149000], 71000], 146000], 73008], 75008), 51008], 75008), 78008), 61000], 108000], 82008), 74008), 65008), 80000), 117008],61000], 68000], 44008], 87000], 33000], 90000], 42008), 123000], 118000], 37000], 71200], 70000), 39000], 23008), 147000], 138000], 86000), 79000), 138000], 23000], 60000], 113009], 107000], 33000], 80000], 96000], 18008], 71000], 129008), 76008), 44000], 118000], 90000], 30000), 43000), 78000], 59000], 42000), 74000], 91000], 59000], 57000], 143000], 26008), 38000], 113000], 143000], 27e00], 101000], 45000], 82000], 23000], 65000), 84200], 59000], 84000], 28000], 71000], 55000], 35000),28000], 65000], 17008), 22008), 141000), 17008), 97008), 59008], 27008), 18008); 88000], 58000), 60000), 34000), 72000], 100000], 21000], 90000], 88000), 32008), 22008); 59008], 44000], 72008), 142000], 32008), 71008), 74008), 75008), 76608), 25000], 61000), 112000], 80000), 75008), 47200), 75008); 25008); 80008), 60008], 52008], 125000], 29008), 126008), 134000), 37000], 71008), 61000], 27000), 60000), 74008), 23008); 72008), 117000), 72008), 80008), 95008], 52008], 79008), 55000],75000], 28000), 139000], 18008], 51008), 133000], 32008], 22008); 55000], 104000), 119608], 53000), 144000], 66600), 137008), 58000], 41000), 22000], 15008), 19008]; 74008); 122000], 73008), 71008), 23008); 72008); 83000], 26008), 44000), 75008), 47000), 68000], 54000), 135000], 114000], 36008), 133000], 61008), 89008), 16008); 31008), 72008), 33000), 125008), 131008), 71000), 62000], 72008), 63000], 47200), 116000], 49000], 74000], 59000], 89008), 79008); 82008), 57008], 34000), 108000],72000], 71000], 106000], 57000], 72000], 23000], 108009], 17000], 134000], 43000), 43008], 38000], 45008), 72000), 134000], 137000], 16000], 32000), 66000], 73000], 79000], 50000], 30000], 93000], 46000), 22000], 37000], 55000], 54000], 36000], 194000], 57008), 108000], 23000], 65000], 20000], 36000], 79000], 33000], 72000], 39000], 31000], 70000), 79800], 81000], 80000], 85000], 39000), 88000], 88000], 150000], 65000], sage], 43000), 52000], 30000], 43000), 52000], 54900], 118000]], dtype=int6s)from sklearn.linear_model import LogisticRegression classifier=LogisticRegression(randon_state=0) classifier. fit (x train,y_train) Lassifier. predict (x_test) array([®, @, @, @, @, @, , ®, @, @, @, 0, @, @, @, @, @, @, @, @, @, 2, 2, @ @, @, 2, @ 0, % 2 2, 2 B 2, % 2 0, 2 2 2, 2 2 2, 2, 0 @, 2, 2, @ 0, 2 2, 2, 2, B 0, 2% 2 @, ® 2 2, 2 2 2, 2, 0 @, 2, 2, @ 0, 2 2, 2, 2 B, 2, % 2 @, ® 2 2, % 2 @, @, 0, 8, 6, 2, 8, 8, 2, 2, @, @, a], dtypenintés) y_test array([@, ®, @, 8, ® @, @ 1, 2, ® 0, 2 ® 2, 2 8 8, 1, 2 @ 1, @, 1, ® 1, 2 @ 2, % @ 1, 1, @ 0, 2 2 @, 2 1, 2 2 2 @ 1, @ ®, 1, @ 1, 1, 2 @, @ 1, 1, 2, % 1, @, 2 1, 2 2, 2 1, 2, ® @, 1, ® 0, 2, ® 2, % ® 1, 1, 1, @, ® @ 1, 1, 2 1, 1, ® @ 1, @, 8, 1, @ 1, 1, 1], dtypesintes) from sklearn.metrics import accuracy_score accuracy_score(y_test,y_pred) 0.68 from sklearn.metrics import accuracy_score accuracy_score(y_test,y_pred)*10@ 58.0 from sklearn.metrics import confusion_matrix cm=confusion_matrix(y_test,y_pred) array([[68, 2], [32, 0], dtype=inte4) SET B Build a simple linear regression model for Fish Species Weight Prediction. https://www kaggle.com/aungpyaeap/fish-market?select=Fish. csv import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from itertools import combinations import numpy as np data = pd.read_csv("Fish.csv") data.head()Species Q Bream 1 Bream 2 Bream 3 Bream 4 Bream Weight Length1 Length2 242.0 2900 3400 3630 4300 data.isna().sum() SalePrice @ Overallqual @ dtype: intea 232 240 239 263 265 290 290 Length3 300 312 34 335 340 Height 11.5200 124800 123778 12.7300 12.4440 Width 4.0200 43056 4696" 4asss 5.1340

Tybsc WT & FDS Sem 5 Practical Slip
No ratings yet
Tybsc WT & FDS Sem 5 Practical Slip
30 pages
TYBCS Java Slips Solution 2022
No ratings yet
TYBCS Java Slips Solution 2022
82 pages
Web Technology Codes
0% (1)
Web Technology Codes
24 pages
WT Da All Practical Questions
100% (2)
WT Da All Practical Questions
100 pages
WT and Fds Practical Slips
No ratings yet
WT and Fds Practical Slips
32 pages
Os - I Slips Solution Done
No ratings yet
Os - I Slips Solution Done
24 pages
Tybsc Cs368 Data Analytics Labbook
No ratings yet
Tybsc Cs368 Data Analytics Labbook
58 pages
JAVA - Slips 1-15
67% (3)
JAVA - Slips 1-15
67 pages
TY - Lab-III CS-359 Core JAVA Slip (Rev 2021-22)
0% (1)
TY - Lab-III CS-359 Core JAVA Slip (Rev 2021-22)
30 pages
6th Sem Solve Webtechnology 2
No ratings yet
6th Sem Solve Webtechnology 2
18 pages
Telco Customer Churn Prediction Project Report
No ratings yet
Telco Customer Churn Prediction Project Report
40 pages
TYBCS SEM 6 Operating System Code Solution
No ratings yet
TYBCS SEM 6 Operating System Code Solution
12 pages
Abc
No ratings yet
Abc
107 pages
Slip 1: A) //file Exist or Not
100% (2)
Slip 1: A) //file Exist or Not
46 pages
Java Slip 16-30
100% (1)
Java Slip 16-30
42 pages
Java Slip Solution
No ratings yet
Java Slip Solution
57 pages
Dairy Management Project Report
No ratings yet
Dairy Management Project Report
41 pages
Assignment 1: Set A
No ratings yet
Assignment 1: Set A
9 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
AI Lab Manual-1
100% (1)
AI Lab Manual-1
16 pages
Project Report
100% (1)
Project Report
61 pages
OS PracticalSlipsQues
No ratings yet
OS PracticalSlipsQues
72 pages
Houses Prices Prediction Model
No ratings yet
Houses Prices Prediction Model
11 pages
T.Y. B.SC Computer Science Core Java Slips 2020 Pattern
No ratings yet
T.Y. B.SC Computer Science Core Java Slips 2020 Pattern
25 pages
Furniture Shop Management System Python Project
No ratings yet
Furniture Shop Management System Python Project
55 pages
AJAX
No ratings yet
AJAX
8 pages
Banker 1
No ratings yet
Banker 1
6 pages
A Project Report On: Chat Application
100% (1)
A Project Report On: Chat Application
44 pages
Linux Slips
100% (1)
Linux Slips
9 pages
Web Technology Lab
No ratings yet
Web Technology Lab
50 pages
MCA (MANAGEMENT) 2020 PATTERN Sem2 Answer
0% (1)
MCA (MANAGEMENT) 2020 PATTERN Sem2 Answer
25 pages
ML Manual
No ratings yet
ML Manual
9 pages
ML
No ratings yet
ML
17 pages
B.B.A. (C. A.) 2019 Pattern PDF
No ratings yet
B.B.A. (C. A.) 2019 Pattern PDF
77 pages
Online Book Store Report
No ratings yet
Online Book Store Report
30 pages
Online Fake Logo Detection System Python Project
No ratings yet
Online Fake Logo Detection System Python Project
8 pages
Employee Management System Project
No ratings yet
Employee Management System Project
2 pages
Bangalore House Price Prediction Using The Best Machine Learning Model Submitted by Rukzana Vadakkekudy Rassak P2682221
No ratings yet
Bangalore House Price Prediction Using The Best Machine Learning Model Submitted by Rukzana Vadakkekudy Rassak P2682221
9 pages
Stock Management Synopsis
75% (4)
Stock Management Synopsis
2 pages
Railway Reservation System Report
No ratings yet
Railway Reservation System Report
74 pages
Dsbda Mini Manav
No ratings yet
Dsbda Mini Manav
17 pages
Final Slips
No ratings yet
Final Slips
29 pages
Studen Record System Project Report
0% (1)
Studen Record System Project Report
17 pages
Python Project Online Health Management System Report
No ratings yet
Python Project Online Health Management System Report
11 pages
E-Travel Booking Site: Submitted As A Part of
0% (1)
E-Travel Booking Site: Submitted As A Part of
23 pages
MCA Python Journal
100% (2)
MCA Python Journal
5 pages
Python Report 2
100% (1)
Python Report 2
9 pages
SRM Mess Management System
No ratings yet
SRM Mess Management System
18 pages
Core Java BCA Sem V Slip Solution
67% (3)
Core Java BCA Sem V Slip Solution
69 pages
5) Fact Finding Technique
100% (1)
5) Fact Finding Technique
7 pages
Pythonshreyash
100% (1)
Pythonshreyash
18 pages
22 - Online Book Store System-Synopsis
No ratings yet
22 - Online Book Store System-Synopsis
7 pages
Python P
100% (1)
Python P
11 pages
Student Result Management System Srs
100% (1)
Student Result Management System Srs
2 pages
E-Bill Management System (Synopsis)
No ratings yet
E-Bill Management System (Synopsis)
7 pages
Online Reservation System Project in Java
No ratings yet
Online Reservation System Project in Java
7 pages
Synopsis Quiz System
No ratings yet
Synopsis Quiz System
3 pages
Tybca Slips-25marks
50% (2)
Tybca Slips-25marks
44 pages
Cs Question
No ratings yet
Cs Question
7 pages
Synopsis of Railway Reservation
100% (1)
Synopsis of Railway Reservation
14 pages

Deepak Data Analysis 1

Uploaded by

Deepak Data Analysis 1

Uploaded by

You might also like