0% found this document useful (0 votes)
19 views

Python Stuff

The document discusses using linear regression, logistic regression, Naive Bayes, LDA and QDA machine learning algorithms in Python. It includes code snippets for importing datasets, defining models, fitting models to training data, predicting classes and calculating accuracy scores.

Uploaded by

master guardian
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Python Stuff

The document discusses using linear regression, logistic regression, Naive Bayes, LDA and QDA machine learning algorithms in Python. It includes code snippets for importing datasets, defining models, fitting models to training data, predicting classes and calculating accuracy scores.

Uploaded by

master guardian
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

To import

from google.colab import drive
drive.mount('/content/drive')
import pandas
df = pandas.read_csv('/content/drive/My Drive/advertising.csv')
print(df)

from sklearn import linear_model 
import numpy as np 
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

to plot graph

df.plot(kind="scatter",x='TV',y='Sales')

to define variable
x = df['TV'].values.reshape(-1,1)
y = df['Sales'].values.reshape(-1,1)
OR IF MANY VARIABLES
x = df[['Avg. Session Length','Time on App','Time on Website','Le
ngth of Membership']]
y = df['Yearly Amount Spent']

define linear dregression


reg = LinearRegression() 

to fit selected variable


reg.fit(x, y)

to predict the model


y_pred = reg.predict(x)

to plot regression line


plt.scatter(x, y,  color='gray')

plt.plot(x, y_pred, color='red', linewidth=2)
plt.show()

to find coefficient of determination (r2)


r_sq = reg.score(x, y)
print('coefficient of determination:', r_sq)

to find intercept
beta=float(reg.intercept_)

print(beta)

to find slope

alpha=float(reg.coef_)

print(alpha)
If cannot run then remove FLOAT

to define function for prediction

def pred(x_i: float) -> float:

      return alpha * x_i + beta

to find mean value

mTV=np.mean(x)

print(mTV)

to predict sales for twice mean value

pred(2*mTV)
LOGISTIC REGRESSION
To import

from google.colab import drive
drive.mount('/content/drive')
import pandas
df = pandas.read_csv('/content/drive/My Drive/
costumer_database.csv')
print(df)

import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LogisticRegression

to put in variables

x = np.array([3000, 1000, 2700, 1500, 1200, 1400, 4200, 580,  739
0, 245]).reshape(-1, 1)
y = np.array([1,  0,  1,  0,  0,  0,  1,  0,  1,  0])

xtest = np.array([1970, 2845, 4656, 5800, 900]).reshape(-1, 1)
ytest = np.array([0,  1,  1,  1,  1])

IF QUESTION WANNA FIND 2 COLUMN

x = np.array([[25000,3000],[35000,1000],[23000,2700], 
[28000,1500],[30000,1200],[26000,1400],[43000,4200],[34000,580],
[42000,7390], [26000,245]])
y = np.array([1,  0,  1,  0,  0,  0,  1,  0,  1,  0])
xtest = np.array([[23000,1970], [29000,2845],[31000,4656],
[42000,5800],[30000,900]])
ytest = np.array([0,  1,  1,  1,  1])

IF GOT TOO MANY DATA


x_for_train (x[:700])
y_for_train (y[:700])
x_for_test (x[701:1000])
y_for_test (y[701:1000])

to define model to be used

model = LogisticRegression()

to fit the training data

model.fit(x, y)

to find intercept and coefficient

beta0=model.intercept_

beta1=model.coef_
print (beta0, beta1)

to find predicted class

ypred=model.predict(xtest)

print (ypred)
to find accuracy

model.score(xtest, ytest)
NAÏVE BAYES

To import

from google.colab import drive
drive.mount('/content/drive')
import pandas
df = pandas.read_csv('/content/drive/My Drive/
costumer_database.csv')
print(df)

import matplotlib.pyplot as plt
import numpy as np
import sklearn
from sklearn.naive_bayes import GaussianNB

To input variables
x = np.array([3000, 1000, 2700, 1500, 1200, 1400, 4200, 580,  739
0, 245]).reshape(-1, 1)

y = np.array([1,  0,  1,  0,  0,  0,  1,  0,  1,  0])

xtest = np.array([1970, 2845, 4656, 5800, 900]).reshape(-1, 1)
ytest = np.array([0,  1,  1,  1,  1])

To define model
model = GaussianNB()

To fit training data


model.fit(x, y)

To find predicted class of dataset


ypred=model.predict(xtest)

print (ypred)
To find accuracy
model.score(xtest, ytest)

FOR LDA: Just change the model


To import
from sklearn.discriminant_analysis import LinearDiscriminantAnaly
sis as LDA

Define the model


modelLDA = LDA()

To fit the model


modelLDA.fit(x, y)

To find predicted class of dataset


ypredLDA=modelLDA.predict(xtest)

print (ypredLDA)

To find accuracy
modelLDA.score(xtest, ytest)

FOR QDA:
To import
from sklearn.discriminant_analysis import QuadraticDiscriminantAn
alysis as QDA

To define model
modelQDA = QDA()
To fit the model
modelQDA.fit(x, y)

To find predicted class of dataset


ypredQDA=modelQDA.predict(xtest)

print (ypredQDA)

To find accuracy
modelQDA.score(xtest, ytest)

You might also like