0% found this document useful (0 votes)

46 views2 pages

Scikit Learn Cheat Sheet

Uploaded by

Rania Dirar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views2 pages

Scikit Learn Cheat Sheet

Uploaded by

Rania Dirar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Scikit-Learn(Cyber-security) Cheat Sheet

by satwik dondapati (sati) via cheatography.com/121228/cs/22124/

Definition Pandas functions for importing Data

Scikit-learn is an open source Python library that implements a pd.read_csv(filename) From a CSV file
range of machine learning, preprocessing, cross-validation and pd.read_excel(filename) From an Excel file
visualization algorithms using a unified interface
pd.read_sql(query, Read from a SQL table/database
connection_object)
Splitting Data
pd.read_clipboard() Takes the contents of your clipboard and
from sklearn.model_selection import train_test_split
passes it to read_table()
X_train, X_test, y_train, y_test = train_test_split(X, y, random_st‐
ate=7) Visualization using Scikit-learn

from sklearn.metrics import Importing "plot_roc_curve" to plot

Handling Missing Data
plot_roc_curve
from sklearn.impute import SimpleImputer
svc_disp = plot_roc_curve(svc, Plotting Receiver operating
missingvalues = SimpleImputer(missing_values = np.nan, strategy = X_test, y_test) characteristic Curve
'mean')
metrics.plot_confusion_matrix Plotting Confusion Matrix.
missingvalues = missingvalues.fit(X[:, 1:3])
X[:, 1:3]=missingvalues.transform(X[:, 1:3])
Clustering metrics

Adjusted Rand Index

Linear Regression
>>> from sklearn.metrics import adjusted_rand_score
from sklearn.linear_model import LinearRegression
>>> adjusted_rand_score(y_true, y_pred)
linear_reg = LinearRegression()
Homogeneity
linear_reg.fit( X , y )
>>> from sklearn.metrics import homogeneity_score >>> homoge‐
neity_score(y_true, y_pred)
Decision Tree and Random forest
V-measure
from sklearn.tree import DecisionTreeRegressor >>> from sklearn.metrics import v_measure_score
from sklearn.ensemble import RandomForestRegressor >>> metrics.v_measure_score(y_true, y_pred)
regressor = DecisionTreeRegressor(random_state = 0)
regressor.fit(X,y) Pandas Data Cleaning functions
regressor2 = RandomForestRegressor(n_estimators = pd.isnull() Checks for null Values, Returns Boolean Arrray
100,random_state=0)
pd.notnull() Opposite of pd.isnull()
regressor2.fit(X,y)
df.dropna() Drop all rows that contain null values

Cross-Validation df.dropna(axis=1) Drop all columns that contain null values

df.fillna(x) Replace all null values with x
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
Numpy Basic Functions
from sklearn.model_selection import cross_validate
X , y = make)regression(n_samples = 1000, random_state = 0) import numpy as np importing numpy
lr = LinearRegression() example = [0,1,2] array([0, 1, 2])
result = cross_validate(lr,X,y) example = np.array(example)
result['test_score'] np.arange(1,4) array([1,2,3])

It is used to know the effectiveness of our Models by re-sampling and np.zeros(2,2) array([[0,0],[0,0]])
applying to models in different iterations.

By satwik dondapati (sati) Published 22nd March, 2020. Sponsored by CrosswordCheats.com

cheatography.com/sati/ Last updated 22nd March, 2020. Learn to solve cryptic crosswords!
Page 1 of 2. http://crosswordcheats.com
Scikit-Learn(Cyber-security) Cheat Sheet
by satwik dondapati (sati) via cheatography.com/121228/cs/22124/

Numpy Basic Functions (cont) Feature Scaling

np.linspace(0,10,2) array([0,5]), gives two evenly spaced values from sklearn.preprocessing import StandardScaler
np.eye(2) array([[1,0],[0,1]), 2*2 Identity Matrix sc_X = StandardScaler()
example.reshape(3,1) array([[0],[1],[2]]) X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)
Loading Dataset from local Machine
Euclidean distance is dominated by the larger numbers and to make
import pandas as pd all the values on the same scale. hence Scaling should be done.
data = pd.read_csv(pathname) Most of the models do feature scaling by themselves.

If the file is in the local directory then we can directly use File name
SVR(Non-linear Regression model)

Loading Data from Standard datasets from sklearn.svm import SVR

regressor = SVR(kernel = 'rbf')

from sklearn import datasets
regressor.fit(X,y)
iris = datasets.load_iris()
digits = datasets.load_digits() y_prediction = regressor.predict (values)

Basically, the kernel is selected based on the given problem. If the

Encoding Categorical Variables problem is Linear then kernel='linear'. And if problem is non-linear
from sklearn.preprocessing import LabelEncoder we can choose either 'poly' or 'rbf'(gussian)

labelencoder_X = LabelEncoder()
Some Classification Models
X[ : , 0] = labelencoder_X.fit_transform(X[ : , 0 ])
Logistic Regression
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray() K-NN(K- nearest neighbours)
Support Vector Machine(SVM)
Polynomial Regression Naive Bayes
from sklearn.preprocessing import PolynomialFeatures Decision Tree Classification
poly_reg = PolynomialFeatures(degree =2) Random Forest Classification
X_poly = poly_reg.fit_transform(X)

It not only checks the relation between X(independent) and y(depe‐ Some Clustering Models
ndent). But also checks with X2 ..X n. (n is degree specified by us). K-Means Clustering
Hierarchial Clustering
Evaluation of Regression Model Performance
DB-SCAN
R2 = 1 - SS(residuals)/SS(total)

SS(res) = SUM(Yi - y^i) 2 Knowing about Data information with Pandas

SS(Total) = SUM(yi - yavg) 2 df.head(n) First n rows of the DataFrame
from sklearn.metrics import r2_score df.tail(n) Last n rows of the DataFrame
r2_score(y_true,y_pred)
df.shape Number of rows and columns
The Greater the R2 value the better the model is.. df.info() Index, Datatype and Memory information
df.describe() Summary statistics for numerical columns
Converting Dataframe to Matrix

data = pd.read_csv("data.csv")
X = data.iloc[ : , :-1].values
y = data.iloc[ : , 3].values

y is Dependent parameter

By satwik dondapati (sati) Published 22nd March, 2020. Sponsored by CrosswordCheats.com

cheatography.com/sati/ Last updated 22nd March, 2020. Learn to solve cryptic crosswords!
Page 2 of 2. http://crosswordcheats.com

Ihs PC 2dadv
100% (1)
Ihs PC 2dadv
686 pages
MongoBoulder - Schema Design
No ratings yet
MongoBoulder - Schema Design
59 pages
كتاب الاب الغني والاب الفقير لروبرت كيوساكي
100% (1)
كتاب الاب الغني والاب الفقير لروبرت كيوساكي
244 pages
2021 Saudi Arabia Venture Capital Report 2021
No ratings yet
2021 Saudi Arabia Venture Capital Report 2021
25 pages
Database 14
No ratings yet
Database 14
48 pages
Transfer of Requirements Configuration in SD
No ratings yet
Transfer of Requirements Configuration in SD
8 pages
Memory Forensics
No ratings yet
Memory Forensics
8 pages
AZ-303 Exam - Free Actual Q&as, Page 1 - ExamTopics
0% (1)
AZ-303 Exam - Free Actual Q&as, Page 1 - ExamTopics
5 pages
Ospf DMVPN Anycast PDF
No ratings yet
Ospf DMVPN Anycast PDF
32 pages
Configuring An Xcruiser HD PVR To Use A WiFi Adaptor
No ratings yet
Configuring An Xcruiser HD PVR To Use A WiFi Adaptor
7 pages
Nokia C2-00 User Guide: Issue 1.1
No ratings yet
Nokia C2-00 User Guide: Issue 1.1
39 pages
WB EDistrict User Manual Applicant Certified Copy of Deed 0.2 4may15
No ratings yet
WB EDistrict User Manual Applicant Certified Copy of Deed 0.2 4may15
23 pages
International Scholarly Research Notices - 2013 - Van Der Aalst - Business Process Management A Comprehensive Survey
No ratings yet
International Scholarly Research Notices - 2013 - Van Der Aalst - Business Process Management A Comprehensive Survey
37 pages
Resume Template
No ratings yet
Resume Template
2 pages
28 Jupyter Notebook Tips, Tricks, and Shortcuts
No ratings yet
28 Jupyter Notebook Tips, Tricks, and Shortcuts
51 pages
Retention Model
No ratings yet
Retention Model
24 pages
Java Programming Chapter 11
No ratings yet
Java Programming Chapter 11
59 pages
Constraint Layout
No ratings yet
Constraint Layout
31 pages
Session One
No ratings yet
Session One
26 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
35 pages
Assignment: 17
No ratings yet
Assignment: 17
12 pages
Pandas Tricks and Tips To Exceed
No ratings yet
Pandas Tricks and Tips To Exceed
12 pages
Types of Servlet
No ratings yet
Types of Servlet
18 pages
Characteristics of Computer System: Speed
No ratings yet
Characteristics of Computer System: Speed
4 pages
NotPetya Cyberattack Report Final
No ratings yet
NotPetya Cyberattack Report Final
2 pages
Go Tools Cheat Sheet Golangbot
No ratings yet
Go Tools Cheat Sheet Golangbot
4 pages
UD39811B-A Network-Video-Recorder User-Manual V5.04.000 20250407
No ratings yet
UD39811B-A Network-Video-Recorder User-Manual V5.04.000 20250407
150 pages
Data Science Life Cycle - All Details
No ratings yet
Data Science Life Cycle - All Details
12 pages
Snail Mail Vs E-Mail (9th)
No ratings yet
Snail Mail Vs E-Mail (9th)
2 pages
Assignment Etgbe
No ratings yet
Assignment Etgbe
2 pages
Paper: Cse-604: Btech Examination, Class: Vi Subject: Computer Networking
No ratings yet
Paper: Cse-604: Btech Examination, Class: Vi Subject: Computer Networking
3 pages
Notification SBI Specialist Cadre Officer Posts
No ratings yet
Notification SBI Specialist Cadre Officer Posts
21 pages
Turnaround Strategy
No ratings yet
Turnaround Strategy
2 pages
ICDL Professional Modules - Computational - Using Databases
No ratings yet
ICDL Professional Modules - Computational - Using Databases
10 pages
3 Lab Report For GXCQ
No ratings yet
3 Lab Report For GXCQ
5 pages
Saumont2 tJoK MEAP V08 ch1
No ratings yet
Saumont2 tJoK MEAP V08 ch1
31 pages
Grade 7 Notes-Unit 7
No ratings yet
Grade 7 Notes-Unit 7
1 page
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

Scikit Learn Cheat Sheet

Uploaded by

Scikit Learn Cheat Sheet

Uploaded by

Scikit-Learn(Cyber-security) Cheat Sheet

by satwik dondapati (sati) via cheatography.com/121228/cs/22124/

Definition Pandas functions for importing Data

from sklear​n.m​etrics import Importing "​plo​t_r​oc_​cur​ve" to plot

Adjusted Rand Index

Cross-​Val​idation df.dro​pna​(ax​is=1) Drop all columns that contain null values

By satwik dondapati (sati) Published 22nd March, 2020. Sponsored by CrosswordCheats.com

Numpy Basic Functions (cont) Feature Scaling

Loading Data from Standard datasets from sklear​n.svm import SVR

regressor = SVR​(k​ernel = 'rbf')

Basically, the kernel is selected based on the given problem. If the

SS(res) = SUM(Yi - y^i) 2 Knowing about Data inform​ation with Pandas

By satwik dondapati (sati) Published 22nd March, 2020. Sponsored by CrosswordCheats.com

You might also like

from sklearn.metrics import Importing "plot_roc_curve" to plot

Cross-Validation df.dropna(axis=1) Drop all columns that contain null values

Loading Data from Standard datasets from sklearn.svm import SVR

regressor = SVR(kernel = 'rbf')

SS(res) = SUM(Yi - y^i) 2 Knowing about Data information with Pandas