0% found this document useful (0 votes)

174 views3 pages

Scikit-Learn ML Cheat Sheet Guide

This document provides a summary of machine learning techniques using scikit-learn in Python. It loads iris data, divides it into training and test sets, trains a k-nearest neighbors model on the training set and predicts the test set labels. It then discusses various preprocessing techniques like standardization, binarization, normalization, handling categorical features, imputing missing values, and generating polynomial features to transform data for machine learning models.

Uploaded by

gepiv94928

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

174 views3 pages

Scikit-Learn ML Cheat Sheet Guide

Uploaded by

gepiv94928

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Python Scikit-Learn Cheat Sheet for Machine Learning

Let’s create a basic example using scikit-learn library which will be used to

⚫ Load the data

⚫ Divide the data into train and test,
⚫ Train your data using the KNN Algorithm and,
⚫ Predict the result

from sklearn import neighbors, datasets, preprocessing

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
X, y = iris.data[:, :2], iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=33)
scaler = preprocessing.StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
knn = neighbors.KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)>>> y_pred = knn.predict(X_test)
accuracy_score(y_test, y_pred)

Loading the data

You need to have a numeric data stored in NumPy arrays or SciPy sparse matrices.
You can also use other numeric arrays, such as Pandas DataFrame.

import numpy as np
X = np.random.random((10,5))
y = np.array(['M','M','F','F','M','F','M','M','F','F','F'])
X[X < 0.7] = 0

Train and Test

Once the data is loaded, your next task would be split your dataset into training data
and testing data.

from sklearn.model_selection import train_test_spli

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
Data Preprocessing

Standardization
Data standardization is one of the data preprocessing step which is used for
rescaling one or more attributes so that the attributes have a mean value of 0 and a
standard deviation of 1. Standardization assumes that your data has a Gaussian
(bell curve) distribution.

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler().fit(X_train)
standardized_X = scaler.transform(X_train)
standardized_X_test = scaler.transform(X_test)

Binarization
Binarization is a common operation performed on text count data. Using binarization
the analyst can decide to consider the presence or absence of a feature rather than
having a quantified number of occurrences for instance.

from sklearn.preprocessing import Binarizer

binarizer = Binarizer(threshold=0.0).fit(X)
binary_X = binarizer.transform(X)

Normalization
Normalization is a technique generally used for data preparation for machine
learning. The main goal of normalization is to change the values of numeric columns
in the dataset so that we can have a common scale, without losing the information
or distorting the differences in the ranges of values.

from sklearn.preprocessing import Normalizer

scaler = Normalizer().fit(X_train)
normalized_X = scaler.transform(X_train)
normalized_X_test = scaler.transform(X_test)

Encoding Categorical Features

The LabelEncoder is another class used in data-preprocessing for encoding class
levels. It can also be used to transform non-numerical labels into numerical labels.

from sklearn.preprocessing import LabelEncoder

enc = LabelEncoder()>>> y = enc.fit_transform(y)
Imputing missing values
The Imputer class in python will provide you with the basic strategies for
imputing/filling missing values. It does this by using the mean, median values or the
most frequent value of the row or column in which the missing values are located.
This class also allows for encoding different missing values.

from sklearn.preprocessing import Imputer

imp = Imputer(missing_values=0, strategy='mean', axis=0)
imp.fit_transform(X_train)

Generating Polynomial Features

Polynomial Feature generates a new feature matrix which consists of all polynomial
combinations of the features with degree less than or equal to the specified degree.
For example, if an input sample is two dimensional and of the form [a, b], then the 2-
degree polynomial features are [1, a, b, a^2, ab, b^2].

from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(5)
poly.fit_transform(X)

Full Article and Source

https://www.edureka.co/blog/cheatsheets/python-scikit-learn-cheat-sheet/

Data - Preprocessing - Jupyter Notebook
No ratings yet
Data - Preprocessing - Jupyter Notebook
5 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
Data Preprocessing with Scikit-learn
No ratings yet
Data Preprocessing with Scikit-learn
14 pages
Scikit-learn Machine Learning Tutorial
No ratings yet
Scikit-learn Machine Learning Tutorial
17 pages
Scikit-Learn Classification Cheat Sheet
No ratings yet
Scikit-Learn Classification Cheat Sheet
1 page
Mtech Study Material
No ratings yet
Mtech Study Material
10 pages
Scikit-Learn: Library For Machine Learning and Data Science With Python
100% (1)
Scikit-Learn: Library For Machine Learning and Data Science With Python
11 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
Scikit Hca
No ratings yet
Scikit Hca
8 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
Mini 4
No ratings yet
Mini 4
9 pages
Scikit-Learn Python ML Cheat Sheet
100% (2)
Scikit-Learn Python ML Cheat Sheet
16 pages
Data Preprocessing
No ratings yet
Data Preprocessing
8 pages
Lecture 2 20022025 092902am
No ratings yet
Lecture 2 20022025 092902am
87 pages
Data Pre-Processing With Sklearn Using Standard and Minmax
No ratings yet
Data Pre-Processing With Sklearn Using Standard and Minmax
21 pages
MLP Week 2 Slides
No ratings yet
MLP Week 2 Slides
82 pages
Lec 2 Unit 1
No ratings yet
Lec 2 Unit 1
89 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Feature Engineering PDF
100% (1)
Feature Engineering PDF
75 pages
4 Data Preprocessing
No ratings yet
4 Data Preprocessing
27 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
6 - Machine Learning 2
No ratings yet
6 - Machine Learning 2
14 pages
CSL0777 L09
No ratings yet
CSL0777 L09
29 pages
Data Preprocessing in Machine Learning
No ratings yet
Data Preprocessing in Machine Learning
14 pages
4-2 Preprocessing and Pipelines
No ratings yet
4-2 Preprocessing and Pipelines
64 pages
Practical 2 - Working With Scikit-Learn
No ratings yet
Practical 2 - Working With Scikit-Learn
6 pages
Scikit Learn
No ratings yet
Scikit Learn
107 pages
Machine Learning With Python Data Preprocessing, Analysis and Visualization
No ratings yet
Machine Learning With Python Data Preprocessing, Analysis and Visualization
8 pages
Data Pre-Processing Steps
No ratings yet
Data Pre-Processing Steps
32 pages
Using Pandas with Scikit-Learn
No ratings yet
Using Pandas with Scikit-Learn
3 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
Scikit-learn Datasets and Setup Guide
No ratings yet
Scikit-learn Datasets and Setup Guide
92 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
Data Preparation for Machine Learning
No ratings yet
Data Preparation for Machine Learning
45 pages
003-FIN7790 (Part2)
No ratings yet
003-FIN7790 (Part2)
162 pages
Scikit-Learn Python Cheat Sheet
No ratings yet
Scikit-Learn Python Cheat Sheet
1 page
Scikit-Learn Algorithm Overview
No ratings yet
Scikit-Learn Algorithm Overview
1 page
Data Scientists' scikit-learn Guide
No ratings yet
Data Scientists' scikit-learn Guide
52 pages
Unit 5 Material
No ratings yet
Unit 5 Material
18 pages
Scikit Learn
No ratings yet
Scikit Learn
25 pages
ML Algorithms for Data Scientists
100% (2)
ML Algorithms for Data Scientists
148 pages
The Complete Guide To Data Preprocessing
No ratings yet
The Complete Guide To Data Preprocessing
50 pages
How To Prepare Your Dataset For Machine Learning in Python
No ratings yet
How To Prepare Your Dataset For Machine Learning in Python
14 pages
Data Sampling and Feature Engineering Guide
No ratings yet
Data Sampling and Feature Engineering Guide
2 pages
Data Preprocessing for Machine Learning
No ratings yet
Data Preprocessing for Machine Learning
38 pages
Scikit-Learn Supervised Learning Guide
100% (1)
Scikit-Learn Supervised Learning Guide
108 pages
Scikit-Learn: Python Data Analytics
No ratings yet
Scikit-Learn: Python Data Analytics
58 pages
Machine Learning Data Preprocessing Guide
No ratings yet
Machine Learning Data Preprocessing Guide
24 pages
Data Science Bootcamp Insights
No ratings yet
Data Science Bootcamp Insights
161 pages
Standardization in Python for ML
No ratings yet
Standardization in Python for ML
19 pages
Deep Learning - Fundamentals, Theory and Applications 2019 PDF
100% (11)
Deep Learning - Fundamentals, Theory and Applications 2019 PDF
168 pages
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
Machine Learning With Python
100% (15)
Machine Learning With Python
692 pages
Full Course of Machine Learning
100% (17)
Full Course of Machine Learning
660 pages
Understanding Machine Learning
100% (73)
Understanding Machine Learning
416 pages
Learning The Pandas Library Python Tools For Data Munging Analysis and Visual PDF
100% (19)
Learning The Pandas Library Python Tools For Data Munging Analysis and Visual PDF
208 pages
Python Machine Learning For Beginners Ebook Final
100% (11)
Python Machine Learning For Beginners Ebook Final
305 pages
Machine Learning Projects in Python
100% (17)
Machine Learning Projects in Python
135 pages
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
91% (11)
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
166 pages
Deep Learning With Python
100% (10)
Deep Learning With Python
396 pages
Data Visualization With Python PDF
93% (15)
Data Visualization With Python PDF
662 pages
Data Structure and Algorithms With Python
100% (16)
Data Structure and Algorithms With Python
369 pages
RAG Architecture
100% (11)
RAG Architecture
52 pages
DATA ANALYTICS - A Comprehensive Beginner's Guide To Learn About The Realms of Data Analytics From A-Z
89% (18)
DATA ANALYTICS - A Comprehensive Beginner's Guide To Learn About The Realms of Data Analytics From A-Z
102 pages
The Python Bible
97% (33)
The Python Bible
506 pages
EBOOK - Python Crash Course For Data Analysis
100% (12)
EBOOK - Python Crash Course For Data Analysis
168 pages
Python Data Science
92% (12)
Python Data Science
65 pages
Python in Excel (2024)
100% (14)
Python in Excel (2024)
607 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
94% (18)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (15)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages
Coffee Break NumPy PDF
100% (8)
Coffee Break NumPy PDF
211 pages
Python Programming. A Step-by-Step Guide For Absolute Beginners
91% (46)
Python Programming. A Step-by-Step Guide For Absolute Beginners
181 pages
Machine Learning - An Applied Mathematics Introduction PDF
100% (14)
Machine Learning - An Applied Mathematics Introduction PDF
246 pages
Hackers Guide To Machine Learning With Python PDF
100% (16)
Hackers Guide To Machine Learning With Python PDF
272 pages
Practical Projects
100% (32)
Practical Projects
478 pages
SQL For Data Science
75% (4)
SQL For Data Science
350 pages
Data Analytics Using Python
100% (7)
Data Analytics Using Python
982 pages
Python Notes For Professionals
100% (18)
Python Notes For Professionals
814 pages
Generative Ai Fundamentals v1
100% (19)
Generative Ai Fundamentals v1
80 pages
Comprehensive Blog Directory List
No ratings yet
Comprehensive Blog Directory List
4 pages
Course Plan of 6th Semester
No ratings yet
Course Plan of 6th Semester
9 pages
OM-EL-USB-4 OM-EL-USB-4: Instruction Sheet
No ratings yet
OM-EL-USB-4 OM-EL-USB-4: Instruction Sheet
7 pages
Streaming Algorithms Explained
No ratings yet
Streaming Algorithms Explained
4 pages
Tik Tok Daddy - How I Make $3K Monthly On Tik Tok
No ratings yet
Tik Tok Daddy - How I Make $3K Monthly On Tik Tok
18 pages
System Analysis Questions
No ratings yet
System Analysis Questions
50 pages
Initial Report September 4th, 2008: Analyst: Lisa Springer, CFA
100% (1)
Initial Report September 4th, 2008: Analyst: Lisa Springer, CFA
22 pages
AC Testo Saveris Brochure 2
No ratings yet
AC Testo Saveris Brochure 2
25 pages
SolidWorks Drawing Exercises Guide
No ratings yet
SolidWorks Drawing Exercises Guide
51 pages
Unit 2: Electronic Circuit Simulation Package (Pt. 1) E3004 / UNIT 2
No ratings yet
Unit 2: Electronic Circuit Simulation Package (Pt. 1) E3004 / UNIT 2
27 pages
Yealink Action URI
No ratings yet
Yealink Action URI
2 pages
Computer Science Notes
No ratings yet
Computer Science Notes
6 pages
Swap Nodes in Linked List Tutorial
No ratings yet
Swap Nodes in Linked List Tutorial
2 pages
Overview of Basic Assembler Functions
No ratings yet
Overview of Basic Assembler Functions
13 pages
Educational Robotics Simulation
No ratings yet
Educational Robotics Simulation
6 pages
Easy-Med Software by Bi-Tech Software Developmentpls
No ratings yet
Easy-Med Software by Bi-Tech Software Developmentpls
20 pages
Course Done at SEED Batch No. SEED Centre Gender: Diploma in Software Testing 615 Pinac Female
No ratings yet
Course Done at SEED Batch No. SEED Centre Gender: Diploma in Software Testing 615 Pinac Female
6 pages
Controller LigoWave
No ratings yet
Controller LigoWave
22 pages
Django 1.4 Developer Cheatsheet
No ratings yet
Django 1.4 Developer Cheatsheet
2 pages
BPL ProgrammersManual
No ratings yet
BPL ProgrammersManual
56 pages
Create Table Function in SQL Using SAP HANA Studio PDF
No ratings yet
Create Table Function in SQL Using SAP HANA Studio PDF
7 pages
Music Box SFZ
0% (1)
Music Box SFZ
3 pages
Combine XFCE and i3WM on Manjaro
No ratings yet
Combine XFCE and i3WM on Manjaro
6 pages
Firmware Updates for MeetingBar
No ratings yet
Firmware Updates for MeetingBar
10 pages
BATCH 51 RES - PAPER - Edited
No ratings yet
BATCH 51 RES - PAPER - Edited
6 pages
Superpave Gyratory Compactor
100% (1)
Superpave Gyratory Compactor
4 pages
Virtual Connect Simulator Guide
100% (1)
Virtual Connect Simulator Guide
18 pages
ISO27001 CheatSheet Explanation
No ratings yet
ISO27001 CheatSheet Explanation
2 pages
OBE Outline - AISP (AIS 3311) - MTH - Spring23
No ratings yet
OBE Outline - AISP (AIS 3311) - MTH - Spring23
4 pages
Advanced Printer System Settings
No ratings yet
Advanced Printer System Settings
4 pages

Scikit-Learn ML Cheat Sheet Guide

Uploaded by

Scikit-Learn ML Cheat Sheet Guide

Uploaded by

Python Scikit-Learn Cheat Sheet for Machine Learning

⚫ Load the data

from sklearn import neighbors, datasets, preprocessing

Loading the data

Train and Test

from sklearn.model_selection import train_test_spli

from sklearn.preprocessing import StandardScaler

from sklearn.preprocessing import Binarizer

from sklearn.preprocessing import Normalizer

Encoding Categorical Features

from sklearn.preprocessing import LabelEncoder

from sklearn.preprocessing import Imputer

Generating Polynomial Features

from sklearn.preprocessing import PolynomialFeatures

Full Article and Source

You might also like