0% found this document useful (0 votes)

23 views5 pages

Scikit-Learn (Sklearn) in Python

Scikit-learn (sklearn) is a prominent Python library for machine learning that provides tools for classification, regression, clustering, and dimensionality reduction. It includes a wide range of supervised and unsupervised learning algorithms, cross-validation methods, and various datasets for practice. While it is excellent for building machine learning models, it is not designed for data manipulation, which is better suited for libraries like NumPy and Pandas.

Uploaded by

gopalelectronics19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views5 pages

Scikit-Learn (Sklearn) in Python

Uploaded by

gopalelectronics19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Scikit-learn(sklearn) in Python

What is scikit-learn or sklearn?

Scikit-learn is probably the most useful library for machine learning in Python.

The sklearn library contains a lot of efficient tools for machine learning and

statistical modeling including classification, regression, clustering and

dimensionality reduction.

Please note that sklearn is used to build machine learning models. It should not

be used for reading the data, manipulating and summarizing it. There are better

libraries for that (e.g. NumPy, Pandas etc.)

Components of scikit-learn:

Scikit-learn comes loaded with a lot of features. Here are a few of them to help

you understand the spread:

 Supervised learning algorithms: Think of any supervised machine learning

algorithm you might have heard about and there is a very high chance that it is

part of scikit-learn. Starting from Generalized linear models (e.g Linear

Regression), Support Vector Machines (SVM), Decision Trees to Bayesian

methods – all of them are part of scikit-learn toolbox. The spread of machine
learning algorithms is one of the big reasons for the high usage of scikit -learn. I

started using scikit to solve supervised learning problems and would recommend

that to people new to scikit / machine learning as well.

 Cross-validation: There are various methods to check the accuracy of

supervised models on unseen data using sklearn.

 Unsupervised learning algorithms: Again there is a large spread of machine

learning algorithms in the offering – starting from clustering, factor analysis,

principal component analysis to unsupervised neural networks.

 Various toy datasets: This came in handy while learning scikit-learn. I had

learned SAS using various academic datasets (e.g. IRIS dataset, Boston House

prices dataset). Having them handy while learning a new library helped a lot.

 Feature extraction: Scikit-learn for extracting features from images and text

(e.g. Bag of words)

Community / Organizations using scikit-learn:

One of the main reasons behind using open source tools is the huge community

it has. Same is true for sklearn as well. There are about 35 contributors to scikit -

learn till date, the most notable being Andreas Mueller (P.S. Andy’s machine

learning cheat sheet is one of the best visualizations to understand the

spectrum of machine learning algorithms).

There are various Organizations of the likes of Evernote, Inria and AWeber which

are being displayed on scikit learn home page as users. But I truly believe that

the actual usage is far more.

In addition to these communities, there are various meetups across the globe.

There was also a Kaggle knowledge contest, which finished recently but might

still be one of the best places to start playing around with the library.

Machine Learning cheat sheet – see Original image for better resolution

Quick Example:

Now that you understand the ecosystem at a high level, let me illustrate the use

of sklearn with an example. The idea is to just illustrate the simplicity of usage of

sklearn. We will have a look at various algorithms and best ways to use them in

one of the articles which follow.

We will build a logistic regression on IRIS dataset:

Step 1: Import the relevant libraries and read the dataset

[stextbox id = “grey”]

import numpy as np
import matplotlib as plt

from sklearn import datasets

from sklearn import metrics

from sklearn.linear_model import LogisticRegression

[/stextbox]

We have imported all the libraries. Next, we read the dataset:

[stextbox id = “grey”]

dataset = datasets.load_iris()

[/stextbox]

Step 2: Understand the dataset by looking at distributions and plots

I am skipping these steps for now. You can read this article, if you want to learn

exploratory analysis.

Step 3: Build a logistic regression model on the dataset and making

predictions

[stextbox id = “grey”]

model.fit(dataset.data, dataset.target)

expected = dataset.target

predicted = model.predict(dataset.data)

[/stextbox]
Step 4: Print confusion matrix

[stextbox id = “grey”]

print(metrics.classification_report(expected, predicted))

print(metrics.confusion_matrix(expected, predicted))

Scikit - Learn Machine Learning in Python
No ratings yet
Scikit - Learn Machine Learning in Python
6 pages
Data Sets
No ratings yet
Data Sets
36 pages
Scikit-Learn - Machine Learning in Python PDF
No ratings yet
Scikit-Learn - Machine Learning in Python PDF
6 pages
Scikit Learn - Quick Guide
No ratings yet
Scikit Learn - Quick Guide
111 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Python Libraries and Packages For Data Science
100% (1)
Python Libraries and Packages For Data Science
5 pages
Scikit Learn Tutorial PDF
100% (2)
Scikit Learn Tutorial PDF
151 pages
Introduction To Scikit Learn
100% (1)
Introduction To Scikit Learn
108 pages
Scikit-Learn-Exercises - Jupyter Notebook
100% (2)
Scikit-Learn-Exercises - Jupyter Notebook
28 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Vtu ML
No ratings yet
Vtu ML
62 pages
Scikit Learn
No ratings yet
Scikit Learn
25 pages
Scikit Learn Cheat Sheet Python
No ratings yet
Scikit Learn Cheat Sheet Python
1 page
Scikit-Learn Integration in The Python Ecosystem
No ratings yet
Scikit-Learn Integration in The Python Ecosystem
1 page
Scikit
No ratings yet
Scikit
3 pages
English Is Cool Course Book 3 - Preview
No ratings yet
English Is Cool Course Book 3 - Preview
20 pages
Scikit Learn
No ratings yet
Scikit Learn
107 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
API Design For Machine Learning Software: Experiences From The Scikit-Learn Project
No ratings yet
API Design For Machine Learning Software: Experiences From The Scikit-Learn Project
15 pages
Scikit-Learn: Library For Machine Learning and Data Science With Python
No ratings yet
Scikit-Learn: Library For Machine Learning and Data Science With Python
11 pages
The Monogamy Gap Men Love and the Reality of Cheating 1st Edition Eric Anderson Full Chapters Included
No ratings yet
The Monogamy Gap Men Love and the Reality of Cheating 1st Edition Eric Anderson Full Chapters Included
129 pages
Scikit Learn
No ratings yet
Scikit Learn
4 pages
TP02
No ratings yet
TP02
3 pages
Scikit-Learn Cheat Sheet - Python Machine Learning (Article) - DataCamp
100% (2)
Scikit-Learn Cheat Sheet - Python Machine Learning (Article) - DataCamp
16 pages
Python Scripting
100% (1)
Python Scripting
15 pages
Unit 1
No ratings yet
Unit 1
28 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Python SciKit Learn Tutorial - DigitalOcean
No ratings yet
Python SciKit Learn Tutorial - DigitalOcean
11 pages
ML Lab Manual (Vim)
No ratings yet
ML Lab Manual (Vim)
13 pages
Practical Guide To Scikit-Learn For Data Science
No ratings yet
Practical Guide To Scikit-Learn For Data Science
27 pages
Consti Concepts Citizanship Suffrage
No ratings yet
Consti Concepts Citizanship Suffrage
7 pages
Regression Scikit Learn
No ratings yet
Regression Scikit Learn
33 pages
Machine Learning Lab Programs
No ratings yet
Machine Learning Lab Programs
6 pages
Data Mining Essen, Als 2: Data Mining in Prac, Ce, With Python
No ratings yet
Data Mining Essen, Als 2: Data Mining in Prac, Ce, With Python
31 pages
Scikit Learn Cheat Sheet
No ratings yet
Scikit Learn Cheat Sheet
9 pages
ML Libraries
No ratings yet
ML Libraries
19 pages
ML Lab Manual
No ratings yet
ML Lab Manual
20 pages
Supervised Learning: Andreas Müller
No ratings yet
Supervised Learning: Andreas Müller
43 pages
Practical 1
No ratings yet
Practical 1
2 pages
ML Lab
No ratings yet
ML Lab
4 pages
Practical 2 - Working With Scikit-Learn
No ratings yet
Practical 2 - Working With Scikit-Learn
6 pages
Data Science II: Charles C.N. Wang
No ratings yet
Data Science II: Charles C.N. Wang
38 pages
Unit 5 Material
No ratings yet
Unit 5 Material
18 pages
Vtu ML
No ratings yet
Vtu ML
13 pages
Applied Motor Control and Control Module ISU Ilagan - 060606
No ratings yet
Applied Motor Control and Control Module ISU Ilagan - 060606
10 pages
Lecture # 2
No ratings yet
Lecture # 2
21 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
Comparative SComparative Study The Kurt Lewin of Changtudy The Kurt Lewin of Chang
100% (1)
Comparative SComparative Study The Kurt Lewin of Changtudy The Kurt Lewin of Chang
4 pages
Scikit-Learn Cookbook Sample Chapter
No ratings yet
Scikit-Learn Cookbook Sample Chapter
52 pages
Unveiling The Power
No ratings yet
Unveiling The Power
17 pages
Scikit-Learn: Machine Learning in Python
No ratings yet
Scikit-Learn: Machine Learning in Python
6 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
Ch1 - Slides - Supervised Learning
No ratings yet
Ch1 - Slides - Supervised Learning
32 pages
About Scikit
No ratings yet
About Scikit
3 pages
Unit 1 (1)
No ratings yet
Unit 1 (1)
62 pages
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
100% (1)
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
1 page
Mmu PHD Thesis Guidelines
100% (4)
Mmu PHD Thesis Guidelines
8 pages
Intro To Scikit Learning
No ratings yet
Intro To Scikit Learning
18 pages
Home Shifitng Perfect
No ratings yet
Home Shifitng Perfect
47 pages
Autoregressive (AR) Model For Time Series Forecasting - GeeksforGeeks
No ratings yet
Autoregressive (AR) Model For Time Series Forecasting - GeeksforGeeks
16 pages
Leads200 Ahmedabad Gandhinagar
No ratings yet
Leads200 Ahmedabad Gandhinagar
10 pages
TA Tao-Hands-Practitioner-Syllabus 20231019 V09 DR
No ratings yet
TA Tao-Hands-Practitioner-Syllabus 20231019 V09 DR
8 pages
SMCR Model of Communication
No ratings yet
SMCR Model of Communication
11 pages
Political Economy - Version 1.2 - July 2021
No ratings yet
Political Economy - Version 1.2 - July 2021
13 pages
Case Study 3 Ramada Demostrates Its Personal Best 1
No ratings yet
Case Study 3 Ramada Demostrates Its Personal Best 1
10 pages
Diagnostic Test Variant 2
No ratings yet
Diagnostic Test Variant 2
3 pages
Ba Sample Resume 2
No ratings yet
Ba Sample Resume 2
1 page
CV Pierre-Alain Viguier
No ratings yet
CV Pierre-Alain Viguier
3 pages
Ieema Sample 100
No ratings yet
Ieema Sample 100
3 pages
Fully Funded 5 Days Summer School in Sweden 2025
No ratings yet
Fully Funded 5 Days Summer School in Sweden 2025
2 pages
National Apprenticeship Training Scheme (NATS)
No ratings yet
National Apprenticeship Training Scheme (NATS)
2 pages
Order ID 6918781963
No ratings yet
Order ID 6918781963
2 pages
Vision Embesoft Website Proposal
No ratings yet
Vision Embesoft Website Proposal
2 pages
Analisis Tingkat Kepuasan Pemustaka Terhadap Layanan Di Perpustakaan Balai Arkeologi Daerah Istimewa Yogyakarta Dengan Metode Libqual
No ratings yet
Analisis Tingkat Kepuasan Pemustaka Terhadap Layanan Di Perpustakaan Balai Arkeologi Daerah Istimewa Yogyakarta Dengan Metode Libqual
19 pages
Business Plan Template
No ratings yet
Business Plan Template
10 pages
MSC Financial Economics - Sh506: 1. Objectives
No ratings yet
MSC Financial Economics - Sh506: 1. Objectives
4 pages
Parm para
No ratings yet
Parm para
1 page
CEM
No ratings yet
CEM
8 pages
A Deep Learning Approach To The Geometry Friends Game (Artículo)
No ratings yet
A Deep Learning Approach To The Geometry Friends Game (Artículo)
10 pages
Calander 2018-2019 Tusd
No ratings yet
Calander 2018-2019 Tusd
1 page
Learner Profile Brochure
No ratings yet
Learner Profile Brochure
3 pages
Lesson 11 at A Glance
No ratings yet
Lesson 11 at A Glance
1 page
Pulpitis: A Review: September 2015
No ratings yet
Pulpitis: A Review: September 2015
7 pages
DSP Final Exam FS 2022
No ratings yet
DSP Final Exam FS 2022
1 page
Sample Q
No ratings yet
Sample Q
5 pages
Exam Results
No ratings yet
Exam Results
2 pages
New Proposal On Mathematics Tutoring Application For Secondary School
No ratings yet
New Proposal On Mathematics Tutoring Application For Secondary School
4 pages
Jayson B. Bejec: # 128 San Simon Street. Brgy Holy Spirit, Quezon City 09454292147
No ratings yet
Jayson B. Bejec: # 128 San Simon Street. Brgy Holy Spirit, Quezon City 09454292147
3 pages
Cross Cultural Understanding: Aan Pranata (17018106)
No ratings yet
Cross Cultural Understanding: Aan Pranata (17018106)
3 pages
Varun Sodhi Resume PDF
No ratings yet
Varun Sodhi Resume PDF
1 page
Catherine Hoblin-5
No ratings yet
Catherine Hoblin-5
1 page

Scikit-Learn (Sklearn) in Python

Uploaded by

Scikit-Learn (Sklearn) in Python

Uploaded by

Scikit-learn(sklearn) in Python

What is scikit-learn or sklearn?

statistical modeling including classification, regression, clustering and

libraries for that (e.g. NumPy, Pandas etc.)

you understand the spread:

 Supervised learning algorithms: Think of any supervised machine learning

part of scikit-learn. Starting from Generalized linear models (e.g Linear

Regression), Support Vector Machines (SVM), Decision Trees to Bayesian

that to people new to scikit / machine learning as well.

 Cross-validation: There are various methods to check the accuracy of

supervised models on unseen data using sklearn.

 Unsupervised learning algorithms: Again there is a large spread of machine

learning algorithms in the offering – starting from clustering, factor analysis,

principal component analysis to unsupervised neural networks.

(e.g. Bag of words)

learning cheat sheet is one of the best visualizations to understand the

spectrum of machine learning algorithms).

the actual usage is far more.

one of the articles which follow.

We will build a logistic regression on IRIS dataset:

Step 1: Import the relevant libraries and read the dataset

from sklearn import datasets

from sklearn import metrics

from sklearn.linear_model import LogisticRegression

We have imported all the libraries. Next, we read the dataset:

Step 2: Understand the dataset by looking at distributions and plots

Step 3: Build a logistic regression model on the dataset and making

You might also like