0% found this document useful (0 votes)

18 views

Regression Scikit Learn

regression

Uploaded by

amarinder765

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Regression Scikit Learn

regression

Uploaded by

amarinder765

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

scikit-learn

▪ introduction
▪ installation/distribution
▪ essential/auxiliary libraries
▪ usage

1
scikit-learn

▪ free
introduction--- ▪ open-source
▪ constantly being developed and improved
scikit-learn (also known as sklearn) is ▪ an active user community.
a free software machine ▪ state-of-the-art machine learning algorithms
learning library for ▪ provides nice documentation
the Python programming language. ▪ widely used in industry and academia
▪ a wealth of tutorials and code snippets are
available online.
▪ works well with many scientific Python
tools

2
scikit-learn

▪ for scientific computation

dependencies --- ▪ NumPy
▪ SciPy.
scikit-learn heavily relies on
NumPy and SciPy for its ▪ for plotting
▪ matplotlib
functions-moreover, can be used
more effectively with other ▪ for interactive development
auxiliary packages ▪ Ipython
▪ Jupyter Notebook

3
scikit-learn

installation--- 1
• Anaconda Free
(recommended)
▪ can be independently installed
▪ (recommended) can be 2
• Enthought canopy Not free

installed via a number of

• Python( x, y ) Free
python distributions ⇛ 3

if you install any of these Python

distributions, scikit-learn comes
packaged with it -

4
scikit-learn

comes with:
▪ NumPy,
▪ SciPy,
▪
anaconda --- ▪
matplotlib,
pandas,
a Python distribution for ▪ IPython,
▪ Jupyter Notebook,
large-scale data processing,
▪ scikit-learn
predictive analytics, and
scientific computing ⇛ available on:
▪ Mac
▪ OS
▪ Windows

5
scikit-learn

Jupyter Notebook
• provides an interactive environment
libraries--- • runs code in the browser.
• great tool for exploratory data analysis
essentially required or increase the

⇛ NumPy •
•
widely used by data scientists
supports many programming languages
effectiveness of scikit-learn

⇛ SciPy
⇛ Jupyter Notebook NumPy
⇛ matplotlib • fundamental packages for scientific computing
• provides functionality for:
⇛ Pandas • multidimensional arrays
• high-level mathematical functions, e.g.,
• linear algebra operations
• Fourier transform
• pseudorandom number generators.

6
scikit-learn

NumPy, SciPy
:: strengths ::

7
scikit-learn

SciPy
libraries--- • a collection of functions for scientific computing
• provides, among other functionality:
essentially required or increase the

⇛ NumPy • advanced linear algebra routines,

effectiveness of scikit-learn

•
⇛ SciPy •
mathematical function optimization,
signal processing,
⇛ Jupyter Notebook • special mathematical functions,
⇛ matplotlib • statistical distributions.
⇛ Pandas • scikit-learn draws from SciPy’s collection of functions
for implementing its algorithms.

8
scikit-learn

matplotlib
libraries--- • primary scientific plotting library in Python
essentially required or increase the

• provides functions for

⇛ NumPy • making publication-quality visualizations:
effectiveness of scikit-learn

⇛ SciPy • line charts,

⇛ Jupyter Notebook • histograms,
• scatter plots,
⇛ matplotlib • and so on.
⇛ Pandas

9
scikit-learn

libraries--- pandas
essentially required or increase the

⇛ NumPy •
•
Python library for data wrangling and analysis
effectiveness of scikit-learn

built around a data structure called the DataFrame

⇛ SciPy • a DataFrame is a table
⇛ Jupyter Notebook • has methods for manipulating this table, e.g.,
• allows SQL-like queries and joins on such tables
⇛ matplotlib
⇛ Pandas

10
Fitting the Linear Regression Model
𝑚
𝜏 = 𝑥 𝑖 ,𝑦 𝑖
𝑖=1
, 𝑥 𝑖
𝜖ℝ 𝑛
,𝑦 𝑖
∈ℝ
(𝑖) (𝑖) (𝑖)
▪ model: 𝑦ො = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛
▪ model parameters: 𝑤0 , 𝑤1 , 𝑤1 ,…, 𝑤𝑛
▪ intercept: 𝑤0
▪ coefficients: 𝑤1 , 𝑤1 ,…, 𝑤𝑛

▪ dataset: the Boston data

11
the Boston data
• The Boston house-price data of Harrison, D. and
Rubinfeld, D. L. 'Hedonic prices and the demand
for clean air', J. Environ. Economics &
Management, vol.5, 81-102, 1978.
▪ Regression diagnostics: Identifying Influential
Data and Sources of Collinearity’…what
influences housing prices in Boston-

CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT MEDV
0.00632 18 2.31 0 0.538 6.575 65.2 4.09 1 296 15.3 396.9 4.98 24
0.02731 0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.9 9.14 21.6
0.02729 0 7.07 0 0.469 7.185 61.1 4.9671 2 242 17.8 392.83 4.03 34.7
0.03237 0 2.18 0 0.458 6.998 45.8 6.0622 3 222 18.7 394.63 2.94 33.4
0.06905 0 2.18 0 0.458 7.147 54.2 6.0622 3 222 18.7 396.9 5.33 36.2
0.02985 0 2.18 0 0.458 6.43 58.7 6.0622 3 222 18.7 394.12 5.21 28.7
0.08829 12.5 7.87 0 0.524 6.012 66.6 5.5605 5 311 15.2 395.6 12.43 22.9
0.14455 12.5 7.87 0 0.524 6.172 96.1 5.9505 5 311 15.2 396.9 19.15 27.1
0.21124 12.5 7.87 0 0.524 5.631 100 6.0821 5 311 15.2 386.63 29.93 16.5

12
the Boston housing example
𝑖 𝑖 𝑖
𝑥1 𝑥2
… 𝑥13 𝑦ො 𝑖

𝑠𝑖𝑧𝑒 = 506 × (13 + 1)

𝑖 𝑖 𝑖
𝑦ො = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤13 𝑥13
𝑓: ℝ13 → ℝ
13
exploring the data 14

Steps ---
▪ import the dataset loader
▪ create the loader object
▪ explore/understand the data
▪ shape of the data
▪ description(DESCR)
▪ feature names/values feature values target values

▪ target names/values
names of the features
▪ file path
▪ etc.
information about the data

file path
exploring the data 15

Steps ---
▪ import the dataset loader
▪ create the loader object
▪ explore/understand the data #columns/
#features
#rows/
▪ shape of the data #training examples
▪ description(DESCR)
▪ feature names/values
▪ target names/values
▪ file path
▪ etc.

506 rows, 1 target

exploring the data 16

Steps ---
▪ import the dataset loader
▪ create the loader object
▪ explore/understand the data
▪ shape of the data
▪ description(DESCR)
▪ feature names/values
▪ target names/values
▪ file path
▪ etc.
exploring the data 17

Steps ---
▪ import the dataset loader
▪ create the loader object
▪ explore/understand the data
▪ shape of the data
▪ description(DESCR) …
▪ feature names/values
▪ target names/values
▪ file path
▪ etc.
exploring the data 20

training ---
▪ split the data into training(75%), test sets (25%)
▪ import the model
▪ fit the model to the data
▪ test the model
▪ predict
training the algorithm 21

training ---
▪ split the data into training(75%), test sets (25%)
▪ import the model
▪ fit the model to the data
▪ test the model
▪ predict
training the algorithm 22

training ---
▪ split the data into training/test sets
▪ import the model
▪ fit the model to the data
▪ test the model
▪ predict
training the algorithm 23

training ---
▪ split the data into training/test sets
▪ import the model
▪ fit the model to the data
▪ test the model
▪ predict
the iris data

Iris Flower--
Data about 150 iris flowers to
be classified into 3 varieties; Sepal length Sepal width Petal length Petal width specie

sitosa, versicolor, virginica 5.1 3.3 1.7 0.5 sitosa

4.9 3.0 1.4 0.2 versicolor
5.4 3.6 1.4 0.2 sitosa
6.0 2.7 5.1 1.5 virginica

size: 150 × (4 + 1)

24
25

training the algorithm

step 1

Steps---
▪ load the data
▪ explore the data step 2
▪ split into training and validation
subsets
▪ import the optimizer

step 3
26

training the algorithm

step 4

Steps---
▪ load the data
▪ explore the data step 5
▪ split into training and validation
subsets
▪ import the optimizer
▪ fit to the data (derive the model)
▪ check accuracy of the model on step 6
the data
27

training the algorithm

step 4

training the algorithm

step 7

Steps---
▪ load the data
▪ explore the data
▪ split into training and validation
subsets
▪ import the optimizer
▪ fit to the data (derive the model)
▪ check accuracy of the model on
the data
▪ predict with the model derived
29

training the algorithm

30
31
https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.Logi
sticRegression.html#sklearn.linear_model.LogisticRegression

32
end

Machine Learning Lab Dlihebca6sem
No ratings yet
Machine Learning Lab Dlihebca6sem
25 pages
Intro To Scikit Learning
No ratings yet
Intro To Scikit Learning
18 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Ch1 - Slides - Supervised Learning
No ratings yet
Ch1 - Slides - Supervised Learning
32 pages
Data Mining Essen, Als 2: Data Mining in Prac, Ce, With Python
No ratings yet
Data Mining Essen, Als 2: Data Mining in Prac, Ce, With Python
31 pages
Practical Guide To Scikit-Learn For Data Science
No ratings yet
Practical Guide To Scikit-Learn For Data Science
27 pages
Python SciKit Learn Tutorial _ DigitalOcean
No ratings yet
Python SciKit Learn Tutorial _ DigitalOcean
11 pages
Fundamentals of Machine Learning Support Vector Machines, Practical Session
No ratings yet
Fundamentals of Machine Learning Support Vector Machines, Practical Session
4 pages
Scikit Learn
No ratings yet
Scikit Learn
10 pages
Internship Presentation
No ratings yet
Internship Presentation
18 pages
Kabir Data Preprocessing Python
No ratings yet
Kabir Data Preprocessing Python
14 pages
Summer Training Report - Ishan Patwal
No ratings yet
Summer Training Report - Ishan Patwal
21 pages
Machine Learning - Python Libraries
No ratings yet
Machine Learning - Python Libraries
12 pages
Machine Learning Bro Ids
No ratings yet
Machine Learning Bro Ids
25 pages
Part3 ML
No ratings yet
Part3 ML
201 pages
Project 3 - Phishing Detector Using LR
No ratings yet
Project 3 - Phishing Detector Using LR
3 pages
Python Libraries
No ratings yet
Python Libraries
12 pages
Scikit
No ratings yet
Scikit
81 pages
Machine Learning Foundation
No ratings yet
Machine Learning Foundation
4 pages
Practical 1to10
No ratings yet
Practical 1to10
32 pages
Scikit-Learn
No ratings yet
Scikit-Learn
8 pages
Dms - 5e147898f022bDS and ML With Python Libraries
No ratings yet
Dms - 5e147898f022bDS and ML With Python Libraries
2 pages
d2 1 PDF
No ratings yet
d2 1 PDF
4 pages
DIP Lab Manual No 02
No ratings yet
DIP Lab Manual No 02
24 pages
Python GTU Study Material Presentations Unit-2 24072020062038AM
No ratings yet
Python GTU Study Material Presentations Unit-2 24072020062038AM
18 pages
Image Quality Techniques
No ratings yet
Image Quality Techniques
6 pages
Clustering Algorithms SciKit Learn 1705740354
No ratings yet
Clustering Algorithms SciKit Learn 1705740354
22 pages
Machine Learning Python
No ratings yet
Machine Learning Python
9 pages
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
No ratings yet
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
108 pages
Applied Machine Learning in Python: Nikhil Sharma 1710991526 Data Science Batch
No ratings yet
Applied Machine Learning in Python: Nikhil Sharma 1710991526 Data Science Batch
27 pages
Complete Download (Ebook) Deep Learning With Pytorch by Eli Stevens, Luca Antiga, Thomas Viehmann ISBN 9781617295263, 1617295264 PDF All Chapters
100% (6)
Complete Download (Ebook) Deep Learning With Pytorch by Eli Stevens, Luca Antiga, Thomas Viehmann ISBN 9781617295263, 1617295264 PDF All Chapters
65 pages
Introduction To Machine Learning: Ksi Microsoft Aep
No ratings yet
Introduction To Machine Learning: Ksi Microsoft Aep
12 pages
Scikit Learn1
No ratings yet
Scikit Learn1
4 pages
Module-2
100% (1)
Module-2
62 pages
School of Computer Science: Python For ML/Al Internship
No ratings yet
School of Computer Science: Python For ML/Al Internship
12 pages
Satya Final Minor Report
100% (1)
Satya Final Minor Report
25 pages
Technical Synopsis
No ratings yet
Technical Synopsis
5 pages
Artificial Intelligence Lab Manual: (ACADEMIC YEAR: 2018-19) Semester - I
No ratings yet
Artificial Intelligence Lab Manual: (ACADEMIC YEAR: 2018-19) Semester - I
40 pages
CSE3999 Technical Answers For Real World Problems (TARP)
No ratings yet
CSE3999 Technical Answers For Real World Problems (TARP)
22 pages
Support Vector Machines
No ratings yet
Support Vector Machines
16 pages
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
No ratings yet
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
21 pages
Done Assignment
No ratings yet
Done Assignment
9 pages
Human Resource Analytics: Bachelor of Technology
No ratings yet
Human Resource Analytics: Bachelor of Technology
66 pages
Unit5_AI_Top AIML Tools
No ratings yet
Unit5_AI_Top AIML Tools
15 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Udemy Test4
No ratings yet
Udemy Test4
41 pages
Statistical Learning and Text Classification With NLTK and Scikit-Learn
No ratings yet
Statistical Learning and Text Classification With NLTK and Scikit-Learn
24 pages
FAISHAL_SHARIF_Resume (1)
No ratings yet
FAISHAL_SHARIF_Resume (1)
2 pages
Thesis Final
No ratings yet
Thesis Final
63 pages
Full Python AI Article
No ratings yet
Full Python AI Article
7 pages
AIML_Dom_25_Nov_2024
No ratings yet
AIML_Dom_25_Nov_2024
22 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
Optimizations For Cpus, Gpus and Numerical Stability: Georg Zitzlsberger
No ratings yet
Optimizations For Cpus, Gpus and Numerical Stability: Georg Zitzlsberger
9 pages
Image Classification Using Backpropagation Algorithm (Presentation)
No ratings yet
Image Classification Using Backpropagation Algorithm (Presentation)
23 pages
AC Project
No ratings yet
AC Project
7 pages
03_pytorch_computer_vision
No ratings yet
03_pytorch_computer_vision
29 pages
PYDS 3150713 Unit-2
No ratings yet
PYDS 3150713 Unit-2
38 pages
Siva Ram Korakutty
No ratings yet
Siva Ram Korakutty
6 pages
ML Trends
No ratings yet
ML Trends
89 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Python Basics Day 1
No ratings yet
Python Basics Day 1
31 pages
Fire extinguisher prediction using machine learning report
No ratings yet
Fire extinguisher prediction using machine learning report
48 pages
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
100% (1)
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
14 pages
Pharmacy Management System Harshal00000
No ratings yet
Pharmacy Management System Harshal00000
44 pages
Python Programming For Economics Finance
No ratings yet
Python Programming For Economics Finance
267 pages
Software Requirement Specification
No ratings yet
Software Requirement Specification
3 pages
IT1110E IntroductionTo Programming Ver5.0
No ratings yet
IT1110E IntroductionTo Programming Ver5.0
37 pages
BOP File
No ratings yet
BOP File
13 pages
Pandas Cookbook Recipes for Scientific Computing Time Series Analysis and Data Visualization using Python 1st Edition Theodore Petrou - Download the ebook now to never miss important content
100% (1)
Pandas Cookbook Recipes for Scientific Computing Time Series Analysis and Data Visualization using Python 1st Edition Theodore Petrou - Download the ebook now to never miss important content
73 pages
Machine Learning For Fraud Detection in Online Transactions
No ratings yet
Machine Learning For Fraud Detection in Online Transactions
4 pages
The Tip of The Iceberg: 1 Before You Start
No ratings yet
The Tip of The Iceberg: 1 Before You Start
18 pages
Cs Docs Doc Sdo
No ratings yet
Cs Docs Doc Sdo
6 pages
Statistics Machine Learning Python
No ratings yet
Statistics Machine Learning Python
399 pages
Secure Online Payment
No ratings yet
Secure Online Payment
74 pages
Spa&Saloon-Srinivasa V
No ratings yet
Spa&Saloon-Srinivasa V
38 pages
Data Minig
No ratings yet
Data Minig
119 pages
Lab Experiment 1 - AI
No ratings yet
Lab Experiment 1 - AI
8 pages
Introduction To Python Programming - Notes
No ratings yet
Introduction To Python Programming - Notes
20 pages
Dataset 101 Visualizations Using Python (Abouraia A.) (Z-Library)
No ratings yet
Dataset 101 Visualizations Using Python (Abouraia A.) (Z-Library)
122 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
Machine Learning Based On Predictive Analytics For Aircraft Engine
No ratings yet
Machine Learning Based On Predictive Analytics For Aircraft Engine
56 pages
SHOWCast Manual - v2021-10-26
No ratings yet
SHOWCast Manual - v2021-10-26
66 pages
Lec 01 Alpha Batch
No ratings yet
Lec 01 Alpha Batch
16 pages
Python Programming: Prerequisites Required Tools/Packages
No ratings yet
Python Programming: Prerequisites Required Tools/Packages
2 pages
Python For Desktop Applications Toc
No ratings yet
Python For Desktop Applications Toc
3 pages
Python Notes BVL
No ratings yet
Python Notes BVL
161 pages
#1 Bahan Paparan Phyton
No ratings yet
#1 Bahan Paparan Phyton
19 pages
Turner, Ryan - Python Machine Learning - The Ultimate Beginner's Guide To Learn Python Machine Learning Step by Step Using Scikit-Learn and Tensorflow (2019)
No ratings yet
Turner, Ryan - Python Machine Learning - The Ultimate Beginner's Guide To Learn Python Machine Learning Step by Step Using Scikit-Learn and Tensorflow (2019)
144 pages
Introduction to Python for Econometrics, Statistics and Data Analysis Kevin Sheppard 2024 Scribd Download
100% (8)
Introduction to Python for Econometrics, Statistics and Data Analysis Kevin Sheppard 2024 Scribd Download
41 pages
Continuumio Anaconda Platform Docs Site 5.0.2 PDF
No ratings yet
Continuumio Anaconda Platform Docs Site 5.0.2 PDF
128 pages