0% found this document useful (0 votes)

16 views9 pages

MODELS (AutoRecovered)

Uploaded by

1694547.harshavardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views9 pages

MODELS (AutoRecovered)

Uploaded by

1694547.harshavardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

MODELS

Logistic Regression:
Despite its name, logistic regression is used for classification. This model calculates the probability, p,
that an observation belongs to a binary class.
if p is more than or equal to 0.5, we label the data as one
if p is less than 0.5, we label it zero

The default probability threshold for logistic regression in scikit-learn is zero-point-five. This
threshold can also apply to other models such as KNN.

Hyperparameter tuning
Parameters that we specify before fitting a model, like alpha and n_neighbors, are called
hyperparameters.
So, a fundamental step for building a successful model is choosing the correct hyperparameters.
We can try lots of different values, fit all of them separately, see how well they perform, and choose
the best values!

One approach for hyperparameter tuning is called grid

search, where we choose a grid of possible hyperparameter
values to try. For example, we can search across two
hyperparameters for a KNN model - the type of metric and a
different number of neighbors
We perform k-fold cross-validation for each combination of hyperparameters. The mean scores for
each combination are shown here. We then choose hyperparameters that performed best

alpha and solver are both hyper parameters for

Ridge regression the cross validation scored is
calculated for each pair of alpha and solver and the
best is choosen

Grid search is great. However, the number of fits is equal to the number of hyperparameters
multiplied by the number of values multiplied by the number of folds.
Therefore, it doesn't scale well! So, performing 3-fold cross-validation for one hyperparameter with
10 values each means 30 fits, while 10-fold cross-validation on 3 hyperparameters with 10 values
each equals 900 fits!

Hence we can choose Random Search

which picks random hyperparameter
values rather than exhaustively searching
through all options.

DATA PRE PROCESSING :

Dealing with categorical data
we need to convert them into numeric features. We achieve this by splitting the feature into multiple
binary features called dummy variables, one for each category. Zero means the observation was not
that category, while one means it was.
We create binary
features for each
genre. As each song
has one genre, each
row will have a 1 in
one of the ten
columns and zeros in
the rest. If a song is
not any of the first
nine genres, then
implicitly, it is a rock song. That means we only need nine features, so we can delete the Rock column
To create dummy variables we can use scikit-learn's OneHotEncoder, or pandas' get_dummies.

If the DataFrame only has one

categorical feature, we can pass
the entire DataFrame, thus
skipping the step of combining
variables. If we don't specify a
column, the new DataFrame's
binary columns will have the
original feature name prefixed,
so they will start with genre-

underscore - as shown here. Notice the original genre column is automatically dropped. Once we
have dummy variables, we can fit models as before.

Handling missing data:

Count of missing
values in each feature

A common approach is to remove missing observations accounting for less than 5% of all data.
Subset has
the cols in
which null values are present if one of the col has null values entire row is removed

Another option is to impute missing data. This means making an educated guess as to what
the missing values could be
We can impute the mean of all non-missing entries for a given feature. We can also use other
values like the median. For categorical values we commonly impute the most frequent value.
We must split our data before imputing to avoid leaking test set information to our model,
a concept known as data leakage.

Divide the training and testing data into two parts categorical and numerical data using the same
random state ensures the position of a col is same for both the data
Imputers are also called as transformers for their ability to transform the data

Centering and scaling the data:

We normally scale or centre the data to make sure they are on the same scale

Scaling the data:

Different ways of scaling:

Standarization : Subtract the mean and div by variance: Data is centered around 0 have variance 1
Subtract min and div by range so that min is 0 and max is 1
can also normalize the data such that it is in the range of -1 to 1

Before scaling we split the data to avoid data leakage

Pipelines can be used

to do 2 steps at the
same time
Initialize the steps in
the Pipeline and then
call pipeline.fit

Cross Validation in Scaling

UNSUPERVISED LEARNING

Unsupervised learning is a class of machine learning techniques for discovering patterns in

data

Evaluating a cluster:

Cross tabulations are used to know what

the labelled data is indicating

Take the example of iris data set

The following way is used for creating a cross table after clustering the data

A cross table looks like this

In the iris data set we have the labels so we

evaluated uaing the lables

In case of anyother data which donot have the labels we

evaluate the cluster using inertia

Inertia means how spread the data is in a cluster

Lower the inertia better the cluster

It measures distance of each sample from the centroid of

the cluster

FEATURE VARIANCE:
Variance measures the spread of the data
values
In kmeans feature variance =feature influence
Standard Scaler is a module used for
modifying feature variance
It transforms every feature to have mean 0 and variance 1

HIERARCHIAL CLUSTERING

Hierarchical clustering is very useful for cluster

visualization

Hierarchical clusters as visualized into diagrams

called dendograms ->

This is done by first forming the cluster for every

row and then merging the two closest rows one at
a time

Linkage forms the hierarchical

clustering and dendogram is
used for visulaizing the image

hierarchical clustering is not

only a visualization tool we can also extract clusters from intermediate stages which can be used in
further computations
An intermediate stage in the hierarchical clustering is specified by choosing a height on the
dendrogram.

T-SNE
T-SNE stands for T-Distributed Stochastic Neighbor Embedding
It has a complicated name, but it serves a very simple purpose. It maps samples from their high-
dimensional space into a 2- or 3-dimensional space so they can visualized. While some distortion is
inevitable, t-SNE does a great job of approximately representing the distances between the samples.
For this reason, t-SNE is an invaluable visual aid for understanding a dataset.

It Maps samples to 2D or 3D

TSNE has only fit_transform method instead of sep

fit and transform
We only give the X values to TSNE not the Y
Learning rate is also given to the TSNE
It is different for diff data Normal rage is 50 to 200
Even though
code is same if we rerun the plot may differ But even though the plot differs Relative positions of
data do not change

DIMENSIONALITY REDUCTION

It finds patterns in data and re express the data in compressed form

PCA

It performs dimensionality reduction in 2 steps

1. Decorrelation(doesn’t change dimension of data)

2. reduces dimension

DECORRELATION

In this data is rotated such that it is aligned with the axes and data is also shifted such that mean is

It has fot and transform methds in sklearn

Fit() method learns how much to shift the dats and how much to rotate the dats but doesnot rotate
them

Transform() methods rotates the data based on fit metod That means transform can also be used for
new unseen data

Also PCA removes the correlation between two features i.e if two features are linearly correalted at
the beginning they are not linearly corelated after the shofting and rotating

PCA is called Principal because it identifies the principal data and rotates and shifts based on those
principal components

They can be found out using model.components_

Intrinsic Dimension:

It is no.of PCA features with significance variance

Non-negative matrix factorization:

It is a dimensionality reduction technique like PCA

NMF models are interpretable unlike PCA

But All the features must be non negative

NMF expresses documents as combinations of topics

And images as combination of patterns

LINEAR CLASSIFIERS

LINEAR SVC(support Vector Classifier)

Dot product-x@y
Lr.coef_ gives the co-efficients of the Predictor equation
lr.intercept_ gives the intercept of the predictor equation
Based on the raw model output class is identified

LOSS FUNCTIONS

Scikit learn’s LinearRegression uses squared error for loss function

Minimization is with respect to parameters of the model

Squared errors are not appropriate for classification models

Hence we use 0-1 loss 0 for correct prediction 1 or incorrect so that we know how many errors
occurred (hard to minimize So models wont use it)

This function is used to minimize any function

REGULARIZATION IN LOGISTIC REGRESSION

ML Book Notes
No ratings yet
ML Book Notes
9 pages
Learning Machine Learning With Yellowbrick
No ratings yet
Learning Machine Learning With Yellowbrick
64 pages
Machine Learning (ML)
No ratings yet
Machine Learning (ML)
35 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Classification Algorithms I
No ratings yet
Classification Algorithms I
14 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Pattern Summary Final
No ratings yet
Pattern Summary Final
28 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Practical Data Analysis Cookbook - Sample Chapter
100% (1)
Practical Data Analysis Cookbook - Sample Chapter
31 pages
DATA SCIENCE iNTERVIEW QUESTION
No ratings yet
DATA SCIENCE iNTERVIEW QUESTION
42 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
23 pages
2 DataPreProcessing Code
No ratings yet
2 DataPreProcessing Code
46 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
16 Comparison of Data Science Algorithms
No ratings yet
16 Comparison of Data Science Algorithms
13 pages
M2 - Supervised Machine Learning
No ratings yet
M2 - Supervised Machine Learning
79 pages
PS Notes (Machine Learning
No ratings yet
PS Notes (Machine Learning
14 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Coincent - Data Science With Python Assignment
100% (2)
Coincent - Data Science With Python Assignment
23 pages
Machine Learning Lecture1 - 26-27 Aug
No ratings yet
Machine Learning Lecture1 - 26-27 Aug
30 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Machine Learning Path
No ratings yet
Machine Learning Path
21 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
Session 5
No ratings yet
Session 5
36 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
AIML
No ratings yet
AIML
30 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
Assignment1 LATEX
No ratings yet
Assignment1 LATEX
11 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Data Science Cheatsheet 2.0: Statistics Model Evaluation Logistic Regression
No ratings yet
Data Science Cheatsheet 2.0: Statistics Model Evaluation Logistic Regression
4 pages
Scikit Learn
No ratings yet
Scikit Learn
17 pages
ARTIFICIAL INTELLIGENCE Question Paper 21 22
0% (1)
ARTIFICIAL INTELLIGENCE Question Paper 21 22
3 pages
Maxbox Starter60 Machine Learning
No ratings yet
Maxbox Starter60 Machine Learning
8 pages
Scikit Learn Cheat Sheet Python
No ratings yet
Scikit Learn Cheat Sheet Python
1 page
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Final ML
No ratings yet
Final ML
2 pages
EDA Explanations
No ratings yet
EDA Explanations
22 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
End SEM V IMP DSE 2
No ratings yet
End SEM V IMP DSE 2
9 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Intro To Data Science Summary
No ratings yet
Intro To Data Science Summary
17 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
A Short Guide For Feature Engineering and Feature Selection
No ratings yet
A Short Guide For Feature Engineering and Feature Selection
32 pages
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
100% (1)
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
1 page
Machine Learning Theory Updated
No ratings yet
Machine Learning Theory Updated
8 pages
AIML Solved Paper Nov-Dec 2024
No ratings yet
AIML Solved Paper Nov-Dec 2024
2 pages
Finite Difference Method
100% (3)
Finite Difference Method
10 pages
Adaptive Digital Filters
No ratings yet
Adaptive Digital Filters
10 pages
Ejercicios Interpolacion Lagrange
No ratings yet
Ejercicios Interpolacion Lagrange
1 page
Chapter 4 Shape Function
No ratings yet
Chapter 4 Shape Function
34 pages
1704719125wpdm - MATHEMATICS II JULY AUG 2023
No ratings yet
1704719125wpdm - MATHEMATICS II JULY AUG 2023
8 pages
Goldman Sachs - LeetCode
No ratings yet
Goldman Sachs - LeetCode
5 pages
Worksheet Questions 19
No ratings yet
Worksheet Questions 19
2 pages
Pre Mfe Nla Feb2020 Syllabus
No ratings yet
Pre Mfe Nla Feb2020 Syllabus
4 pages
4-Machine Learning and Neural Networks
No ratings yet
4-Machine Learning and Neural Networks
9 pages
Page Rank Algorithm
No ratings yet
Page Rank Algorithm
9 pages
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
No ratings yet
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
20 pages
Ker As Tutorial
No ratings yet
Ker As Tutorial
33 pages
XOR Problem Tensorflow NN - Ipynb
No ratings yet
XOR Problem Tensorflow NN - Ipynb
29 pages
Booths Multiplication Algorithm
No ratings yet
Booths Multiplication Algorithm
15 pages
Laboratory Activity No. 3 Error Calculations
No ratings yet
Laboratory Activity No. 3 Error Calculations
15 pages
IOE Syllabus of Data Mining
No ratings yet
IOE Syllabus of Data Mining
2 pages
Keras
No ratings yet
Keras
2 pages
CO34563-Assignment-4 With Solution
No ratings yet
CO34563-Assignment-4 With Solution
13 pages
Lecture22 CCF
No ratings yet
Lecture22 CCF
26 pages
Digitizing and Packetizing Voice: Describe Cisco Voip Implementations
No ratings yet
Digitizing and Packetizing Voice: Describe Cisco Voip Implementations
24 pages
SSP 0 1 Introduction
No ratings yet
SSP 0 1 Introduction
5 pages
MTH686-Non Linear Regression Lecture 4
No ratings yet
MTH686-Non Linear Regression Lecture 4
5 pages
MS SS 16
No ratings yet
MS SS 16
11 pages
Yao's Minimax Principle: Game Tree Evaluation
No ratings yet
Yao's Minimax Principle: Game Tree Evaluation
30 pages
A Gentle Introduction To Mini-Batch Gradient Descent and How To Configure Batch Size
No ratings yet
A Gentle Introduction To Mini-Batch Gradient Descent and How To Configure Batch Size
16 pages
Chapter 5
No ratings yet
Chapter 5
7 pages
Nested Loops, Hash Join and Sort Merge Joins - Difference?: Nested Loop (Loop Over Loop)
No ratings yet
Nested Loops, Hash Join and Sort Merge Joins - Difference?: Nested Loop (Loop Over Loop)
7 pages
89-656: Introduction To Cryptography Exercise 1: Gilad Asharov October 17, 2021 Due Date: October 31
No ratings yet
89-656: Introduction To Cryptography Exercise 1: Gilad Asharov October 17, 2021 Due Date: October 31
2 pages
CS 249 Project 3 Recursion: Unit Testing
No ratings yet
CS 249 Project 3 Recursion: Unit Testing
4 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet

MODELS (AutoRecovered)

Uploaded by

MODELS (AutoRecovered)

Uploaded by

MODELS

One approach for hyperparameter tuning is called grid

alpha and solver are both hyper parameters for

Hence we can choose Random Search

DATA PRE PROCESSING :

If the DataFrame only has one

Handling missing data:

Centering and scaling the data:

Scaling the data:

Different ways of scaling:

Before scaling we split the data to avoid data leakage

Pipelines can be used

Cross Validation in Scaling

Unsupervised learning is a class of machine learning techniques for discovering patterns in

Cross tabulations are used to know what

Take the example of iris data set

A cross table looks like this

In the iris data set we have the labels so we

In case of anyother data which donot have the labels we

Inertia means how spread the data is in a cluster

Lower the inertia better the cluster

It measures distance of each sample from the centroid of

Hierarchical clustering is very useful for cluster

Hierarchical clusters as visualized into diagrams

This is done by first forming the cluster for every

Linkage forms the hierarchical

hierarchical clustering is not

TSNE has only fit_transform method instead of sep

It finds patterns in data and re express the data in compressed form

It performs dimensionality reduction in 2 steps

It has fot and transform methds in sklearn

They can be found out using model.components_

It is no.of PCA features with significance variance

Non-negative matrix factorization:

It is a dimensionality reduction technique like PCA

But All the features must be non negative

NMF expresses documents as combinations of topics

And images as combination of patterns

LINEAR SVC(support Vector Classifier)

Scikit learn’s LinearRegression uses squared error for loss function

Squared errors are not appropriate for classification models

This function is used to minimize any function

REGULARIZATION IN LOGISTIC REGRESSION

You might also like