0% found this document useful (0 votes)

668 views167 pages

Understanding Machine Learning Algorithms - in Depth

This document contains handwritten notes on machine learning algorithms. It covers topics such as supervised learning algorithms like linear regression, decision trees, boosting and SVM. It also discusses unsupervised learning techniques including clustering algorithms like K-means and dimensionality reduction using PCA. For each algorithm, it provides the concept, intuition, examples and advantages/disadvantages. The document aims to help understand various machine learning techniques at a high level.

Uploaded by

suryanshmishra452.inhaltmart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

668 views167 pages

Understanding Machine Learning Algorithms - in Depth

Uploaded by

suryanshmishra452.inhaltmart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 167

- Understanding Machine Learning Algorithms: Handwritten Notes -

Table of Contents
1. What is Machine Learning?

2. What are the Types of Machine Learning?

3. Supervised Machine Learning

4. Unsupervised Machine Learning

5. Reinforcement Learning

6. Semi-Supervised Learning

7. Steps in ML Project

8. Exploring Step 1 Data Collection

9. Exploring Step 2 Data Preparation

- Exploratory Data Analysis

- Data Preprocessing

- Feature Engineering
Notes by RaviTeja G
10. Exploring Step 3 - Train Model on Dataset
- Types of Learning
- Under Fitting and OverFitting
- Regularization techniques
- Hyperparameter Tuning
11. Exploring Step 4 - Evaluation of a Model
- Evaluation Metrics
- Confusion Matrix
- Recall/Sensitivity
- Precision
- Specificity
- F1 Score
- AUC and ROC Curve
- Analysis of a Model
12. Supervised Learning
- Linear Regression
- Regularization Techniques
- Logistic Regression
- Decision Trees
- Ensemble Techniques
- Random Forests
- AdaBoost
- Gradient Boost
- XG Boost
- K-Nearest Neighbours
- Support Vector Machines
- Naive Bayes Classifiers
13. Unsupervised Learning
- Clustering Techniques
- K-Means Clustering
- Hierarchical Clustering
- DB Scan Clustering
- Evaluation of Clustering Models
- Curse of Dimensionality
- Principal Component Analysis
14. Cheat Sheet of Supervised and Unsupervised Algorithms
Table of Contents
1. What is Linear Regression
2. Understanding with an example
3. Evaluating the fitness of the model
4. Understanding Gradient descent
5. Understanding Loss Function
6. Measuring Model Strength
7. Another Approach for LR - OLS
Table Of Contents
1. Understanding Multicollinearity
2. Variance Inflection Factor
3. Regularization
4. Lasso - L1 Form
5. Ridge - L2 Form
6. Elastic Net
7. Difference Between Ridge and Lasso
8. When to use Ridge/Lasso/Elastic Net
9. Polynomial Regression
Table Of Contents
1. Why do we need Decision Trees
2. How it works
3. How do we select a root node
4. Understanding Entropy, Information Gain
5. Solving an Example on Entropy
6. Understanding Gini Impurity
7. Solving an Example on Gini Impurity
8. Decision tree for Regression
9. Why Decision Trees are Greedy Apporach
10. Understanding Pruning
Table of Contents
1.Understanding Boosting
2.Understanding AdaBoost
3.Solving and Example on AdaBoost
4.Understanding Gradient Boosting
5.Solving an Example on Gradient Boosting
6.AdaBoost Vs Gradient Boosting
Table Of Contents
1. How does K-Nearest Neighbours work
2. How is Distance Calculated
- Eculidean Distance
- Hamming Distance
- Manhattan Distance
3. Why is KNN a Lazy Learner
4. Effects of Choosing the value of K
5. Different ways to perform KNN
6. Understanding KD-Tree
7. Solving an Example of KD Tree
8. Understanding Ball Tree
Understanding Support Vector Machines

Table Of Contents
1. Understanding Concept of SVC
2. What are Support Vectors
3. What is Margin
4. Hard Margin and Soft Margin
5. Kernelized SVC
6. Types of Kernels
7. Understanding SVR
Table Of Contents
1. Why do we need Naive Bayes
2. Concept of how it works
3. Mathematical Intuition of Naive Bayes
4. Solving an Example on Naive Bayes
5. Other Bayes Classifiers
- Gaussian Naive Bayes Classifier
- Multinomial Naive Bayes Classifier
- Bernoulli Naive Bayes Classifier
Table of Contents
1. How clustering is different from classification
2. Applications of Clustering
3. What are density based methods
4. What are Hierarchial based methods
5. What are partitioning methods
6. What are Grid Based methods
7. Main Requirements for Clustering Algorithms
Table Of Contents
1. Concept of K-Means Clustering
2. Math Intuition Behind K-Means
3. Cluster Building Process
4. Edge Case Scenarios of K-Means
5. Challenges and Improvements in K-Means
Understanding Principal Component Analysis

Table Of Contents
1. Idea Behind PCA
2. What are Principal Components
3. Eigen Decomposition Approach
4. Singular Value Decomposition Approach
5. Why do we maximize Variance
6. What is Explained Variance Ratio
7. How to select optimal no.of Prinicpal Components
8. Understanding Scree plot
9. Issues with PCA
10. Understanding Kernel PCA
– Supervised Algorithms –

Regression Models
ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES
APPLICATION

Linear Linear Regression models 1. Fast training because 1. Assumes a linear relationship
Regression a linear relationship there are few parameters. between input and output variables.
between input variables 2. 2. Sensitive to outliers.
and a continuous Interpretable/Explainable 3. Typically generalizes worse than
numerical output variable. results by its output ridge or lasso regression.
The default loss function is coefficients.
the mean square error
(MSE).

Polynomial Polynomial Regression 1. Provides a good 1. Poor interpretability of the

Regression models nonlinear approximation of the coefficients since the underlying
relationships between the relationship between the variables can be highly correlated. 2.
dependent, and dependent and The model fit is nonlinear but the
independent variable as independent variables. 2. regression function is linear. 3. Prone
the n-th degree Capable of fitting a wide to overfitting.
polynomial. range of curvature.

Support Vector Support Vector 1. Robust against outliers. 1. Does not perform well with large
Regression Regression (SVR) uses 2. Effective learning and datasets. 2. Tends to underfit in
the same principle as strong generalization cases where the number of variables
SVMs but optimizes the performance. 3. Different is much smaller than the number of
cost function to fit the most Kernel functions can be observations.
straight line (or plane) specified for the decision
through the data points. function.
With the kernel trick it can
efficiently perform a
non-linear regression by
implicitly mapping their
inputs into
high-dimensional feature
spaces.

Gaussian Gaussian Process 1. Provides uncertainty 1. Poor choice of kernel can make
Process Regression (GPR) uses a measures on the convergence slow.
Regression Bayesian approach that predictions. 2. Specifying specific kernels
infers a probability 2. It is a flexible and requires deep mathematical
distribution over the usable non-linear model understanding
possible functions that fit which fits many datasets
the data. The Gaussian well.
process is a prior that is 3. Performs well on small
specified as a multivariate datasets as the GP kernel
Gaussian distribution. allows to specify a prior on
the function space.
Classification Models

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

APPLICATION

SVM In its simplest form, 1. Effective in cases with a 1. Sensitive to overfitting,

support vector machine is high number of variables. regularization is crucial. 2. Choosing
a linear classifier. But with 2. Number of variables can a "good" kernel function can be
the kernel trick. it can be larger than the number difficult. 3. Computationally
efficiently perform a of samples. 3. Different expensive for big data due to high
non-linear classification by Kernel functions can be training complexity. 4. Performs
implicitly mapping their specified for the decision poorly if the data is noisy (target
inputs into function. classes overlap).
high-dimensional feature
spaces. This makes SVM
one of the best prediction
methods.

Negrest Nearest Neighbors 1. Successful in situations 1. Sensitive to noisy and missing

Neighbors predicts the label based where the decision data. 2. Computationally expensive
on a predefined number of boundary is irregular. 2. because the entire set of n. points for
samples closest in Non-parametric approach every execution is required.
distance to the new point. as it does not make any
assumption on the
underlying data.

Logistic The logistic regression 1. Explainable & 1. Makes a strong assumption about
Regression models a linear Interpretable. 2. Less the relationship between input and
(and its relationship between input prone to overfitting using response variables. 2.
extensions) variables and the regularization. 3. Multicollinearity can cause the model
response variable. It Applicable for multi-class to easily overfit without
models the output as predictions. regularization.
binary values (o or rather
than numeric values.

Linear The linear decision 1. Explainable & 1. Multicollinearity can cause the
Discriminant boundary maximizes the Interpretable. 2. Applicable model to overfit. 2. Assuming that all
Analysis separability between the for multi-class predictions. classes share the same covariance
classes by finding a linear matrix. 3. Sensitive to outliers. 4.
combination of features. Doesn't work well with small class
sizes.
Both Regression and Classification Models
ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES
APPLICATION

Decision Decision Tree models learn 1. Explainable and 1. Prone to overfitting. 2. Can be
Trees on the data by making interpretable. 2. Can unstable with minor data drift. 3.
decision rules on the handle missing values. Sensitive to outliers.
variables to separate the
classes in a flowchart like a
tree data structure. They
can be used for both
regression and
classification.

Random Random Forest 1. Effective learning and 1. Large number of trees can slow
Forest classification models learn better generalization down performance. 2. Predictions
using an ensemble of performance. 2. Can are sensitive to outliers. 3.
decision trees. The output handle moderately large Hyperparameter tuning can be
of the random forest is datasets. 3. Less prone to complex.
based on a majority vote of overfit than decision trees.
the different decision trees.

Gradient An ensemble learning 1. Handling of 1. Sensitive to outliers and can

Boosting method where weak multicollinearity. 2. therefore cause overfitting. 2. High
predictive learners are Handling of non-linear complexity due to hyperparameter
combined to improve relationships. 3. Effective tuning. 3. Computationally
accuracy. Popular learning and strong expensive.
techniques include generalization
XGBoost, LightGBM and performance. 4. XGBoost
more. is fast and is often used as
a benchmark algorithm.

Ridge Ridge Regression penalizes 1. Less prone to 1. All the predictors are kept in the
Regression variables with low predictive overfitting. 2. Best suited final model. 2. Doesn't perform
outcomes by shrinking their when data suffers from feature selection.
coefficients towards zero. It multicollinearity. 3.
can be used for Explainable &
classification and Interpretable.
regression.

Lasso Lasso Regression penalizes 1. Good generalization 1. Poor interpretability/explainability

Regression features that have low performance. 2. Good at as it can keep a single variable. from
predictive outcomes by handling datasets where a set of highly correlated variables.
shrinking their coefficients the number of variables is
to zero. It can be used for much larger than the
classification and number of observations. 3.
regression. No need for feature
selection.

AdaBoost Adaptive Boosting uses an 1. Explainable & 1. Less prone to overfitting as the
ensemble of weak learners Interpretable. 2. Less need input variables are not jointly
that is combined into a for tweaking parameters. optimized. 2. Sensitive to noisy data
weighted sum that 3. Usually outperforms and outliers.
represents the final output Random Forest.
of the boosted classifier.
– Unsupervised Algorithms –
Clustering Algorithms
ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES
APPLICATION

K-Means Most common clustering 1. Scales to large datasets 1. Requires defining the
approach which assumes 2. Interpretable & expected number of clusters in
that the closer data points explainable results 3. Can advance. 2. Not suitable to
are to each other, the more generate tight clusters identify clusters with
similar they are It non-convex shapes.
determines K clusters
based on Euclidean
distances.

DBSCAN Density-Based Spatial 1. No assumption on the 1. Requires optimization of two

Clustering of Applications expected number of parameters 2. Can struggle in
with Noise can handle clusters. 2. Can handle case of very high dimensional
non-linear cluster noisy data and outliers 3. data
structures, purely based on No assumptions on the
density. It can differentiate shapes and sizes of the
and separate regions with clusters 4. Can identify
varying degrees of density, clusters with different
thereby creating clusters. densities

HDBSCAN Family of the density-based 1. No assumption on the 1. Mapping of unseen objects in

algorithms and has roughly expected number of HDBSCAN is not
two steps: finding the core clusters 2. Can handle straightforward. 2. Can be
distance of each point, and noisy data and outliers. 3. computationally expensive
expands clusters from No assumptions on the
them. It extends DBSCAN shapes and sizes of the
by converting it into a clusters. 4. Can identify
hierarchical clustering clusters with different
algorithm. densities

Agglomerativ Uses hierarchical clustering 1. There is no need to 1. Specifying metric and

e Hierarchical to determine the distance specify the number of linkages types requires good
Clustering between samples based on clusters. 2. With the right understanding of the statistical
the metric, and pairs are linkage, it can be used for properties of the data 2. Not
merged into clusters using the detection of outliers. 3. straightforward to optimize 3.
the linkage type. Interpretable results using Can be computationally
dendrograms. expensive for large datasets

OPTICS Family of the density-based No assumption on the 1. It only produces a cluster

algorithms where it finds expected number of ordering. 2. Does not work well
core sample of high density clusters. 2. Can handle in case of very high
and expands clusters from noisy data and outliers. 3. dimensional data. 3. Slower
them. It operates with a No assumptions on the than DBSCAN.
core distance (e) and shapes and sizes of the
reachability distance. clusters. 4. Can identify
clusters with different
densities. 5 Not required to
define fixed radius as in
DBSCAN.
Dimensionality Reduction Techniques

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

APPLICATION

PCA Principal Component 1. Explainable Interpretable 1. Sensitive to outliers 2.

Analysis (PCA) is a feature results. 2 New unseen Requires data standardization
extraction approach that datapoints can be mapped
uses a linear function to into the existing PCA space
reduce dimensionality in 3. Con be used as
datasets by minimizing dimensionality reduction
information loss. technique as preliminary
step to other machine
learning tasks 4 Helps
reduce overfitting 5. Helps
remove correlated features

t-SNE t-distributed Stochastic 1. Helps preserve the 1. The cost function is not
Neighbor Embedding is a relationships seen in high convex: different initializations
non-linear dimensionality dimensionality 2. Easy to can get different results. 2.
reduction method that visualise the structure of Computationally intensive for
converts similarities between high dimensional data in or large datasets. 3. Default
data points to joint 3 dimensions 3. Very parameters do not always
probabilities using the effective for visualizing achieve the best results
Student t-distribution in the clusters or groups of data
low-dimensional space points and their relative
proximities

UMAP Uniform Manifold 1. It can be used as 1. Default parameters do not

Approximation and general-purpose dimension always achieve the best results
Projection (UMAP) reduction technique as o
constructs a preliminary step to other
high-dimensional graph machine learning tasks. 2.
representation of the data Can be very effective for
then optimizes a visualizing clusters or
low-dimensional graph to be groups of data points and
as structurally similar as their relative proximities. 3
possible. Able to handle high
dimensional sparse
datasets

ICA Independent Component 1. Can separate 1. Without any prior knowledge,

Analysis (ICA) is a linear multivariate signals into its determination of the number of
dimensionality reduction subcomponents 2 Clear independent components or
method that aims to separate aim of the method: only sources can be difficult. 2. PCA
a multivariate signal into applicable if there are is often required as a
additive subcomponents multiple independent pre-processing step.
under the assumption that generators of information to
independent components are uncover. 3. Can extract
non-gaussian. Where PCA hidden factors in the data
"compresses" the data, ICA by transforming a set of
"separates" the information. variables to new set that
maximally independent.
Association Rules

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

APPLICATION

Apriori The Apriori algorithm uses 1. Explainable & 1. Requires defining the
algorithm the join and prune step interpretable results. 2. expected number of clusters or
iteratively to identify the Exhaustive approach mixture components in advance
most frequent itemset in the based on the confidence 2. The covariance type needs
given dataset. Prior and support. to be defined for the mixture of
knowledge (apriori) of component
frequent itemset properties
is used in the process.

FP-growth Frequent Pattern growth 1. Explainable & 1. More complex algorithm to

algorithm (FP-growth) is an interpretable results. 2. build than Apriori 2. Can result
improvement on the Apriori Smaller memory footprint in many (incremental)
algorithm for finding than the Apriori algorithm overlapping/trivial itemsets
frequent itemsets. It
generates a conditional
FP-Tree for every item in
the data.

FP-Max A variant of Frequent 1. Explainable & 1. More complex algorithm to

Algorithm pattern growth that is Interpretable results. 2. build than Apriori
focused on finding maximal Smaller memory footprint
itemsets. than the Apriori and
FP-growth algorithms

Practical Statistics For Data Scientists
100% (1)
Practical Statistics For Data Scientists
363 pages
Vasudevan S. Deep Learning. A Comprehensive Guide 2022
No ratings yet
Vasudevan S. Deep Learning. A Comprehensive Guide 2022
307 pages
Machine Learning For Predictive Data Analytics PDF
No ratings yet
Machine Learning For Predictive Data Analytics PDF
45 pages
Sample Outline Azure Machine Learning Engineering
No ratings yet
Sample Outline Azure Machine Learning Engineering
17 pages
What Is A Support Vector Machine?: Primer
No ratings yet
What Is A Support Vector Machine?: Primer
3 pages
Tutorials
No ratings yet
Tutorials
17 pages
Algo Cert Study Plan 2018
No ratings yet
Algo Cert Study Plan 2018
3 pages
Deep Reinforcement Learning PDF
No ratings yet
Deep Reinforcement Learning PDF
150 pages
Time Series Analysis
No ratings yet
Time Series Analysis
25 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Deep Learning Fundamentals Materials
100% (1)
Deep Learning Fundamentals Materials
216 pages
Scikit Learn Docs
No ratings yet
Scikit Learn Docs
1,810 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
Activations, Loss Functions & Optimizers in ML
No ratings yet
Activations, Loss Functions & Optimizers in ML
29 pages
Deep Learning
No ratings yet
Deep Learning
127 pages
Introduction To Machine Learning EECS 6327
No ratings yet
Introduction To Machine Learning EECS 6327
22 pages
Developing Machine Learning Applications With TensorFlow
No ratings yet
Developing Machine Learning Applications With TensorFlow
22 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
47 pages
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
No ratings yet
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
5 pages
Mehryar Mohri - Foundations of Machine Learning - Book
No ratings yet
Mehryar Mohri - Foundations of Machine Learning - Book
1 page
Dive Into Deep Learning
100% (1)
Dive Into Deep Learning
633 pages
Understandingdeeplearning
100% (1)
Understandingdeeplearning
428 pages
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages
Daily Dose of Data Science Full Archive
No ratings yet
Daily Dose of Data Science Full Archive
53 pages
Anomaly Detection in Images CIFAR-10
No ratings yet
Anomaly Detection in Images CIFAR-10
9 pages
Kursus Deep Learning
No ratings yet
Kursus Deep Learning
108 pages
Neural
No ratings yet
Neural
35 pages
F# For Machine Learning Essentials - Sample Chapter
No ratings yet
F# For Machine Learning Essentials - Sample Chapter
29 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
MFML PDF
No ratings yet
MFML PDF
101 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
Machine Learning GL
No ratings yet
Machine Learning GL
25 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Neuromorphic Computing
No ratings yet
Neuromorphic Computing
14 pages
ML - Full Slides Srikanth Allamshatty
No ratings yet
ML - Full Slides Srikanth Allamshatty
369 pages
A Brief Survey of Deep Reinforcement Learning
No ratings yet
A Brief Survey of Deep Reinforcement Learning
16 pages
Lecture13 ANFIS
No ratings yet
Lecture13 ANFIS
43 pages
Machine Learning Basic Principles
No ratings yet
Machine Learning Basic Principles
124 pages
Hands On Machine Learning With Scikit Learn and TensorFlow Concepts Tools and Techniques To Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 PDF Download
100% (6)
Hands On Machine Learning With Scikit Learn and TensorFlow Concepts Tools and Techniques To Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 PDF Download
75 pages
Brownlee J. Genetic Algorithm Afternoon. A Practical Guide... 2024
No ratings yet
Brownlee J. Genetic Algorithm Afternoon. A Practical Guide... 2024
130 pages
A Novel Adoption of LSTM in Customer Touchpoint Prediction Problems Presentation 1
100% (1)
A Novel Adoption of LSTM in Customer Touchpoint Prediction Problems Presentation 1
73 pages
Maths of Machine Learning
No ratings yet
Maths of Machine Learning
75 pages
Generative Adversarial Networks (GANs)
No ratings yet
Generative Adversarial Networks (GANs)
51 pages
Zero To Deep Learning
100% (4)
Zero To Deep Learning
753 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
No ratings yet
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
15 pages
Iran Prosperity Project
No ratings yet
Iran Prosperity Project
5 pages
L1 - Machine Learning For Finance
100% (1)
L1 - Machine Learning For Finance
131 pages
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
No ratings yet
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
28 pages
Supervised, Unsupervised & Reinforcement Learning
No ratings yet
Supervised, Unsupervised & Reinforcement Learning
11 pages
Deep Learning With Python Sample
100% (1)
Deep Learning With Python Sample
31 pages
Project
No ratings yet
Project
39 pages
Machine Learning Most Compressive Perfect Book
No ratings yet
Machine Learning Most Compressive Perfect Book
136 pages
Machine Learning and Data Mining in Manufacturing
No ratings yet
Machine Learning and Data Mining in Manufacturing
45 pages
Deep Learning Kathi
No ratings yet
Deep Learning Kathi
18 pages
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
Machine Learning Notes
No ratings yet
Machine Learning Notes
167 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
JD - Tech Mahindra Launchpad For Campus Hiring - V1.0
No ratings yet
JD - Tech Mahindra Launchpad For Campus Hiring - V1.0
2 pages
Placement F1e5b0
No ratings yet
Placement F1e5b0
2 pages
Placement 5f7c10
No ratings yet
Placement 5f7c10
2 pages
Data Analytical Roadmap
No ratings yet
Data Analytical Roadmap
10 pages
Regression Solution
No ratings yet
Regression Solution
11 pages
Using The TI-83/84 Plus Chapter 11: Additional Hypothesis Tests
No ratings yet
Using The TI-83/84 Plus Chapter 11: Additional Hypothesis Tests
5 pages
1.1 Parametric and Nonparametric Statistical Inference
No ratings yet
1.1 Parametric and Nonparametric Statistical Inference
8 pages
Journal of Financial Economics: Kuntara Pukthuanthong, Richard Roll
No ratings yet
Journal of Financial Economics: Kuntara Pukthuanthong, Richard Roll
19 pages
Revision Test No 06 Class X 2016
No ratings yet
Revision Test No 06 Class X 2016
2 pages
Module 2 ML Chapter2
No ratings yet
Module 2 ML Chapter2
64 pages
Evaluasi Penerapan Layanan Tiket Kereta Api On-Une: Keywords
No ratings yet
Evaluasi Penerapan Layanan Tiket Kereta Api On-Une: Keywords
11 pages
9 0
No ratings yet
9 0
9 pages
Syllabus - Quanttitative Techniques and Statistics
No ratings yet
Syllabus - Quanttitative Techniques and Statistics
2 pages
Gronnerod - Rorschach Assessment of Changes After Psychother
No ratings yet
Gronnerod - Rorschach Assessment of Changes After Psychother
21 pages
Biostatistics (Correlation and Regression)
100% (1)
Biostatistics (Correlation and Regression)
29 pages
Taguchi Design Tutorial: First Linear Graph For L16 Array
No ratings yet
Taguchi Design Tutorial: First Linear Graph For L16 Array
13 pages
Critical Values of The Mann-Whitney U
No ratings yet
Critical Values of The Mann-Whitney U
2 pages
P10 2016 S PDF
No ratings yet
P10 2016 S PDF
3 pages
419 - DataSceince - SQP 2023 - 2024-2-7
No ratings yet
419 - DataSceince - SQP 2023 - 2024-2-7
6 pages
WWW - Manaresults.co - In: (Common To ME, CSE, IT, MCT, AME, MIE)
No ratings yet
WWW - Manaresults.co - In: (Common To ME, CSE, IT, MCT, AME, MIE)
3 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
Inferentialstatistics 210411214248
No ratings yet
Inferentialstatistics 210411214248
102 pages
Unit 3
No ratings yet
Unit 3
20 pages
Final. Estimation of Population Parameters Chapter Test Sabordo Villabito
No ratings yet
Final. Estimation of Population Parameters Chapter Test Sabordo Villabito
9 pages
SPSS Exercise 6
No ratings yet
SPSS Exercise 6
5 pages
Hello
No ratings yet
Hello
3 pages
Ujian Akhir Semester - MM Ugm - Eks 39 A - Deri Firmansyah
No ratings yet
Ujian Akhir Semester - MM Ugm - Eks 39 A - Deri Firmansyah
4 pages
MODULE 3-Unlocked
No ratings yet
MODULE 3-Unlocked
15 pages
Meng 112 B
No ratings yet
Meng 112 B
2 pages
Mco 22
No ratings yet
Mco 22
6 pages
Maximum Likelihood Estimation of Heckman's Sample Selection Model
No ratings yet
Maximum Likelihood Estimation of Heckman's Sample Selection Model
16 pages
2010-09-27 161817 The Distribution of The Annual Incomes of A Group of Middle-Management Employees
No ratings yet
2010-09-27 161817 The Distribution of The Annual Incomes of A Group of Middle-Management Employees
2 pages
Questions With Answer
No ratings yet
Questions With Answer
6 pages

Understanding Machine Learning Algorithms - in Depth

Uploaded by

Understanding Machine Learning Algorithms - in Depth

Uploaded by

- Understanding Machine Learning Algorithms: Handwritten Notes -

2. What are the Types of Machine Learning?

3. Supervised Machine Learning

4. Unsupervised Machine Learning

8. Exploring Step 1 Data Collection

9. Exploring Step 2 Data Preparation

- Exploratory Data Analysis

Polynomial Polynomial Regression 1. Provides a good 1. Poor interpretability of the

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

SVM In its simplest form, 1. Effective in cases with a 1. Sensitive to overfitting,

Negrest Nearest Neighbors 1. Successful in situations 1. Sensitive to noisy and missing

Gradient An ensemble learning 1. Handling of 1. Sensitive to outliers and can

Lasso Lasso Regression penalizes 1. Good generalization 1. Poor interpretability/explainability

DBSCAN Density-Based Spatial 1. No assumption on the 1. Requires optimization of two

HDBSCAN Family of the density-based 1. No assumption on the 1. Mapping of unseen objects in

Agglomerativ Uses hierarchical clustering 1. There is no need to 1. Specifying metric and

OPTICS Family of the density-based No assumption on the 1. It only produces a cluster

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

PCA Principal Component 1. Explainable Interpretable 1. Sensitive to outliers 2.

UMAP Uniform Manifold 1. It can be used as 1. Default parameters do not

ICA Independent Component 1. Can separate 1. Without any prior knowledge,

ALGORITHM DESCRIPTION & ADVANTAGES DISADVANTAGES

FP-growth Frequent Pattern growth 1. Explainable & 1. More complex algorithm to

FP-Max A variant of Frequent 1. Explainable & 1. More complex algorithm to

You might also like