0% found this document useful (0 votes)

72 views71 pages

Supervised Learning

Uploaded by

rielay out

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views71 pages

Supervised Learning

Uploaded by

rielay out

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 71

Pembelajaran Mesin dan

Pembelajaran Mendalam
#3 Supervised Learning

Prodi Informatika
Universitas Atma Jaya Yogyakarta
Semester Gasal 2023/2024
Outline
• Classification and Regression
• Classification Algorithms: KNN, Decision Tree, Random Forest,
Gradient Boosting, Logistic Regression, Support Vector Machines,
Neural Networks
• Regression: Linear Regression, Ridge, Lasso
Classification and Regression
• Predicting class labels: a selection from an
existing list of possibilities
• Binary classification: distinguish between two
classes only. Ex: spam email or not
Klasifikasi • Multiclass classification: classification between
more than two classes. Ex: irises: setosa,
versicolor, or virginica
• to predict continuous numbers, or floating-
point numbers in programming terms (or real
numbers in mathematical terms)
• An easy way to differentiate between
classification and regression tasks is there
Regression some kind of continuity in the output
• Example: predicting a person's annual income
from their education, their age, and where
they live
• the predicted value is a number, and can be
any number within a specified range
K-Nearest Neighbors
• The KNN algorithm is "lazy" because it does
not work by learning functions that can form
decision boundaries from data.
• KNN learns from the training set by
K-Nearest memorizing.
• The KNN algorithm has the following working
Neighbors steps:
• Selects the number k and the distance
measurement method
• Looking for k nearest neighbors from the train data
set
• Perform class predictions by voting the majority of
the closest class for new data
KNN
Illustration
• Based on the distance measurement method
chosen, KNN finds a number of k data in the train
set that are close to or most similar to the points
we want to classify.
• The class label of the data point is determined by
the highest number of votes among the k nearest
K-Nearest neighbor data.
Neighbors • The choice of the right k value is very important
to overcome overfitting and underfitting.
• We must ensure that the selected measurement
method is appropriate for the features of the
dataset.
• Ex: using the Euclidean method requires data
standardization so that each feature contributes
equally to the distance measurement.
• The advantages of KNN are:
• Easy to apply for simple cases
• Quite reliable on unbalanced datasets
• Simple model training process
K-Nearest • The disadvantages of KNN are:
• If the train set is large, the training process will take
Neighbors longer
• Very sensitive to outliers in calculating the distance
between data points
• Cannot take into account the level of relevance or
significance of features, if there are many features
that are less important it will interfere with the
learning process of the KNN model
• KNN is also susceptible to overfitting due to
the curse of dimensionality.
• Curse of dimensionality is a phenomenon
where the feature space increases in number
Curse of of dimensions.
• We can assume nearby data points to be too
Dimensionality far away in higher dimensions to provide a
good distance estimate.
• Models such as KNN and Decision Tree cannot
apply regularization so we need to use feature
selection and feature reduction to help us
avoid the curse of dimensionality.
KNN Code Ex
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV

KNN = KNeighborsClassifier()
pipe_KNN = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,KNN)])

params_KNN = {‘feat_selectk’: …, ‘clfn_neighbors’: list(range(1,6)),

‘clf__weights’:[‘uniform’, ’distance’],
‘clf__metric’:[‘euclidean’,’manhattan’,’minkowski’]}

GSCV_KNN = GridSearchCV(pipe_KNN,params_KNN,cv=5)
GSCV_KNN.fit(X_train,y_train)
GSCV_KNN.score(X_test,y_test)
Decision Trees
• Decision Tree is a model that is widely used for
classification and regression tasks
• The model learns if/else questions, leading to
decisions
Decision Trees • model for distinguishing between four classes
of animals (eagles, penguins, dolphins, and
bears) using the three characteristics "has
feathers," "can fly," and "has fins."

14
• Building a Decision Tree:
• Learning the sequence of if/else questions that gets
us to the correct answer most quickly
• In a machine learning setting, these questions are
Decision called tests (as opposed to test sets)
• Usually data does not appear in the form of binary
Trees “yes/no” features as in the animal Decision Tree
example, but is instead represented as continuous
features as in 2D data sets

15
• Building a Decision Tree
• The algorithm searches for all possible tests and
finds the most informative one about the target
variable
• This recursive process produces a binary Decision
Decision Tree, with each node containing a test
Trees

16
• Building a Decision Tree
• The recursive partition of the data is repeated until
each region in the partition (each leaf in the
Decision Tree) contains only one target value (one
class or one regression value)
Decision • A tree leaf that contains data points that all have
Trees the same target value is called a pure leaf

17
• Building a Decision Tree
• Predictions on a new data point are made by
examining the region where the point resides from
a partition of the feature space, and then
predicting the majority target (or single target in
Decision the case of pure leaves) in that region
Trees • Regions can be found by traversing the tree from
the root and to the left or right, depending on
whether the test is met or not

18
• Building a Decision Tree
• The last partition is too detailed = overfitting

Decision
Trees

19
• Controlling the complexity of the Decision Tree
• There are two general strategies to prevent
overfitting: stopping building the tree early (pre-
pruning) or building the tree but then removing or
collapsing nodes that contain little information
Decision (post-pruning/pruning)
Trees • Possible criteria for pre-pruning include: limiting
the maximum depth of the tree, limiting the
maximum number of leaves
• Decision Trees in scikit-learn are implemented in
the DecisionTreeRegressor and
DecisionTreeClassifier classes.
• Scikit-learn only implements pre-pruning

20
• Instead of looking at the entire tree, there are
several useful attributes we can derive to
summarize how the tree works.
• The most commonly used summary is feature
Feature importances, assessing how important each
importances feature is to the decisions the tree makes.
• It's a number between 0 and 1 for each
feature, where 0 means "not used at all" and 1
means "predicts the target perfectly."

21
• Feature importance in trees
• def plot_feature_importances_cancer(model):
• n_features = cancer.data.shape[1]
• plt.barh(range(n_features), model.feature_importances_,
align='center')
Decision • plt.yticks(np.arange(n_features), cancer.feature_names)
• plt.xlabel("Feature importance")
Trees • plt.ylabel("Feature")
• plot_feature_importances_cancer(tree)

22
Decision Tree Code Ex
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV

DT = DecisionTreeClassifier(random_state=0)
pipe_DT = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,DT)])

params_DT = {‘feat_selectk’: …, ‘clfcriterion’: [‘gini’,’entropy’],

‘clf__max_depth’:[2,3,4,5]}

GSCV_DT = GridSearchCV(pipe_DT,params_DT,cv=5)
GSCV_DT.fit(X_train,y_train)
GSCV_DT.score(X_test,y_test)
• Ensembles: a method that combines multiple
machine learning models to create a more
powerful model.
• There are many models in the machine
Ensembles of learning literature that fall into this category,
Decision Trees but there are two ensemble models that have
proven effective on a variety of data sets for
classification and regression, both of which use
Decision Trees as their building blocks: Random
Forest and Gradient Boosted Tree

24
• Random forest: a collection of Decision Trees,
where each tree is slightly different from the
others
• The idea behind Random Forest is that any
Ensembles of given tree may do a relatively good job of
Decision Trees predicting, but it will likely be a good fit on only
part of the data
• There are two ways in which the trees in
Random Forest are randomized; by selecting
the data points used to build the tree or by
selecting features in each split test

25
• To build a tree, we first take what is called a
bootstrap sample of our data
• From n_samples data points, repeatedly
sample data randomly with replacement
Ensembles of (meaning the same sample can be drawn
Decision Trees multiple times)
• To illustrate, let's say we want to create a
bootstrap instance of the list ['a', 'b', 'c', 'd’].
• A possible bootstrap example is ['b', 'd', 'd', 'c’].
Another possible example is ['d', 'a', 'd', 'a'].

26
• Next, a Decision Tree is built based on this
newly created dataset
• The algorithm randomly selects a subset of
features, and searches for the best possible
Ensembles of test involving one of these features.
Decision Trees • This feature subset selection is repeated
separately at each node, so that each node in
the tree can make decisions using a different
feature subset.

27
• To make predictions using Random Forest, the
algorithm first makes predictions for each tree
in the "forest"
• for regression, averaging the results per tree to
Ensembles of get the final prediction
• for classification, a “soft voting” strategy was
Decision Trees used
• Random Forest provides a much more intuitive
decision boundary. In any real application, the
model will use many more trees (often
hundreds or thousands), leading to finer
bounds

28
Ensembles of
Decision Trees

29
Random Forest provides feature importances, which
are calculated by combining the feature importances
of each tree in the Random Forest

Ensembles of
Decision Trees

30
• Advantages: reliable, often works well without
heavy parameter tuning, and does not require
data scaling
• Random forests tend not to perform well on
Ensembles of incomplete and very high-dimensional data,
Decision Trees such as text data
• Random forests require more memory and are
slower to train and predict than linear models
• Important parameters to adjust are
n_estimators, max_features, and possibly pre-
pruning options like max_depth

31
• Gradient Boosting Tree works by building trees
sequentially, where each tree tries to correct
the errors of the previous one
• Gradient Boosting Tree often uses very shallow
Ensembles of trees, with a depth of one to five, which makes
Decision Trees the model smaller in terms of memory and
makes predictions faster
• By default, there is no randomization in a
Gradient Boosting Tree; instead, strong
prepruning is used

32
• The main idea behind Gradient Boosting Trees
is to combine many simple models (in this
context known as weak learners), such as
shallow trees
Ensembles of • Each tree can only provide good predictions on
Decision Trees a portion of the data, and more trees are
added to improve performance iteratively

33
• The feature importance of Gradient Boosting Tree
is somewhat similar to Random Forest, although
Gradient Boosting Tree ignores some features

Ensembles of
Decision Trees

34
• Pros: one of the most powerful and widely used
models for supervised learning
• Disadvantages: requires careful parameter tuning
and may take a long time to train
• The main parameters of the model are n_estimator,
Ensembles of and learning_rate, which controls the extent to
Decision Trees which each tree is allowed to correct the errors of
the previous tree
• Another important parameter is max_depth (or
alternatively max_leaf_nodes), to reduce the
complexity of each tree

35
Random Forest Code Ex
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV

RF = RandomForestClassifier(random_state=0)
pipe_RF = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,RF)])

params_RF = {‘feat_select__k’:
…,’clf__n_estimators’:[100,150,200],‘clf__criterion’:
[‘gini’,’entropy’],‘clf__max_depth’:[2,3,4,5]}

GSCV_RF = GridSearchCV(pipe_RF,params_RF,cv=5)
GSCV_RF.fit(X_train,y_train)
GSCV_RF.score(X_test,y_test)
GradientBoostedTree Code Ex
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import GridSearchCV

GBT = GradientBoostingClassifier(random_state=0)
pipe_GBT = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,GBT)])

params_GBT = {‘feat_select__k’:
…,’clf__n_estimators’:[100,150,200],‘clf__criterion’:
[‘friedman_mse’,’squared_error’],‘clf__max_depth’:[2,3,4,5],’clf__learn
ing_rate’:[0.1,1,10]}

GSCV_GBT = GridSearchCV(pipe_GBT,params_GBT,cv=5)
GSCV_GBT.fit(X_train,y_train)
GSCV_GBT.score(X_test,y_test)
Logistic Regression
• Logistic Regression is a basic but effective
approach to linear and binary classification
problems.
• Even though it has the name 'regression',
Logistic Logistic Regression is a classification model, not
Regression a regression model.
• The workings behind Logistic Regression as a
model for binary classification are based on
'likelihood’.
Logistic Regression
• Logistic Regression applies a
sigmoid function to the
results of a linear equation
of input variables and
produces a probabilistic
output ranging between 0
and 1.
• A threshold value is applied
to the output to classify new
data into a certain class.
• Logistic Regression has a parameter that
determines the strength of regularization
called C.
• A higher value of C indicates less regularization
Logistic factor. High C values try to adapt learning to
Regression the training set as best as possible (in more
detail "trust" the training set).
• A low C value makes the model place more
emphasis on finding a coefficient vector (𝑤)
that is close to zero (minimizing the influence
of several features).
• The advantages of Logistic Regression are:
• Easy to implement and simple
• Reliable for datasets with few to moderate features
• The coefficients are easy to interpret
Logistic • The disadvantages of Logistic Regression are:
Regression • Sensitive to outliers and imbalanced datasets
• Not reliable on complex datasets
• Interpretation of coefficients becomes difficult
when there are correlated features
LogisticRegression Code Ex
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

LogReg = LogisticRegression()
pipe_LogReg = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,
LogReg)])

params_LogReg = {‘feat_select__k’:
…,’clf__C’:[0.1,1,10],‘clf__penalty’: [‘l1’,’l2’]}

GSCV_LogReg = GridSearchCV(pipe_LogReg,params_LogReg,cv=5)
GSCV_LogReg.fit(X_train,y_train)
GSCV_LogReg.score(X_test,y_test)
Kernelized Support Vector Machine
• extensions that allow more complex models
that are not defined solely by hyperplanes in
the input space
Kernelized • One way to make a linear model more flexible
Support is to add more features, by adding interactions
Vector or polynomials of the input features
Machines

45
• Two-class classification dataset where the
classes are not linearly separable
Kernelized
Support
Vector
Machines

46
• Linear models and nonlinear features
• Decision boundaries created by linear SVM
Kernelized
Support
Vector
Machines

47
Kernelized Support Vector Machines

48
Kernelized Support Vector Machines
• As a function of the original features, the linear SVM model is actually
not linear anymore. It's not a line, but more like an ellipse

49
• Kernel trick=mathematical trick that allows our
classifier to learn in a higher dimensional space
without actually computing a new
Kernelized representation which may be very large
Support • works by directly calculating the distance
Vector (more precisely, the scalar product) of data
points for an expanded feature representation,
Machines without ever actually calculating the expansion

50
• Two ways to map your data to a higher
dimensional space with SVM:
• polynomial kernel, which computes all possible
polynomials up to a certain level of the original
Kernelized feature (such as feature1 ** 2 * feature2 ** 5);
Support • radial basis function (RBF) kernel, also known as
Gaussian kernel
Vector • During training, the SVM learns how important
each training data point is to representing the
Machines decision boundary between two classes
• Usually only a subset of the training points is
important for determining the decision boundary:
those located on the borders between classes
(support vectors)

51
• To make predictions for new points, the
distance to each support vector is measured
• Classification decisions are made based on the
Kernelized distance to the support vector, and the
Support importance of the support vector learned
Vector during training (stored in the dual_coef_
attribute of the SVC)
Machines • https://youtu.be/Q7vT0--5VII

52
• The gamma parameter controls the width of
the Gaussian kernel
• determines the scale of what it means for
Kernelized points to be close together
Support • The C parameter is a regularization parameter,
Vector similar to that used in linear models
Machines • limiting the importance of each point (or more
precisely, dual_coef_)

53
Kernelized
Support
Vector
Machines

54
• A small value of gamma means a large radius
for the Gaussian kernel, which means many
points are considered close
Kernelized • reflected in very fine decision boundaries on
Support the left, and more focused boundaries at a
Vector single point further to the right
Machines • A low gamma value means the decision
boundary will vary slowly, resulting in a model
with low complexity, while a high gamma value
results in a more complex model.

55
• A small value of C means a very limited model,
where each data point can only have a very
limited influence
Kernelized • on the top left the decision boundary looks
Support almost linear, with misclassified points having
Vector almost no influence on the line
Machines • increasing C, as shown in the bottom right,
allows these points to have a stronger
influence on the model and makes the decision
boundary bend to classify them correctly

56
• Strengths, weaknesses and parameters
• SVM is a powerful model and performs well on a
variety of datasets
Kernelized • SVM allows complex decision boundaries, even if
the data only has a few features. SVM performs
Support well on low-dimensional and high-dimensional
data (i.e., few and many features), but does not
Vector scale very well with sample size
• Running SVM on data with 10,000 samples may
Machines work fine, but working with datasets of 100,000 or
more, can be challenging in terms of runtime and
memory usage
• Another disadvantage of SVMs is that they require
careful data pre-processing and parameter tuning

57
• Additionally, SVM models are difficult to
examine; it may be difficult to understand why
certain predictions are made, and it may be
Kernelized difficult to explain the model to non-experts
Support • The important parameters in the SVM kernel
Vector are the regularization parameter C, kernel
choice, and kernel specific parameters
Machines

58
SVC Code Ex
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV

SVMClf = SVC()
pipe_SVM = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘clf’,
SVMClf)])

params_SVM = {‘feat_selectk’: …,’clfC’:[0.1,1,10],‘clf__gamma’:

[0.1,1,10], ‘clf__kernel’: [‘rbf’,’poly’,’sigmoid’]}

GSCV_SVM = GridSearchCV(pipe_SVM,params_SVM,cv=5)
GSCV_SVM.fit(X_train,y_train)
GSCV_SVM.score(X_test,y_test)
Regression
• Regression in supervised learning is a method
used to model the relationship between the
independent variable (X) and the dependent
variable (y) by looking for mathematical
equations.
Regresi • Regression has the aim of predicting the value
of the dependent variable based on the value
of the independent variable.
• An easy way to differentiate between
classification and regression tasks: is there
some kind of continuity in the output
• Example: predicting a person's annual income
from their education, their age, and where
Regresi they live
• The predicted value is a quantity, and can be
any number within a specified range
Ilustrasi
Regresi
• one of the simplest and most frequently used
regression methods.
• Linear Regression works by looking for the best
straight line (linear) that can describe the
Linear relationship between the independent (X) and
Regression dependent (y) variables.
• Linear Regression calculates the w and b
parameters that minimize the Mean Squared
Error between the prediction and the actual
regression target.
• Advantages: simple and easy to understand,
can provide accurate results on linear datasets,
relatively fast computing time.
• Disadvantages: cannot overcome non-linear
Linear relationships, cannot overcome
Regression multicollinearity (the existence of correlation
or strong relationship between two or more
independent variables in a regression model.
Significant to the predicted value by coefficient
b).
• Lasso and Ridge regression techniques were
developed to overcome multicollinearity and
overfitting problems in linear regression
models.
• Both techniques work by imposing a penalty
Lasso on the regression equation.
• Lasso uses the L1 penalty, creating a simpler
model by focusing on the most important
features and discarding minor features.
• The Lasso model also limits the coefficient w to
approach zero.
• Advantages: can create simplified models,
select the most important features, and avoid
overfitting.
• Disadvantages: often ignores features that
have no impact on the dependent variable and
Lasso may produce an unstable model.
• Ridge uses the L2 penalty by reducing
regression coefficients and addressing
multicollinearity to produce a more stable
model.
• The coefficient value is made as small as
Ridge possible; all w values must be close to zero.
• Each feature should have as little effect as
possible on the results (which means a small
slope if graphed), but should still predict well.
• Advantages: Ridge can eliminate
multicollinearity and create a more stable
model.
• Disadvantages: cannot ignore insignificant
features and tends to build complicated
Ridge models.
Regression Code Ex
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV

RR = Ridge()
pipe_RR = Pipeline(steps=[(‘scale’,…),(‘feat_select’,…),(‘reg’, RR)])

params_RR = {‘feat_selectk’: …,’regalpha’:[0.01,0.1,1,10]}

GSCV_RR =
GridSearchCV(pipe_RR,params_RR,cv=5,scoring=‘neg_mean_absolute_error’)
GSCV_RR.fit(X_train,y_train)
GSCV_RR.score(X_test,y_test)

K Nearest Neighbor - Step by Step Tutorial
No ratings yet
K Nearest Neighbor - Step by Step Tutorial
16 pages
How To Train An Object Detection Model With Mmdetection - DLology
No ratings yet
How To Train An Object Detection Model With Mmdetection - DLology
7 pages
The GRASP Multiple Micro UAV Testbed
No ratings yet
The GRASP Multiple Micro UAV Testbed
8 pages
Drone Application Report 2022 Sample
No ratings yet
Drone Application Report 2022 Sample
38 pages
Image Segmentation DeepLearning
No ratings yet
Image Segmentation DeepLearning
18 pages
PierreKhancyr Ros and ArduPIlot
No ratings yet
PierreKhancyr Ros and ArduPIlot
17 pages
Supervised Learning With Scikit-Learn
No ratings yet
Supervised Learning With Scikit-Learn
178 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Principles of Remote Sensing
No ratings yet
Principles of Remote Sensing
410 pages
Computer Vision - Ipynb - Colaboratory
No ratings yet
Computer Vision - Ipynb - Colaboratory
17 pages
UAV Aerial Image-Based Forest Fire Detection Using Deep Learning KEDDOUS AKILA 2.0
No ratings yet
UAV Aerial Image-Based Forest Fire Detection Using Deep Learning KEDDOUS AKILA 2.0
100 pages
Object Detection and Avoidance in Unmanned Ground Vehicle Using Arduino1
No ratings yet
Object Detection and Avoidance in Unmanned Ground Vehicle Using Arduino1
4 pages
Pradip Python-PPT-Geoinformatics (Pradip)
100% (1)
Pradip Python-PPT-Geoinformatics (Pradip)
8 pages
Computer Vision Pretrained Models: What Is Pre-Trained Model?
No ratings yet
Computer Vision Pretrained Models: What Is Pre-Trained Model?
10 pages
TensorFlow Tutorial
No ratings yet
TensorFlow Tutorial
65 pages
Step by Step Guide How To Rapidly Build Neural Networks
No ratings yet
Step by Step Guide How To Rapidly Build Neural Networks
6 pages
AI Driven Companies in Egypt
No ratings yet
AI Driven Companies in Egypt
16 pages
Mining Presentation Wingtra CPG Tunisia March 2022
No ratings yet
Mining Presentation Wingtra CPG Tunisia March 2022
42 pages
Drone Based Sensor Platforms
No ratings yet
Drone Based Sensor Platforms
44 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
UNIT-I - Introduction To Computer Vision
No ratings yet
UNIT-I - Introduction To Computer Vision
45 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
No ratings yet
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
15 pages
Accelerate Computing Vision and Image Processing Using VPI 1.1 by Rodolfo Lima
No ratings yet
Accelerate Computing Vision and Image Processing Using VPI 1.1 by Rodolfo Lima
23 pages
Columbia Seaborn Tutorial
No ratings yet
Columbia Seaborn Tutorial
12 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
Btech CSE
No ratings yet
Btech CSE
17 pages
Data Science For Financial Markets - Kaggle
No ratings yet
Data Science For Financial Markets - Kaggle
202 pages
The Drones Book - 13th Edition - 7 December 2023
No ratings yet
The Drones Book - 13th Edition - 7 December 2023
132 pages
APRENDENDO PYTHON (MARK LUTZ, DAVID ASCHER) (Z-Lib - 220526 - 214603
No ratings yet
APRENDENDO PYTHON (MARK LUTZ, DAVID ASCHER) (Z-Lib - 220526 - 214603
576 pages
Automates Neural Architecture Construction
No ratings yet
Automates Neural Architecture Construction
23 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Chapter 1 - Introduction To Time Series and Forcasting - Student Version
100% (2)
Chapter 1 - Introduction To Time Series and Forcasting - Student Version
51 pages
05 Logistic - Regression
No ratings yet
05 Logistic - Regression
7 pages
Using Machine Learning & Deep Learning With Arcgis Imagery: Kannan Jayaraman Gistec
No ratings yet
Using Machine Learning & Deep Learning With Arcgis Imagery: Kannan Jayaraman Gistec
28 pages
PHS 3E Manual 20190418
No ratings yet
PHS 3E Manual 20190418
16 pages
Data Science Curriculum 2024
No ratings yet
Data Science Curriculum 2024
16 pages
Data Science Learning Path For 50 Days
No ratings yet
Data Science Learning Path For 50 Days
15 pages
Tensorflow 2 - 0 Slides PDF
No ratings yet
Tensorflow 2 - 0 Slides PDF
100 pages
Example of 2D Convolution
No ratings yet
Example of 2D Convolution
5 pages
Blackmagic Live Video Processing With OpenCV
No ratings yet
Blackmagic Live Video Processing With OpenCV
19 pages
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
100% (1)
Random Forest: Implementaciones de Scikit-Learn Sobre QSAR
11 pages
Large-Scale Deep Reinforcement Learning
No ratings yet
Large-Scale Deep Reinforcement Learning
6 pages
Oil Spill Mapping With Sentinel-1
No ratings yet
Oil Spill Mapping With Sentinel-1
17 pages
Applying Bayesian Inference in A Hybrid CNN-LSTM Model For Time Series Prediction.
No ratings yet
Applying Bayesian Inference in A Hybrid CNN-LSTM Model For Time Series Prediction.
7 pages
EastWestAirlines Cluster
100% (1)
EastWestAirlines Cluster
6 pages
2022 - Chuan Shi, Xiao Wang, Cheng Yang - Advances in Graph Neural Networks-Springer
No ratings yet
2022 - Chuan Shi, Xiao Wang, Cheng Yang - Advances in Graph Neural Networks-Springer
207 pages
Forecast
No ratings yet
Forecast
82 pages
IOT Drone Fundamentals
No ratings yet
IOT Drone Fundamentals
11 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
7 pages
A Practical Time-Series Tutorial With MATLAB
No ratings yet
A Practical Time-Series Tutorial With MATLAB
95 pages
Decision Trees - 2022
No ratings yet
Decision Trees - 2022
49 pages
Microstrategy Tips and Techniques: Reporting Essentials Five Styles of Business Intelligence
No ratings yet
Microstrategy Tips and Techniques: Reporting Essentials Five Styles of Business Intelligence
20 pages
Machine Learning & Some Industry Applications
No ratings yet
Machine Learning & Some Industry Applications
43 pages
Supervised Learning
No ratings yet
Supervised Learning
14 pages
U02Lecture08 Statistical Machine Learning
No ratings yet
U02Lecture08 Statistical Machine Learning
41 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
Unit 3 Classification - Dr. Vidyut D
No ratings yet
Unit 3 Classification - Dr. Vidyut D
72 pages
Unit - II
No ratings yet
Unit - II
37 pages

Supervised Learning

Uploaded by

Supervised Learning

Uploaded by

Pembelajaran Mesin dan

params_KNN = {‘feat_select__k’: …, ‘clf__n_neighbors’: list(range(1,6)),

params_DT = {‘feat_select__k’: …, ‘clf__criterion’: [‘gini’,’entropy’],

params_SVM = {‘feat_select__k’: …,’clf__C’:[0.1,1,10],‘clf__gamma’:

params_RR = {‘feat_select__k’: …,’reg__alpha’:[0.01,0.1,1,10]}

You might also like

params_KNN = {‘feat_selectk’: …, ‘clfn_neighbors’: list(range(1,6)),

params_DT = {‘feat_selectk’: …, ‘clfcriterion’: [‘gini’,’entropy’],

params_SVM = {‘feat_selectk’: …,’clfC’:[0.1,1,10],‘clf__gamma’:

params_RR = {‘feat_selectk’: …,’regalpha’:[0.01,0.1,1,10]}