0% found this document useful (0 votes)

70 views59 pages

UNIT II Part-1

The document discusses parametric methods in machine learning, defining parametric, non-parametric, and semi-parametric models, and explaining the role of parameters in model training. It covers concepts such as Maximum Likelihood Estimation, bias and variance in model performance, and the importance of tuning model complexity to achieve a balance between bias and variance. Additionally, it highlights logistic regression as a key parametric model used for binary classification.

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views59 pages

UNIT II Part-1

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 59

UNIT II

Parametric Methods

Code:U18CST7002
Presented by: Nivetha Raju
Department: CSE
Machine Learning
•Machine learning can be briefed as learning a function (f)
that maps input variables (X) and the following results are
given in output variables (Y).

•Y = f(x)

•Thus machine learning models can be parameterized so

that their behavior can be tuned for a given problem.
•These models can have many parameters and finding
the best combination of parameters can be treated as a
search problem.
Machine Learning

• An algorithm learns this target function

from training data
• In order to estimate the unknown function,
we need to fit a model over the data (the
training data to be more precise).
• The form of the function is unknown, so our
job as machine learning practitioners is to
evaluate different machine learning
algorithms and see which is better at
approximating the underlying function.
• In general, this process can be parametric
or non-parametric.
Machine Learning
•Parametric models assume a specific functional form for
the underlying distribution of the data and are
characterized by a fixed number of parameters. The
model structure is predefined, and the parameters are
estimated from the data.
•Example:Linear regression, logistic regression, and
Gaussian Naive Bayes.

•Non-parametric models do not assume a specific

functional form for the data. Instead, they are flexible and
can adapt to the shape of the data. The number of
parameters can grow with the amount of data.
Example:k-Nearest Neighbors (k-NN), Decision Trees,
and Kernel Density Estimation.

•Semi-parametric models combine aspects of both

parametric and non-parametric models. They include a
parametric component that captures the main structure
of the data and a non-parametric component that allows
for additional flexibility.
What is a parameter in a machine
learning model?
A model parameter is a configuration variable that is internal to
the model and whose value can be estimated from the given
data.

•They are required by the model when making predictions.

•Their values define the skill of the model on your problem.
•They are estimated or learned from historical training data.
•They are often not set manually by the practitioner.
•They are often saved as part of the learned model.
What is a parameter in a machine
learning model?
The examples of model parameters include:

• The weights in an artificial neural network.

• The support vectors in a support vector machine.
• The coefficients in linear regression or logistic regression.
Parametric model
A learning model that summarizes data with a set of fixed-size
parameters (independent on the number of instances of
training).Parametric machine learning algorithms are which
optimizes the function to a known form.
Parametric model
• In a parametric model, you know exactly which model you
are going to fit in with the data, for example, linear
regression line.
• b0 + b1*x1 + b2*x2 = 0
where,
• b0, b1, b2 → the coefficients of the line that control the
intercept and slope
• x1, x2 → input variables
• The assumed functional form is always a linear
combination of input variables and as such parametric
machine learning algorithms are also frequently referred
to as ‘linear machine learning algorithms.’
8
Parametric model
Some more examples of parametric machine learning
algorithms include:

• Logistic Regression
• Linear Discriminant Analysis
• Perceptron
• Naive Bayes
• Simple Neural Networks

9
Parametric model
Assumptions can greatly simplify the learning
process but can also limit what can be learned.
Algorithms that simplify the function to a known
form are called parametric machine learning
algorithms.
we assume that the sample is drawn from some
distribution that obeys a known model, for
example, Gaussian.
The algorithms involve two steps:
1.Select a form for the function.
2.Learn the coefficients(parameters) for the function from
the training data.
The advantage of the parametric approach is that
the model is defined up to a small number of
parameters—for example, mean, variance—the
sufficient statistics of the distribution. 10
Maximum Likelihood
Maximum Likelihood Estimation (MLE) is a statistical method used to
estimate the parameters of a statistical model. The core idea is to find the
parameter values that make the observed data most probable under the
assumed model.

11
Maximum Likelihood-Gaussian
Density

12
Parametric Classification
To predict the classes of new data, the trained classifier finds the class
with the smallest misclassification cost. So we use the discriminant
function

13
• Given the sample
X  {xt ,r t }tN1
t

 1 if x  Ci
x ri t   t
0 if x  C j , j i
• ML estimates are

 x 2
 ri t  x tr i t t
 mi rit
P̂ C i   t
mi  t
si2  t
N  ri t
i
r t

• Discriminant becomes t t

1
gi x    log 2  log si 
x  mi   log P̂ C  2

2 i
2 2si

14
Parametric Classification

15
Parametric Classification

16
Parametric Classification

Equal variances

Single boundary at
halfway between
means

17
Parametric Classification

Variances are different

Two boundaries

18
Regression
Logistic regression is another supervised learning algorithm
which is used to solve the classification problems
Logistic regression algorithm works with the categorical
variable such as 0 or 1, Yes or No, True or False, Spam or not
spam, etc.

19
What is Logistic Regression?
Logistic regression is used for binary classification where we
use sigmoid function, that takes input as independent
variables and produces a probability value between 0 and 1.

For example, we have two classes Class 0 and Class 1 if the

value of the logistic function for an input is greater than 0.5
(threshold value) then it belongs to Class 1 otherwise it
belongs to Class 0. It’s referred to as regression because it
is the extension of linear regression but is mainly used for
classification problems.

20
Logistic Function – Sigmoid Function
• The sigmoid function is a mathematical function used to
map the predicted values to probabilities.
• It maps any real value into another value within a range
of 0 and 1. The value of the logistic regression must be
between 0 and 1, which cannot go beyond this limit, so it
forms a curve like the “S” form.
• The S-form curve is called the Sigmoid function or the
logistic function.
• In logistic regression, we use the concept of the threshold
value, which defines the probability of either 0 or 1. Such
as values above the threshold value tends to 1, and a
value below the threshold values tends to 0.
21
How does Logistic Regression work?
The logistic regression model transforms the linear regression
function continuous value output into categorical value
output using a sigmoid function, which maps any real-valued
set of independent variables input into a value between 0
and 1. This function is known as the logistic function.

22
How does Logistic Regression work?

23
Parametric Regression
r  f x  
estimator : g x | 
 ~N 0,  2

pr | x  ~ N g x | ,  2

p(x, r) = p(r|x)p(x) In regression, we would like to write the numeric output, called
the dependent variable, as a function of the input, called the
N independent var
L  |X  log px t , r t 
t 1
N N
log pr | x  log px t 
t t

t 1 t 1
Parametric Regression
Linear Regression
Polynomial regression
Polynomial regression
Bias
• Bias is one type of error that occurs due to wrong
assumptions about data such as assuming data is linear
when in reality, data follows a complex function.
• On the other hand, variance gets introduced with high
sensitivity to variations in training data.
• This also is one type of error since we want to make our
model robust against noise.
• There are two types of error in machine learning.
Reducible error and Irreducible error.
• Bias and Variance come under reducible error.

29
What is Bias?
• Bias is simply defined as the inability of the model because
of that there is some difference or error occurring between
the model’s predicted value and the actual value.
• These differences between actual or expected values and
the predicted values are known as error or bias error or
error due to bias. Bias is a systematic error that occurs due
to wrong assumptions in the machine learning process.
• Let Y be the true value of a parameter, and let \hat Y be an
estimator of Y based on a sample of data. Then, the bias of
the estimator \hat Y is given by:

30
What is Bias?

• Low Bias: Low bias value means fewer assumptions are

taken to build the target function. In this case, the model
will closely match the training dataset.
• High Bias: High bias value means more assumptions are
taken to build the target function. In this case, the model
will not match the training dataset closely.
The high-bias model will not be able to capture the dataset
trend. It is considered as the underfitting model which has a
high error rate. It is due to a very simplified algorithm.

31
Ways to reduce high bias in Machine
Learning
• Use a more complex model: One of the main reasons
for high bias is the very simplified model. it will not be able
to capture the complexity of the data.
• Increase the number of features: By adding more
features to train the dataset will increase the complexity of
the model.
• Reduce Regularization of the model: Regularization
techniques such as L1 or L2 regularization can help to
prevent overfitting and improve the generalization ability
of the model.
• Increase the size of the training data

32
Variance
• Variance is the measure of spread in data from its mean
position.
• In machine learning variance is the amount by which the
performance of a predictive model changes when it is
trained on different subsets of the training data.
• More specifically, variance is the variability of the model
that how much it is sensitive to another subset of the
training dataset. i.e. how much it can adjust on the new
subset of the training dataset.

33
Ways to Reduce the Variance in Machine
Learning:
• Cross-validation: By splitting the data into training and testing
sets multiple times, cross-validation can help identify if a model
is overfitting or underfitting and can be used to tune
hyperparameters to reduce variance.
• Feature selection: By choosing the only relevant feature will
decrease the model’s complexity. and it can reduce the variance
error.
• Regularization: We can use L1 or L2 regularization to reduce
variance in machine learning models
• Ensemble methods: It will combine multiple models to
improve generalization performance. Bagging, boosting, and
stacking are common ensemble methods that can help reduce34
Variance errors are either low or high-
variance errors.
• Low variance: Low variance means that the model is less
sensitive to changes in the training data and can produce
consistent estimates of the target function with different
subsets of data from the same distribution. This is the case of
underfitting when the model fails to generalize on both
training and test data.

• High variance: High variance means that the model is very

sensitive to changes in the training data and can result in
significant changes in the estimate of the target function when
trained on different subsets of data from the same distribution.
This is the case of overfitting when the model performs35
Ways to Reduce the Variance in Machine
Learning:
• Simplifying the model: Reducing the complexity of the
model, such as decreasing the number of parameters or
layers in a neural network, can also help reduce variance
and improve generalization performance.
• Early stopping: Early stopping is a technique used to
prevent overfitting by stopping the training of the learning
model when the performance on the validation set stops
improving.

36
Different Combinations of Bias-Variance

• High Bias, Low Variance

• High Variance, Low Bias
• High-Bias, High-Variance
• Low Bias, Low Variance

37
Different Combinations of Bias-Variance

38
Tuning Model Complexity: Bias/Variance
Dilemma
• Let us say that a sample X ={x ^ t, r ^ t } is drawn from
some unknown joint probability density p(x, r). Using this
sample, we construct our estimate g(·). The expected
square error (over the joint density) at x can be written as

39
Tuning Model Complexity: Bias/Variance
Dilemma

40
Tuning Model Complexity: Bias/Variance
Dilemma
(a) Function, f(x) = 2sin(1.5x), and one noisy (N(0,1)) dataset sampled from
the function. Five samples are taken, each containing twenty instances.
(b), (c), (d) are five polynomial fits, namely, gi(·), of order 1, 3, and 5. For
each case, dotted line is the average of the five fits, namely, g(·).

41
Tuning Model Complexity: Bias/Variance
Dilemma
In the same setting as that of previous ﬁgure, using one hundred models
instead of ﬁve, bias, variance, and error for polynomials of order 1 to 5.
Order 1 has the smallest variance. Order 5 has the smallest bias. As the
order is increased, bias decreases but variance increases. Order 3 has the
minimum error.

42
Tuning Model Complexity: Bias/Variance
Dilemma
As the order of the polynomial increases, small changes in the dataset
cause a greater change in the fitted polynomials; thus variance increases.
But a complex model on the average allows a better fit to the underlying
function; thus bias decreases. This is called the bias/variance dilemma
• If there is bias, this indicates that our model class does not contain
the solution; this is underfitting
• If there is variance, the model class is too general and also learns the
noise; this is overfitting.

for example, a polynomial of the same order, we have an unbiased

estimator, and estimated bias decreases as the number of models
increase. This shows the error-reducing eﬀect of choosing the right
model (which we called inductive bias

43
Tuning Model Complexity: Bias/Variance
Dilemma
• If the algorithm is too simple (hypothesis with linear
equation) then it may be on high bias and low variance
condition and thus is error-prone.
• If algorithms fit too complex (hypothesis with high degree
equation) then it may be on high variance and low bias.
• In the latter condition, the new entries will not perform
well. Well, there is something between both of these
conditions, known as a Trade-off or Bias Variance Trade-off.
• This tradeoff in complexity is why there is a tradeoff
between bias and variance.
• An algorithm can’t be more complex and less complex at
the same time.
44
Bias Variance Tradeoff

45
Parameters and Hyperparameters

Parameters are the internal variables of a model that are

learned from the training data during the training process.
These values are optimized to minimize the error between
the predicted outputs and the actual outputs in the
training set.
•Examples:
•Weights in Neural Networks:
•Coefficients in Linear Regression
•Support Vectors in SVM

Hyperparameters are external configurations or settings

used to control the learning process of a model. These are
set before the training process begins and are not learned
from the data.
Examples:
Learning Rate
Number of Trees in a Random Forest
Number of Hidden Layers in a Neural Network 46
Model Selection Procedures

There are 6 number of procedures can be used to fine-tune

model complexity.
• Cross-validation
• Regularization
• Akaike’s information criterion (AIC) and Bayesian
information criterion
BIC (BIC)
• Structural risk minimization (SRM)
• Minimum description length (MDL)
• Bayesian model selection

47
Cross-validation

Cross-validation is a technique for validating the model

efficiency by training it on the subset of input data and testing
on previously unseen subset of the input data.
Basic steps of cross-validations are:
• Reserve a subset of the dataset as a validation set.
• Provide the training to the model using the training dataset.
• Now, evaluate model performance using the validation set.
If the model performs well with the validation set, perform
the further step, else check for the issues.

48
Comparison of Cross-validation to train/test
split in Machine Learning
• Train/test split: The input data is divided into two parts, that
are training set and test set on a ratio of 70:30, 80:20, etc.
It provides a high variance, which is one of the biggest
disadvantages.

• Training Data: The training data is used to train the model,

and the dependent variable is known.

• Test Data: The test data is used to make the predictions

from the model that is already trained on the training data.
This has the same features as training data but not the part
of that.

• Cross-Validation dataset: It is used to overcome the

disadvantage of train/test split by splitting the dataset into
groups of train/test splits, and averaging the result. It can
be used if we want to optimize our model that has been
trained on the training dataset for the best performance. 49
Cross Validation and Structural Risk
Minimization
• In the same setting as that of ﬁgure before, training
and validation sets (each containing 50 instances) are
generated. (a) Training data and ﬁtted polynomials of
order from 1 to 8. (b) Training and validation errors as a
function of the polynomial order. The “elbow” is at 3.

50
Structural Risk Minimization

• Structural Risk Minimization (SRM) is a principle in

machine learning and statistical learning theory that
aims to find a balance between two competing factors:
model complexity and the ability to minimize
errors on both training data and unseen data
(generalization). It's a foundational concept behind
techniques like Support Vector Machines (SVMs).

• Model Selection: SRM involves evaluating a series of

models with increasing complexity. For each model, you
calculate the empirical risk and add a penalty based on
the model's complexity.

• Optimal Model: The model that achieves the best trade-

off between low empirical risk and low complexity
penalty is chosen. This model is expected to generalize
well to new data.
51
Regularization
Regularization is a technique used to reduce errors by fitting
the function appropriately on the given training set and
avoiding overfitting. The commonly used regularization
techniques are :

• Lasso Regularization – L1 Regularization

• Ridge Regularization – L2 Regularization
• Elastic Net Regularization – L1 and L2 Regularization

52
Regularization
Lasso Regression
• A regression model which uses the L1 Regularization
technique is called LASSO(Least Absolute Shrinkage and
Selection Operator) regression. Lasso Regression adds the
“absolute value of magnitude” of the coefficient as a
penalty term to the loss function(L).
• Lasso regression also helps us achieve feature selection by
penalizing the weights to approximately equal to zero if
that feature does not serve any purpose in the model.

53
Regularization
Ridge Regression
A regression model that uses the L2 regularization technique
is called Ridge regression. Ridge regression adds the “squared
magnitude” of the coefficient as a penalty term to the loss
function(L).

54
Regularization
Elastic Net Regression
This model is a combination of L1 as well as L2 regularization.
That implies that we add the absolute norm of the weights as
well as the squared measure of the weights. With the help of
an extra hyperparameter that controls the ratio of the L1 and
L2 regularization.

55
Akaike’s information criterion (AIC)&
Bayesian Information Criterion

Methods such as Akaike’s information criterion (AIC) and

Bayesian information criterion(BIC) work by estimating this
optimism and adding it to the training error to estimate
test error, without any need for validation.
Bayesian model selection is used when we have some
prior knowledge about the appropriate class of
approximating functions. This prior knowledge is defined
as a prior distribution over models, p(model).

56
AIC and BIC
• AIC and BIC: These methods estimate how much the
training error might be optimistic (underestimated) and
adjust it to predict the test error.
• No Validation Needed: They do this adjustment
without requiring a separate validation set.
• Influence of Input Features: The more input features
(or parameters) the model has, the greater the
adjustment for underestimation.
• Effect of Training Set Size: As the size of the training
set increases, the underestimation decreases.
• Impact of Noise: The adjustment also increases with
the amount of noise in the data, which can be estimated
from the error of a low-bias model.
57
Minimum description length (MDL)

• If the data is simple, it has a short complexity; for

example, if it is a sequence of ‘0’s, we can just write ‘0’
and the length of the sequence. If the data is completely
random, then cannot have any description of the data
shorter than the data itself.
• If a model is appropriate for the data, then it has a good
fit to the data, and instead of the data, it can send/store
the model description.
• Out of all the models that describe the data, it want to
have the simplest model so that it lends itself to the
shortest description.
• Again have a trade-off between how simple the model is
and how well it explains the data. 58
Bayesian model selection

Bayesian model selection is used when we have some

prior knowledge about the appropriate class of
approximating functions. This prior knowledge is defined
as a prior distribution over models, p(model). Given the
data and assuming a model, we can calculate p(model|
data) using Bayes’ rule:

R Data Analysis Projects PDF
No ratings yet
R Data Analysis Projects PDF
354 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
Machine Learning
No ratings yet
Machine Learning
41 pages
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
REGRESSION
No ratings yet
REGRESSION
13 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Unit 1
No ratings yet
Unit 1
77 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Mod 1
No ratings yet
Mod 1
99 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
03 - Model Evaluation Comparison
No ratings yet
03 - Model Evaluation Comparison
80 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
ML Hand Written Notes
No ratings yet
ML Hand Written Notes
19 pages
ML 21-22 Sem
No ratings yet
ML 21-22 Sem
10 pages
Machine Learning Report
No ratings yet
Machine Learning Report
58 pages
ML Cheatsheet Final
No ratings yet
ML Cheatsheet Final
32 pages
Classification Algorithms
100% (2)
Classification Algorithms
23 pages
Unit Vi Parametric Machine Learning
No ratings yet
Unit Vi Parametric Machine Learning
77 pages
Lec 1
No ratings yet
Lec 1
54 pages
MLT - MKC
No ratings yet
MLT - MKC
10 pages
Model Parameters
No ratings yet
Model Parameters
26 pages
AI Unit 2 QB
No ratings yet
AI Unit 2 QB
97 pages
Classification
100% (2)
Classification
105 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Unit 3
No ratings yet
Unit 3
9 pages
Machine Learning
No ratings yet
Machine Learning
133 pages
Regression
No ratings yet
Regression
45 pages
Q No. 1 1.1machine Learning:: Machine Learning Is The Study of Computer Algorithms That Improve Automatically
No ratings yet
Q No. 1 1.1machine Learning:: Machine Learning Is The Study of Computer Algorithms That Improve Automatically
10 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
ML Final
No ratings yet
ML Final
92 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
ML Assignment
No ratings yet
ML Assignment
6 pages
5 Regression-1
No ratings yet
5 Regression-1
46 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Unit 2
No ratings yet
Unit 2
76 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Unit 1
100% (1)
Unit 1
13 pages
Unit 3
No ratings yet
Unit 3
45 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Supervised Learning
No ratings yet
Supervised Learning
6 pages
ML 3
No ratings yet
ML 3
21 pages
2.SupervisedLearning Error
No ratings yet
2.SupervisedLearning Error
32 pages
Day 2
No ratings yet
Day 2
52 pages
Supervised and Unsupervised Learning Algorithm-2
No ratings yet
Supervised and Unsupervised Learning Algorithm-2
52 pages
ML - Unit - 1
No ratings yet
ML - Unit - 1
47 pages
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
No ratings yet
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
153 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Exercises of Limits
From Everand
Exercises of Limits
Simone Malacrida
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Visualization Charts
No ratings yet
Visualization Charts
108 pages
Unit V - Graphical Models
No ratings yet
Unit V - Graphical Models
43 pages
DV Unit5
No ratings yet
DV Unit5
113 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
UNIT II Part-2
No ratings yet
UNIT II Part-2
32 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
QB 12678
No ratings yet
QB 12678
3 pages
UNIT III Part-2
No ratings yet
UNIT III Part-2
39 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
DV Unit1 Part2
No ratings yet
DV Unit1 Part2
98 pages
Machine Learning Over in Game Ec
No ratings yet
Machine Learning Over in Game Ec
83 pages
MD Sohail Me102 Project Report II
No ratings yet
MD Sohail Me102 Project Report II
49 pages
Comprehensive
No ratings yet
Comprehensive
14 pages
Ad3461 Machine Learning Laboratory - 1
No ratings yet
Ad3461 Machine Learning Laboratory - 1
1 page
COMP717 Machine Learning Assignment 1: Question 1 (20 Marks)
No ratings yet
COMP717 Machine Learning Assignment 1: Question 1 (20 Marks)
2 pages
Machine Learning Algorithms For GeoSpatial Data - Applications and Software Tools
No ratings yet
Machine Learning Algorithms For GeoSpatial Data - Applications and Software Tools
9 pages
AI-900 Updated Dumps - Microsoft Azure AI Fundamentals
No ratings yet
AI-900 Updated Dumps - Microsoft Azure AI Fundamentals
27 pages
Unit 1 DS BCA NOTES
No ratings yet
Unit 1 DS BCA NOTES
7 pages
Counterpropagation Networks
No ratings yet
Counterpropagation Networks
6 pages
How To Develop LSTM Models For Time Series Forecasting
100% (1)
How To Develop LSTM Models For Time Series Forecasting
188 pages
Large Scale Incremental Learning
No ratings yet
Large Scale Incremental Learning
9 pages
Machine Learning Using Matlab 2019-20
No ratings yet
Machine Learning Using Matlab 2019-20
2 pages
Criminal Detection Based Suspect Prediction
No ratings yet
Criminal Detection Based Suspect Prediction
9 pages
Adoption of Artificial Intelligence in Library and Information Science in The 21st Century
No ratings yet
Adoption of Artificial Intelligence in Library and Information Science in The 21st Century
11 pages
Perkenalan Jurusan 2212 - Sistem Informasi - R0
No ratings yet
Perkenalan Jurusan 2212 - Sistem Informasi - R0
44 pages
Journal No 65
No ratings yet
Journal No 65
391 pages
1 Trial
No ratings yet
1 Trial
7 pages
Copyright Compensation and Commons in The Music AI Industry
No ratings yet
Copyright Compensation and Commons in The Music AI Industry
19 pages
Pom500 2025 01 SG
No ratings yet
Pom500 2025 01 SG
134 pages
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
No ratings yet
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
25 pages
Convolutional Neural Networks in Python - Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) (PDFDrive) PDF
No ratings yet
Convolutional Neural Networks in Python - Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) (PDFDrive) PDF
75 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
PEEIC 2024 Presentation Schedule
No ratings yet
PEEIC 2024 Presentation Schedule
18 pages
Stock Prediction RNN
No ratings yet
Stock Prediction RNN
7 pages
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
No ratings yet
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
35 pages
Ebook - Tech Recruiter Career Guide
No ratings yet
Ebook - Tech Recruiter Career Guide
59 pages
Paper 1
No ratings yet
Paper 1
20 pages
1 s2.0 S1574013721000186 Main
No ratings yet
1 s2.0 S1574013721000186 Main
13 pages
Artificial Intelligence and Data Driven Optimization of Internal Combustion Engines 1st Edition Jihad Badra (Editor) Instant Download
No ratings yet
Artificial Intelligence and Data Driven Optimization of Internal Combustion Engines 1st Edition Jihad Badra (Editor) Instant Download
51 pages

UNIT II Part-1

Uploaded by

UNIT II Part-1

Uploaded by

UNIT II

•Thus machine learning models can be parameterized so

• An algorithm learns this target function

•Non-parametric models do not assume a specific

•Semi-parametric models combine aspects of both

•They are required by the model when making predictions.

• The weights in an artificial neural network.

Variances are different

For example, we have two classes Class 0 and Class 1 if the

• Low Bias: Low bias value means fewer assumptions are

• High variance: High variance means that the model is very

• High Bias, Low Variance

for example, a polynomial of the same order, we have an unbiased

Parameters are the internal variables of a model that are

Hyperparameters are external configurations or settings

There are 6 number of procedures can be used to fine-tune

Cross-validation is a technique for validating the model

• Training Data: The training data is used to train the model,

• Test Data: The test data is used to make the predictions

• Cross-Validation dataset: It is used to overcome the

• Structural Risk Minimization (SRM) is a principle in

• Model Selection: SRM involves evaluating a series of

• Optimal Model: The model that achieves the best trade-

• Lasso Regularization – L1 Regularization

Methods such as Akaike’s information criterion (AIC) and

• If the data is simple, it has a short complexity; for

Bayesian model selection is used when we have some

You might also like