0% found this document useful (0 votes)

43 views37 pages

5.design and Analysis of Machine Learning Experiments

Uploaded by

puneethsp2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views37 pages

5.design and Analysis of Machine Learning Experiments

Uploaded by

puneethsp2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

5.

Design and Analysis of

Machine Learning Experiments
Presentation Material
Department of Computer Science & Engineering
Course Code: Semester: V
Course Title: AI & Machine Learning Year: 2024
Faculty Name:
Indu Joseph Thoppil

AI & ML Department of Computer Science & Engineering 1

MODULE 5

TEXT BOOK REFERRED

Chapter 19 : Ethem Alpaydin, Introduction to Machine Learning (Adaptive
Computation and Machine Learning series), The MIT Press, Third Edition.

AI & ML Department of Computer Science & Engineering 2

Introduction
• In machine learning, there are several classification algorithms and,
given a certain problem, more than one may be applicable.
• There is a need to examine how we can assess how good a selected
algorithm is.
• Also, we need a method to compare the performance of two or
more different classification algorithms. These methods help us
choose the right algorithm in a practical situation.
Issues Related to Analysing the ML algos
• Having trained a classification algorithm on a dataset drawn from a
specific application, can we confidently predict its future performance in
real-life scenarios? While a well-trained model can achieve high accuracy
on the training data, it's essential to consider several factors that may
influence its performance when deployed in the real world.
• How can we determine which of two learning algorithms has a lower
error rate for a given application? The algorithms may belong to different
categories (e.g., parametric vs. nonparametric) or use varying
hyperparameter settings. Eg: we might want to compare a multilayer
perceptron with four hidden units to one with eight hidden units, or find
the optimal value of k for a k-nearest neighbor classifier.
Issues Related to Analysing the ML algos
• When evaluating machine learning models, relying on training set errors
is insufficient because these errors are always smaller than those on
unseen test data. Training errors cannot be used to compare algorithms,
as complex models tend to fit training data better than simpler ones,
regardless of their true performance.
• To make fair comparisons, a separate validation set is needed. However,
even a single validation run may not be enough due to two main reasons:
1. Small Dataset Bias: Limited training and validation sets may include noise or
outliers, which can distort evaluation.
2. Random Factors: Factors like random initialization of weights in algorithms like
multilayer perceptrons can lead to different outcomes, even with identical setups.
How randomness and variability in machine learning
experiments can impact model evaluation?
Single Training Run Limitation:
• When you train a model just once, the resulting learner and its
validation error are influenced by random factors such as:
• The specific training data subset selected.
• Initial model parameters (e.g., weights in neural networks).
• Stochastic elements in the learning process (e.g., mini-batch selection during
training).
• This single validation error provides only a snapshot, which may not
reliably represent the algorithm's true performance due to these
sources of randomness.
Averaging Over Multiple Runs:
• To account for variability, multiple learners are trained using the
same algorithm under slightly different conditions, such as:
• Different training or validation data splits (e.g., in cross-validation).
• Different initializations of the model's parameters.
• Each learner is tested on a separate validation set to compute its
validation error. This results in a distribution of validation errors
rather than a single value.
• Multiple Runs to Reduce Randomness:
Random factors, such as initial weights or stochastic training,
introduce variability in outcomes. To account for this, multiple
learners are trained on different subsets or configurations, and their
validation errors are averaged.
This helps generate a distribution of errors, offering a robust
comparison of algorithms.
Evaluating the Algorithm: The distribution of validation errors is used
to assess the algorithm's performance:
• Expected Error: The mean of the distribution gives an estimate of the
expected error rate for the algorithm on that problem.
• Variability and Robustness: The spread (e.g., standard deviation) of the
distribution reveals how consistent the algorithm is across different
conditions.
This approach also facilitates direct comparison between algorithms.
By comparing their respective error distributions, it’s possible to
identify which algorithm is more reliable or effective for a given
problem.
Significance:This methodology ensures that the evaluation accounts
for all potential sources of randomness. Instead of relying on a single
outcome, it provides a more statistically sound basis for comparing
algorithms or assessing their real-world performance.
Following are the key principles in the design and analysis of machine
learning experiments, emphasizing best practices for training,
validation, and testing to ensure robust model evaluation and
meaningful insights.
• No Free Lunch Theorem: (Wolpert 1995)

The theorem states that no single learning algorithm performs best

across all datasets. The performance of an algorithm depends on how
well its inductive biases align with the properties of the data at hand.
Example: A neural network may excel in tasks requiring non-linear
decision boundaries but may underperform in problems where
simpler models like k-NN suffice.
• Dataset Partitioning:
Proper data splitting into training, validation, and test sets is crucial:
• Training Set: Used to optimize the model parameters.
• Validation Set: Used to tune hyperparameters (e.g., number of layers in an
MLP, k in k-NN) or stop training.
• Test Set: Reserved for final performance evaluation, ensuring no prior
exposure during training or validation.
Example: In an experiment with an MLP, weights are trained on the
training set, hidden units and learning rate are fine-tuned on the
validation set, and final error is reported on the test set.
Criteria Beyond Accuracy:
Real-world decisions depend on multiple factors, not just error rates:
• Cost-sensitive Learning: Balancing false positives and false
negatives based on application-specific costs.
• Complexity: Training/testing time and space requirements.

• Interpretability: The ease of extracting insights from the model.

• Ease of Deployment: Simplicity in programming and integration.

Example: A support vector machine (SVM) might be accurate but less

interpretable compared to a decision tree.
Statistical Design of Experiments: Statistical principles should be
applied to experimental design and analysis to draw reliable
conclusions. This includes proper sampling, hypothesis testing, and
confidence intervals.
Applying statistical methodologies ensures meaningful conclusions
from experiments:
• Control for randomness by averaging multiple runs.

• Use proper statistical tests to validate significance between

algorithms.
Some other criteria include:
• training time and space complexity,

• testing time and space complexity,

• interpretability, namely, whether the method allows knowledge ex

traction which can be checked and validated by experts, and
• easy programmability

The relative importances of these factors change depending on the

application.
Principles of experimental design:

• There are three basic principles of design which were developed by

Sir Ronald A. Fisher.
(i) Randomization
(ii) Replication
(iii) Local control
(i) Randomization
• Randomization requires that the order in which the runs are carried out
should be randomly determined so that the results are indepen dent.
• Randomization involves randomly assigning subjects, instances, or
conditions to experimental groups to eliminate bias and ensure each unit
has an equal chance of being in any group.
• Purpose: To avoid systematic errors and confounding effects that can
skew results. Randomization balances unknown factors across groups,
making results more generalizable.
• Example:
• In a clinical trial testing a new drug, patients are randomly assigned to either the
drug group or the placebo group. This ensures that any differences in outcomes are
more likely due to the drug itself, not pre-existing differences between groups.
• In machine learning, when splitting a dataset into training and testing sets, random
shuffling ensures that the split is representative of the overall dataset and reduces
bias.
Randomization
Randomization forms a basis of a valid experiment but replication is
also needed for the validity of the experiment.
If the randomization process is such that every experimental unit has an
equal chance of receiving each treatment, it is called complete
randomization.
Replication
• Replication implies that for the same configuration of (controllable)
factors, the experiment should be run a number of times to average
over the effect of uncontrollable factors.
• ie it refers to repeating the experiment multiple times to confirm
the consistency and reliability of results. It can involve duplicating
the entire experiment or testing across different datasets or
conditions.
• In machine learning, this is typically done by running the same
algorithm on a number of resampled versions of the same dataset
which is known as cross-validation.
Replication
Purpose: To ensure results are not due to random chance and to
identify variations in data.
Example: In agriculture, testing the yield of a new crop variety in
different regions (under varied soil and weather conditions) ensures
that the findings are consistent and robust across environments.
In machine learning, running a model training multiple times with
different random initializations (e.g., weights in neural networks) and
averaging the performance metrics can account for randomness in the
training process.
Replication
•
The relationship between the variance of a sample mean and the
sample size can be observed in machine learning, particularly in
bagging (Bootstrap Aggregating), a technique used in ensemble
learning methods like Random Forests.
Scenario: In a Random Forest, multiple decision trees are trained on
bootstrap samples (random subsets) of the training data. Each tree
predicts an output for a given input, and the final prediction is
obtained by averaging (for regression) or voting (for classification)
across the predictions of all trees.
Connection to Variance: If the variance of predictions made by a
single decision tree is 𝜎2,the variance of the mean prediction made by
𝑛 trees is approximately:
Var(Mean Prediction)= 𝜎2/𝑛
As 𝑛 (the number of trees) increases, the variance of the ensemble's
prediction decreases, leading to a more stable and robust model.
Practical Observation: With fewer trees (𝑛 small), the model's
predictions might be more sensitive to the noise in the training data,
resulting in higher variance.
As more trees are added (𝑛 increases), the predictions converge to a
more reliable output, as the variance of the sample mean diminishes.
Conclusion: This principle is foundational to ensemble methods,
where aggregating predictions reduces variance and improves model
generalization. Thus, increasing the sample size (number of
observations or base models) enhances stability and reduces error
variance, analogous to the statistical property of the sample mean.
Blocking(Local Control)
• Blocking is used to reduce or eliminate the variability due to
nuisance factors(confounding factors) that influence the response
but in which we are not interested.
• It is a technique used in experimental design to control for factors
that might influence the results but aren't of primary interest. It's
like grouping similar things together to isolate the effect of the
things you do care about.
Blocking in Machine Learning Experiments:
• When comparing machine learning algorithms:
• Objective: Ensure that the differences in performance are due to the
algorithms themselves and not due to variability introduced by different
subsets of data.
• Method: Use the same training and testing splits (or resampled subsets)
for all algorithms being compared.
Why Blocking is Important:
1. Without Blocking:
1. If different algorithms use different training subsets, the observed differences in
accuracy might stem from the data split rather than the algorithms' performance.
2. For example, one algorithm might perform better simply because it had a "lucky"
data split with easier examples.
2. With Blocking:
1. By using identical subsets across replicated runs, the variability due to data splitting
is minimized, and the differences in performance reflect only the algorithms'
capabilities.
Eg from Machine Learning
• Pairing: If you're comparing two groups (like two different algorithms),
you might want to pair similar data points together. This helps you isolate
the effect of the algorithms and avoid confounding factors.

• Confounding Variables: A variable that is connected to both the

dependent and independent variables but is not a component of the
hypothesis being tested is referred to as a confounder. Confounding
factors have the potential to skew an experiment's findings and provide
false conclusions. Confounding variables must be understood in
experimental design and taken into account when creating the
experiment.
Here's an example:

• Imagine you're testing two different fertilizers on plants. You want

to know which fertilizer helps plants grow taller. But you also know
that the amount of sunlight a plant gets can affect its growth. So,
you divide your plants into groups based on how much sunlight
they get (e.g., sunny, partly sunny, shady). This is blocking. By
grouping plants based on sunlight, you can isolate the effect of the
fertilizer and see if it really makes a difference.

• Here Sunlight is the confounding factor.

Guidelines for Machine Learning Experiments
Before we start experimentation, we need to have a good idea about
1. what it is we are studying,

2. how the data is to be collected,

3. how we are planning to analyze it.

A. Aim of the study
• We need to start by stating the problem clearly, defining what the objectives
are. In machine learning, there may be several possibilities.
There are various goals one might pursue:
1. Assessing Expected Error: Determine if the learning algorithm achieves an
acceptable error level on a specific problem.
2. Comparing Two Algorithms: Evaluate which of two algorithms has a lower
generalization error for a given dataset. This could involve comparing different
algorithms or an improved version of one (e.g., by using better features).
3. Ranking Multiple Algorithms: For a given dataset, compare the performance of
more than two algorithms and rank them based on error or other performance
measures.
4. Cross-Dataset Comparison: Evaluate and compare algorithms across multiple
datasets to understand their general performance and robustness.
B. Selection of the Response Variable
We need to decide on what we should use as the quality measure.
1. Misclassification Error: Used for classification tasks to measure the
rate of incorrect predictions.
2. Mean Square Error (MSE): Applied in regression problems to
quantify the average squared difference between predicted and
actual values.
3. Precision and Recall: Widely used in information retrieval tasks to
evaluate the relevance of retrieved items.
Choosing the right measure ensures that the evaluation aligns with
the objectives of the task and the specific application context.
C. Choice of Factors and Levels
The factors in a machine learning experiment depend on the study's
goal. Examples include:
1. Hyperparameters: If optimizing an algorithm, the hyperparameters
(e.g., k in k-nearest neighbors) are the factors.
2. Algorithms: If comparing different learning algorithms, they serve
as the factors.
3. Datasets: When analyzing performance across multiple datasets,
the datasets become factors.
C. Choice of Factors and Levels
• It is always good to try to normalize factor levels. For example, in
optimizing k of k-nearest neighbor, one can try values such as 1, 3,
5, and so on
• It is also important to investigate all factors and factor levels that
may be of importance and not be overly influenced by past
experience.
D. Choice of Experimental Design
1. Factorial Design: Prefer factorial design as factors often interact. Avoid assuming
independence unless proven otherwise.
2. Replication:
1. Small datasets require more replication to ensure statistical reliability and valid comparisons.
2. Larger datasets may need fewer replicates but should still provide enough data to analyze
distributions effectively.
3. Dataset Division :
1. Separate data into test, training, and validation sets. Resampling techniques are often used to
improve reliability.
2. Small datasets tend to yield high variance in results, making conclusions less significant or
inconclusive.
4. Real-world Data:
1. Use real-world datasets collected under authentic conditions for experiments.
2. Synthetic or low-dimensional datasets may provide intuition but do not reliably represent
algorithm performance in high-dimensional, practical scenarios.

In short, proper design, realistic data, and careful replication enhance the validity and
applicability of machine learning experiments.
E. Performing the Experiment
Emphasizes the importance of planning, testing, and adhering to best
practices for reliable and unbiased machine learning experimentation.
It combines careful preparation with professional standards to
minimize errors, promote reproducibility, and ensure objective
evaluations of algorithms. Some important points to be checked here
are
• Trial Runs: Conduct a few preliminary runs with random settings to
ensure everything is working as expected before committing to a
large-scale experiment. This step helps catch potential errors early,
avoiding wasted effort on flawed setups
• Reproducibility and Backup: Save intermediate results or the random number
generator seeds to allow partial reruns of the experiment.Ensure all results are
reproducible, an essential aspect of scientific rigor.
• Software Aging: Be cautious of issues like software fatigue in long-running
experiments, where performance may degrade over time due to bugs, memory
leaks, or system errors.
• Unbiased Experimentation: Maintain objectivity, especially when comparing
algorithms. Both your algorithm and competitors' should receive equal effort and
diligence. In large-scale studies, consider separating the roles of testers and
developers to minimize bias.
• Use Reliable Code: Prefer reliable, well-tested libraries over custom-built solutions
to leverage the robustness and optimization of established codebases.
• Documentation: Properly document experiments to ensure clarity and facilitate
collaboration, especially in group projects.Use standard software engineering
practices for quality and maintainability in machine learning experiments.
F. Statistical Analysis of the Data
Emphasizes the importance of using statistical rigor and visual aids to ensure
findings in machine learning experiments are reliable and interpretable.
• Objective Analysis: The goal is to derive conclusions that are objective and
not influenced by chance or bias. This involves framing research questions as
statistical hypotheses.
• Hypothesis Testing: Questions like "Is algorithm A better than algorithm B?"
are translated into testable hypotheses, such as "The average error of
algorithm A is significantly lower than that of algorithm B. "Statistical
methods are used to determine whether the data supports this hypothesis.
• Visualization:
• Visual tools like histograms, box-and-whisker plots, and range plots
are helpful for exploring data and understanding error distributions.
• These visualizations complement statistical tests, providing an
intuitive sense of variability and differences.
G. Conclusions and Recommendation
• Iterative Nature of Experiments: Machine learning experiments are
iterative processes. Initial experiments are exploratory, and only a
fraction (e.g., 25%) of resources should be invested initially. Further
experimentation is often required to refine methods and results.
• Statistical Hypotheses: Statistical testing evaluates how well the sample
data supports a hypothesis but does not confirm its absolute truth. Small,
noisy datasets increase the risk of inconclusive or erroneous conclusions.
• Learning from Failures: When results do not meet expectations,
analyzing deficiencies can lead to insights for improvement.
Enhancements in algorithms often stem from identifying shortcomings in
prior versions.
• Thorough Analysis Before Next Steps: Before testing improved versions,
ensure that all insights from the current experiment have been
thoroughly explored and understood. Ideas are valuable only when
rigorously tested, and testing demands resources and effort.

Machine Learning Assignment
100% (1)
Machine Learning Assignment
55 pages
ML Short - Ques - Answers
No ratings yet
ML Short - Ques - Answers
10 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Well Posed Learning Problem
No ratings yet
Well Posed Learning Problem
5 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
MLT Part 1
No ratings yet
MLT Part 1
230 pages
ML 5units
No ratings yet
ML 5units
284 pages
GUIDELINES FOR MACHIE LEARNING EXPERIMENTS - PDF (Lakshan)
No ratings yet
GUIDELINES FOR MACHIE LEARNING EXPERIMENTS - PDF (Lakshan)
11 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
No ratings yet
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
127 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
CSL0777 L08
No ratings yet
CSL0777 L08
29 pages
Model Selection On ML
No ratings yet
Model Selection On ML
49 pages
Designing A Learning System
No ratings yet
Designing A Learning System
12 pages
Receiver Operator Characteristic
No ratings yet
Receiver Operator Characteristic
25 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
Introduction
No ratings yet
Introduction
73 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Unit - 1+2
No ratings yet
Unit - 1+2
108 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
ML Unit-1
No ratings yet
ML Unit-1
70 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Intro DL 01
No ratings yet
Intro DL 01
64 pages
General ML Notes
No ratings yet
General ML Notes
30 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
CS3244 (2120) - Project Discussion 1 - Overview
No ratings yet
CS3244 (2120) - Project Discussion 1 - Overview
25 pages
Doc-20250117-Wa0014. 20250117 193235 0000
No ratings yet
Doc-20250117-Wa0014. 20250117 193235 0000
22 pages
Aiml Unit - V
No ratings yet
Aiml Unit - V
50 pages
A Practical and Technical Introduction To Machine Learning
No ratings yet
A Practical and Technical Introduction To Machine Learning
23 pages
Unit - 1 1.introduction To ML
No ratings yet
Unit - 1 1.introduction To ML
74 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
No ratings yet
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
30 pages
Unit 4
No ratings yet
Unit 4
34 pages
Election Algorithm and Distributed Processing - Unit 2
100% (1)
Election Algorithm and Distributed Processing - Unit 2
2 pages
Approach Towards Model Evaluation, Model Selection
No ratings yet
Approach Towards Model Evaluation, Model Selection
13 pages
L2 - Problems in ML & Performance Evaluation
No ratings yet
L2 - Problems in ML & Performance Evaluation
30 pages
AIML-Unit 5 Notes-Assignment 5
No ratings yet
AIML-Unit 5 Notes-Assignment 5
24 pages
IV Ai & Ds Al3451 ML Unit5
No ratings yet
IV Ai & Ds Al3451 ML Unit5
23 pages
Unit 1
No ratings yet
Unit 1
62 pages
BAI602 Module 2 Textbook
No ratings yet
BAI602 Module 2 Textbook
31 pages
College Notes. Mechine Lerning
No ratings yet
College Notes. Mechine Lerning
16 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
Deep Learning Unit 3
No ratings yet
Deep Learning Unit 3
19 pages
ML Notes
No ratings yet
ML Notes
26 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Model Selection NEW
No ratings yet
Model Selection NEW
24 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
8 pages
Ai Unit 5
No ratings yet
Ai Unit 5
13 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
No ratings yet
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
16 pages
Unit - 1
No ratings yet
Unit - 1
11 pages
A "Short" Introduction To Model Selection
No ratings yet
A "Short" Introduction To Model Selection
25 pages
Testing Machine Learning Algorithms
No ratings yet
Testing Machine Learning Algorithms
3 pages
ML MAKAUT Unit-3
No ratings yet
ML MAKAUT Unit-3
6 pages
Unit 1.2 Desigining A Learning System
No ratings yet
Unit 1.2 Desigining A Learning System
15 pages
Unit 1
No ratings yet
Unit 1
8 pages
Short - Ques - Answers FML
No ratings yet
Short - Ques - Answers FML
10 pages
Types of ML
No ratings yet
Types of ML
4 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Module 4 Notes
No ratings yet
Module 4 Notes
16 pages
M5 File System
No ratings yet
M5 File System
116 pages
Mathematics 6.2 LabManual
No ratings yet
Mathematics 6.2 LabManual
14 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Image Features Using Wavelets and Applications To Document Image Processing
No ratings yet
Image Features Using Wavelets and Applications To Document Image Processing
71 pages
Module 3 PPT-B
No ratings yet
Module 3 PPT-B
42 pages
Exe 1 DL
No ratings yet
Exe 1 DL
3 pages
Spectral Mapping Theorem For Polynomials
No ratings yet
Spectral Mapping Theorem For Polynomials
28 pages
Module 3 PPT-A
No ratings yet
Module 3 PPT-A
62 pages
Chapter 6 - Network Flows Optimization
No ratings yet
Chapter 6 - Network Flows Optimization
50 pages
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
No ratings yet
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
24 pages
6-1LTI Frequency Response
No ratings yet
6-1LTI Frequency Response
7 pages
Water Jag Problem
No ratings yet
Water Jag Problem
36 pages
Simple Regression Model CH02
No ratings yet
Simple Regression Model CH02
60 pages
Why Is Cross-Validation Needed?
No ratings yet
Why Is Cross-Validation Needed?
24 pages
Ee583 - Statistical Signal Processing
No ratings yet
Ee583 - Statistical Signal Processing
6 pages
CG Syllabus
No ratings yet
CG Syllabus
5 pages
Lab1 DSP
No ratings yet
Lab1 DSP
15 pages
OS m5 Os Protection
No ratings yet
OS m5 Os Protection
25 pages
03 A Polynomial Linear Regression
No ratings yet
03 A Polynomial Linear Regression
6 pages
EDA Lec10 Week 10 Dec v1
No ratings yet
EDA Lec10 Week 10 Dec v1
58 pages
FAFL Module 5
No ratings yet
FAFL Module 5
37 pages
Image Enhancement Frequency Domain
No ratings yet
Image Enhancement Frequency Domain
40 pages
Advances in Human-Computer Interaction - 2022 - Mahapatra - Multiclass Classification of Imagined Speech Vowels and Words
No ratings yet
Advances in Human-Computer Interaction - 2022 - Mahapatra - Multiclass Classification of Imagined Speech Vowels and Words
10 pages
It6005-Digital Image Processing-737663277-It6005 Dip
No ratings yet
It6005-Digital Image Processing-737663277-It6005 Dip
13 pages
Interval Estimation
No ratings yet
Interval Estimation
51 pages
ECE533 Digital Image Processing: University of Wisconsin - Madison
No ratings yet
ECE533 Digital Image Processing: University of Wisconsin - Madison
25 pages
SEPM Module4
No ratings yet
SEPM Module4
19 pages
Algorithms 2
No ratings yet
Algorithms 2
39 pages
Implementation of Pattern Matching Algorithm
No ratings yet
Implementation of Pattern Matching Algorithm
4 pages
Math 3201 Chapter 6 Review
No ratings yet
Math 3201 Chapter 6 Review
4 pages
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
No ratings yet
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
22 pages
QueuingTheory - Single Server Systems
No ratings yet
QueuingTheory - Single Server Systems
19 pages
Bounded L 2 Gain Static Output Feedback Controller Design and Implementation On An Electromechanical System
No ratings yet
Bounded L 2 Gain Static Output Feedback Controller Design and Implementation On An Electromechanical System
7 pages
11 Examples To Master Python List Comprehensions - by Soner Yıldırım - Towards Data Science
No ratings yet
11 Examples To Master Python List Comprehensions - by Soner Yıldırım - Towards Data Science
9 pages
Bs 341 Exam Tutorial 1
No ratings yet
Bs 341 Exam Tutorial 1
6 pages
Bda Lab
No ratings yet
Bda Lab
4 pages
DFS Algorithm For Graph
No ratings yet
DFS Algorithm For Graph
4 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet

5.design and Analysis of Machine Learning Experiments

Uploaded by

5.design and Analysis of Machine Learning Experiments

Uploaded by

5.

Design and Analysis of

AI & ML Department of Computer Science & Engineering 1

TEXT BOOK REFERRED

AI & ML Department of Computer Science & Engineering 2

The theorem states that no single learning algorithm performs best

• Interpretability: The ease of extracting insights from the model.

• Ease of Deployment: Simplicity in programming and integration.

Example: A support vector machine (SVM) might be accurate but less

• Use proper statistical tests to validate significance between

• testing time and space complexity,

• interpretability, namely, whether the method allows knowledge ex

The relative importances of these factors change depending on the

• There are three basic principles of design which were developed by

• Confounding Variables: A variable that is connected to both the

• Imagine you're testing two different fertilizers on plants. You want

• Here Sunlight is the confounding factor.

2. how the data is to be collected,

3. how we are planning to analyze it.

You might also like