0% found this document useful (0 votes)

25 views27 pages

ML VN Unit1 1

Uploaded by

Ridhima Gautam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views27 pages

ML VN Unit1 1

Uploaded by

Ridhima Gautam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

UNIT 1

Overview of Machine Learning: What is machine learning, Why

Machine learning, Broad categories of machine learning
algorithms- supervised and unsupervised learning. Supervised
learning- classification and regression, unsupervised learning-
clustering. Real world applications of supervised and unsupervised
algorithms. Parametric vs. non parametric models, Curse of
dimensionality, model selection, No free lunch theorem

Varsha Nemade
Machine Learning

• The field of study interested in the development of

computer algorithms for transforming data into
intelligent action is known as machine learning.
• This field originated in an environment where the
available data, statistical methods, and computing
power rapidly and simultaneously evolved
• A closely related sibling of machine learning, data
mining

Varsha Nemade
Varsha Nemade
Broad Categories of ML

Varsha Nemade
Supervised learning
• Supervised
– In supervised learning, the training set we feed to the algorithm
includes the desired solutions, called labels
– Teacher/Teaching/Telling
– Training data is with Labels
• Classification
– Clarifying the output into categories/class/label
» Applications:-
• Classification of spam email or not spam
• Classifying Images(Yes/no)
• Disease Detection (Yes/No)
• Regression
– Predicting continuous output value
– X- independent Variable
– Y- dependent Variable
» Applications:-
• Predicting Price
• Predicting Income

Varsha Nemade
 Example
• Linear Regression
• Logistic Regression (classification/regression)
• K Nearest Neighbors(classification/regression)
• Support Vector Machine (SVC- Classification, SVR-
Regression)
• Decision tree
• Naïve Bayes
• Random Forest
• Ensemble Learning
• Neural Networks

Varsha Nemade
• Advantages:
• Since supervised learning work with the labelled dataset so
we can have an exact idea about the classes of objects.
• These algorithms are helpful in predicting the output on
the basis of prior experience.
• Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different
from the training data.
• It requires lots of computational time to train the
algorithm.

Varsha Nemade
Unsupervised learning

– In unsupervised learning, as you might guess, the

training data is unlabeled.
– The system tries to learn without a teacher.
• Clustering
– It finds the group without our help.
» K Means
» K Medoid
» DBSCAN
» Hierarchical Cluster Analysis

Varsha Nemade
Applications
• Network Analysis: Unsupervised learning is used for identifying
plagiarism and copyright in document network analysis of text data
for scholarly articles.
• Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation
applications for different web applications and e-commerce
websites.
• Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points
within the dataset. It is used to discover fraudulent transactions.
• Singular Value Decomposition: Singular Value Decomposition or
SVD is used to extract particular information from the database. For
example, extracting information of each user located at a particular
location.

Varsha Nemade
• Advantages:
• These algorithms can be used for complicated tasks compared to
the supervised ones because these algorithms work on the
unlabeled dataset.
• Unsupervised algorithms are preferable for various tasks as getting
the unlabeled dataset is easier as compared to the labelled dataset.
• Disadvantages:
• The output of an unsupervised algorithm can be less accurate as
the dataset is not labelled, and algorithms are not trained with the
exact output in prior.
• Working with Unsupervised learning is more difficult as it works
with the unlabelled dataset that does not map with the output.

Varsha Nemade
Semi-supervised Learning
• Semi-Supervised learning is a type of Machine Learning
algorithm that lies between Supervised and Unsupervised
machine learning
• The main aim of semi-supervised learning is to effectively use
all the available data, rather than only labelled data like in
supervised learning. Initially, similar data is clustered along
with an unsupervised learning algorithm, and further, it helps
to label the unlabeled data into labelled data. It is because
labelled data is a comparatively more expensive acquisition
than unlabeled data.

Varsha Nemade
• We can imagine these algorithms with an example.
Supervised learning is where a student is under the
supervision of an instructor at home and college.
Further, if that student is self-analysing the same
concept without any help from the instructor, it
comes under unsupervised learning. Under semi-
supervised learning, the student has to revise himself
after analyzing the same concept under the guidance
of an instructor at college.

Varsha Nemade
Advantages and disadvantages of
Semi-supervised Learning
• Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• It is used to solve drawbacks of Supervised and
Unsupervised Learning algorithms.
• Disadvantages:
• Iterations results may not be stable.
• We cannot apply these algorithms to network-
level data.
• Accuracy is low.
Varsha Nemade
Reinforcement Learning

• Reinforcement learning works on a feedback-

based process, in which an AI agent (A software
component) automatically explore its surrounding
by hitting & trail, taking action, learning from
experiences, and improving its performance.
• Agent gets rewarded for each good action and get
punished for each bad action; hence the goal of
reinforcement learning agent is to maximize the
rewards.

Varsha Nemade
– Giving Rewards & updating policy
– Mario Game is example
– It is used by robots to learn how to walk.

Varsha Nemade
Advantages and Disadvantages of
Reinforcement Learning
• Advantages
• It helps in solving complex real-world problems which are
difficult to be solved by general techniques.
• The learning model of RL is similar to the learning of human
beings; hence most accurate results can be found.
• Helps in achieving long term results.
• Disadvantage
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload
of states which can weaken the results.

Varsha Nemade
Curse of dimensionality
• Handling the high-dimensional data is very difficult in
practice, commonly known as the curse of
dimensionality. If the dimensionality of the input
dataset increases, any machine learning algorithm and
model becomes more complex. As the number of
features increases, the number of samples also gets
increased proportionally, and the chance of overfitting
also increases. If the machine learning model is trained
on high-dimensional data, it becomes overfitted and
results in poor performance.
• Hence, it is often required to reduce the number of
features, which can be done with dimensionality
reduction.
Varsha Nemade
Benefits of applying Dimensionality Reduction
• By reducing the dimensions of the features, the
space required to store the dataset also gets
reduced.
• Less Computation training time is required for
reduced dimensions of features.
• Reduced dimensions of features of the dataset
help in visualizing the data quickly.
• It removes the redundant features (if present) by
taking care of multicollinearity.

Varsha Nemade
• Disadvantages of dimensionality Reduction
• Some information is lost, possibly degrading the
performance of subsequent training algorithms.
• It makes the independent variables less
interpretable.
• In the PCA dimensionality reduction technique,
sometimes the principal components required to
consider are unknown.

Varsha Nemade
Varsha Nemade
Model Selection
• selection is the process of selecting the best
one by comparing and validating with various
parameters and choosing the final one

We have to
compare the
relative
performance
between more than
two models for the
given and cleaned
data set”

Varsha Nemade
for model selection bias and variance are important factors.
During the model selection, we are supposed
to get ready with the required sufficient data
in hand. In an ideal situation, we have to split
the data into three different sets of data
Training set
Used to fit the models
Validation set
Used to Estimate the prediction of error
for the model
Test set
Used for Assessment of the
Generalization error
Once all the above process flow has been
completed the final model could be select
from the list of model

Varsha Nemade
Bias: Bias is an error that has been introduced in our model due to the
oversimplification of used the machine learning algorithm. The basic problem here
is that the algorithm is not strong enough to capture the patterns or trends in the
fine-tuned data set. The root cause for this error is when the data is too complex for
the algorithm to understand. so it ends up with low accuracy and this leads
to underfitting the model.
Variance: Variance is an error that has been introduced in our model due to the
selection of a complex machine learning algorithm(s), with high noise in the given
dataset, resulting in high sensitivity and overfitting. You can observe that the
performs of the model is well on the training dataset but poor performance on the
testing dataset.

Varsha Nemade
Types of Model Selection
There are 2 major techniques in model selection, as mentioned earlier this
is a mathematical model and patterns are extracted from the given
dataset.
Resampling
Probabilistic

Probabilistic measure -Probabilistic measures involve analytically scoring a

candidate model using both its performance on the training dataset and the
complexity of the model.
The model which has the lowest AIC, BIC, or highest adjusted R squared is
considered as the best model among all the candidate models.
Akaike Information Criterion (AIC)
Bayesian Information Criterion (BIC)
adjusted R squared

Varsha Nemade
Resampling: These are simple techniques just rearranging data samples and
inspecting that the model performs good or bad with the data set.

Varsha Nemade
No free lunch theorem
• The No Free Lunch Theorem is often thrown around in the field of
optimization and machine learning, often with little understanding of what
it means or implies.
• The theorem states that all optimization algorithms perform equally well
when their performance is averaged across all possible problems.
• It implies that there is no single best optimization algorithm. Because of the
close relationship between optimization, search, and machine learning, it
also implies that there is no single best machine learning algorithm for
predictive modeling problems such as classification and regression.
• The no free lunch theorem suggests the performance of all optimization
algorithms are identical, under some specific constraints.
• There is provably no single best optimization algorithm or machine
learning algorithm.
• The practical implications of the theorem may be limited given we are
interested in a small subset of all possible objective functions.

Varsha Nemade
Varsha Nemade

Applied ML Notes
No ratings yet
Applied ML Notes
123 pages
IPD Products For Caterpillar Spark Ignited Engines: The Standard For Quality, Innovation, Service and Support Since 1955
No ratings yet
IPD Products For Caterpillar Spark Ignited Engines: The Standard For Quality, Innovation, Service and Support Since 1955
52 pages
HFHDJSJWDJNDNDKWM
No ratings yet
HFHDJSJWDJNDNDKWM
81 pages
21 Reasons Kettlebells PDF
No ratings yet
21 Reasons Kettlebells PDF
4 pages
Mosdorfer Catalog Clamps
No ratings yet
Mosdorfer Catalog Clamps
44 pages
Lamprel Jebel Ali
No ratings yet
Lamprel Jebel Ali
2 pages
Operator'S Manual: AVI Survival Product, Inc. 1655 NW 136 Avenue, Bldg. M Sunrise, Florida, USA 33323
100% (1)
Operator'S Manual: AVI Survival Product, Inc. 1655 NW 136 Avenue, Bldg. M Sunrise, Florida, USA 33323
19 pages
03 Corpo Rigido-2d
No ratings yet
03 Corpo Rigido-2d
91 pages
2 Staad Analysis Output
No ratings yet
2 Staad Analysis Output
7 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
8 pages
Fluid Level Sensors in Oil & Gas
No ratings yet
Fluid Level Sensors in Oil & Gas
4 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
VBQ-XII - English Core - 2
No ratings yet
VBQ-XII - English Core - 2
25 pages
Machine Learning: Presentation
100% (2)
Machine Learning: Presentation
23 pages
CT 230
No ratings yet
CT 230
21 pages
100 Tareas Me Dejan en Ingles
No ratings yet
100 Tareas Me Dejan en Ingles
2 pages
Enterprise Structure
No ratings yet
Enterprise Structure
4 pages
The Classical Civilization of Greece and Rome.
No ratings yet
The Classical Civilization of Greece and Rome.
8 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
THermit Rialtech - Instruction
No ratings yet
THermit Rialtech - Instruction
17 pages
Summer Internship Report
No ratings yet
Summer Internship Report
27 pages
Machine Learning A Basic Approach
No ratings yet
Machine Learning A Basic Approach
9 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
BBE Fiitjee
No ratings yet
BBE Fiitjee
46 pages
L3 - Supervised and Unsupervised Learning
100% (3)
L3 - Supervised and Unsupervised Learning
24 pages
Machine Learning and Web Scraping Lesson02
No ratings yet
Machine Learning and Web Scraping Lesson02
29 pages
Unit IV - Learning
No ratings yet
Unit IV - Learning
18 pages
AI Assignment 2
No ratings yet
AI Assignment 2
5 pages
2 - Types of Machine Learning
No ratings yet
2 - Types of Machine Learning
26 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
5th Sem Report
No ratings yet
5th Sem Report
29 pages
Module1 And2
No ratings yet
Module1 And2
122 pages
Changes in Jump Performance and Muscle Activity Following Soccer-Specific Exercise
No ratings yet
Changes in Jump Performance and Muscle Activity Following Soccer-Specific Exercise
9 pages
Module 1
No ratings yet
Module 1
22 pages
Machine Learning Is The Branch of
No ratings yet
Machine Learning Is The Branch of
12 pages
ML Unit-1
No ratings yet
ML Unit-1
39 pages
Module 1
No ratings yet
Module 1
122 pages
Unit 2
No ratings yet
Unit 2
63 pages
An Overview of Machine Learning
No ratings yet
An Overview of Machine Learning
20 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
20 pages
CPCS335 - Chapter 8-Final
No ratings yet
CPCS335 - Chapter 8-Final
23 pages
Deep Learning Part 2
No ratings yet
Deep Learning Part 2
16 pages
ZHAO - Variability of Surface Heat Fluxes and Its Driving Forces at Different Time Scales Over A Large Ephemeral Lake in China - 2018
No ratings yet
ZHAO - Variability of Surface Heat Fluxes and Its Driving Forces at Different Time Scales Over A Large Ephemeral Lake in China - 2018
19 pages
CH8568DOCSIS 3.1 Wireless Voice Gateway
No ratings yet
CH8568DOCSIS 3.1 Wireless Voice Gateway
3 pages
Unit 5 PPT
No ratings yet
Unit 5 PPT
32 pages
Module 1 Notes
No ratings yet
Module 1 Notes
56 pages
702 - Sample Assignment
No ratings yet
702 - Sample Assignment
20 pages
Randomized Controlled Trials
100% (1)
Randomized Controlled Trials
9 pages
Module 1 Notes
No ratings yet
Module 1 Notes
38 pages
UNIT4
No ratings yet
UNIT4
12 pages
ML R20 Material
No ratings yet
ML R20 Material
96 pages
Machine Learning
No ratings yet
Machine Learning
35 pages
Is Brain A Good Model For AI
No ratings yet
Is Brain A Good Model For AI
2 pages
Python Final Internship Report
No ratings yet
Python Final Internship Report
29 pages
4.introduction To Learning - Unit 2
No ratings yet
4.introduction To Learning - Unit 2
8 pages
Math Class KGII
No ratings yet
Math Class KGII
3 pages
Machine Learning (AI)
No ratings yet
Machine Learning (AI)
19 pages
What Is Machine Learning-UNIT III
No ratings yet
What Is Machine Learning-UNIT III
12 pages
Tugas Inggris Ridwan TaufikC1B230115 An23 Kls Pesantren
No ratings yet
Tugas Inggris Ridwan TaufikC1B230115 An23 Kls Pesantren
5 pages
Miraña Genus Aeromonas
No ratings yet
Miraña Genus Aeromonas
1 page
Mathematical Modeling of A Battery Energy Storage System in Grid Forming Mode
No ratings yet
Mathematical Modeling of A Battery Energy Storage System in Grid Forming Mode
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
78 pages
Unit 1
No ratings yet
Unit 1
19 pages
The Geisha Memory 2
No ratings yet
The Geisha Memory 2
25 pages
Bhumika Kasar
No ratings yet
Bhumika Kasar
1 page
Ain3001 - 01.3 - ML - Fast.tutorial
No ratings yet
Ain3001 - 01.3 - ML - Fast.tutorial
58 pages
Unit 01
No ratings yet
Unit 01
32 pages
Machine Learning
No ratings yet
Machine Learning
56 pages
Supervised Learning
No ratings yet
Supervised Learning
19 pages
Machine Learning: Understanding The Basics of Machine Learning and Its Applications
No ratings yet
Machine Learning: Understanding The Basics of Machine Learning and Its Applications
24 pages
Data Science-Unit-4 - 05.10.23
No ratings yet
Data Science-Unit-4 - 05.10.23
59 pages
AIML Super, UnSuper
No ratings yet
AIML Super, UnSuper
3 pages
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
No ratings yet
AI Unit4 Learning Dd83e0ee 7d19 48c7 Bc5d B39decf3b0fc
19 pages
Ad8552 ML Unit I
No ratings yet
Ad8552 ML Unit I
31 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
73 pages
Deep Learnng IA
No ratings yet
Deep Learnng IA
69 pages
Free API
No ratings yet
Free API
3 pages
4 Transpiration
No ratings yet
4 Transpiration
15 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
32 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Mod 1
No ratings yet
Mod 1
15 pages
Unit 1
No ratings yet
Unit 1
24 pages
AAI Lecture 9 SP 25
No ratings yet
AAI Lecture 9 SP 25
26 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
LKSK ML typesToStudents
No ratings yet
LKSK ML typesToStudents
18 pages
Module IV - Machine Learning
No ratings yet
Module IV - Machine Learning
53 pages
Ml-Unit 1
No ratings yet
Ml-Unit 1
53 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

ML VN Unit1 1

Uploaded by

ML VN Unit1 1

Uploaded by

UNIT 1

Overview of Machine Learning: What is machine learning, Why

• The field of study interested in the development of

– In unsupervised learning, as you might guess, the

• Reinforcement learning works on a feedback-

Probabilistic measure -Probabilistic measures involve analytically scoring a

You might also like