ML VN Unit1 1
ML VN Unit1 1
Varsha Nemade
Machine Learning
Varsha Nemade
Varsha Nemade
Broad Categories of ML
Varsha Nemade
Supervised learning
• Supervised
– In supervised learning, the training set we feed to the algorithm
includes the desired solutions, called labels
– Teacher/Teaching/Telling
– Training data is with Labels
• Classification
– Clarifying the output into categories/class/label
» Applications:-
• Classification of spam email or not spam
• Classifying Images(Yes/no)
• Disease Detection (Yes/No)
• Regression
– Predicting continuous output value
– X- independent Variable
– Y- dependent Variable
» Applications:-
• Predicting Price
• Predicting Income
Varsha Nemade
Example
• Linear Regression
• Logistic Regression (classification/regression)
• K Nearest Neighbors(classification/regression)
• Support Vector Machine (SVC- Classification, SVR-
Regression)
• Decision tree
• Naïve Bayes
• Random Forest
• Ensemble Learning
• Neural Networks
Varsha Nemade
• Advantages:
• Since supervised learning work with the labelled dataset so
we can have an exact idea about the classes of objects.
• These algorithms are helpful in predicting the output on
the basis of prior experience.
• Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different
from the training data.
• It requires lots of computational time to train the
algorithm.
Varsha Nemade
Unsupervised learning
Varsha Nemade
Applications
• Network Analysis: Unsupervised learning is used for identifying
plagiarism and copyright in document network analysis of text data
for scholarly articles.
• Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation
applications for different web applications and e-commerce
websites.
• Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points
within the dataset. It is used to discover fraudulent transactions.
• Singular Value Decomposition: Singular Value Decomposition or
SVD is used to extract particular information from the database. For
example, extracting information of each user located at a particular
location.
Varsha Nemade
• Advantages:
• These algorithms can be used for complicated tasks compared to
the supervised ones because these algorithms work on the
unlabeled dataset.
• Unsupervised algorithms are preferable for various tasks as getting
the unlabeled dataset is easier as compared to the labelled dataset.
• Disadvantages:
• The output of an unsupervised algorithm can be less accurate as
the dataset is not labelled, and algorithms are not trained with the
exact output in prior.
• Working with Unsupervised learning is more difficult as it works
with the unlabelled dataset that does not map with the output.
Varsha Nemade
Semi-supervised Learning
• Semi-Supervised learning is a type of Machine Learning
algorithm that lies between Supervised and Unsupervised
machine learning
• The main aim of semi-supervised learning is to effectively use
all the available data, rather than only labelled data like in
supervised learning. Initially, similar data is clustered along
with an unsupervised learning algorithm, and further, it helps
to label the unlabeled data into labelled data. It is because
labelled data is a comparatively more expensive acquisition
than unlabeled data.
Varsha Nemade
• We can imagine these algorithms with an example.
Supervised learning is where a student is under the
supervision of an instructor at home and college.
Further, if that student is self-analysing the same
concept without any help from the instructor, it
comes under unsupervised learning. Under semi-
supervised learning, the student has to revise himself
after analyzing the same concept under the guidance
of an instructor at college.
Varsha Nemade
Advantages and disadvantages of
Semi-supervised Learning
• Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• It is used to solve drawbacks of Supervised and
Unsupervised Learning algorithms.
• Disadvantages:
• Iterations results may not be stable.
• We cannot apply these algorithms to network-
level data.
• Accuracy is low.
Varsha Nemade
Reinforcement Learning
Varsha Nemade
– Giving Rewards & updating policy
– Mario Game is example
– It is used by robots to learn how to walk.
Varsha Nemade
Advantages and Disadvantages of
Reinforcement Learning
• Advantages
• It helps in solving complex real-world problems which are
difficult to be solved by general techniques.
• The learning model of RL is similar to the learning of human
beings; hence most accurate results can be found.
• Helps in achieving long term results.
• Disadvantage
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload
of states which can weaken the results.
Varsha Nemade
Curse of dimensionality
• Handling the high-dimensional data is very difficult in
practice, commonly known as the curse of
dimensionality. If the dimensionality of the input
dataset increases, any machine learning algorithm and
model becomes more complex. As the number of
features increases, the number of samples also gets
increased proportionally, and the chance of overfitting
also increases. If the machine learning model is trained
on high-dimensional data, it becomes overfitted and
results in poor performance.
• Hence, it is often required to reduce the number of
features, which can be done with dimensionality
reduction.
Varsha Nemade
Benefits of applying Dimensionality Reduction
• By reducing the dimensions of the features, the
space required to store the dataset also gets
reduced.
• Less Computation training time is required for
reduced dimensions of features.
• Reduced dimensions of features of the dataset
help in visualizing the data quickly.
• It removes the redundant features (if present) by
taking care of multicollinearity.
Varsha Nemade
• Disadvantages of dimensionality Reduction
• Some information is lost, possibly degrading the
performance of subsequent training algorithms.
• It makes the independent variables less
interpretable.
• In the PCA dimensionality reduction technique,
sometimes the principal components required to
consider are unknown.
Varsha Nemade
Varsha Nemade
Model Selection
• selection is the process of selecting the best
one by comparing and validating with various
parameters and choosing the final one
We have to
compare the
relative
performance
between more than
two models for the
given and cleaned
data set”
Varsha Nemade
for model selection bias and variance are important factors.
During the model selection, we are supposed
to get ready with the required sufficient data
in hand. In an ideal situation, we have to split
the data into three different sets of data
Training set
Used to fit the models
Validation set
Used to Estimate the prediction of error
for the model
Test set
Used for Assessment of the
Generalization error
Once all the above process flow has been
completed the final model could be select
from the list of model
Varsha Nemade
Bias: Bias is an error that has been introduced in our model due to the
oversimplification of used the machine learning algorithm. The basic problem here
is that the algorithm is not strong enough to capture the patterns or trends in the
fine-tuned data set. The root cause for this error is when the data is too complex for
the algorithm to understand. so it ends up with low accuracy and this leads
to underfitting the model.
Variance: Variance is an error that has been introduced in our model due to the
selection of a complex machine learning algorithm(s), with high noise in the given
dataset, resulting in high sensitivity and overfitting. You can observe that the
performs of the model is well on the training dataset but poor performance on the
testing dataset.
Varsha Nemade
Types of Model Selection
There are 2 major techniques in model selection, as mentioned earlier this
is a mathematical model and patterns are extracted from the given
dataset.
Resampling
Probabilistic
Varsha Nemade
Resampling: These are simple techniques just rearranging data samples and
inspecting that the model performs good or bad with the data set.
Varsha Nemade
No free lunch theorem
• The No Free Lunch Theorem is often thrown around in the field of
optimization and machine learning, often with little understanding of what
it means or implies.
• The theorem states that all optimization algorithms perform equally well
when their performance is averaged across all possible problems.
• It implies that there is no single best optimization algorithm. Because of the
close relationship between optimization, search, and machine learning, it
also implies that there is no single best machine learning algorithm for
predictive modeling problems such as classification and regression.
• The no free lunch theorem suggests the performance of all optimization
algorithms are identical, under some specific constraints.
• There is provably no single best optimization algorithm or machine
learning algorithm.
• The practical implications of the theorem may be limited given we are
interested in a small subset of all possible objective functions.
Varsha Nemade
Varsha Nemade