ML Unit-1
ML Unit-1
Unit – I
Introduction- Artificial Intelligence, Machine Learning, Deep learning, Types of Machine Learning
Systems, Main Challenges of Machine Learning. Statistical Learning: Introduction, Supervised and
Unsupervised Learning, Training and Test Loss, Trade-offs in Statistical Learning, Estimating Risk
Statistics, Sampling distribution of an estimator, Empirical Risk Minimization.
CODETREE.GRPHY.COM
Machine Learning:
• Machine learning is a growing technology which enables computers to learn automatically from
past data.
• Machine learning uses various algorithms for building mathematical models and making predictions
using historical data or information.
• Currently, it is being used for various tasks such as image recognition, speech recognition, email
filtering, Facebook auto-tagging, recommender system, and many more.
Arthur Samuel
• The term machine learning was first introduced by Arthur Samuel in 1959. We can define it in a
summarized way as:
•Machine learning enables a machine to automatically learn from data, improve performance
from experiences, and predict things without being explicitly programmed.
CODETREE.GRPHY.COM
Deep Learning:
• Deep learning is based on the branch of machine learning, which is a subset of
artificial intelligence.
• Since neural networks imitate the human brain and so deep learning will do. In deep
learning, nothing is programmed explicitly.
• Basically, it is a machine learning class that makes use of numerous nonlinear
processing units so as to perform feature extraction as well as transformation.
• IDEA: Deep learning is implemented with the help of Neural Networks, and the idea
behind the motivation of Neural Network is the biological neurons, which is nothing
but a brain cell.
•Deep learning is a collection of statistical techniques of machine learning for learning
feature hierarchies that are actually based on artificial neural networks.
• Example of Deep Learning:
CODETREE.GRPHY.COM
TOPIC-2: Types of Machine Learning Systems
There are so many different types of Machine Learning systems that it is useful to classify them in broad
categories, based on the following criteria:
CODETREE.GRPHY.COM
1. Whether or not they are trained with human supervision (supervised, unsupervised, semi supervised, and
Reinforcement Learning)
2. Whether or not they can learn incrementally on the fly (online versus batch learning)
3.Whether they work by simply comparing new data points to known data points, or instead by detecting
patterns in the training data and building a predictive model, much like scientists do (instance-based versus
model-based learning).
1. Supervised Machine Learning: As its name suggests, supervised machine learning is based on
supervision.
• It means in the supervised learning technique, we train the machines using the "labelled" dataset,
and based on the training, the machine predicts the output.
• The main goal of the supervised learning technique is to map the input variable(x) with the output
variable(y). Some real-world applications of supervised learning are Risk Assessment, Fraud
Detection, Spam filtering, etc.
Categories of Supervised Machine Learning:
• Supervised machine learning can be classified into two types of problems, which are given below:
• Classification
• Regression
Classification: Classification algorithms are used to solve the classification problems in which the output
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc.
• Some real-world examples of classification algorithms are Spam Detection, Email filtering, etc.
Some popular classification algorithms are given below:
• Random Forest Algorithm
CODETREE.GRPHY.COM
Advantages and Disadvantages of Supervised Learning:
Advantages:
• Since supervised learning work with the labelled dataset so we can have an exact idea about the
classes of objects.
• These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different from the training data.
• It requires lots of computational time to train the algorithm.
Unsupervised Learning can be further classified into two types, which are given below:
• Clustering
• Association
1) Clustering:
• The clustering technique is used when we want to find the inherent groups from the data.
• It is a way to group the objects into a cluster such that the objects with the most similarities remain
in one group and have fewer or no similarities with the objects of other groups.
• An example of the clustering algorithm is grouping the customers by their purchasing behavior.
CODETREE.GRPHY.COM
2) Association:
• Association rule learning is an unsupervised learning technique, which finds interesting relations
among variables within a large dataset.
• The main aim of this learning algorithm is to find the dependency of one data item on another data
item and map those variables accordingly so that it can generate maximum profit.
• Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth
algorithm.
Disadvantages:
• The output of an unsupervised algorithm can be less accurate as the dataset is not labelled, and
algorithms are not trained with the exact output in prior.
• Working with Unsupervised learning is more difficult as it works with the unlabeled dataset that
does not map with the output.
3. Semi-Supervised Learning:
• Semi-Supervised learning is a type of Machine Learning algorithm that lies between
Supervised and Unsupervised machine learning.
• It represents the intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the combination of
labelled and unlabeled datasets during the training period.
To overcome the drawbacks of supervised learning and unsupervised learning algorithms, the
concept of Semi-supervised learning is introduced.
• We can imagine these algorithms with an example. Supervised learning is where a student is under
the supervision of an instructor at home and college.
• Further, if that student is self- analyzing the same concept without any help from the instructor, it
comes under unsupervised learning.
• Under semi-supervised learning, the student has to revise himself after analyzing the same concept
under the guidance of an instructor at college.
Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• It is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.
CODETREE.GRPHY.COM
Disadvantages:
• Iterations results may not be stable.
• We cannot apply these algorithms to network-level data.
• Accuracy is low.
4. Reinforcement Learning:
• Reinforcement learning works on a feedback-based process, in which an AI agent (A software
component) automatically explore its surrounding by hitting & trail, taking action, learning
from experiences, and improving its performance.
• Agent gets rewarded for each good action and get punished for each bad action; hence the goal of
reinforcement learning agent is to maximize the rewards.
• In reinforcement learning, there is no labelled data like supervised learning, and agents learn from
their experiences only.
• The reinforcement learning process is similar to a human being; for example, a child learns various
things by experiences in his day-to-day life.
• An example of reinforcement learning is to play a game, where the Game is the environment, moves
of an agent at each step define states, and the goal of the agent is to get a high score.
• Agent receives feedback in terms of punishment and rewards.
• Due to its way of working, reinforcement learning is employed in different fields such as Game
theory, Operation Research, Information theory, multi-agent systems.
Categories of Reinforcement Learning:
• Reinforcement learning is categorized mainly into two types of methods/algorithms:
CODETREE.GRPHY.COM
Real-world Use cases of Reinforcement Learning
• Video Games
• Robotics
• Text Mining
4) Talent Deficit
Albeit numerous individuals are pulled into the ML business, however, there are still not many experts who
can take complete control of this innovation.
5) Implementation
Organizations regularly have examination engines working with them when they decide to move up to ML.
The usage of fresher ML strategies with existing procedures is a complicated errand.
6) Making The Wrong Assumptions
ML models can’t manage datasets containing missing data points. Thus, highlights that contain a huge part
of missing data should be erased.
7) Deficient Infrastructure
ML requires a tremendous amount of data stirring abilities. Inheritance frameworks can’t deal with the
responsibility and clasp under tension.
CODETREE.GRPHY.COM
10) Customer Segmentation
Let us consider the data of human behaviour by a user during a time for testing and the relevant previous
practices. All things considered, an algorithm is necessary to recognize those customers that will change over to
the paid form of a product and those that won’t.
Neural Networks
Naive Bayesian Model
Classification
Support Vector Machines
Regression
Random Forest Model
11) Complexity
Although Machine Learning and Artificial Intelligence are booming, a majority of these sectors are still in
their experimental phases, actively undergoing a trial and error method.
12) Slow Results
Another one of the most common issues in Machine Learning is the slow-moving program. The Machine
Learning Models are highly efficient bearing accurate results but the said results take time to be produced.
13) Maintenance
Requisite results for different actions are bound to change and hence the data needed for the same is
different.
CODETREE.GRPHY.COM
TOPIC-4 Statistical Learning: Introduction
• Structuring and visualizing data are important aspects of data science, the main challenge lies in the
mathematical analysis of the data.
• When the goal is to interpret the model and quantify the uncertainty in the data, this analysis is
usually referred to as statistical learning.
There are two major goals for modeling data:
• 1) to accurately predict some future quantity of interest, given some observed data, and
• 2) To discover unusual or interesting patterns in the data.
• In contrast, regression when y can only lie in a finite set, say y ∈ {0. . . c − 1}, then predicting y is
conceptually the same as classifying the input x into one of c categories, and so prediction becomes
a classification problem.
• loss function:
• We can measure the accuracy of a prediction by with respect to a given response y by loss function
using some Loss(y,y’).
• In a regression setting the usual choice is the squared error loss (y−y’) 2 .
CODETREE.GRPHY.COM
TOPIC-6 Training and Test Loss:
CODETREE.GRPHY.COM
TOPIC-7 Tradeoffs in Statistical Learning:
CODETREE.GRPHY.COM
TOPIC-8 Estimating Risk:
1. IN-SAMPLE RISK:
2. CROSS-VALIDATION
CODETREE.GRPHY.COM
TOPIC-9 Sampling distributions of estimators
Since our estimators are statistics (particular functions of random variables), their distribution can be
derived from the joint distribution of X1 . . . Xn.
It is called the sampling distribution because it is based on the joint distribution of the random sample.
-Given a sampling distribution, we can – calculate the probability that an estimator will not differ
from the parameter θ by more than a specified amount
– obtain interval estimates rather than point estimates after we have a sample
- An interval estimate is a random interval such that the true parameter lies within this interval with
a given probability (say 95%).
– Choose between to estimators- we can, for instance, calculate the mean-squared error of the
estimator, Eθ[(θˆ − θ) 2 ] using the distribution of θˆ.
Sampling distributions of estimators depend on sample size, and we want to know exactly how the
distribution changes as we change this size so that we can make the right trade-offs between cost and
accuracy.
CODETREE.GRPHY.COM
The ERM is a nice idea, if used with care
The plot below shows a regression problem with a training set of 15 points.
The ERM principle is an inference principle which consists in finding the model f^ by minimizing
the empirical risk:
f^= arg minf:X→Y Remp(h)
where the empirical risk is an estimate of the risk computed as the average of the loss function over
the training sample D={(Xi,Yi)}Ni=1:
Remp(f)=1N∑i=1Nℓ(f(Xi),Yi)
with the loss function ℓ.
CODETREE.GRPHY.COM