Chap-6 Machine Learning Introduction
Chap-6 Machine Learning Introduction
Machine Learning
Introduction
1
Machine Learning Introduction
2
Machine Learning Introduction
3
Why Machine Learning?
4
Machine Learning
Concepts & Dimensions of
Machine Learning
5
Concepts of Learning
Learning = Improve Task “T” with respect to performance
measure “P” based on experience “E”
6
Concepts of Learning
Example: Signature matching
• T: Determine if signature belongs to correct person
• P: % of signatures that were correctly matched, %
of valid signatures that were incorrectly labelled as
not matching
• E: Database of signatures known to be of that
person
7
Dimensions Of Learning Systems
8
Dimensions Of Learning Systems
9
Dimensions Of Learning Systems
10
Machine Learning
The Learning Process
11
12
The learning Process
Normalization is used when the data doesn't have Gaussian distribution whereas
Standardization is used on data having Gaussian distribution.
Normalization scales in a range of [0, 1] or [-1, 1]. Standardization is not bounded by range.
Normalization is highly affected by outliers. Standardization is slightly affected by outliers.
Normalization is considered when the algorithms do not make assumptions about the data
distribution. Standardization is used when algorithms make assumptions about the data
distribution. 13
The learning Process
Normalization is used when the data doesn't have Gaussian distribution whereas
Standardization is used on data having Gaussian distribution.
Normalization scales in a range of [0, 1] or [-1, 1]. Standardization is not bounded by range.
Normalization is highly affected by outliers. Standardization is slightly affected by outliers.
Normalization is considered when the algorithms do not make assumptions about the data
distribution. Standardization is used when algorithms make assumptions about the data
distribution. 14
The learning Process
15
Bagging The learning Process
Bagging attempts to reduce the chance overfitting
complex models.
Testing Predict
Data Model
(future)
Steps:
• Gather Data from various sources
• Clean data to have homogeneity
• Build model (select the right Machine Learning
algorithm)
• Gather insights from the model’s results
• Visualize – transform results into visual graphs
20
Performance Evaluation
21
Problems - Overfitting & Underfitting
22
Machine Learning
Categorization of Machine
Learning
23
Categorization of Machine Learning
Machine
Learning
25
Data Set for Supervised Learning
26
Data Set for Unsupervised Learning
27
Machine Learning Coordinates
Discrete
Classification or Categorization Clustering
Continuous
Regression Dimensionality Reduction
28
Machine Learning Coordinates
Discrete
Classification or Categorization Clustering
Continuous
Regression Dimensionality Reduction
30
Classification & Clustering
Savings
Low-Risk
IF Income > Ѳ1
AND Savings > Ѳ2
THEN
High-Risk
Low-Risk
ELSE
High-Risk
Income
31
Regression
• A regression problem is when the output variable is a real
value, such as “dollars” or “weight” instead of a class.
• Estimate the relationship between a dependent variable and
one or more independent variables (or 'predictors’)
y: sales figures
x
factors like screen size, display x y‘ = w0 + w1x1
x
type, brand, resolution, x x
x
technology etc. x x
Here, we consider just one
attribute, screen size and plot
the corresponding prices
x: screen size
y: sales figures
We try to find the relation X: screen size
(function) that best matches
these values
32
Dimensionality Reduction
• Dimensionality reduction is the process of reducing the
number of random variables under consideration by
obtaining a set of principal variables. It can be divided into
feature selection and feature extraction.
33
Machine Learning
Machine Learning
Applications
34
Applications of Machine Learning
35
Classification Applications
Face Recognition
• Identify or verify a person from a digital image or a
video frame
Character Recognition
Spam detection
Medical Diagnosis
• Determine which disease or condition explains a
person's symptoms and signs.
Biometrics
• Authentication using physical and/or behavioral
characteristics: Face, iris, signature, etc
36
Regression Applications
Epidemiology
• incidence, distribution, and possible control of
diseases and other factors relating to health
37
Manufacturing & Retail Industries
Manufacturing
Retail
• Predictive inventory planning
• Recommendation engines
• Upsell & cross-channel marketing
• Market segmentation & targeting
• Customer ROI & lifetime value
38
Healthcare & Life Science & Travel & Hospitality
• Aircraft scheduling
• Dynamic pricing
• Social media – consumer feedback & interaction analysis
• Customer complaint resolution
• Traffic patterns & congestion management
39
Financial Services & Energy, Feedstock & Utilities
Financial Services
40
Machine Learning
Machine Learning
Algorithms
41
Machine Learning Algorithms
Decision Trees
• A decision tree takes as input an object or situation
described by a set of properties and outputs a yes/no
decision.
• Each decision node tests the value of an input attribute.
• Branches from the node are all possible values of the
attribute
• Leaf nodes supply the value (Yes/No) to be returned if
that leaf is reached.
Criteria used to choose the best nodes to build the most
precise decision tree:
• Entropy - degree of disorganization in our data.
Entropy is 1 when collection has equal no. of
positive and negative examples
• Information Gain is used to determine the
goodness of a split. The attribute with the most
entropy reduction is chosen.
43
Example: Decision Tree
• Decision Tree on whether to buy a Mobile Phone or not
Price of Mobile
<10000 >10000
No Yes Yes No No
44
Support Vector Machines (SVM)
Y
Support Vectors
X
• Map data to higher-dimensional space where they will
be linearly separable.
• Algorithm - plot each data item as a point in n-
dimensional space (where n is number of features) with
the value of each feature being the value of a particular
coordinate.
• Then perform classification into 2 classes by finding the
hyper-plane that differentiates the classes very well
• Support Vectors are the co-ordinates of the individual
observations. Support Vector Machine is a frontier
which best segregates the two classes (hyper-plane/
line).
45
Bayesian Networks
46
K Nearest Neighbor Model (k-NN)
• Idea: Properties of an input x are likely to be similar to
those of points in the neighborhood of x
• Find (k) nearest neighbor(s) of x and infer target
attribute value(s) of x based on corresponding attribute
value(s).
• In k-NN classification, the output is a class membership.
An object is assigned to the class most common among
its k nearest neighbors.
• In k-NN regression, the output is the property value for
the object. This value is the average of the values of its k
nearest neighbors.
• To determine which of the K instances in the training
dataset are most similar to a new input a distance
measure is used. There can be various types of distance
measures like Euclidean, Hamming, Manhattan etc.
47
Ensemble Learning
48
Deep Learning (Neural Networks)
Subset of machine learning and covers all three
paradigms using artificial neural networks (ANNs)
49