0% found this document useful (0 votes)
35 views50 pages

ML Module2-Chapter 1

Artificial Intelligence & Machine learning Module 2 ppt according to 21 series vtu syllabus

Uploaded by

ashwiniiseait
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views50 pages

ML Module2-Chapter 1

Artificial Intelligence & Machine learning Module 2 ppt according to 21 series vtu syllabus

Uploaded by

ashwiniiseait
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Machine Learning

Introduction to
Machine Learning
NEED FOR MACHINE LEARNING

• Business organizations use huge amount of data for


their daily activities.
• Earlier, the full potential of this data was not utilized
due to two reasons.
1. One reason was data being scattered across different
archive systems and organizations not being able to
integrate these sources fully.
2. Secondly, the lack of awareness about software tools that
could help to unearth the useful information from data.
• Business organizations have now started to use the
latest technology, machine learning, for this purpose.
• Machine learning has become so popular because of three
reasons:
1. High volume of available data to manage: Big companies such
as Facebook, Twitter, and YouTube generate huge amount of
data that grows at a phenomenal rate. It is estimated that the
data approximately gets doubled every year.
2. The cost of storage has reduced. The hardware cost has also
dropped. Therefore, it is easier now to capture, process,
store, distribute, and transmit the digital information.
3. The availability of complex algorithms now. Especially with
the advent of deep learning, many algorithms are available
for machine learning.
• Data: All facts are data. Data can be numbers or text that can be
processed by a computer. Today, organizations are accumulating vast
and growing amounts of data with data sources such as flat files,
databases, or data warehouses in different storage formats.
• Information: Processed data is called information. This includes
patterns, associations, or relationships among data. For example,
sales data can be analyzed to extract information like which is the
fast selling product.
• Knowledge: Condensed information is called knowledge. For
example, the historical patterns and future trends obtained in the
above sales data can be called knowledge. Unless knowledge is
extracted, data is of no use. Similarly, knowledge is not useful unless
it is put into action.
• Intelligence: An actionable form of knowledge is called intelligence.
Computer systems have been successful till this stage.

• Wisdom: The ultimate objective of knowledge pyramid is wisdom


that represents the maturity of mind that is, so far, exhibited only
by humans.

• Here comes the need for machine learning. The objective of


machine learning is to process these archival data for organizations
to take better decisions to design new products, improve the
business processes, and to develop effective decision support
systems.
MACHINE LEARNING EXPLAINED
• Machine learning is an important sub-branch of
Artificial Intelligence (AI).
• “Machine learning is the field of study that gives the
computers ability to learn without being explicitly
programmed.” -Arthur Samuel
• The systems should learn by itself without explicit
programming.
• The artificial intelligence aims to understand the problems and develop

general purpose rules manually.

• Then, these rules are formulated into logic and implemented in a

program to create intelligent systems.

• This idea of developing intelligent systems by using logic and reasoning

by converting an expert’s knowledge into a set of rules and programs is

called an expert system.

• Ex: MYCIN (designed for medical diagnosis by expert knowledge of

many doctors into a system. This approach did not progress as

programs lacked real intelligence. )


• As humans take decisions based on an experience, computers
make models based on extracted patterns in the input data
and then use these data-filled models for prediction and to
take decisions. For computers, the learnt model is equivalent
to human experience. This is shown in Figure 1.2.
• y = f(x)
• Here, f is the learning function that maps the input x to output
y.

• The learning program summarizes the raw data in a model. A


model is an explicit description of patterns within the data in
the form of:
– Mathematical equation
– Relational diagrams like trees/graphs
– Logical if/else rules, or
– Groupings called clusters

• In summary, a model can be a formula, procedure or


representation that can generate data decisions.
• “A computer program is said to learn from experience E, with

respect to task T and some performance measure P, if its

performance on T measured by P improves with experience E.”-Tom

Mitchell

• For example, the task T could be detecting an object in an image.

The machine can gain the knowledge of object using training

dataset of thousands of images. This is called experience E. So, the

focus is to use this experience E for this task of object detection T.

The ability of the system to detect the object is measured by

performance measures like precision and recall.


In systems, experience is gathered by these steps:
• Collection of data:
• Once data is gathered, abstract concepts are formed out of that data.
Abstraction is used to generate concepts. For example: we have some
idea about how an elephant looks like.
• Generalization converts the abstraction into an actionable form of
intelligence. It can be viewed as ordering of all possible concepts. So,
generalization involves ranking of concepts, inferencing from them and
formation of heuristics, an actionable aspect of intelligence. Heuristics
are educated guesses for all tasks.
• Heuristics normally works! But, occasionally, it may fail too. It is not the
fault of heuristics as it is just a ‘rule of thumb′. The course correction is
done by taking evaluation measures.
MACHINE LEARNING IN RELATION TO OTHER FIELDS

1. Machine Learning and Artificial Intelligence


• Machine learning is an important branch of AI, which is a much broader
subject.
• The aim of AI is to develop intelligent agents. An agent can be a robot,
humans, or any autonomous systems.
Data Driven Systems: (Machine Learning)
• The aim is to find relations and regularities present in the data.
Machine learning is the subbranch of AI, whose aim is to
extract the patterns for prediction.

Deep learning
• Deep learning is a subbranch of machine learning. In deep
learning, the models are constructed using neural network
technology. Neural networks are based on the human neuron
models. Many neurons form a network connected with the
activation functions that trigger further neurons to perform
tasks.
2. Machine Learning, Data Science, Data Mining, and Data
Analytics
• Data science is an ‘Umbrella’ term that encompasses many
fields.
• Data science and machine learning are interlinked.
• Data science deals with gathering of data for analysis.
• It is a broad field that includes:
1. Big Data
2. Data Mining
3. Data Analytics
4. Pattern Recognition
Big Data
• Data science concerns about collection of data. Big data is a
field of data science that deals with data’s following
characteristics:
– Volume: Huge amount of data is generated by big
companies like Facebook, Twitter, YouTube.
– Variety: Data is available in variety of forms like images,
videos, and in different formats.
– Velocity: It refers to the speed at which the data is
generated and processed.
• Big data is used by many machine learning algorithms for
applications such as language translation and image
recognition
Data Mining
• Data mining’s original genesis is in the business.
• Like while mining the earth one gets into precious
resources, it is often believed that unearthing of the
data produces hidden information that otherwise
would have eluded the attention of the management.
• Nowadays, many consider that data mining and
machine learning are same.
• There is no difference between these fields except that
data mining aims to extract the hidden patterns that
are present in the data, whereas, machine learning
aims to use it for prediction.
Data Analytics

• Another branch of data science is data analytics.


• It aims to extract useful knowledge from crude data.
• There are different types of analytics.
• Predictive data analytics is used for making predictions.
• Machine learning is closely related to this branch of analytics
and shares almost all algorithms.
Pattern Recognition

• It is an engineering field.
• It uses machine learning algorithms to extract the features for
pattern analysis and pattern classification.
• One can view pattern recognition as a specific application of
machine learning
3. Machine Learning and Statistics
• Statistics is a branch of mathematics.
• Like ML, it can learn from data. But the difference between
statistics and ML is that statistical methods look for regularity in
data called patterns.
• Initially, statistics sets a hypothesis and performs experiments to
verify and validate the hypothesis in order to find relationships
among data.
• It is mathematics intensive and models are often complicated
equations and involve many assumptions.
• It has strong theoretical foundations and interpretations that
require a strong statistical knowledge.
• Machine learning, comparatively, has less assumptions and
requires less statistical knowledge.
TYPES OF MACHINE LEARNING
• There are four types of machine learning as shown in Figure
1.5.
Labeled and Unlabelled Data
• Data is a raw fact.
• The data is represented in the form of a table.
• Data also can be referred to as a data point, sample, or an
example.
• Each row of the table represents a data point.
• Features are attributes or characteristics of an object.
• The columns of the table are attributes. One important
attribute present is called a label.
• Label is the feature that we aim to predict.
• Thus, there are two types of data – labeled and unlabelled.
Labeled Data
• To illustrate labeled data, let us take one example dataset
called Iris flower dataset or Fisher’s Iris dataset.
• The dataset has 50 samples of Iris – with four attributes, length
and width of sepals and petals.
• The target variable is called class. There are three classes – Iris
setosa, Iris virginica, and Iris versicolor.
• A dataset need not be always numbers. It can be images or video
frames.
• Deep neural networks can handle images with labels. In the
following Figure 1.6, the deep neural network takes images of
dogs and cats with labels for classification.
Supervised Learning
• Supervised algorithms use labeled dataset.
• As the name suggests, there is a supervisor or teacher
component in supervised learning.
• A supervisor provides labeled data so that the model is
constructed and generates test data.
• In supervised learning algorithms, learning takes place in two
stages.
• In simple terms, during the first stage, the teacher
communicates the information to the student that the student is
supposed to master. The student receives the information and
understands it.
• During this stage, the teacher has no knowledge of whether the
information is grasped by the student.
• This leads to the second stage of learning. The teacher then
asks the student a set of questions to find out how much
information has been grasped by the student. Based on these
questions, the student is tested, and the teacher informs the
student about his assessment.
• This kind of learning is called supervised learning.
• Supervised learning has two methods:
– Classification
– Regression
Classification (supervised learning)
• Classification is a supervised learning method.
• The input attributes of the classification algorithms are called
independent variables.
• The target attribute is called label or dependent variable.
• The relationship between the input and target variable is
represented in the form of a structure which is called a
classification model.
• The focus of classification is to predict the ‘label’ that is in a
discrete form (a value from the set of finite values).
• In classification, learning takes place in two stages.
• Training stage, the learning algorithm takes a labeled dataset and
starts learning. After the training set, samples are processed and
the model is generated. In the second stage, the constructed
model is tested with test or unknown sample and assigned a label.
This is the classification process.
• This is illustrated in the above Figure 1.7. Initially, the classification
learning algorithm learns with the collection of labeled data and
constructs the model. Then, a test case is selected, and the model
assigns a label.
• Similarly, in the case of Iris dataset, if the test is given as (6.3, 2.9,
5.6, 1.8, ?), the classification will generate the label for this. This is
called classification.
• The classification models can be categorized based on the
implementation technology like decision trees, probabilistic
methods, distance measures, and soft computing methods.
• Classification models can also be classified as generative
models and discriminative models.
• Generative models deal with the process of data generation
and its distribution. Probabilistic models are examples
of generative models.
• Discriminative models do not care about the generation of
data. Instead, they simply concentrate on classifying the given
data.
• Some of the key algorithms of classification are:
– Decision Tree
– Random Forest
– Support Vector Machines
– Naïve Bayes
– Artificial Neural Network and Deep Learning networks like
CNN
Regression Models (supervised learning)
• Regression models predict continuous variables like price (it is
a number). A fitted regression model is shown in Figure 1.8 for
a dataset that represent weeks input x and product sales y.
• The regression model takes input x and generates a model in
the form of a fitted line of the form y = f(x).
• Here, x is the independent variable that may be one or more
attributes and y is the dependent variable.
• In Figure 1.8, linear regression takes the training set and tries
to fit it with a line – product sales = 0.66 * Week + 0.54.
• Here, 0.66 and 0.54 are all regression coefficients that are
learnt from data.
• The advantage of this model is that prediction for product
sales (y) can be made for unknown week data (x).
• For example, the prediction for unknown eighth week can be
made by substituting x as 8 in that regression formula to get y.
• Both regression and classification models are
supervised algorithms.
• The main difference is that regression models predict
continuous variables such as product price, while
classification concentrates on assigning labels such as
class.
Unsupervised Learning

• The second kind of learning is by self-instruction.


• As the name suggests, there are no supervisor or teacher
components.
• This process of self-instruction is based on the concept of trial
and error.
• Here, the program is supplied with objects, but no labels are
defined.
• The algorithm itself observes the examples and recognizes
patterns based on the principles of grouping.
• Grouping is done in ways that similar objects form the same
group.
• Cluster analysis and Dimensional reduction algorithms are
examples of unsupervised algorithms.
Cluster Analysis
• Cluster analysis is an example of unsupervised learning.
• It aims to group objects into disjoint clusters or groups.
• Cluster analysis clusters objects based on its attributes.
• All the data objects of the partitions are similar in some
aspect and vary from the data objects in the other
partitions significantly.
• Some of the examples of clustering processes are
– segmentation of a region of interest in an image,
– detection of abnormal growth in a medical image, and
– determining clusters of signatures in a gene database.
https://www.v7labs.com/blog/image-segmentat
ion-guide
• An example of clustering scheme is shown in Figure 1.9 where
the clustering algorithm takes a set of dogs and cats images
and groups it as two clusters-dogs and cats.
Some of the key clustering algorithms are:
• k-means algorithm
• Hierarchical algorithms
Dimensionality Reduction

• Dimensionality reduction algorithms are examples of


unsupervised algorithms.
• It takes a higher dimension data as input and outputs the data
in lower dimension by taking advantage of the variance of the
data.
• It is a task of reducing the dataset with few features without
losing the generality.
Semi-supervised Learning

• There are circumstances where


the dataset has a huge
collection of unlabelled data
and some labeled data.
• Labeling is a costly process and
difficult to perform by the
humans.
• Semi-supervised algorithms use
unlabelled data by assigning a
pseudo-label.
• Then, the labeled and pseudo-
labeled dataset can be
combined.
Reinforcement Learning
• Reinforcement learning mimics human beings.
• Like human beings use ears and eyes to perceive the world
and take actions, reinforcement learning allows the agent to
interact with the environment to get rewards.
• The agent can be human, animal, robot, or any independent
program. The rewards enable the agent to gain experience.
The agent aims to maximize the reward.
• The reward can be positive or negative (Punishment). When
the rewards are more, the behavior gets reinforced and
learning becomes possible.
• Example: In this grid game, the gray tile
indicates the danger, black is a block, and
the tile with diagonal lines are the goal.
• The aim is to start, say from bottom-left
grid, using the actions left, right, top and
bottom to reach the goal state.
• To solve this sort of problem, there is no
data.
• The agent interacts with the environment to
get experience.
• In the above case, the agent tries to create a
model by simulating many paths and finding
rewarding paths.
• This experience helps in constructing a
model.
Challenges of Machine Learning
• Ill-posed problems –problems whose specifications are not clear (see the
example table)

• Huge data (quality data with no incorrect data or no missing data)


• Huge computation power [Systems with Graphics Processing Unit (GPU) or even
Tensor Processing Unit (TPU) ]
• Complexity of algorithms (to design, select, and evaluate optimal algorithms.)
• Bias/variance (variance is error poses problem called bias/ variance tradeoff. A
model that fits the training data but fails for test data, lacks generalization, is
called over fitting. The reverse problem is called under fitting where the model
fails for training data but has good generalization)
Machine Learning Process

• Understanding the business – This


step involves understanding the
objectives and requirements of the
business organization.
• Understanding the data – It
involves the steps like data
collection, study of the
characteristics of the data,
formulation of hypothesis, and
matching of patterns to the
selected hypothesis.
• Preparation of data – This step involves producing the final
dataset by cleaning the raw data and preparation of data for
the data mining process. (handle missing data values )
• Modelling – This step plays a role in the application of data
mining algorithm for the data to obtain a model or pattern.
• Evaluate – This step involves the evaluation of the data mining
results using statistical analysis and visualization methods.
• Deployment – This step involves the deployment of results of
the data mining algorithm to improve the existing process or
for a new situation.
MACHINE LEARNING APPLICATIONS
• Sentiment analysis – This is an application of natural language processing
(NLP) where the words of documents are converted to sentiments like
happy, sad, and angry which are captured by emoticons effectively. For
movie reviews or product reviews, five stars or one star are automatically
attached using sentiment analysis programs.
• Recommendation systems – These are systems that make personalized
purchases possible. For example, Amazon recommends users to find
related books or books bought by people who have the same taste like
you, and Netflix suggests shows or related movies of your taste. The
recommendation systems are based on machine learning.
• Voice assistants – Products like Amazon Alexa, Microsoft Cortana, Apple
Siri, and Google Assistant are all examples of voice assistants. They take
speech commands and perform tasks. These chatbots are the result of
machine learning technologies.
• Technologies like Google Maps and those used by Uber are all examples
of machine learning which offer to locate and navigate shortest paths to
reduce time.
S. Problem Domain Applications
No

1 Business Predicting the bankruptcy of a business firm


2 Banking Prediction of bank loan defaulters and detecting credit
card frauds
3 Image Image search engines, object identification, image
Processing classification, and generating synthetic images

4 Audio/Voice Chatbots like Alexa, Microsoft Cortana. Developing chatbots for


customer support, speech to text, and text to
Voice
5 Trend analysis and identification of bogus calls,
Telecommunic
ation fraudulent calls and its callers, churn analysis
6
Marketing Retail sales analysis, market basket analysis, product
performance analysis, market segmentation analysis,
and study of travel patterns of customers for marketing tours
S. Problem Applications
No Domain

7 Games Game programs for Chess, GO, and Atari video games
8 Natural Google Translate, Text summarization, and sentiment
Language
Translation analysis
9 Identification of access patterns, detection of e-mail spams, viruses,
Web Analysis personalized web services, search engines like
and Services Google, detection of promotion of user websites, and finding loyalty
of users after web page layout modification
10 Medicine Prediction of diseases, given disease symptoms as cancer or
diabetes. Prediction of effectiveness of the treatment using patient
history and Chabot to interact with patients like IBM Watson uses
machine learning technologies.
11 Multimedia Face recognition/identification, biometric projects like identification
and Security of a person from a large image or video
database, and applications involving multimedia retrieval
12 Scientific Discovery of new galaxies, identification of groups of houses based
on house type/geographical location, identification of earthquake
epicenters, and identification of similar land use
Domain

You might also like