0% found this document useful (0 votes)
15 views57 pages

Updated Unit 1

The document provides an overview of Machine Learning, its definitions, applications, and the life cycle involved in developing machine learning systems. It discusses various learning paradigms such as supervised, unsupervised, and reinforcement learning, along with common machine learning problems and their respective algorithms. Additionally, it outlines the steps for designing a learning system, emphasizing the importance of data preparation and model deployment.

Uploaded by

kajol shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views57 pages

Updated Unit 1

The document provides an overview of Machine Learning, its definitions, applications, and the life cycle involved in developing machine learning systems. It discusses various learning paradigms such as supervised, unsupervised, and reinforcement learning, along with common machine learning problems and their respective algorithms. Additionally, it outlines the steps for designing a learning system, emphasizing the importance of data preparation and model deployment.

Uploaded by

kajol shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

PARUL INSTITUTE OF ENGINEERING &

TECHNOLOGY
FACULTY OF ENGINEERING & TECHNOLOGY
PARUL UNIVERSITY

B.Tech, Sem -6th


Subject : Machine Learning
Unit 1 : Introduction
Computer Science & Engineering
Kamini Sharma(Assistant Professor, PIET-CSE)
Outline
• Introduction to Machine Learning
• Basics Learning Problems
• designing a learning system
• Issues with machine learning
• Concept Learning
• Version Spaces and Candidate Elimination
• Inductive bias
Introduction to Machine Learning

•Machine learning is a subset of artificial intelligence


•It enables the machine to automatically learn from data, improve performance
from past experiences, and make predictions.
•Machine learning contains a set of algorithms that work on a huge amount of
data.
•Data is fed to these algorithms to train them, and on the basis of training, they
build the model & perform a specific task.
•Machine learning uses various algorithms for building mathematical models
and making predictions using historical data or information.
•It is being used for various tasks such as image recognition, speech recognition,
email filtering, Facebook auto-tagging, recommender system, and many more.
Introduction to Machine Learning

What is Machine Learning?


In the real world, we are surrounded by humans who can learn everything from
their experiences with their learning capability, and we have computers or
machines which work on our instructions. But can a machine also learn from
experiences or past data like a human does? So here comes the role of Machine
Learning.
Introduction to Machine Learning
•“Machine learning enables a machine to automatically learn from data,
improve performance from experiences, and predict things without being
explicitly programmed.”
•With the help of sample historical data, which is known as training data,
machine learning algorithms build a mathematical model that helps in making
predictions or decisions without being explicitly programmed. Machine learning
brings computer science and statistics together for creating predictive models.
Machine learning constructs or uses the algorithms that learn from historical
data. The more we will provide the information, the higher will be the
performance.
Introduction to Machine Learning

How does Machine Learning work

A Machine Learning system learns from historical data, builds the prediction
models, and whenever it receives new data, predicts the output for it. The
accuracy of predicted output depends upon the amount of data, as the huge
amount of data helps to build a better model which predicts the output more
accurately.
Introduction to Machine Learning
Machine Learning
How does Machine Learning work
• Suppose we have a complex problem, where we need to perform some
predictions, so instead of writing a code for it, we just need to feed the data to
generic algorithms, and with the help of these algorithms, machine builds the
logic as per the data and predict the output. Machine learning has changed our
way of thinking about the problem. The below block diagram explains the
working of Machine Learning algorithm:
Introduction…..

Features of Machine Learning


⮚ Machine learning uses data to detect various patterns in a given dataset.
⮚ It can learn from past data and improve automatically.
⮚ It is a data-driven technology.
⮚ Machine learning is much similar to data mining as it also deals with the huge
amount of the data.
Introduction…..

Need for Machine Learning


⮚ Rapid increment in the production of data
⮚ Solving complex problems, which are difficult for a human
⮚ Decision making in various sector including finance
⮚ Finding hidden patterns and extracting useful information from
data.
Applications of Machine Learning
❑ Prediction — Machine learning can also be used in the prediction systems.
Considering the loan example, to compute the probability of a fault, the
system will need to classify the available data in groups.
❑ Image recognition — Machine learning can be used for face detection in an
image as well. There is a separate category for each person in a database of
several people.
❑ Speech Recognition — It is the translation of spoken words into the text. It is
used in voice searches and more. Voice user interfaces include voice dialing,
call routing, and appliance control. It can also be used a simple data entry and
the preparation of structured documents.
❑ Medical diagnoses — ML is trained to recognize cancerous tissues.
❑ Financial industry and trading — companies use ML in fraud investigations
and credit checks.
Applications of Machine Learning
❑ Traffic prediction — Machine learning can also be used in the prediction of
traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested.
❑ Product recommendations— Machine learning is widely used by various e-
commerce and entertainment companies such as Amazon, Netflix, etc., for
product recommendation to the user.
❑ Self-driving cars— Machine learning plays a significant role in self-driving
cars. Tesla, the most popular car manufacturing company is working on self-
driving car. It is using unsupervised learning method to train the car models to
detect people and objects while driving.
❑ Email Spam and Malware Filtering— We always receive an important mail
in our inbox with the important symbol and spam emails in our spam box, and
the technology behind this is Machine learning.
Applications of Machine Learning
❑ Online Fraud Detection— Whenever we perform some online transaction, there may
be various ways that a fraudulent transaction can take place such as fake accounts, fake
ids, and steal money in the middle of a transaction.
❑ Stock Market trading— In the stock market, there is always a risk of up and downs in
shares, so for this machine learning's long short term memory neural network is used
for the prediction of stock market trends.
❑ Medical Diagnosis— machine learning is used for diseases diagnoses.medical
technology is growing very fast and able to build 3D models that can predict the exact
position of lesions in the brain.
❑ Virtual Personal Assistant— We have various virtual personal assistants such
as Google assistant, Alexa, Cortana, Siri. As the name suggests, they help us in finding
the information using our voice instruction.
Applications of Machine Learning
But there are much more examples of ML in use
Machine learning Life cycle
Machine learning Life cycle
1. Gathering Data:
Data Gathering is the first step of the machine learning life cycle. The goal of this step is
to identify and obtain all data-related problems.
In this step, we need to identify the different data sources, as data can be collected from
various sources such as files, database, internet, or mobile devices. It is one of the most
important steps of the life cycle. The quantity and quality of the collected data will
determine the efficiency of the output. The more will be the data, the more accurate will
be the prediction.
This step includes the below tasks:
• Identify various data sources
• Collect data
• Integrate the data obtained from different sources
By performing the above task, we get a coherent set of data, also called as a dataset. It
will be used in further steps.
Machine learning Life cycle
2. Data preparation
After collecting the data, we need to prepare it for further steps. Data preparation
is a step where we put our data into a suitable place and prepare it to use in our
machine learning training.
In this step, first, we put all data together, and then randomize the ordering of
data.
This step can be further divided into two processes:
Data exploration:
It is used to understand the nature of data that we have to work with. We need to
understand the characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find
Correlations, general trends, and outliers.
Data pre-processing:
Now the next step is preprocessing of data for its analysis.
Machine learning Life cycle
3. Data Wrangling
Data wrangling is the process of cleaning and converting raw data into a useable
format. It is the process of cleaning the data, selecting the variable to use, and
transforming the data in a proper format to make it more suitable for analysis in
the next step. It is not necessary that data we have collected is always of our use
as some of the data may not be useful. In real-world applications, collected data
may have various issues, including:
Missing Values
Duplicate data
Invalid data
Noise
So, we use various filtering techniques to clean the data.
Machine learning Life cycle
4. Data Analysis
Now the cleaned and prepared data is passed on to the analysis step. This step
involves:
Selection of analytical techniques
Building models
Review the result
The aim of this step is to build a machine learning model to analyze the data
using various analytical techniques and review the outcome. It starts with the
determination of the type of the problems, where we select the machine learning
techniques such as Classification, Regression, Cluster analysis, Association,
etc. then build the model using prepared data, and evaluate the model.
Hence, in this step, we take the data and use machine learning algorithms to
build the model.
Machine learning Life cycle
7. Deployment

The last step of machine learning life cycle is deployment, where we deploy the
model in the real-world system.
If the above-prepared model is producing an accurate result as per our
requirement with acceptable speed, then we deploy the model in the real system.
But before deploying the project, we will check whether it is improving its
performance using available data or not. The deployment phase is similar to
making the final report for a project.
Learning Paradigms in Machine Learning
Learning Paradigms in Machine Learning
Learning Paradigms in Machine Learning
1) Supervised Learning

• Supervised learning is a type of machine learning method in which we provide


sample labeled data to the machine learning system in order to train it, and on
that basis, it predicts the output.
• The system creates a model using labeled data to understand the datasets and
learn about each data, once the training and processing are done then we test
the model by providing a sample data to check whether it is predicting the
exact output or not.
Learning Paradigms in Machine Learning

• The goal of supervised learning is to map input data with the output data. The
supervised learning is based on supervision, and it is the same as when a
student learns things in the supervision of the teacher. The example of
supervised learning is spam filtering.
• The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of
supervised learning are Risk Assessment, Fraud Detection, Spam filtering, etc.
Learning Paradigms in Machine Learning
2) Unsupervised Learning
• Unsupervised learning is a learning method in which a machine learns without
any supervision.
• The training is provided to the machine with the set of data that has not been
labeled, classified, or categorized, and the algorithm needs to act on that data
without any supervision. The goal of unsupervised learning is to restructure
the input data into new features or a group of objects with similar patterns.
• In unsupervised learning, we don't have a predetermined result. The machine
tries to find useful insights from the huge amount of data.
Learning Paradigms in Machine Learning
Advantages of Unsupervised Learning
• Unsupervised learning is used for more complex tasks as compared to supervised
learning because, in unsupervised learning, we don't have labeled input data.
• Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to
labeled data.
Disadvantages of Unsupervised Learning
• Unsupervised learning is intrinsically more difficult than supervised learning as it does
not have corresponding output.
• The result of the unsupervised learning algorithm might be less accurate as input data
is not labeled, and algorithms do not know the exact output in advance.
Learning Paradigms in Machine Learning
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a
learning agent gets a reward for each right action and gets a penalty for each
wrong action. The agent learns automatically with these feedbacks and improves
its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward
points, and hence, it improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an
example of Reinforcement learning.
Learning Paradigms in Machine Learning
Advantages and Disadvantages of Reinforcement Learning

Advantages
• It helps in solving complex real-world problems which are difficult to be
solved by general techniques.
• The learning model of RL is similar to the learning of human beings; hence
most accurate results can be found.
• Helps in achieving long term results.
Disadvantage
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states which can
weaken the results.
Learning Problems
Most Common Types of Machine Learning Problems are:
⮚ Regression
⮚ Classification
⮚ Clustering
⮚ Time-series forecasting
⮚ Anomaly detection
⮚ Ranking
⮚ Recommendation
⮚ Data generation
⮚ Optimization
Learning Problems
Problem types Details Algorithms
When the need is to predict numerical Linear regression, K-NN,
Regression values, For example, house price random forest, neural
prediction networks
Logistic regression,
When there is a need to classify the data
random forest, K-NN,
in different classes,
Classification gradient boosting
example, classify whether a person is
classifier, neural
suffering from a disease or otherwise.
networks
When there is a need to categorize the K-Means, DBSCAN,
data points in similar groupings or Hierarchical clustering,
Clustering
clusters, this is called a clustering Gaussian mixture
problem. models, BIRCH
Learning Problems
When there is a need to predict a number based
ARIMA, SARIMA, LSTM,
on the time-series data
Exponential smoothing,
Time-series forecasting example, a time-series forecasting problem is
Prophet, GARCH, TBATS,
about forecasting the sales demand for a product,
Dynamic linear models
based on a set of input data

Isolation Forest,
a given record can be classified as an outlier or
Minimum covariance
unexpected event/item
Anomaly detection determinant, Local
example, credit card fraud transactions detection
outlier factor, One-class
is an anomaly detection problem.
SVM

When there is a need to order the results of a


request or a query based on some criteria. Bipartite ranking
Ranking Example: Recommendation engines make use of (Bipartite Rankboost,
the ranking algorithm to recommend the next Bipartite RankSVM)
items.
Learning Problems
When there is a need to recommend such
as “next item” to buy or “next video” to Content-based and
watch or “next song” to listen to, the collaborative filtering
Recommendation
problem is called a recommendation machine learning
problem. The solutions to such problems methods
are called recommender systems.
When there is a need to generate data such Generative
as images, videos, articles, posts, etc, the adversarial network
Data generation
problem is called a data generation (GAN), Hidden
problem. Markov models

When there is a need to generate a set of


Linear programming
outputs that optimize outcomes related to
Optimization methods, genetic
some objective (objective function), the
programming
problem is called an objective function.
Designing a Learning system

• By following the number of steps we can design a learning system :


Designing a Learning system

• Step 1) Choosing the Training Experience: The very


important and first task is to choose the training data or
training experience which will be fed to the Machine Learning
Algorithm.
• the data or experience that we fed to the algorithm must have
a significant impact on the Success or Failure of the Model.
• Step 2- Choosing target function:according to the
knowledge fed to the algorithm the machine learning will
choose NextMove function which will describe what type of
Designing a Learning system

• Step 3- Choosing Representation for Target


function: When the machine algorithm will know all the possible
legal moves the next step is to choose the optimized move using any
representation i.e. using linear Equations, Hierarchical Graph
Representation, Tabular form etc.
• Step 4- Choosing Function Approximation Algorithm: An
optimized move cannot be chosen just with the training data. The
training data had to go through with set of example and through
these examples the training data will approximates which steps are
Designing a Learning system

• Step 4- Example : When a training data of Playing chess is fed to


algorithm so at that time it is not machine algorithm will fail or get
success and again from that failure or success it will measure while
next move what step should be chosen and what is its success rate.
• Step 5- Final Design: The final design is created at last when
system goes from number of examples , failures and success ,
correct and incorrect decision and what will be the next step etc.
Example: DeepBlue is an intelligent computer which is ML-based won
chess game against the chess expert Garry Kasparov, and it became
Issues with machine learning
1. Inadequate Training Data
The backbone of any ML algorithm is the data it is trained on. The
challenge arises when there is a shortage of both quality and quantity
in the training dataset.
2. Underfitting of Training Data
This process occurs when data is unable to establish an accurate
relationship between input and output variables. It simply means trying
to fit in undersized jeans. It signifies the data is too simple to establish
a precise relationship.
Issues with machine learning

3. Overfitting of Training Data


Overfitting refers to a machine learning model trained with a massive
amount of data that negatively affect its performance. It is like trying to
fit in Oversized jeans.
4. Monitoring and Maintenance
Regular monitoring and maintenance are essential to ensure the
continued effectiveness of ML models.
5. Process Complexity of Machine Learning
The complexity of the ML process, marked by experimental phases and
Issues with machine learning

6. Customer Segmentation
Accurate customer segmentation is crucial for effective ML algorithms.
Developing algorithms that recognize customer behavior and trigger
relevant recommendations based on past experiences is essential for
personalized user interactions.
7. Data Bias
Data bias introduces errors when certain elements in the dataset are
given disproportionate weight. Detecting and mitigating bias requires
careful examination of the dataset, regular analysis, and implementing
Concept Learning
▪ Concept learning, as a broader term, includes both case-based and instance-
based learning.
▪ concept learning involves the extraction of general rules or patterns from
specific instances to make predictions on new, unseen data.
▪ Concept learning in machine learning is not confined to a single pattern; it
spans various approaches, including rule-based learning, neural networks,
decision trees, and more. The process of concept learning in machine
learning involves iterative refinement. The model learns from examples,
refines its understanding of the underlying concepts, and continually updates
Concept Learning

▪ Concept learning is one of the essential building blocks of machine learning


that helps machines classify objects or events based on their features.
▪ Concept learning is a subfield of machine learning that aims to learn
generalizable concepts from data such classification and regression, where we
want the models to be able to generalize well on unseen examples.
▪ Concept learning is Inferring a boolean-valued function from training
examples of its input and output.
▪ Concept learning is also known as binary classification
Concept Learning

Example: “EnjoySport”
• Suppose want to learn target concept days on which Fred enjoys his favorite
water sport
• Input is a set of examples, one per day– describing the day in terms of a set of
attributes– indicating (yes/no) whether Fred enjoyed his sport that day
Concept Learning

• Task: learn to predict the value of Enjoy Sport for an arbitrary day, given values
of other attributes
• Suppose hypotheses take form of conjunctions of constraints on instance
attributes– e.g. specify allowed values of:
Sky, Temp, Humid, Wind, Water, Forecast and
suppose constraints take one of three forms:
– ?–any value acceptable
– Specified value– specific value required, e.g. Warm for the Temp attribute
– 0– no value acceptable
Concept Learning

• Then, hypotheses can be represented as vectors of such constraints.

E.g. ?, Cold, High,?,?,?– represents hypothesis


Fred enjoys sport only on cold days with high humidity

benefits of Concept Learning for Machine Learning


▪ Concept learning enables machines to generalize from examples and improve
their performance considerably.
▪ concept learning can help machines better understand the data they are
working with.
▪ using concepts as a guide for the prediction can lead to more accurate results.
Concept Learning
Limitations of Concept Learning in Machine Learning
▪ Concept learning is a supervised learning technique that uses a set of labeled
examples to learn about the relationships between concepts.
▪ Concept learning is able to generalize beyond the training data, but it has
limitations.
▪ concept learning may not be able to learn about new concepts that have not
been seen in the training data.
▪ concept learning may be slower than other supervised machine learning
techniques and may require more training data.
Version Space

h1 = (?, ?, No, ?, Many) – Consistent Hypothesis as it is consistent with


all the training examples
Version Space
The version space VSH,Dis the subset of the hypothesis from H
consistent with the training example in D,

VSH,D = {HE H | Consistent(h,D)}

H= hypothesis Consistent

D= training Example h(x)=c(x)


Algorithm to obtain Version Space

List-Then-Eliminate algorithm

1. VersionSpace = a list containing every hypothesis in H

2. For each training example, <a(x), c(x)> Remove


from VersionSpace any hypothesis h for which h(x) != c(x)

3. Output the list of hypotheses in VersionSpace.


Algorithm to obtain Version Space

List-Then-Eliminate algorithm
Example:
F1 – > A, B
F2 – > X, Y
Here F1 and F2 are two features (attributes) with two possible values for
each feature or attribute.
Instance Space: (A, X), (A, Y), (B, X), (B, Y) – 4Examples

Hypothesis Space: (A, X), (A, Y), (A, ø), (A, ?), (B, X), (B, Y), (B, ø),
(B, ?), (ø, X), (ø, Y), (ø, ø), (ø, ?), (?, X), (?, Y), (?, ø), (?, ?) – 16
Hypothesis

Semantically Distinct Hypothesis : (A, X), (A, Y), (A, ?), (B, X), (B, Y),
Inductive bias
What is Inductive Bias?
Inductive bias is the set of assumptions or preferences that a learning algorithm
uses to make predictions beyond the data it has been trained on.
Without inductive bias, machine learning algorithms would be unable to
generalize from training data to unseen situations, as the possible hypotheses or
models could be infinite.
Inductive bias
What is Inductive Bias?
Inductive bias
in a classification problem, if the model is trained on data that suggests a linear
relationship between features and outcomes, the inductive bias of the model
might favor a linear hypothesis. This preference guides the model to choose
simpler, linear relationships rather than complex, nonlinear ones, even if such
relationships might exist in the data.
Examples:
•Inductive bias in decision trees: A preference for shorter trees with fewer splits.
•Inductive bias in linear regression: The assumption that the data follows a linear
trend.
Inductive bias
Types of Inductive Bias
1. Language Bias: Language bias refers to the constraints placed on the
hypothesis space, which defines the types of models a learning algorithm can
consider.
2. Search Bias: Search bias refers to the preferences that an algorithm has when
selecting hypotheses from the available options.
3. Algorithm-Specific Biases: Certain algorithms have specific biases based on
their structure:
•Linear Models: Assume that the data has linear relationships.
•k-Nearest Neighbors (k-NN): Assumes that similar data points exist in close
proximity.
•Decision Trees: Typically biased towards choosing splits that result in the most
homogeneous subgroups.
ThankYou
www.paruluniversi
ty.ac.in

You might also like