0% found this document useful (0 votes)
27 views12 pages

What Is Machine Learning-UNIT III

Uploaded by

manglamdubey2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views12 pages

What Is Machine Learning-UNIT III

Uploaded by

manglamdubey2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

AI & ML UNIT-3

What is Machine Learning


In the real world, we are surrounded by humans who can learn everything from their experiences with
their learning capability, and we have computers or machines which work on our instructions. But can a
machine also learn from experiences or past data like a human does? So here comes the role of Machine
Learning.

Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the
development of algorithms which allow a computer to learn from the data and past experiences on their
own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define it in a
summarized way as:

Machine learning enables a machine to automatically learn from data, improve performance from
experiences, and predict things without being explicitly programmed

With the help of sample historical data, which is known as training data, machine learning algorithms
build a mathematical model that helps in making predictions or decisions without being explicitly
programmed. Machine learning brings computer science and statistics together for creating predictive
models. Machine learning constructs or uses the algorithms that learn from historical data. The more we
will provide the information, the higher will be the performance.

How does Machine Learning work

A Machine Learning system learns from historical data, builds the prediction models, and whenever
it receives new data, predicts the output for it. The accuracy of predicted output depends upon the
amount of data, as the huge amount of data helps to build a better model which predicts the output more
accurately.

Suppose we have a complex problem, where we need to perform some predictions, so instead of writing a
code for it, we just need to feed the data to generic algorithms, and with the help of these algorithms,
machine builds the logic as per the data and predict the output. Machine learning has changed our way of
thinking about the problem. The below block diagram explains the working of Machine Learning
algorithm:

Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.


o It can learn from past data and improve automatically.

By Dr. Megha Mishra


AI & ML UNIT-3

o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge amount of the data

Classification of Machine Learning

Concepts of Learning

Learning is the process of converting experience into expertise or knowledge.


Learning can be broadly classified into three categories, as mentioned below, based on the nature of the
learning data and interaction between the learner and the environment.

● Supervised Learning

● Unsupervised Learning

● Semi-supervised Learning

Similarly, there are four categories of machine learning algorithms as shown below

● Supervised learning algorithm

● Unsupervised learning algorithm

● Semi-supervised learning algorithm

● Reinforcement learning algorithm

However, the most commonly used ones are supervised and unsupervised learning.

Supervised Learning
Supervised learning is commonly used in real world applications, such as face and speech recognition,
products or movie recommendations, and sales forecasting. Supervised learning can be further classified
into two types - Regression and Classification.
Regression trains on and predicts a continuous-valued response, for example predicting real estate prices.
Classification attempts to find the appropriate class label, such as analyzing positive/negative sentiment,
male and female persons, benign and malignant tumors, secure and unsecure loans etc.
In supervised learning, learning data comes with description, labels, targets or desired outputs and the
objective is to find a general rule that maps inputs to outputs. This kind of learning data is called labeled
data. The learned rule is then used to label new data with unknown outputs.
Supervised learning involves building a machine learning model that is based on labeled samples. For
example, if we build a system to estimate the price of a plot of land or a house based on various features,
such as size, location, and so on, we first need to create a database and label it. We need to teach the
algorithm what features correspond to what prices. Based on this data, the algorithm will learn how to
calculate the price of real estate using the values of the input features.
Supervised learning deals with learning a function from available training data. Here, a learning algorithm
analyzes the training data and produces a derived function that can be used for mapping new examples.

By Dr. Megha Mishra


AI & ML UNIT-3

There are many supervised learning algorithms such as Logistic Regression, Neural networks, Support
Vector Machines (SVMs), and Naive Bayes classifiers.
Common examples of supervised learning include classifying e-mails into spam and not-spam categories,
labeling webpages based on their content, and voice recognition.

Unsupervised Learning
Unsupervised learning is used to detect anomalies, outliers, such as fraud or defective equipment, or to
group customers with similar behaviors for a sales campaign. It is the opposite of supervised learning.
There is no labeled data here.
When learning data contains only some indications without any description or labels, it is up to the coder
or to the algorithm to find the structure of the underlying data, to discover hidden patterns, or to
determine how to describe the data. This kind of learning data is called unlabeled data.
Suppose that we have a number of data points, and we want to classify them into several groups. We may
not exactly know what the criteria of classification would be. So, an unsupervised learning algorithm tries
to classify the given dataset into a certain number of groups in an optimum way.
Unsupervised learning algorithms are extremely powerful tools for analyzing data and for identifying
patterns and trends. They are most commonly used for clustering similar input into logical groups.
Unsupervised learning algorithms include Kmeans, Random Forests, Hierarchical clustering and so on.

Semi-supervised Learning
If some learning samples are labeled, but some other are not labeled, then it is semi-supervised learning.
It makes use of a large amount of unlabeled data for training and a small amount of labeled data for
testing. Semi-supervised learning is applied in cases where it is expensive to acquire a fully labeled
dataset while more practical to label a small subset. For example, it often requires skilled experts to label
certain remote sensing images, and lots of field experiments to locate oil at a particular location, while
acquiring unlabeled data is relatively easy.

Reinforcement Learning
Here learning data gives feedback so that the system adjusts to dynamic conditions in order to achieve a
certain objective. The system evaluates its performance based on the feedback responses and reacts
accordingly. The best known instances include self-driving cars and chess master algorithm AlphaGo.

Purpose of Machine Learning


Machine learning can be seen as a branch of AI or Artificial Intelligence, since, the ability to change
experience into expertise or to detect patterns in complex data is a mark of human or animal intelligence.
As a field of science, machine learning shares common concepts with other disciplines such as statistics,
information theory, game theory, and optimization.
As a subfield of information technology, its objective is to program machines so that they will learn.
However, it is to be seen that, the purpose of machine learning is not building an automated duplication of
intelligent behavior, but using

By Dr. Megha Mishra


AI & ML UNIT-3

What is training data?


Training data is also known as training dataset, learning set, and training set. It's an essential component
of every machine learning model and helps them make accurate predictions or perform a desired task.

Simply put, training data builds the machine learning model. It teaches what the expected output looks
like. The model analyzes the dataset repeatedly to deeply understand its characteristics and adjust itself
for better performance.

In a broader sense, training data can be classified into two categories: labeled data and unlabeled data.

What is labeled data?


Labeled data is a group of data samples tagged with one or more meaningful labels. It's also called
annotated data, and its labels identify specific characteristics, properties, classifications, or contained
objects.

For example, the images of fruits can be tagged as apples, bananas, or grapes.

Labeled training data is used in supervised learning. It enables ML models to learn the characteristics
associated with specific labels, which can be used to classify newer data points. In the example above,
this means that a model can use labeled image data to understand the features of specific fruits and use
this information to group new images.

Data labeling or annotation is a time-consuming process as humans need to tag or label the data points.
Labeled data collection is challenging and expensive. It isn't easy to store labeled data when compared to
unlabeled data.

What is unlabeled data?


As expected, unlabeled data is the opposite of labeled data. It's raw data or data that's not tagged with
any labels for identifying classifications, characteristics, or properties. It's used in unsupervised machine
learning, and the ML models have to find patterns or similarities in the data to reach conclusions.

Going back to the previous example of apples, bananas, and grapes, in unlabeled training data, the
images of those fruits won't be labeled. The model will have to evaluate each image by looking at its
characteristics, such as color and shape.

After analyzing a considerable number of images, the model will be able to differentiate new images (new
data) into the fruit types of apples, bananas, or grapes. Of course, the model wouldn't know that the
particular fruit is called an apple. Instead, it knows the characteristics needed to identify it.

There are hybrid models that use a combination of supervised and unsupervised machine learning.

How training data is used in machine learning

By Dr. Megha Mishra


AI & ML UNIT-3

Unlike machine learning algorithms, traditional programming algorithms follow a set of instructions to
accept input data and provide output. They don't rely on historical data, and every action they make is
rule-based. This also means that they don't improve over time, which isn't the case with machine learning.

For machine learning models, historical data is fodder. Just as humans rely on past experiences to make
better decisions, ML models look at their training dataset with past observations to make predictions.

Predictions could include classifying images as in the case of image recognition, or understanding the
context of a sentence as in natural language processing (NLP).

Think of a data scientist as a teacher, the machine learning algorithm as the student, and the training
dataset as the collection of all textbooks.

The teacher’s aspiration is that the student must perform well in exams and also in the real world. In the
case of ML algorithms, testing is like exams. The textbooks (training dataset) contain several examples of
the type of questions that’ll be asked in the exam.

Training data vs. test data vs. validation data

Training data is used in model training, or in other words, it's the data used to fit the model. On the
contrary, test data is used to evaluate the performance or accuracy of the model. It's a sample of data
used to make an unbiased evaluation of the final model fit on the training data.

A training dataset is an initial dataset that teaches the ML models to identify desired patterns or perform a
particular task. A testing dataset is used to evaluate how effective the training was or how accurate the
model is.

Once an ML algorithm is trained on a particular dataset and if you test it on the same dataset, it's more
likely to have high accuracy because the model knows what to expect. If the training dataset contains all
possible values the model might encounter in the future, all well and good.

Then there's validation data. This is a dataset used for frequent evaluation during the training phase.
Although the model sees this dataset occasionally, it doesn't learn from it. The validation set is also
referred to as the development set or dev set. It helps protect models from overfitting and underfitting.

What Is Function Approximation


Function approximation is a technique for estimating an unknown underlying function using historical or
available observations from the domain.
Artificial neural networks learn to approximate a function.

In supervised learning, a dataset is comprised of inputs and outputs, and the supervised learning algorithm
learns how to best map examples of inputs to examples of outputs.

We can think of this mapping as being governed by a mathematical function, called the mapping
function, and it is this function that a supervised learning algorithm seeks to best approximate.

By Dr. Megha Mishra


AI & ML UNIT-3

Neural networks are an example of a supervised learning algorithm and seek to approximate the function
represented by your data. This is achieved by calculating the error between the predicted outputs and the
expected outputs and minimizing this error during the training process.

We say “approximate” because although we suspect such a mapping function exists, we don’t know
anything about it.
The true function that maps inputs to outputs is unknown and is often referred to as the target function. It
is the target of the learning process, the function we are trying to approximate using only the data that is
available. If we knew the target function, we would not need to approximate it, i.e. we would not need a
supervised machine learning algorithm. Therefore, function approximation is only a useful tool when the
underlying target mapping function is unknown.

Applications of Machine learning


Machine learning is a buzzword for today's technology, and it is growing very rapidly day by day. We are
using machine learning in our daily life even without knowing it such as Google Maps, Google assistant,
Alexa, etc. Below are some most trending real-world applications of Machine Learning:

1. Image Recognition:

Image recognition is one of the most common applications of machine learning. It is used to identify objects,
persons, places, digital images, etc. The popular use case of image recognition and face detection
is, Automatic friend tagging suggestion:

Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo with our
Facebook friends, then we automatically get a tagging suggestion with name, and the technology behind this
is machine learning's face detection and recognition algorithm.

By Dr. Megha Mishra


AI & ML UNIT-3

It is based on the Facebook project named "Deep Face," which is responsible for face recognition and person
identification in the picture.

2. Speech Recognition

While using Google, we get an option of "Search by voice," it comes under speech recognition, and it's a
popular application of machine learning.

Speech recognition is a process of converting voice instructions into text, and it is also known as "Speech to
text", or "Computer speech recognition." At present, machine learning algorithms are widely used by various
applications of speech recognition. Google assistant, Siri, Cortana, and Alexa are using speech recognition
technology to follow the voice instructions.

3. Traffic prediction:

If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily congested with
the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.

Everyone who is using Google Map is helping this app to make it better. It takes information from the user
and sends back to its database to improve the performance.

4. Product recommendations:

Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for some product
on Amazon, then we started getting an advertisement for the same product while internet surfing on the
same browser and this is because of machine learning.

Google understands the user interest using various machine learning algorithms and suggests the product as
per customer interest.

As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc., and
this is also done with the help of machine learning.

5. Self-driving cars:

One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most popular car manufacturing company is working on self-
driving car. It is using unsupervised learning method to train the car models to detect people and objects
while driving.

By Dr. Megha Mishra


AI & ML UNIT-3

6. Email Spam and Malware Filtering:

Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We always
receive an important mail in our inbox with the important symbol and spam emails in our spam box, and the
technology behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:

We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name
suggests, they help us in finding the information using our voice instruction. These assistants can help us in
various ways just by our voice instructions such as Play music, call someone, Open an email, Scheduling an
appointment, etc.

These virtual assistants use machine learning algorithms as an important part.

These assistant record our voice instructions, send it over the server on a cloud, and decode it using ML
algorithms and act accordingly.

8. Online Fraud Detection:

Machine learning is making our online transaction safe and secure by detecting fraud transaction. Whenever
we perform some online transaction, there may be various ways that a fraudulent transaction can take place
such as fake accounts, fake ids, and steal money in the middle of a transaction. So to detect this, Feed
Forward Neural network helps us by checking whether it is a genuine transaction or a fraud transaction.

For each genuine transaction, the output is converted into some hash values, and these values become the
input for the next round. For each genuine transaction, there is a specific pattern which gets change for the
fraud transaction hence, it detects it and makes our online transactions more secure.

9. Stock Market trading:

Machine learning is widely used in stock market trading. In the stock market, there is always a risk of up and
downs in shares, so for this machine learning's long short term memory neural network is used for the
prediction of stock market trends.

10. Medical Diagnosis:

In medical science, machine learning is used for diseases diagnoses. With this, medical technology is
growing very fast and able to build 3D models that can predict the exact position of lesions in the brain.

By Dr. Megha Mishra


AI & ML UNIT-3

It helps in finding brain tumors and other brain-related diseases easily.

11. Automatic Language Translation:

Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all, as for
this also machine learning helps us by converting the text into our known languages. Google's GNMT (Google
Neural Machine Translation) provide this feature, which is a Neural Machine Learning that translates the text
into our familiar language, and it called as automatic translation.

The technology behind the automatic translation is a sequence to sequence learning algorithm, which is used
with image recognition and translates the text from one language to another language.

Machine learning Life cycle


Machine learning has given the computer systems the abilities to automatically learn without being explicitly
programmed. But how does a machine learning system work? So, it can be described using the life cycle of
machine learning. Machine learning life cycle is a cyclic process to build an efficient machine learning
project. The main purpose of the life cycle is to find a solution to the problem or project.

Machine learning life cycle involves seven major steps, which are given below:

o Gathering Data
o Data preparation
o Data Wrangling
o Analyse Data
o Train the model
o Test the model
o Deployment

The most important thing in the complete process is to understand the problem and to know the purpose of
the problem. Therefore, before starting the life cycle, we need to understand the problem because the good
result depends on the better understanding of the problem.

By Dr. Megha Mishra


AI & ML UNIT-3

In the complete life cycle process, to solve a problem, we create a machine learning system called "model",
and this model is created by providing "training". But to train a model, we need data, hence, life cycle starts
by collecting data.

1. Gathering Data:
Data Gathering is the first step of the machine learning life cycle. The goal of this step is to identify and obtain
all data-related problems.

In this step, we need to identify the different data sources, as data can be collected from various sources
such as files, database, internet, or mobile devices. It is one of the most important steps of the life cycle. The
quantity and quality of the collected data will determine the efficiency of the output. The more will be the
data, the more accurate will be the prediction.

This step includes the below tasks:

o Identify various data sources


o Collect data
o Integrate the data obtained from different sources

By performing the above task, we get a coherent set of data, also called as a dataset. It will be used in further
steps.

2. Data preparation
After collecting the data, we need to prepare it for further steps. Data preparation is a step where we put our
data into a suitable place and prepare it to use in our machine learning training.

In this step, first, we put all data together, and then randomize the ordering of data.

This step can be further divided into two processes:

o Data exploration:
It is used to understand the nature of data that we have to work with. We need to understand the
characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find Correlations, general
trends, and outliers.
o Data pre-processing:
Now the next step is preprocessing of data for its analysis.

3. Data Wrangling
Data wrangling is the process of cleaning and converting raw data into a useable format. It is the process of
cleaning the data, selecting the variable to use, and transforming the data in a proper format to make it more

By Dr. Megha Mishra


AI & ML UNIT-3

suitable for analysis in the next step. It is one of the most important steps of the complete process. Cleaning
of data is required to address the quality issues.

It is not necessary that data we have collected is always of our use as some of the data may not be useful. In
real-world applications, collected data may have various issues, including:

o Missing Values
o Duplicate data
o Invalid data
o Noise

So, we use various filtering techniques to clean the data.

It is mandatory to detect and remove the above issues because it can negatively affect the quality of the
outcome.

4. Data Analysis
Now the cleaned and prepared data is passed on to the analysis step. This step involves:

o Selection of analytical techniques


o Building models
o Review the result

The aim of this step is to build a machine learning model to analyze the data using various analytical
techniques and review the outcome. It starts with the determination of the type of the problems, where we
select the machine learning techniques such as Classification, Regression, Cluster analysis, Association, etc. then
build the model using prepared data, and evaluate the model.

Hence, in this step, we take the data and use machine learning algorithms to build the model.

5. Train Model
Now the next step is to train the model, in this step we train our model to improve its performance for better
outcome of the problem.

We use datasets to train the model using various machine learning algorithms. Training a model is required
so that it can understand the various patterns, rules, and, features.

6. Test Model
Once our machine learning model has been trained on a given dataset, then we test the model. In this step,
we check for the accuracy of our model by providing a test dataset to it.

By Dr. Megha Mishra


AI & ML UNIT-3

Testing the model determines the percentage accuracy of the model as per the requirement of project or
problem.

7. Deployment
The last step of machine learning life cycle is deployment, where we deploy the model in the real-world
system.

If the above-prepared model is producing an accurate result as per our requirement with acceptable speed,
then we deploy the model in the real system. But before deploying the project, we will check whether it is
improving its performance using available data or not. The deployment phase is similar to making the final
report for a project.

By Dr. Megha Mishra

You might also like