0% found this document useful (0 votes)
25 views12 pages

Ai 4

Machine learning is a subset of artificial intelligence that enables systems to learn from data and experience to improve their performance. It builds mathematical models to make predictions or decisions without being explicitly programmed. This document discusses the basics of machine learning, including how it works, common techniques like supervised and unsupervised learning, applications, and the machine learning lifecycle. Key applications mentioned include image recognition, speech recognition, and recommender systems.

Uploaded by

Knal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views12 pages

Ai 4

Machine learning is a subset of artificial intelligence that enables systems to learn from data and experience to improve their performance. It builds mathematical models to make predictions or decisions without being explicitly programmed. This document discusses the basics of machine learning, including how it works, common techniques like supervised and unsupervised learning, applications, and the machine learning lifecycle. Key applications mentioned include image recognition, speech recognition, and recommender systems.

Uploaded by

Knal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine learning is a growing technology which enables computers to learn automatically from past

data. Machine learning uses various algorithms for building mathematical models and making
predictions using historical data or information. Currently, it is being used for various tasks such
as image recognition, speech recognition, email filtering, Facebook auto-tagging, recommender system,
and many more.

This machine learning tutorial gives you an introduction to machine learning along with the wide
range of machine learning techniques such as Supervised, Unsupervised, and Reinforcement learning.
You will learn about regression and classification models, clustering methods, hidden Markov models,
and various sequential models.

What is Machine Learning

In the real world, we are surrounded by humans who can learn everything from their experiences
with their learning capability, and we have computers or machines which work on our instructions.
But can a machine also learn from experiences or past data like a human does? So here comes the
role of Machine Learning.

Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the
development of algorithms which allow a computer to learn from the data and past experiences on
their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define
it in a summarized way as:

Machine learning enables a machine to automatically learn from data, improve

performance from experiences, and predict things without being explicitly

programmed.

With the help of sample historical data, which is known as training data, machine learning algorithms
build a mathematical model that helps in making predictions or decisions without being explicitly
programmed. Machine learning brings computer science and statistics together for creating predictive
models. Machine learning constructs or uses the algorithms that learn from historical data. The more
we will provide the information, the higher will be the performance.

A machine has the ability to learn if it can improve its performance by gaining more data.

How does Machine Learning work

A Machine Learning system learns from historical data, builds the prediction models, and whenever it
receives new data, predicts the output for it. The accuracy of predicted output depends upon the
amount of data, as the huge amount of data helps to build a better model which predicts the output
more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of these
algorithms, machine builds the logic as per the data and predict the output. Machine learning has
changed our way of thinking about the problem. The below block diagram explains the working of
Machine Learning algorithm:

Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.

o It can learn from past data and improve automatically.

o It is a data-driven technology.

O Machine learning is much similar to data mining as it also deals with the huge amount of the
data.

Need for Machine Learning

The need for machine learning is increasing day by day. The reason behind the need for machine
learning is that it is capable of doing tasks that are too complex for a person to implement directly.
As a human, we have some limitations as we cannot access the huge amount of data manually, so for
this, we need some computer systems and here comes the machine learning to make things easy for
us.

We can train machine learning algorithms by providing them the huge amount of data and let them
explore the data, construct the models, and predict the required output automatically. The
performance of the machine learning algorithm depends on the amount of data, and it can be
determined by the cost function. With the help of machine learning, we can save both time and
money.

The importance of machine learning can be easily understood by its uses cases, Currently, machine
learning is used in self-driving cars, cyber fraud detection, face recognition, and friend suggestion by
Facebook, etc. Various top companies such as Netflix and Amazon have build machine learning models
that are using a vast amount of data to analyze the user interest and recommend product
accordingly.

Following are some key points which show the importance of Machine Learning:

o Rapid increment in the production of data


o Solving complex problems, which are difficult for a human

o Decision making in various sector including finance

o Finding hidden patterns and extracting useful information from data.

Classification of Machine Learning


At a broad level, machine learning can be classified into three types:

1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning

Applications of Machine learning


Machine learning is a buzzword for today's technology, and it is growing very rapidly day by day. We
are using machine learning in our daily life even without knowing it such as Google Maps, Google
assistant, Alexa, etc. Below are some most trending real-world applications of Machine Learning:

Machine learning Life cycle


Machine learning has given the computer systems the abilities to automatically learn without being
explicitly programmed. But how does a machine learning system work? So, it can be described using
the life cycle of machine learning. Machine learning life cycle is a cyclic process to build an efficient
machine learning project. The main purpose of the life cycle is to find a solution to the problem or
project.

Machine learning life cycle involves seven major steps, which are given below:

o Gathering Data
o Data preparation

o Data Wrangling

o Analyse Data

o Train the model

o Test the model

o Deployment

What is a dataset?

A dataset is a collection of data in which data is arranged in some order. A dataset can contain any
data from a series of an array to a database table. Below table shows an example of the dataset:

Country Age Salary Purchased

India 38 48000 No

France 43 45000 Yes

Germany 30 54000 No

France 48 65000 No

Germany 40 Yes

India 35 58000 Yes

A tabular dataset can be understood as a database table or matrix, where each column corresponds to
a particular variable, and each row corresponds to the fields of the dataset. The most supported file
type for a tabular dataset is "Comma Separated File," or CSV. But to store a "tree-like data," we can
use the JSON file more efficiently.

Types of data in datasets

o Numerical data:Such as house price, temperature, etc.

o Categorical data:Such as Yes/No, True/False, Blue/green, etc.

o Ordinal data:These data are similar to categorical data but can be measured on the basis of
comparison.

Note: A real-world dataset is of huge size, which is difficult to manage and process at the initial
level. Therefore, to practice machine learning algorithms, we can use any dummy dataset.

Supervised Machine Learning


Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled data
means some input data is already tagged with the correct output.

In supervised learning, the training data provided to the machines work as the supervisor that teaches
the machines to predict the output correctly. It applies the same concept as a student learns in the
supervision of the teacher.

Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).

In the real-world, supervised learning can be used for Risk Assessment, Image classification, Fraud
Detection, spam filtering, etc.

How Supervised Learning Works?

In supervised learning, models are trained using labelled dataset, where the model learns about each
type of data. Once the training process is completed, the model is tested on the basis of test data (a
subset of the training set), and then it predicts the output.

The working of Supervised learning can be easily understood by the below example and diagram:

Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.

o If the given shape has three sides, then it will be labelled as a triangle.

o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is to identify the
shape.

The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the
shape on the bases of a number of sides, and predicts the output.
Steps Involved in Supervised Learning:

o First Determine the type of training dataset

o Collect/Gather the labelled training data.

o Split the training dataset into training dataset, test dataset, and validation dataset.

o Determine the input features of the training dataset, which should have enough knowledge so
that the model can accurately predict the output.

o Determine the suitable algorithm for the model, such as support vector machine, decision tree,
etc.

o Execute the algorithm on the training dataset. Sometimes we need validation sets as the
control parameters, which are the subset of training datasets.

O Evaluate the accuracy of the model by providing the test set. If the model predicts the correct
output, which means our model is accurate.

Types of supervised Machine learning Algorithms:

Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the output
variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc. Below are some popular Regression algorithms which come under supervised learning:

o Linear Regression

o Regression Trees

o Non-Linear Regression

o Bayesian Linear Regression

o Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,
o Random Forest

o Decision Trees

o Logistic Regression

o Support vector Machines

Note: We will discuss these algorithms in detail in later chapters.

Advantages of Supervised learning:

o With the help of supervised learning, the model can predict the output on the basis of prior
experiences.

o In supervised learning, we can have an exact idea about the classes of objects.

o Supervised learning model helps us to solve various real-world problems such as fraud detection,
spam filtering, etc.

Disadvantages of supervised learning:

o Supervised learning models are not suitable for handling the complex tasks.

o Supervised learning cannot predict the correct output if the test data is different from the
training dataset.

o Training required lots of computation times.

o In supervised learning, we need enough knowledge about the classes of object.

Unsupervised Machine Learning


In the previous topic, we learned supervised machine learning in which models are trained using
labeled data under the supervision of training data. But there may be many cases in which we do not
have labeled data and need to find the hidden patterns from the given dataset. So, to solve such types
of cases in machine learning, we need unsupervised learning techniques.

What is Unsupervised Learning?

As the name suggests, unsupervised learning is a machine learning technique in which models are not
supervised using training dataset. Instead, models itself find the hidden patterns and insights from the
given data. It can be compared to learning which takes place in the human brain while learning new
things. It can be defined as:

Unsupervised learning is a type of machine learning in which models are trained using
unlabeled dataset and are allowed to act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression or classification problem because
unlike supervised learning, we have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of dataset, group that data according to
similarities, and represent that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset containing images of
different types of cats and dogs. The algorithm is never trained upon the given dataset, which means
it does not have any idea about the features of the dataset. The task of the unsupervised learning
algorithm is to identify the image features on their own. Unsupervised learning algorithm will perform
this task by clustering the image dataset into the groups according to similarities between images.

Why use Unsupervised Learning?

Below are some main reasons which describe the importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights from the data.

o Unsupervised learning is much similar as a human learns to think by their own experiences,
which makes it closer to the real AI.

o Unsupervised learning works on unlabeled and uncategorized data which make unsupervised
learning more important.

o In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:


Here, we have taken an unlabeled input data, which means it is not categorized and corresponding
outputs are also not given. Now, this unlabeled input data is fed to the machine learning model in
order to train it. Firstly, it will interpret the raw data to find the hidden patterns from the data and
then will apply suitable algorithms such as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to
the similarities and difference between the objects.

Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized into two types of problems:

o Clustering: Clustering is a method of grouping the objects into clusters such that objects with
most similarities remains into a group and has less or no similarities with the objects of
another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.

O Association: An association rule is an unsupervised learning method which is used for finding
the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective. Such
as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A
typical example of Association rule is Market Basket Analysis.

Note: We will learn these algorithms in later chapters.

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

o K-means clustering

o KNN (k-nearest neighbors)

o Hierarchal clustering

o Anomaly detection

o Neural Networks

o Principle Component Analysis

o Independent Component Analysis

o Apriori algorithm

o Singular value decomposition


Advantages of Unsupervised Learning

o Unsupervised learning is used for more complex tasks as compared to supervised learning
because, in unsupervised learning, we don't have labeled input data.

o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled


data.

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than supervised learning as it does not have
corresponding output.

o The result of the unsupervised learning algorithm might be less accurate as input data is not
labeled, and algorithms do not know the exact output in advance.

What is a Decision Tree?


A decision tree is a flowchart-like tree structure where each internal node denotes the feature,
branches denote the rules and the leaf nodes denote the result of the algorithm. It is a
versatile supervised machine-learning algorithm, which is used for both classification and
regression problems. It is one of the very powerful algorithms. And it is also used in Random
Forest to train on different subsets of training data, which makes random forest one of the
most powerful algorithms in machine learning.

Naïve Bayes Classifier Algorithm


o Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and
used for solving classification problems.

o It is mainly used in text classification that includes a high-dimensional training dataset.

o Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which
helps in building the fast machine learning models that can make quick predictions.

o It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.

o Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis,
and classifying articles.
Reinforcement learning
Reinforcement learning is an area of Machine Learning. It is about taking suitable action to
maximize reward in a particular situation. It is employed by various software and machines to
find the best possible behavior or path it should take in a specific situation. Reinforcement
learning differs from supervised learning in a way that in supervised learning the training data
has the answer key with it so the model is trained with the correct answer itself whereas in
reinforcement learning, there is no answer but the reinforcement agent decides what to do to
perform the given task. In the absence of a training dataset, it is bound to learn from its
experience.

Reinforcement Learning (RL) is the science of decision making. It is about learning the optimal
behavior in an environment to obtain maximum reward. In RL, the data is accumulated from
machine learning systems that use a trial-and-error method. Data is not part of the input
that we would find in supervised or unsupervised machine learning.

Example:

The problem is as follows: We have an agent and a reward, with many hurdles in between. The
agent is supposed to find the best possible path to reach the reward. The following problem
explains the problem more easily.

The above image shows the robot, diamond, and fire. The goal of the robot is to get the reward
that is the diamond and avoid the hurdles that are fired. The robot learns by trying all the
possible paths and then choosing the path which gives him the reward with the least hurdles.
Each right step will give the robot a reward and each wrong step will subtract the reward of
the robot. The total reward will be calculated when it reaches the final reward that is the
diamond.

Main points in Reinforcement learning –

 Input: The input should be an initial state from which the model will start
 Output: There are many possible outputs as there are a variety of solutions to a
particular problem
 Training: The training is based upon the input, The model will return a state and the
user will decide to reward or punish the model based on its output.
 The model keeps continues to learn.
 The best solution is decided based on the maximum reward.
Expectation-Maximization (EM) algorithm :-
 The Expectation-Maximization (EM) algorithm is a general algorithm for
maximum likelihood estimation where the data are incomplete or the likelihood
function involves latent variables.
 Incomplete data and latent variables are related: when we have a latent
variable, we may regard our data as being incomplete since we do not observe
values of the latent variables; similarly, when our data are incomplete, we often
can also associate some latent variable with the missing data.
 For language modeling, the EM algorithm is often used to estimate
parameters of a mixture model, in which the exact component model from which
a data point is generated is hidden from us.

 The EM algorithm starts with randomly assigning values to all the parameters
to be estimated. It then iteratively alternates between two steps, called the
expectation step (i.e., the “E-step”) and the maximization step (i.e., the “M-
step”), respectively.
 In the E-step, it computes the expected likelihood for the complete data
(the so-called Q function) where the expectation is taken w.r.t. the computed
conditional distrib

You might also like