AIUNIT4
AIUNIT4
LEARNING :-
Machine Learning (ML) is an automated learning with little or no human intervention. It involves
programming computers so that they learn from the available inputs. The main purpose of machine learning
is to explore and construct algorithms that can learn from the previous data and make predictions on new
input data.
The input to a learning algorithm is training data, representing experience, and the output is any expertise,
which usually takes the form of another algorithm that can perform a task. The input data to a machine
learning system can be numerical, textual, audio, visual, or multimedia.
Concepts of Learning
Learning is the process of converting experience into expertise or knowledge.
Learning can be broadly classified into three categories, as mentioned below, based on the nature of the
learning data and interaction between the learner and the environment.
● Supervised Learning
● Unsupervised Learning
● Semi-supervised Learning
Similarly, there are four categories of machine learning algorithms as shown below −
● Supervised learning algorithm
● Unsupervised learning algorithm
● Semi-supervised learning algorithm
● Reinforcement learning algorithm
However, the most commonly used ones are supervised and unsupervised learning.
Supervised Learning
Supervised learning is commonly used in real world applications, such as face and speech recognition,
products or movie recommendations, and sales forecasting.
Supervised learning can be further classified into two types - Regression and Classification.
Regression trains on and predicts a continuous-valued response, for example predicting real estate
prices.
Classification attempts to find the appropriate class label, such as analyzing positive/negative sentiment,
male and female persons, benign and malignant tumors, secure and unsecure loans etc.
In supervised learning, learning data comes with description, labels, targets or desired outputs and the
objective is to find a general rule that maps inputs to outputs. This kind of learning data is called labeled
data. The learned rule is then used to label new data with unknown outputs.
Supervised learning involves building a machine learning model that is based on labeled samples. For
example, if we build a system to estimate the price of a plot of land or a house based on various features,
such as size, location, and so on,
There are many supervised learning algorithms such as Logistic Regression, Neural networks, Support
Vector Machines (SVMs), and Naive Bayes classifiers.
Common examples of supervised learning include classifying e-mails into spam and not-spam categories,
labeling webpages based on their content, and voice recognition.
Unsupervised Learning
Unsupervised learning is used to detect anomalies, outliers, such as fraud or defective equipment, or to
group customers with similar behaviors for a sales campaign. It is the opposite of supervised learning.
There is no labeled data here.
When learning data contains only some indications without any description or labels, it is up to the coder or
to the algorithm to find the structure of the underlying data, to discover hidden patterns, or to determine how
to describe the data. This kind of learning data is called unlabeled data.
Unsupervised learning algorithms include Kmeans, Random Forests, Hierarchical clustering and so on.
Semi-supervised Learning
If some learning samples are labeled, but some other are not labeled, then it is semi-supervised learning. It
makes use of a large amount of unlabeled data for training and a small amount of labeled data for
testing. Semi-supervised learning is applied in cases where it is expensive to acquire a fully labeled dataset
while more practical to label a small subset. For example, it often requires skilled experts to label certain
remote sensing images, and lots of field experiments to locate oil at a particular location, while acquiring
unlabeled data is relatively easy.
Reinforcement Learning
Here learning data gives feedback so that the system adjusts to dynamic conditions in order to achieve a
certain objective. The system evaluates its performance based on the feedback responses and reacts
accordingly. The best known instances include self-driving cars and chess master algorithm AlphaGo.
Purpose of Machine Learning
Machine learning can be seen as a branch of AI or Artificial Intelligence, since, the ability to
change experience into expertise or to detect patterns in complex data is a mark of human or
animal intelligence.
As a field of science, machine learning shares common concepts with other disciplines such as
statistics, information theory, game theory, and optimization.
As a subfield of information technology, its objective is to program machines so that they will
learn.
However, it is to be seen that, the purpose of machine learning is not building an automated
duplication of intelligent behavior, but using the power of computers to complement and
supplement human intelligence. For example, machine learning programs can scan and process
huge databases detecting patterns that are beyond the scope of human perception.
Rote learning - Rote learning meaning refers to the act of learning things by repetition. The
idea behind rote learning is that the more one repeats something, the better it’ll be able to recall
it. Anytime you’ve crammed or ‘mugged up’ for an exam, or learnt something by heart, you’ve
learned thing.
Rote learning methods have been used in primary and secondary education for over a century
and still are one of the main ways in which learning takes place in the Indian education system.
However, these traditional systems are now seeing the rise of new-age curriculum standards such
as meaningful learning, active learning, and associative learning. In this article, we’ll discuss the
role of rote learning in Indian education system, list the different rote learning techniques, their
benefits and drawbacks, and go over the alternatives to rote learning.
Q(st, at) ← Q(st, at) + αt (st, at) [rt + γmaxaQ (st+1, a) - Q(st, at)]
● where rt is a real reward at time t, αt(s,a) are the learning rates such that 0 ≤ αt(s,a) ≤ 1, and γ is
the discount factor such that 0 ≤ γ < 1.
Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm that is used
for generating a set of classification rules, which produces rules of the form “IF-THEN”, for a set of
examples, producing rules at each iteration and appending to the set of rules.
There are basically two methods for knowledge extraction firstly from domain experts and then with
machine learning. For a very large amount of data, the domain experts are not very useful and reliable. So
we move towards the machine learning approach for this work. To use machine learning One method is
to replicate the expert’s logic in the form of algorithms but this work is very tedious, time taking, and
expensive. So we move towards the inductive algorithms which generate the strategy for performing a
task and need not instruct separately at each step.
Why you should use Inductive Learning?
The ILA is a new algorithm that was needed even when other reinforcement learnings like ID3 and AQ
were available.
● The need was due to the pitfalls which were present in the previous algorithms, one of
the major pitfalls was the lack of generalization of rules.
● The ID3 and AQ used the decision tree production method which was too specific which
were difficult to analyze and very slow to perform for basic short classification
problems.
● The decision tree-based algorithm was unable to work for a new problem if some
attributes are missing.
● The ILA uses the method of production of a general set of rules instead of decision
trees, which overcomes the above problems
1. List the examples in the form of a table ‘T’ where each row corresponds to an example
and each column contains an attribute value.
2. Create a set of m training examples, each example composed of k attributes and a
class attribute with n possible decisions.
3. Create a rule set, R, having the initial value false.
4. Initially, all rows in the table are unmarked.
● Step 1: divide the table ‘T’ containing m examples into n sub-tables (t1, t2,…..tn). One
table for each possible value of the class attribute. (repeat steps 2-8 for each sub-table)
● Step 2: Initialize the attribute combination count ‘ j ‘ = 1.
● Step 3: For the sub-table on which work is going on, divide the attribute list into
distinct combinations, each combination with ‘j ‘ distinct attributes.
● Step 4: For each combination of attributes, count the number of occurrences of
attribute values that appear under the same combination of attributes in unmarked rows
of the
sub-table under consideration, and at the same time, not appears under the same combination
of attributes of other sub-tables. Call the first combination with the maximum number of
occurrences the max-combination ‘ MAX’.
● Step 5: If ‘MAX’ == null, increase ‘ j ‘ by 1 and go to Step 3.
● Step 6: Mark all rows of the sub-table where working, in which the values of ‘MAX’
appear, as classified.
● Step 7: Add a rule (IF attribute = “XYZ” –> THEN decision is YES/ NO) to R whose
left-hand side will have attribute names of the ‘MAX’ with their values separated by AND,
and its right-hand side contains the decision attribute value associated with the sub-table.
● Step 8: If all rows are marked as classified, then move on to process another sub-table and
go to Step 2. Else, go to Step 4. If no sub-tables are available, exit with the set of rules
obtained till then.
An example showing the use of ILA suppose an example set having attributes Place type, weather,
location, decision, and seven examples, our task is to generate a set of rules that under what condition is
the decision.
Example
no. Place type weather location decision
Subset – 1
place
s.no type weather location decision
Subset – 2
place
s.no type weather location decision
● At iteration 1 rows 3 & 4 column weather is selected and rows 3 & 4 are marked. the rule
is added to R IF the weather is warm then a decision is yes.
● At iteration 2 row 1 column place type is selected and row 1 is marked. the rule is added
to R IF the place type is hilly then the decision is yes.
● At iteration 3 row 2 column location is selected and row 2 is marked. the rule is added to
R IF the location is Shimla then the decision is yes.
● At iteration 4 row 5&6 column location is selected and row 5&6 are marked. the rule
is added to R IF the location is Mumbai then a decision is no.
● At iteration 5 row 7 column place type & the weather is selected and row 7 is marked. the
rule is added to R IF the place type is beach AND the weather is windy then the decision
is no.
Finally, we get the rule set:- Rule Set
● Rule 1: IF the weather is warm THEN the decision is yes.
● Rule 2: IF the place type is hilly THEN the decision is yes.
● Rule 3: IF the location is Shimla THEN the decision is yes.
● Rule 4: IF the location is Mumbai THEN the decision is no.
● Rule 5: IF the place type is beach AND the weather is windy THEN the decision is no.
3. Goal: Bucket
B is a bucket if B is liftable, stable and open-vessel.
4. Description of Concept: These are expressed in purely structural forms like Deep, Flat,
rounded etc.
Given a training example and a functional description, we want to build a general structural
description of a bucket. In practice, there are two reasons why the explanation based
learning is important.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:
Dendrites Inputs
Synapse Weights
Axon Output
An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic
the network of neurons makes up a human brain so that computers will have an option to
understand things and make decisions in a human-like manner. The artificial neural network is
designed by programming computers to behave simply like interconnected brain cells.
There are around 1000 billion neurons in the human brain. Each neuron has an association point
somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such a
manner as to be distributed, and we can extract more than one piece of this data when necessary
from our memory parallelly. We can say that the human brain is made up of incredibly amazing
parallel processors.
We can understand the artificial neural network with an example, consider an example of a
digital logic gate that takes an input and gives an output. "OR" gate, which takes two inputs. If
one or both the inputs are "On," then we get "On" in output. If both the inputs are "Off," then we
get "Off" in output. Here the output depends upon input. Our brain does not perform the same
task. The outputs to inputs relationship keep changing because of the neurons in our brain, which
are "learning."
The architecture of an artificial neural network:
To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of. In order to define a neural network that consists of
a large number of artificial neurons, which are termed units arranged in a sequence of layers.
Lets us look at various types of layers available in an artificial neural network.
Artificial Neural Network primarily consists of three layers:
Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to
find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in
output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and
includes a bias. This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the output.
Activation functions choose whether a node should fire or not. Only those who are fired make it
to the output layer. There are distinctive activation functions available that can be applied upon
the sort of task we are performing.
Advantages of Artificial Neural Network (ANN)
Parallel processing capability:
Artificial neural networks have a numerical value that can perform more than one task
simultaneously.
Storing data on the entire network:
Data that is used in traditional programming is stored on the whole network, not on a database.
The disappearance of a couple of pieces of data in one place doesn't prevent the network from
working.
Capability to work with incomplete knowledge:
After ANN training, the information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.
Having a memory distribution:
For ANN is to be able to adapt, it is important to determine the examples and to encourage the
network according to the desired output by demonstrating these examples to the network. The
succession of the network is directly proportional to the chosen instances, and if the event can't
appear to the network in all its aspects, it can produce false output.
Having fault tolerance:
Extortion of one or more cells of ANN does not prohibit it from generating output, and this
feature makes the network fault-tolerance.
Disadvantages of Artificial Neural Network:
Assurance of proper network structure:
There is no particular guideline for determining the structure of artificial neural networks. The
appropriate network structure is accomplished through experience, trial, and error.
Unrecognized behavior of the network:
It is the most significant issue of ANN. When ANN produces a testing solution, it does not
provide insight concerning why and how. It decreases trust in the network.
Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per their structure.
Therefore, the realization of the equipment is dependent.
Difficulty of showing the issue to the network:
ANNs can work with numerical data. Problems must be converted into numerical values before
being introduced to ANN. The presentation mechanism to be resolved here will directly impact
the performance of the network. It relies on the user's abilities.
The duration of the network is unknown:
The network is reduced to a specific value of the error, and this value does not give us optimum
results.
Science artificial neural networks that have steeped into the world in the mid-20 th century are
exponentially developing. In the present time, we have investigated the pros of artificial neural
networks and the issues encountered in the course of their utilization. It should not be overlooked
that the cons of ANN networks, which are a flourishing science branch, are eliminated
individually, and their pros are increasing day by day. It means that artificial neural networks
will turn into an irreplaceable part of our lives progressively important.
Afterward, each of the input is multiplied by its corresponding weights ( these weights are the
details utilized by the artificial neural networks to solve a specific problem ). In general terms,
these weights normally represent the strength of the interconnection between neurons inside the
artificial neural network. All the weighted inputs are summarized inside the computing unit.
If the weighted sum is equal to zero, then bias is added to make the output non-zero or something
else to scale up to the system's response. Bias has the same input, and weight equals to 1. Here
the total of weighted inputs can be in the range of 0 to positive infinity. Here, to keep the
response in the limits of the desired value, a certain maximum value is benchmarked, and the
total of weighted inputs is passed through the activation function.
The activation function refers to the set of transfer functions used to achieve the desired output.
There is a different kind of the activation function, but primarily either linear or non-linear sets
of functions.
Signals travel in one way i.e. from input to output only in Feed forward Neural Network. There
is no feedback or loops. The output of any layer does not affect that same layer in such networks.
Feed forward neural networks are straight forward networks that associate inputs with outputs.
They have fixed inputs and outputs. They are mostly used in pattern generation, pattern
recognition and classification.
Feedback Neural Network
Signals can travel in both the directions in Feedback neural networks. Feedback neural
networks are very powerful and can get very complicated. Feedback neural networks are
dynamic. The ‘state’ in such network keep changing until they reach an equilibrium
point. They remain at the equilibrium point until the input changes and a new equilibrium
needs to be found. Feedback neural network architecture is also referred to as interactive
or recurrent, although the latter term is often used to denote feedback connections in
single-layer organisations. Feedback loops are allowed in such networks. They are used
in content addressable memories.
1. Initialization
The process of a genetic algorithm starts by generating the set of individuals, which is called
population. Here each individual is the solution for the given problem. An individual
contains or
is characterized by a set of parameters called Genes. Genes are combined into a string and
generate chromosomes, which is the solution to the problem. One of the most popular techniques
for initialization is the use of random binary strings.
2. Fitness Assignment
Fitness function is used to determine how fit an individual is? It means the ability of an
individual to compete with other individuals. In every iteration, individuals are evaluated based
on their fitness function. The fitness function provides a fitness score to each individual. This
score further determines the probability of being selected for reproduction. The high the fitness
score, the more chances of getting selected for reproduction.
3. Selection
The selection phase involves the selection of individuals for the reproduction of offspring. All
the selected individuals are then arranged in a pair of two to increase reproduction. Then these
individuals transfer their genes to the next generation.
There are three types of Selection methods available, which are:
● Roulette wheel selection
● Tournament selection
● Rank-based selection
4. Reproduction
After the selection process, the creation of a child occurs in the reproduction step. In this step,
the genetic algorithm uses two variation operators that are applied to the parent population. The
two operators involved in the reproduction phase are given below:
● Crossover: The crossover plays a most significant role in the reproduction phase of the
genetic algorithm. In this process, a crossover point is selected at random within the
genes. Then the crossover operator swaps genetic information of two parents from the current
generation to produce a new individual representing the offspring.
The genes of parents are exchanged among themselves until the crossover point is met.
These newly generated offspring are added to the population. This process is also called
or crossover. Types of crossover styles available:
o One point crossover
o Two-point crossover
o Livery crossover
o Inheritable Algorithms crossover
● Mutation
The mutation operator inserts random genes in the offspring (new child) to maintain
the diversity in the population. It can be done by flipping some bits in the
chromosomes.
Mutation helps in solving the issue of premature convergence and enhances
diversification. The below image shows the mutation process:
Types of mutation styles available,
o Flip bit mutation
o Gaussian mutation
o Exchange/Swap mutation
●
5. Termination
After the reproduction phase, a stopping criterion is applied as a base for termination. The
algorithm terminates after the threshold fitness solution is reached. It will identify the
final solution as the best solution in the population.
General Workflow of a Simple Genetic Algorithm
Advantages of Genetic Algorithm
● The parallel capabilities of genetic algorithms are best.
● It helps in optimizing various problems such as discrete functions, multi-objective
problems, and continuous functions.
● It provides a solution for a problem that improves over time.
● A genetic algorithm does not need derivative information.