0% found this document useful (0 votes)
15 views44 pages

Compiled

The document discusses programmable logic controllers including their basic architecture, input and output devices, operation, programming, and provides an example calculation for determining the maximum speed of a conveyor belt based on the PLC scan time to ensure detection of objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views44 pages

Compiled

The document discusses programmable logic controllers including their basic architecture, input and output devices, operation, programming, and provides an example calculation for determining the maximum speed of a conveyor belt based on the PLC scan time to ensure detection of objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

4/30/2022

Programmable logic controllers (PLC):


Motivation

Industrial Instrumentation and Control


INSTR F343
BITS Pilani Conveyor belt operation
Automatic drill control

2 BITS Pilani, Deemed to be University under Section 3, UGC Act

Note to students Programmable logic controllers (PLC)

These slides should only be considered as supporting material. To have a


thorough understanding of the course, these must be accompanied by
textbooks, reference materials and lecture notes.

3 BITS Pilani, Deemed to be University under Section 3, UGC Act 6 BITS Pilani, Deemed to be University under Section 3, UGC Act
4/30/2022

Basic architecture of PLC Input & output devices


• Input devices:
According to NEMA (National Proximity switches
Electrical Manufacturer’s Pushbuttons
Association), PLC can be defined as, TTL signals
Encoders
“A digital electronic device that uses ADCs
a programmable memory to store Thermocouple, load cell, RTDs etc.
instructions and to implement • Output devices:
functions such as logic, sequencing, Motor starters
timing, counting and arithmetic in Indicators
order to control machines and Alarms
processes.”
Relays
Variable speed drives etc.

7 BITS Pilani, Deemed to be University under Section 3, UGC Act 8 BITS Pilani, Deemed to be University under Section 3, UGC Act

Input/output unit PLC operation

• The input/output unit provides the interface between the system and the outside world, allowing • The operation is not simultaneous for the entire ladder diagram.
for connections to be made through input/output channels to input devices such as sensors and
output devices such as motors and solenoids. • The process of reading the inputs, executing the control application program,
• Every input/output point has a unique address which can be used by the CPU. and updating the output is known as SCAN.

Read Execute
inputs program

PLC SCAN

• Each I/O device has to be identified with a unique address for exchange of data. Different
manufacturers adopt different schemes to identify I/O device. For ex.: X1 X2 X3 X4 X5 Update Diagnostic &
• X1 – input (1) or output (0); X2: I/O rack no.; X3: module slot number; X4X5: terminal number outputs Comm.

9 BITS Pilani, Deemed to be University under Section 3, UGC Act 11 BITS Pilani, Deemed to be University under Section 3, UGC Act
4/30/2022

Example PLC Programming

A complex manufacturing results in a 30 ms PLC scan time. The PLC must detect 2 cm • The use of high level language to write programs in microprocessor requires some skill in
individual objects moving on a high-speed conveyor belt. If the object breaks a light beam programming and PLCs are intended to be used by users without any great knowledge of
providing a high input to the PLC, then calculate the highest sped of the conveyor to be sure programming.
that the object is detected. • As a consequence, ladder programming was developed. This is a means of writing programs
which can then be converted into machine code by some software for use by the PLC
2 microprocessor.
𝑇 =
𝑆 • An international standard has been adopted for ladder programming and indeed all the
Where, S = conveyor speed, so, methods used for programming PLCs. The standard published in 1993, is IEC 1131-3
(International Electrotechnical Commission).
if, 𝑇 = 1 𝑠𝑐𝑎𝑛 𝑡𝑖𝑚𝑒 = 30 𝑚𝑠, then, • The IEC 1131-3 programming languages are ladder diagrams (LAD), instruction list (IL),
sequential function charts (SFC), structured text (ST), and function block diagrams (FBD).
𝑺 = 𝟐/𝟎. 𝟎𝟑 = 𝟔𝟔. 𝟕 𝒄𝒎/𝒔.

12 BITS Pilani, Deemed to be University under Section 3, UGC Act 13 BITS Pilani, Deemed to be University under Section 3, UGC Act

Contd… Examples

LATCH: A B

EX-OR Gate:

X400 X401 Y430

Scanning a ladder diagram Use of relay and switch to start the motor X400 X401
14 BITS Pilani, Deemed to be University under Section 3, UGC Act 16 BITS Pilani, Deemed to be University under Section 3, UGC Act
4/30/2022

Example Timers and Counters

Develop the physical ladder diagram • PLC timers and counters are internal instructions that provide the same functions as
AND a programmable ladder hardware timers and counters.
diagram for a motor with the • They activate or deactivate a device after a time interval has expired or a count has reached a
following: NO start button, NC stop preset value.
button, thermal overload limit switch • Timer and counter instructions are generally considered internal outputs.
opens on high temperature, green
light when running, red light for • These are generally represented by blocks in the ladder diagrams.
thermal overload.

17 BITS Pilani, Deemed to be University under Section 3, UGC Act 19 BITS Pilani, Deemed to be University under Section 3, UGC Act

Timers Timers: TP (Generate pulse)

There are typically five types of timers: • The instruction Generate pulse sets
output Q for duration PT.
• TP: generate Pulse
• The instruction is started when the
• TON: ON-delay timer (MOST Common) result of logic operation (RLO) at
• TOF: OFF-delay timer input IN changes from 0 to 1 (positive
signal edge).
• RTO: Retentive timer ON • The programmed time PT begins when
the instruction starts.
The timers use a preset or count and a time base to implement it’s function. • Output Q is set for the duration PT,
regardless of the subsequent course of
𝑃𝑟𝑒𝑠𝑒𝑡 𝑡𝑖𝑚𝑒 = COUNT × 𝑇𝑖𝑚𝑒 𝑏𝑎𝑠𝑒
the input signal.
• Even if a new positive signal edge is
detected, the signal state at the output
Q is not affected as long as the PT time
is running.
20 BITS Pilani, Deemed to be University under Section 3, UGC Act 21 BITS Pilani, Deemed to be University under Section 3, UGC Act
4/30/2022

TON: ON-Delay Timer TOF: OFF-Delay Timer

22 BITS Pilani, Deemed to be University under Section 3, UGC Act 23 BITS Pilani, Deemed to be University under Section 3, UGC Act

RTO or TONR: Retentive Timer ON Counters

• Counter up (CTU)
• Counter down (CTD)
• Counter UP/DOWN (CTUD)

24 BITS Pilani, Deemed to be University under Section 3, UGC Act 25 BITS Pilani, Deemed to be University under Section 3, UGC Act
4/30/2022

CTU counter CTD Counter

26 BITS Pilani, Deemed to be University under Section 3, UGC Act 27 BITS Pilani, Deemed to be University under Section 3, UGC Act

CTUD (UP-DOWN Counter)


BITS Pilani

Thank You

28 BITS Pilani, Deemed to be University under Section 3, UGC Act


5/7/2022

Fuzzy systems

• Usually described as, ill - defined, indefinite, indistinct, murky, obscure,


unclear, vague, and so on…
• Fuzzy systems are capable of dealing with very complex problems —
problems that would be impossible to model mathematically — such as
deciding when, where, and how much money to invest.
• Take another instance, a young boy can easily balance an upside - down
cricket bat in his palm of his hand for any desired length of time.
Industrial Instrumentation and Control • In engineering parlance, this is known as the two-dimensional inverted
INSTR F343 pendulum problem.
• “Expert Knowledge”…
BITS Pilani

3 BITS Pilani, Deemed to be University under Section 3, UGC Act

Note to students Fuzzy Sets

These slides should only be considered as supporting material. To have a • A fuzzy set is a collection of real numbers having partial membership in the
thorough understanding of the course, these must be accompanied by set.
textbook, reference materials and lecture notes. • This is in contrast to a conventional or crisp set.
• For example, consider a set of “all heights of people 6 - ft tall or taller”

• Take another example, Consider a set of “heights of tall people”

2 BITS Pilani, Deemed to be University under Section 3, UGC Act 5 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Fuzzy Sets… Fuzzy Sets: Example…

• More formally, consider a variable with universe of discourse X ⊆ ℜ , and let


x be a real number (i.e., x∈X ).
• Let M denote a fuzzy set defined on X. A membership function μM (x)
associated with M is a function that maps X into [0, 1].
• We say that the fuzzy set M is characterized by μM. Then the fuzzy set M is
defined as,

𝑀 = { 𝑥, 𝜇 𝑥 ∶ 𝑥 ∈ 𝑋}

7 BITS Pilani, Deemed to be University under Section 3, UGC Act 9 BITS Pilani, Deemed to be University under Section 3, UGC Act

Fuzzy Sets: Example… Fuzzy Sets: Example…

• Consider a fuzzy set “WARM” for linguistic variable “OUTDOOR


TEMPERATURE” Gaussian
membership
function
Triangular
membership
function

8 BITS Pilani, Deemed to be University under Section 3, UGC Act 10 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Fuzzy Sets: Example… Examples…

Fuzzy Compliment

𝜇 (𝑥) = 1 − 𝜇 𝑥

11 BITS Pilani, Deemed to be University under Section 3, UGC Act 18 BITS Pilani, Deemed to be University under Section 3, UGC Act

Examples… Examples…


𝜇 𝑥 = min 𝜇 𝑥 , 𝜇 𝑥 : 𝑥 ∈ 𝑋
OR

𝜇 𝑥 = {𝜇 𝑥 𝜇 𝑥 : 𝑥 ∈ 𝑋}

Fuzzy subset Fuzzy Intersection


17 BITS Pilani, Deemed to be University under Section 3, UGC Act 19 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Examples… Examples…


𝜇 𝑥 = m𝑎𝑥 𝜇 𝑥 , 𝜇 𝑥 : 𝑥 ∈ 𝑋
OR

𝜇 𝑥 = {𝜇 𝑥 + 𝜇 𝑥 − 𝜇 𝑥 𝜇 𝑥 ∶ 𝑥 ∈ 𝑋}

Fuzzy Union:

20 BITS Pilani, Deemed to be University under Section 3, UGC Act 22 BITS Pilani, Deemed to be University under Section 3, UGC Act

Examples… Fuzzy System

• Mamdani fuzzy system


• Takagi-Sugeno (TS) fuzzy system

Fuzzy Cartesian Product:

21 BITS Pilani, Deemed to be University under Section 3, UGC Act 23 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Wind Chill example: Rule Base from


Center of Gravity (COG) defuzzification
Expert Knowledge

Center of area

Let’s Calculate the wind chill using Minimum T-norm and COG defuzzification.

Say Temperature (T) = 7oC and Wind Speed (S ) = 22 knots

24 BITS Pilani, Deemed to be University under Section 3, UGC Act 26 BITS Pilani, Deemed to be University under Section 3, UGC Act

Defuzzification example: Wind chill Wind chill calculation example


calculation… (Defuzzification using COG)…
Since,
Temperature (T) = 7oC

𝜇 7 = 0.3
𝜇 7 = 0.7
𝜇 7 =0
𝜇 7 =0

0, 𝑇<0
𝑇
1, 𝑇<0 , 0 > 𝑇 ≥ 10
𝑇 𝜇 = 10
𝜇 = − + 1, 10 > 𝑇 ≥ 0 1
10 − (𝑇 − 20), 10 > 𝑇 ≥ 20
0, 𝑇 ≥ 10 10
0, 𝑇 > 20

25 BITS Pilani, Deemed to be University under Section 3, UGC Act 27 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Wind chill calculation example


Defuzzification: COG method
(Defuzzification using COG)…
Let’s find the implied output membership functions,
Since,
Wind Speed (S) = 22 knots

𝜇 22 = 0
𝜇 22 = 0.44
𝜇 22 = 0.76

0, 𝑇 < 2.5
𝑆 0, 𝑆 < 12.5
− 0.2, 15 > 𝑇 ≥ 2.5 𝑆
𝜇 = 12.5 𝜇 =
𝑆 − 1, 25 > 𝑆 ≥ 12.5
12.5
− + 2.2, 27.5 > 𝑇 ≥ 15 1, 𝑆 ≥ 25
12.5
0, 𝑇 > 27.5
Implied fuzzy sets
28 BITS Pilani, Deemed to be University under Section 3, UGC Act 30 BITS Pilani, Deemed to be University under Section 3, UGC Act

Defuzzification: Finding rule strength: Center of Gravity (COG)


Min T - norm defuzzification…

𝜇 7 = 0.3
𝜇 22 = 0
𝜇 7 = 0.7
𝜇 22 = 0.44
𝜇 7 =0
𝜇 22 = 0.76
𝜇 7 =0

29 BITS Pilani, Deemed to be University under Section 3, UGC Act 31 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Center of Gravity (COG)


Defuzzification: Product T-norm (COG)
defuzzification…

32 BITS Pilani, Deemed to be University under Section 3, UGC Act 34 BITS Pilani, Deemed to be University under Section 3, UGC Act

Input-output characteristic for wind-chill fuzzy Implied fuzzy sets: Product T norm
system (min -T norm and COG defuzzification) (COG)

33 BITS Pilani, Deemed to be University under Section 3, UGC Act 35 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Singleton membership function:


Defuzzification: Product T – Norm (COG)
Example

36 BITS Pilani, Deemed to be University under Section 3, UGC Act 38 BITS Pilani, Deemed to be University under Section 3, UGC Act

Input-output characteristic for wind-chill fuzzy system


(Product -T norm and COG defuzzification)
T-S Fuzzy system

Then, the crisp output of the system will be,

Or,

Input – output characteristic of Wind Chill fuzzy system, product Where,


T - norm, COG defuzzification Fuzzy basis functions
37 BITS Pilani, Deemed to be University under Section 3, UGC Act 39 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Example: T-S Fuzzy System Finding Crisp output of TS fuzzy system

40 BITS Pilani, Deemed to be University under Section 3, UGC Act 42 BITS Pilani, Deemed to be University under Section 3, UGC Act

Membership function variation Crisp output generation…

X1 and X2

41 BITS Pilani, Deemed to be University under Section 3, UGC Act 43 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/7/2022

Crisp output generation… A simple Fuzzy PID structure

44 BITS Pilani, Deemed to be University under Section 3, UGC Act 47 BITS Pilani, Deemed to be University under Section 3, UGC Act

A fuzzy control rule base


BITS Pilani
𝑒̇ (𝑛𝑇)
Δ𝑢(𝑛𝑇) N Z P
N N N Z
𝑒(𝑛𝑇)
Z N Z P
P Z P P Δ𝑢 𝑛𝑇
OR
𝑒̇ (𝑛𝑇) 𝑢 𝑛𝑇 ???

Δ𝑢(𝑛𝑇) N Z P
N NB NS Z
Thank You
𝑒(𝑛𝑇)
Z NS Z PS
P Z PS PB
46 BITS Pilani, Deemed to be University under Section 3, UGC Act
5/4/2022

ARTIFICIAL NEURAL WHAT IS ARTIFICIAL NEURAL


NETWORK(ANN)
NETWORKS • A computer program designed to
recognize patterns and learn
“like” human brain
Concepts
Applications
• Neural networks are a class of
in
software that has the potential
Process Modeling
to simulate biological thinking
&
and learning.
Control

ANNs are an attempt at mimicking the • Neuron = nerve cell


• Soma = Cell Body
NEURON
working of the human mind.
(decisions are made)
• Dendrites – carry information into the cell
The brain is one of the most complex adaptive • Axon – carry output to other neurons
networks, where learning occurs by modifying the
link weights.

Learning is viewed as the establishment of


new connections between neurons or the
modification of existing connections.
Each of the 100 billion neurons in the human brain has, on
average, 7,000 connections to other neurons.
ANNs like people, learn by example. On average, there are 100 trillion neural connections across 100
billion neurons in brain.

1
5/4/2022

• NNs are well suited for domains


of non-linearity and high
complexity that are ill defined,
unknown, or just too complex
for standard programming
practices.

• Rather than explicit


programming, learning
algorithms are used

b: bias weight,input is 1

f is activation function

2
5/4/2022

A Neural Network without Activation function


would simply be a Linear regression Model,
Structure of ANN is composed of: Activation functions add to NN

 Architecture (topology of how the 1. The ability to learn and model images,
network is structured) videos , audio , speech etc

 Activation Function (how neuron 2. Represent non-linear complex arbitrary


responds ) functional mappings between inputs and
outputs.
 Learning Rule (how the weights of the
connections change w.r.t input, output and Linear systems are alike. Nonlinear systems
are nonlinear in their own way.
error).

Activation Functions
Two types of Activation
Functions y y

threshold linear
1. Linear Activation Function
net net

y y
2. Non-linear Activation
Functions piece-wise linear sigmoid

net net

3
5/4/2022

Unipolar Binary Step Function Unipolar Threshold Logic Unit


Unipolar Threshold Logic Unit ( TLU)
(TLU)
n
inputs
net   wi xi
x1 w1 weights
i 1
w2
Output(y)
x2
 q
y

.
f(net) =1 if net ≥0 .
=0 otherwise . wn

1 if net q
y= {
xn

0 if net < q
hardlim in matlab

IDENTIFY GATES with binary input/output Bipolar Binary Step


activation is unipolar TLU, SINGLE Function Hardlims in Matlab
NEURON
‘s’ stands for symmetric
net= w1 x1 +w2 x2
If net>=T, output=1
If net<T, output=0 slope is
zero
except at
origin Not
differenti
able,

4
5/4/2022

Log sigmoid /sigmoid Activation function


[output varies between 0 and 1] S –shaped curve
An important desirable feature of a
Activation function is that it should be
differentiable.

Many learning algorithms like BPA(Back


Propagation Algorithm) compute

gradients of Error(loss) with respect to


Weights to optimize weights

C alcula te y '  f '( n et ) w rt n et in term s o f y


1
w h ere y  f ( n et ) 
1  e   net

y '  f '(net )   y (1  y )
A reason for its popularity in neural
Log sigmoid activation function networks is because the sigmoid
1 function satisfies a property between the
log sigmoid  y  f (net) 
1 exp(.net) derivative and itself such that it is
  Slope computationally easy to perform.

5
5/4/2022

Tan Sigmoid activation function 2


tan sigmoid y  f ( net )  1
(1  e  net )
1  e  net

1  e  net
1  e  2 net
tanh : y  f ( net ) 
1  e  2 net

The tansigmoid function is mainly used for


output is zero centered because its classification between two classes.
range is between -1 to 1.

2
tansigmoid y  f (net)  net
1
(1  e )
1  enet

1  enet
Find f ’(net) w.r.t. net in terms of y
f ' (net )  0.5(1  y 2 )
For tanh
x is net, f(x) is f(net) derivative is (1-y2 )

6
5/4/2022

Logsigmoid and its derivative

Sigmoids saturate (explode)and


kill (vanish)gradients.

Sigmoids have slow


convergence.

exp() is a bit computationally


tanh and its derivative
expensive

Linear Activation function

y = f(net)= c*net, y’=f’(net) = c,

if c=1, f’(net)=1

7
5/4/2022

Currently the most popular activation function for neural


networks is the rectified linear unit (ReLU

f(net) = max(0,net) i.e


If net < 0 , f(net) = 0 and
if net >= 0 , f(net) = net.

UNIPOLAR
HARDLIMITER

What is the output of


network ??

8
5/4/2022

UNIPOLAR HARDLIMITER • Perceptrons can only


perform accurately with
linearly separable classes
(linear hyperplane can place one class of x1
objects on one side of plane and other
class on other)

x2

• Additional (hidden) layers


of neurons, MLP x1

architecture
• Able to solve non-linear
classification problems x2

CLASSIFICATION OF ANN
Feed-forward nets
• Informationflow is
ARCHITECTURE unidirectional

On the basis of topology :


Data is presented to Input
Feed forward ANN: Feed-forward layer
ANNs allow signals to travel one
way only. Passed on to Hidden Layer

Feedback networks: Feedback Passed on to Output layer


networks can have signals traveling
in both directions by introducing • Information processing is
loops in the network parallel

9
5/4/2022

FEEDBACK/RECURRENT NETWORKS
Learning methods

Supervised learning

Unsupervised learning

Reinforcement learning

• In supervised learning, the


network uses a set of training
data.

• The training data contains


examples of inputs together with
the corresponding outputs.

• the network learns to infer the


relationship between the two.

10
5/4/2022

• Unsupervised Training

• Only supplies inputs

• The neural network adjusts its own LEARNING ALGORITHMS IN


weights so that similar inputs cause
similar outputs, discovers patterns NEURAL NETWORKS

The network identifies the patterns and


differences in the inputs without any
external assistance

NEURAL NETWORK LEARNING


RULES

– PERCEPTRON LEARNING RULE(PLR) ,


(hardlimiter activation function)

– DELTA LEARNING RULE


(continuous activation function)

– BACK PROPAGATION ALGORITHM(BPA)


( Multiperceptron Layer model, continuous
activation function)

Perceptron learning rule

11
5/4/2022

EXAMPLE : AND Gate realization using 2nd sample x1 = 0 and x2 = 1.


Perceptron learning net = x1 * w 1 + x2 * w2
Step Activation function with threshold = 0.5
Learning rate = 0.5 = 0 * 0.9 + 1 * 0.9 = 0.9 > 0.5 so output =1

While desired output is 0 , So update weights


1st sample
x1 = 0 and x2 = 0.

ε = Error = actual – prediction = 0 – 1 = -1


net = x1*w1 + x2 * w 2 = 0*0.9 + 0 * 0.9

= 0 < 0.5 w1 = w1 + η * ε x1= 0.9 + 0.5 * (-1)0 = 0.9 = 0.9


w2 = w2 + η * ε x2= 0.9 + 0.5 * (-1)(1) = 0.9 – 0.5
so output =0, desired o/p =0, = 0.4
error =0, Do not update weights

3rd sample: x1 = 1 and x2 = 0


4th sample: x1 = 1 and x2 = 1.
net = x1 * w1 + x2 * w 2 = 1 * 0.9 + 0 * 0.4 = 0.9 > 0.5
net = x1 * w1 + x2 * w 2 = 1 * 0.4 + 1 * 0.4 = 0.8 >
Activation unit will return output as 1
0.5
desired output =0 update weights.

Activation unit will return output =1 = desired


ε = Error = actual – prediction = 0 – 1 = -1 output
Do not update weights.
w1 = w1 + η * ε x1= 0.9 + 0.5 * (-1)1 = 0.9-0.5
= 0.4 EPOCH 1 over
w2 = w2 + η * ε x2= 0.4 + 0.5 * (-1)(0) = 0.9 – 0.5
= 0.4

12
5/4/2022

What is a Hyper-parameter?
It is a parameter in machine learning whose value
is initialized before the learning takes place. They
are like settings that we can change and alter to
control the algorithm’s behavior.

Neural Network , some of the hyperparameters are :


Number of hidden layers
Number of hidden units/ neurons in each layer.
Choice of activation function used in the hidden layers
Learning Rate ( alpha)
Momentum factor
Number of iterations / epochs

What is a cost/loss/error function?

Defines how costly our mistakes are.

Calculates how poorly our model is


performing by comparing what the model is
predicting with the actual value it is supposed
to output.

Used to learn parameters If Y_pred is very far off from Y, the Loss value
will be very high.
A single value, not a vector, it rates how
if both values are almost similar, the Loss
good the neural network did as a whole.
value will be very low.

13
5/4/2022

LOSS FUNCTION To reach minmum error,


wi (new) = wi (old) - η E
If gradient is (+), weight should decrease
Loss function evaluates performance of the network If gradient is( -), weight should increase
by comparing the output ‘y’ with corresponding
target ‘t’
Selection of loss function is problem dependent.

Some popular loss functions are :

1. Euclidean distance (y-t)2 for forecasting of


real values

2. Cross-entropy for classification problems.

GENERALIZED DELTA LEARNING RULE valid for


wi (new) = wi (old) - η E any continuous activation functions

If Error gradient is negative:

wi (new) = wi (old) + η E
wi (new) > wi(old)

If Error gradient is positive:

wi (new) = wi (old) - η E
wi (new) < wi(old)

14
5/4/2022

Learning rate

Determines how far to move in the direction of the


Backpropagation ANNs
gradient of the surface over the weight space defined by
an error function. • Most widely used type of network
• Feedforward, Supervised
A small learning rate will lead to slower learning, but a
large one may cause a move through weight space that • Simple , slow , prone to local minima
``overshoots'' the solution vector. issues
• The learning rate • Error propagated backwards
• ► Too small Convergence extremely slow • Used for data modelling, classification,
• ► Too large May not converge forecasting, data and image
compression and pattern recognition.

MODES OF TRAINING
BACK-PROPAGATION LEARNING ALGORITHM

Training involves three stages:


1.Pattern mode
2.Batch mode
3.Mini-batch
1)Feed forward propagation of the input training pattern

In pattern mode, whole sequence of forward and


2)Calculation and backpropogation of the error backward computation is performed resulting in
weight adjustment for each pattern X (1)/ D (1)
3)Adjustment of weights to X (N)/D (N)

Pattern mode of training is online, requires less


local storage

15
5/4/2022

BATCH MODE OF TRAINING

Mini-batch learning –
In batch mode the weight up gradation is done
after the whole N patterns are presented

In Batch Gradient Descent, all the training data is


taken into consideration to take a single step. Combination of batch and pattern(on-line)
Average of the gradients of all the training examples mode of training can provide a good
, i.e. mean gradient is used to update parameters. trade-off between these two methods
So that’s just one step of gradient descent in one
epoch.
Training cases are separated into several
One completed presentation of entire portions (mini-batch), and within each
training set is called an Epoch.
portion a batch training is performed
Batch mode is more accurate

GENERAL BPA : MORE THAN ONE OUTPUT AND HIDDEN


NEURONS

Calculate Error Vectors E3, E2 and E1

E3 = (yd – y) f’(neto)
E2 = f’(neth2) E3 g(t)
E1 = f’(neth1) E2 w(t)
Calculate new weights Randomly select a vector pair (xp, yp) from the training
g (t +1) = g(t) + η E3 f(neth2 ) set X
w (t +1) = w (t) + η E2 f(neth1 ) and compute the outputs of all neurons in the network (
at hidden and output layers) – Forward pass
h (t +1) = h (t) + η E1 x

16
5/4/2022

Calculate Error component at


k th output node for pth pattern Write expression for error component at
  ( ykp  okp ) f (net )
0
kp
' o
kp
jth hidden node for pattern p.  h
pj
Compute error for the pattern ‘p’ across all nodes at K
output layer .
K   f (net )  kpo wkj
h
jp
' h
jp
 po    kpo k 1
k 1

GENERAL BPA

5. Update the connection-weight values to the


output layer 6. Update the connection-weight values from
input to the hidden layer
wkj   kpo f (net hjp )
w ji   jph xi

17
5/4/2022

•Repeat steps 1 to 6 for all vector pairs in


A single input feedforward neural network is to be
the training set; this is called an EPOCH. trained to learn the unit step response of first order
system in time domain described by equation
•Run as many epochs as required to
[ 1- e-t/τ ] for three different time constants (τ = 0.5s,
reduce the network error E to fall below 1s, and 2s ), from time t = 0 to 10 sec. There is one
a Threshold : hidden layer having two nodes.

i) Draw the network architecture. Identify number of


•Write Expression of SSE input/output nodes?
P K
E   ( ykp  okp ) 2
ii) Perform One forward and One backward pass for
training the network at t=1 for response at three
time constants (τ = 0.5s, 1s, and 2s ), and
p 1 k 1 calculate change in weights.

The picture can't be display ed.

DRAW THE NETWORK ARCHITECTURE


[Learning rate is 0.1, weights between
input and hidden layer is 0.4, between
hidden and output layer is 0.2, no bias
input.

Activation function at the hidden layer is


logsigmoid with slope of 2, and at the CALCULATE DESIRED OUTPUTS
output layer is ReLu]. yd 1  1  e t /1  1  e1/0.5  0.8646
Truncate all values to 4 places after
yd 2  1  e  t / 2  1  e 1/1  0.6321
decimal.
y d 3  1  e  t / 3  1  e 1/2  0.3934

18
5/4/2022

netH  netH1  netH2 10.4  0.4 Error vector at output layer


1
H1out  H2out  f (netH )  ;  2   y1   ( yd 1  y1out )  f '(netO1 ) 
1exp(.netH )  
 output   y 2   ( yd 2  y1out )  f '(netO 2 ) 
1  y 3   ( yd 3  y1out ) f '(netO 3 ) 
  0.6899  
1 exp(20.4)
f '(netH )  2 f (netH )(1 f (netH ))  y1  (0.8646  0.2759) f '(netO )   0.5887 1  .5887 
 
 output   y 2    (0.6321  0.2759) f '(netO )    0.3562 1  .3562 
O1in  O2in  O3in  20.20.6899  0.2759  netO    
 y 3  (0.3934  0.2759) f '(netO )   0.1175 1  .1175 
y1out  y2out  y3out  f (neto )  Relu(neto )  Relu(0.2759)  0.2759  

wH1  y1   y1 H 1out (  f ( net H ))  0.1 0.5887  0.6899  0.0406


wH 2  y1   y1 H 2out (  f (net H ))  0.1 0.5887  0.6899  0.0406

wH1  y2   y2 H 1out (  f ( net H ))  0.1 0.3562  0.6899  0.0245


wH 2  y2   y2 H 2out (  f ( net H ))  0.1 0.3562  0.6899  0.0245

   ( y1wH1 y1   y 2 wH1 y 2   y3wH1 y3 )  f '(netH ) 


 H   H1     wH1  y3   y3 H1out (  f (net H ))  0.1  0.1175  0.6899  0.0081
 H 2   ( y1wH 2 y1   y 2 wH 2 y 2   y3wH 2 y3 )  f '(netH )
wH 2  y3   y3 H 2out ( f (net H ))  0.1 0.1175  0.6899  0.0081
f (net H )  0.6899, f '(net H )  2  f (netH )  (1  f (netH ));

  ( y1   y 2   y3 )  0.2  2  0.6899(1 0.6899)  0.0909 wx  H1   H1 x  0.2  0.0909  1.0  0.0181
 H   H1    
 H 2  ( y1   y 2   y3 )  0.2  2  0.6899(1 0.6899)  0.0909 wx  H 2   H 2 x  0.2  0.0909  1.0  0.0181

19
5/4/2022

Normalizing both input & target data set


during pre-processing stage ensures
that the network output always falls into
a normalized range.

During post-processing stage, the


network output is transformed back into
the units of the original target data.

Standardization Normalizing in the interval [ X max-new , Xmin-new ]


It will transform the data to have mean
0 and variance 1, having no units.

Standardization is useful for comparing


variables expressed in different units.

X current  X
X normalised 
X

20
5/4/2022

Postprocessing Example : Input : 0-1000C, Output-valve opening : 10-100%,


both input/outputs have been normalized to ( -1 to 1)
(Denormalization of output data) Temp = 680C, Output of network = -0.5.
Find normalized input and denormalized output
X current  X min X current  X min
X normalized  ( X max  new  X min  new )  X min  new X normalized  ( X max  new  X min  new )  X min  new
X max  X min X max  X min
68  0
Ycurrent  Ymin Tempnorm (-1 to +1) = [1  (1)]  (1)  0.36
Yde  norm  (Ydenorm  max  Ydenorm  min )  Ydenorm min 100
Ymax  Ymin
Ymax and Ymin are max and min activation Ycurrent  Ymin
values of the output neurons Yde  norm  (Ydenorm  max  Ydenorm  min )  Ydenorm min
Ymax  Ymin
0.5  (1)
Ydenorm-max and Ydenorm-min are the maximum Valve opening de-norm = [100  10]  10  32.5%
1  ( 1)
and minimum values of denormalised Y

ADVANTAGES OF NEURAL
DISADVANTAGES OF NEURAL NETWORK
NETWORK

Good fit for non-linear models. Needs lots of training data sets.

Ability to adopt, learn, generalize, extrapolate, Needs lots of CPU power and
ability to operate in high noise environment, ease
time or horsepower.
of maintenance.
They have very good learning ability.
Unpredictable for utilization in
Their fault and uncertainty tolerance is good. untrained areas.

21
5/4/2022

Data based/Experimental / Black box models


Models based on first principles (White model) for
complex processes become difficult because of
poor knowledge in terms of process kinetics,
order, and parameters.

The black box models are data dependent and


model parameters are determined by experimental
results /wet labs.

 Black box/data based models do not describe the


mechanistic phenomena of the process.

Black box models based on input-output data only


describe the overall behavior of the process.

The data based models are especially ANNs accurately recognize the inherent
relationship between any sets of inputs and outputs
appropriate for problems that are data without formulating a mathematical model of a
rich, but hypothesis and/or information phenomenon under consideration.
poor.
Ability of ANN, trained on a finite set of examples,
for accurate prediction on unseen inputs, is called
In all the cases the availability of Generalization.
sufficient number of quality data points
are required to propose a good model.  From a function approximation point of view, such
new examples can be regarded as either
interpolation or extrapolation of the training data.
 Quality data defined by noise free
data; free of outliers is ensured by data  With respect to neural networks, the operation of
mining and pre conditioning/processing. interpolation is preferred over extrapolation.

22
5/4/2022

WHICH CASE LEADS TO POOR


GENERALIZATION

Over-fitting results in
poor generalization.
Network fails to respond well when
tested and simulated with an unseen
data set.

The network actually memorizes the


samples it is trained with, instead of
learning to generalize the process to
respond to unknown conditions.

23
5/4/2022

A complex plant Forward Modeling


is like an
unknown function

• [x,t] = simplefit_dataset; single-layer feed-forward neural network classifiers


• net = feedforwardnet(10);
• net = train(net,x,t); [automatically configures so that its input, output,
weight, and bias dimensions match the input and target data]
• view(net)
• y = net(x);
• perf = perform(net,y,t)
net=feedforwardnet([10]);
multilayer feed-forward neural network classifiers

perf = 1.4639e-04

net=feedforwardnet([10,10,10]);

24
5/4/2022

Example : process model has three


net = feedforwardnet;[default 10 neurons in single inputs a,b and c and generates an output
hidden layer] y.
net=feedforwardnet([10 11 12]);
3 hidden layers of size 10,11 and 12
Y = 5a + b.c + 7c;
net = configure(net,X,T); so that its input, output,
weight, and bias dimensions match the input and target we are taking this model for data
data.
generation.
net = train(net,Xc,Tc);
net = train(net, X, T, ‘useGPU', 'yes');
In actual cases, you do’nt have the
Yc = net(Xc); mathematical model and you generate
the data by running the real system

a= rand(1,1000);
b=rand(1,1000);
c=rand(1,1000);
n=rand(1,1000)*0.05;
y = a*5 + b.*c + 7*c + n;
n is the noise, added deliberately to make it
more like a real data.
So our input is set of a ,b and c and output is
y. P=[a; b; c];
T=y;
net = feedforwardnet(4); /*hidden layer size as
4
net = train(net,P,T);

25
5/4/2022

Input nodes=3 : a,b,c, INTERACTING TANKS


Hidden nodes in single hidden layer = 4
Output nodes = 1 = y
Trained weight matrix from input to hidden layer
4x3 weight matrix from hidden to output 1x4

H 2 (s) R2

y1=sim(net,[1 1 1]'); Qi ( s) 1 2 s 2  (1   2  A1R2 ) s  1
OR Three inputs :1) time t
y1=net(1 1 1) 2) time constant 1 (=τ1)
3) time constant 2 (=τ2)
y1=13.0279. which is close to 13 the actual One output : height of tank 2,h2
output (5*1+1*1+12*1); Generate input/output for (τ1,τ2 )
=(1,1),(1,2)(2,1)(2,2)

• I
• a=ones(2,n);
• clear all;
• /* 2 rows 10 columns of 1111111111
• close all;H2(s)/Qi(s)= R2/ [T1T2s2+(T1+T2+A1R2)s+1] • 1 1 1 1 1 1 1 11 1
• clc; • a1=a;
• t = 0:1:10; /* 0 1 2 3 4 5 6 7 8 9 10 • a2=[1 0;0 2]*a; 1 0 1111111111
• [m n] = size(t); m (row) =1, n (columns) =11 • 0 2 1 1 1 1 1 1 1 11 1
• % R2= 1 A1 =2 •
1111111111
• sys1 = tf([1],[1 4 1]);% to generate training data for H2 with • 2222222222
T1=1,T2=1
• r11 = step(sys1,t); • a3=[2 0;0 1]*a; 2 0 1111111111
• 0 1 1 1 1 1 1 1 1 11 1
• sys1 = tf([1],[2 5 1]);% to generate training data for H2 with •
T1=1,T2=2 2 22 2 2 2 2
222
• r12 = step(sys1,t); 111111111

• sys1 = tf([1],[2 5 1]);% to generate training data for H2 with • a4=2*a; 2 0 1111111111
T1=2,T2=1 • 0 2 1 1 1 1 1 1 1 11 1
• r21 = step(sys1,t); 2222222222
2222222222
• sys1 = tf([1],[4 6 1]);% to generate training data for H2 with

26
5/4/2022

• A=[a1 a2 a3 a4]; Inverse models of dynamical systems


• T=[t t t t]; are used in designing controllers
• p=[T;A];
• r1=[r11;r12;r21;r22] r11-0.... .r11-10
r12-0 ...r12-10
r1= r21-0.... r21-10
r22-0 ... R22-10
r=transpose(r1)
An inverse model
%The final matrix p (pattern)formed is as maps
follows: outputs(behaviors)
Time: 0------10 0-----10 0-----10 0------10 onto the
T1 : 1-------1 1------1 2-------2 2-------- controls(actions) that
2

When neural networks are used for modeling inverse


dynamics, we are designing direct neural controllers.

Direct design means that a neural network directly


After Training implements the controller .

the controller is a neural network

The network must be trained as the controller using


numerical input-output data or a mathematical model of the
system.

27
5/4/2022

END

28

You might also like