0% found this document useful (0 votes)
15 views98 pages

Neural Network

The document outlines the structure and syllabus for a course on Neural Networks and Applications, including various types of neural networks and their applications. It provides model questions and answers to help students understand university question patterns. Key topics include supervised and unsupervised learning, activation functions, and the implementation of logical functions using neural networks.

Uploaded by

Tridip kundu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
15 views98 pages

Neural Network

The document outlines the structure and syllabus for a course on Neural Networks and Applications, including various types of neural networks and their applications. It provides model questions and answers to help students understand university question patterns. Key topics include supervised and unsupervised learning, activation functions, and the implementation of logical functions using neural networks.

Uploaded by

Tridip kundu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 98
NEURAL NETWORK & APPLICATIONS Introduction to Neural Networks 2 Single-Layer Perceptrons 21 Radial Basis Function Networks 46 Associative Memory Networks 53 Applications 79 NOTE: WBUT course structure and syllabus of 8th Semester has been changed from 2014. NEURAL NETWORK & APPLICATIONS [EC 802A] has been ‘ntroduced in thy 4 Present curriculum as a new subject. We are providing chapterwise some model questions and answers along with the complete solutions of new university papérs, $0 that students can get an idea about university questions patterns. POPULAR PUBLICATDNS INTRODUCTION TO NEURAL NETWORKS 1. In a neural net, iffor the training input vectors, the target output is not know, n, tho training method idopted is called as ; et 2014, 2015) a) supervised traning b) unsupervised training -€) reinforcementtraining 4) none of these Answer: (b) 2. The gradient descnt rule mostly is used in ._[WBUT 2044) a) M-P Neural Leirning b) Hebb NeuralLearning ¢) Back-Propagaion Neural Learning - _) Adaline Neural Learning Answer: (b) : 3. ADALINE stands br (WBUT 2014, 2017) a) Additive Linezr Neuron b) Adaptive Linear Neuron ©) Associative Linear Neuron d) Adaptive De Answer: (b) 4. Which of the following noural networks uses supervised learning? [WBUT 2014, 2015) a) simple recurrent network, b) self-organizing feature map c) Hopfield network d) all of these Answer: (b) 5. Which of the following algorithms can be used totrain a single-layer feedforward network? (WBUT 2014, 2015, 2017) a) hard competitive learning b) soft competitive learning ¢) a genetic algoithm d) all of these Answer: (4) 6. Supervised learning means [IWBUT 2015) a) having a teacher b) having a class. c) having a feedoack d) none of these Answer: (a) ! 7. Bias is : [WBUT 2015) a) weight on a connection from a unit having activation 4 b) weight on a network having activation 2 ¢) weight on a function having activation 4 4) none of these Answer: (a) NN&A-2 NEURAL NETWORK & APPLICATIONS 8, Mc Culloch Model uses 2015 a) Sigmold function b) Step function sesnalae ¢) Signum function d) Tan hyperbolic function Answer: (b) 9, The competitive rule is sulted for BUT 2015] a) unsupervised-network trainin, b) i oh c}renforcedetwerkraining "© Sh euperiead network taining ‘Answer: (a) : 10. se learning algorithm continues until further change in [WBUT 2016] 3 a aa b) non-linearity qd 7 answer: (a) ) learning-rate 41. The synapse of a neuron is modeled by a {WBUT 2016, 2018] a) linoar function b) non-linear function c) non-linear rough function d) linear (non-linear function Answer: (d) ‘ 12. Functional value of Bipolar sigmoid function is (WBUT 2016] a)Otot . ‘b) 1 to1 ¢) any positive value d)none of these Answer: (b) 43, The Hebbian rule is ... type of learning [WBUT 2017, 2018] a) supervised b) unsupervised ¢) competitive " d) reinforced Answer: (a) , 44, What are the advantages of neural network over conventional computers? . (l) They have the ability'to learn by example [WBUT 2017] (ll) They are more faults tolerant (lll) They are suited for real time operation due to their ‘computational’ rates a) (I) and (Il) are true b) (I) and (II) are true ) (Il) and (Ill) are true d) all of these are true Answer: (d) 15. Which of the following is/are true for neural networks? [WBUT 2017] (l):The training time depends on the size of the network (Il) Neural networks can be simulated on a conventional computer (Ill) Artificial neurons are identical in operation to biological ones a) all of these are true b) (Il) is true ¢) (Il) and (Ill) are true d) (1) and (Il) are true Answer: (2) NN&A-3 POPULAR PUBLICATIONS 16. Artificial Neural Networks are inspired by eur a) ‘Swarm \ntolligence b) high speed parallel processa,-""8 c) human brain d) All of these Answer: (c) 17. Which of the following is an application of Neural Network? [MODEL au; a) Sales forecasting b) Data validation. ESTion c) Risk management d) All of these Answer: (d) 18. The Adaline neural network can be used as an adaptive filter for ech cancellation in telephone circuits. For the telephone circuit given in the abo ; figure, which one of the following signals carries the corrected message sent fron the human speaker on the left to the human listener on the right? (Assume that th person on the left transmits an outgoing Voice signal and receives an incoming voice signal from the person on the right.) t a) The outgoing voice signal, s. [MODEL QUESTION b) The delayed incoming voice signal, n. c) The contaminated outgoing signal, s + nO. d) The output of the adaptive filter, y. e) The error of the adaptive filter," =s + n0-y. Answer: (¢) 19. What is classification? [MODEL QUESTION] a) Deciding which features to use in a pattern recognition problem b) Deciding which’class an input pattern belongs to c) Deciding which type of neural network to use Answer: (b) : 20. What is.a pattern vector? [MODEL QUESTION) a) A vector of weights w = [w1, w2, ..., wn]" in a neural network. b).A vector of measured features x = [x1, x2,..., xn]' of an input example. ¢) A vector of outputs y = [y1, y2, ..., yn] of a classifier. Answer: (b) Short Answer Type Questions 1. Implement AND function using McCulloch Pitts neuron (take binary data). [WBUT 2014] Answer: The AND function returns a true value only if both the inputs are true, else it retums 3 false value. “1° represents true value ‘0’ represents false value. The truth table for AND function is, NEURAL NETWORK & APPLICATIONS % 2 YD 11 10 0 0 10 000 A ee Pitts neuron to implement AND function is shown in Fig. 1. The.threshold on unit ¥ is The output Y is, Y= f(y,,) & 1 The net input.is given by 1 GC) Jo =D, Weights * input © Fig: 1 McCulloch-Pitts neuron to Yq = 18x, +18 x, perform logical AND function Yin = F% From this the activations sftp neuron can be formed. r=s0n)={p Pyne Now present the inputs @ HH 4, = 1, Yq HH HH = 1412 Y=S(Yqn)=1 since y,, =2. (i) x=1,4)=0, y, =a +x) =OFT= Y=L(Yq) =O since Yn This is same when-x, (i) x, =0,x,=0, ¥, =aptx, =0+0=0 Hence, y= f(y) =0 since y,, =0<2. if yg <2 2. What is the necessity of an activation function? List commonly used activation functions. [WBUT 2014, 2015] oR, Discuss-about the different activation functions used of training artificial neural networks, [WBUT 2016] : OR, Discuss different Activation Functions that are used in Artificial Neural Network. [WBUT 2017] Answer: : . Ina neural network each neuron has an activation function which species the output of a Feuron to a given input. Neurons are switches that output a when they are -sufliciently activated and a 0 when not. NN&A-5 POPULAR PUBLICATIONS Commonly used activation functions: Step Function: A step function is a function like that used by the original Perceptron. The Output is g certain value, Al, if the input sum is above a certain threshold and AO if the input sum is below a certain threshold. The values used by the Perceptron were Al = | and AO = 9, Linear Combination: . A linear combination is where the weighted sum input of the neuron plus a linearly dependent bias becomes the system output. Continuous Log-Sigmoid Function: A log-sigmoid function, also known as a logistic function is given by the function 1 o() Ite” Softmax Function: The. sofimax activation function is useful predominantly in the output layer of clustering system. Softmax functions conyert a raw value into a posterior Probability, This provides a measure of certainty. 3. What is Adaline? Draw the model ofan Adaline network. [WBUT 2014) . OR, What is Adaline? What type of learning is used in Adaline? IWBUT 2018) Answer: ADALINE (Adaptive Linear Neuron or later Adaptive Linear Element) is an early single-layer artificial neural network and the name of the physical device. that implemented this network, Adaline is a single layer neural network with multiple nodes where each node accepts multiple inputs.and generates one output, Given the following variables: * x is the input vector © — w is the weight vector *. 2 is the number of inputs *_@ some constant * ~y is the output of the model then we find that the output is y =)” x,w, +@. If we further assume that ath m on, 0 then the output further reduces to the dot product of x and w: yexw, NEURAL NETWORK é APPLICATIONS Consider a single ADALINE with two inputs. The diagram for-this network is shown elow. Input ‘Simple ADALINE a It uses supervised leaning, 2 purclingW prey 4. Implement XOR function using McCulloch-Pitts neuron (consider binary data). [WBUT 2014, 2016] | oR, Design a Hebb net to implement logical AND function with bipolar inputs and target. [WBUT 2017] Answer: The exclusive OR (XOR) has the truth table: V1 | V2] XOR watt 0} 0| 0 Oo; ift 1] Of 1 1} 1[0 It cannot be represented with a single neuron, but the — relationship XOR = (V, OR V2) AND NOT (V; ‘AND Y>) suggests that it can be represented with the network. The network is‘as shown below: vi V.XOR Vs . anw=i.e 5. What is the impact of weight in an artificial neural network? [WBUT 2015] What is the role of weight and bias in an aN model? , [WBUT 2016] How does a momentum factor make faster convergence of a network? : [WBUT 2015) Answer: Individual nodes in a neural network emulate’ biological neurons by taking input data and performing simple operations on the data, selectively passing the results on to other neurons. The output of each node is called its "activation" (the terms "node values" and "activations" are used interchangeably here). Weight values are associated with each Vector and node in the network, and these values constrain how input data (e.g., satellite NN&A-7 POPULAR PUBLICATIONS image values) are related to output data (e.g., land-cover classes). Weight Values associated with individual nodes are also known as biases. Weight values are determineg by the iterative flow of training data through the network (i.e. weight values are established during a training phase in which the network learns how to identify Particular classés by their typical input data characteristics). . . The gradient descent is very slow if the learning rate a is small and oscillates Widely if is too large. One very efficient and commonly used method that allows a larger learning rate without oscillations is by adding a momentum factor to the normal. gradient descent method The momentum factor denoted by 1] [0,1] and the value of 0.9 is often used for the momentum factor. A momentum factor can be used with either pattern by pattem updating or batch-mode updatitig. In case of batch mode, it has the effect of complete . averaging over the patterns .Even though the averaging is partial in the pattem-by-pattern © mode; it leaves some useful information for weight updating. 6. Define Delta rule. Write down the error function for delta rule. . [WBUT 2016, 2019) Answer: 1" Part: . The delta rule, also called the Least Mean Square (LMS) method, is one of the most commonly used leaming rules. For a given input vector, the output vector is compared to the correct answer. If the difference ‘is.zero, no learning takes place; otherwise, the weights are adjusted to reduce this difference. The activation function in this case is called a linear activation function, in which the output node's activation is simply equal to the sum of the network’s respective inpu/weight products. The strengths of network's connections (i.e., the values of the weights) are adjusted to reduce the difference between target and-actual output activation (i.e, error). 2™ Part: ot ‘ The Delta Rule employs the error function for what is known as gradient descent learning, which involves the modification of weights along the most direct path in weight-space to minimize error, change applied to a given weight is proportional to the negative.of the derivative of the error with respect to that weight. The error function is commonly given.as the sum of the squates of the differences between all target and actual node.activations for the output layer. For a particular training pattern (i.e, training case), error is thus given by: be l 2 £,=72(1, -4,) total error over the training pattern, 4 is a value applied to simplify the . derivative, n répresents all output nodes for a given training pattern, ty represents the target value for node n in output layer j, and jy represents the actual activation. for the same node. This particular error measure is attractive because its derivative, whose value is needed in the employment of the Delta Rule, is easily NN&A-8 NEURAL NETWORK & APPLICA wns calculated. Error over an entire set of training pattems (i.c., over one iteration, or epoch) is calculated by summing all E,: y e-De=70L, where E is total error, and p represents all training patterns. 7. Discuss different categories of learning rules. [wBuT 2017] Answer: . Different categories of leaming rules are: 4 + Supervised Learning: The learning algorithm would fall under this category if the desired output for the network is also provided with the input while training the network. By providing the neural network with both an‘input and. output pair it is possible to calculate an error based on it's target output and'actual output. It can then use that error to make corrections to the network by updating it's weights. : + Unsupervised Learning: In this paradigm the neural network is only given a set of * inputs and it's the neural network's responsibility to find some kind of pattern within the inputs provided without any external aid. This type of learning paradigm is often used in data mining and is also used by many recommendation algorithms due to their ability to predict a user's preferences basedion.the preferences of other similar users it has grouped together. + Reinforcement Learning: Reinforcement-learing is similar to supervised leaming in that some feedback is given, however instead of providing a target output a reward is given based on how well the system performed. The aim of reinforcement learning is to maximize the reward the system receives through trial-and-error. This paradigm relates strongly with how. learning works in nature, for example an animal might remember the actions it's previously taken which helped it to find food (the reward). 8. Compare biological neuron and ANN. . [WBUT 2017] ‘Answer: Artificial neural nets were originally designed to model in some small way .the functionality of the biological neural networks which are a part of the human brain. Our brains contain about 10'* neurons. Each biological neuron consists of a cell body, a collection of dendrites which bring electrochemical information into the cell and an axon which transmits electrochemical information out of the cell. A neuron produces an output along its axon i.., it fires when the collective effect of its inputs reaches a certain threshold. The axon from one neuron can influence the dendrites of another. neuron across junctions called synapses. Some synapses will generate a positive effect in the dendrite, i.e. one which encourages its neuron to fire, and others will produce a negative effect, i.e. one which discourages the neuron from firing. A single neuron receives inputs from perhaps 10° synapses and the total number of synapses in our brains may be of the order of 10". It is still not clear exactly how our brains learn. and remember but it appeazs to be associated with the interconnections between the neurons (ie. at the synapses). NN&A-9 POPULAR PUBLICATIONS Artificial neural nets try to model this low level functionality of the brain del gst * with high level symbolic reasoning in artificial intelligence whi svious of main level reasoning processes of the brain. When we think we are conse objects, We oe ing Concepts to which we attach names (or symbols) e.g. for people on undemeath ne conscious of the low level electrochemical processes which are a Tow level cn The argument for the neural net approach to Al is that, if we can model t ort WVities correctly, the high level functionality may be produced as an emergen ‘os i It can be seen from the above that there is an analogy between biological (human) ang artificial neural nets. The analogy is summarized below. = : Human Artificial Neuron - Processing Element Dendrites _| Combining Function Cell Body Transfer Function ° Axons Element Output Synapses Weights However, it should be stressed that the analogy is nota strong one. Biological neurons and neuronal activity are far more complex than might be suggested by studying artificial neurons. Real neurons do not simply sum the weighted inputs and the dendritic mechanisms in biological systems are much'more elaborate. Aiso, real neurons do not stay on until the inputs change and the outputs may encode information using complex pulse arrangement. 8. What is Boltzmann learning? How does it differ from Error-Correction learning? [WBUT 2017) Answer: 1"Part: Boltzmann learning is\statistical in nature, and is derived from the: field of thermodynamics. It is similar‘to error-correction learning and training. In this algorithm, the state of each individual neuron, output, are taken into account. In this respect, the Boltzmann leaning rule is Significantly slower than the error-correction learning rule. Neural networks that use Boltzmann leaming are called Boltzmann machines. ‘ Boltzmann learning is similar to an error-correction leaming rule, in that an error signal is " Used to train the system in each iteration. However, instead of a direct difference between the result value and the desired value, we take the difference between the Probability distributions of the system. is used during supervised in addition to the system 2"* Part: Boltzmann learning is similar to an €rror-correction learnin, used to train the system in each iteration. Hi the result value and the desired value, distributions of the system. learning rule. I g rule, in that an error signal is owever, instead of a direct difference between We take the difference between the probability It is also significantly slower than the error-correction NN&A-10 NEURAL NETWORK & APPLICATIONS 40, How neural network can be-applied for pattern classification and clustering? [WBUT 2017] ‘Answer: Classification «the assignment of each object to a specific "class" + Weare provided with a “training set © Recognizing printed or handwritten characters Clustering * Clustering requires grouping together objects that are similar.to each other 11. a) What Is delta learning rule? .. . | [WBUT 2017] b) Compare delta learning rule and Perceptron learning rule. Answer: a) Refer to Question No. 6(1" Part) of Long Answer Type Questions. b) There are two differences between the perceptron and the delta rule. The perceptron is based on an output from a step function, wheréas.the delta rule uses the linear combination of inputs directly. The perceptron is guaranteed to converge to a consistent hypothesis assuming the data is lineafly separable. The delta rule converges in the limit , but it does not need the condition of linearly separable data. i 412. What aré the parameters to increase efficiency Hebbian Synapse as a function of the correlation between the pre-synaptic and post-synaptic? How the parameters are influencing the Hebbian Synapse? [WBUT 2018] Answer: : A Hebbian synapse is a synapse that uses a time-dependent, highly local, and strongly interactive mechanism to increase synaptic efficiency as a function of the correlation between the presynapticand postsynaptic activities. . . *. Time-dependent, mechanism. This mechanism refers to the fact that the. modifications in a Hebbian synapse depend on the exact time of occurrence of the presynaptic'and postsynaptic activities. © Local mechanism. By its very nature, a synapse is the transmission site where information-bearing signals (representing ongoing activity in the presynaptic and postsynaptic units) are in spatiotemporal contiguity. This locally available ssinformation is used by a Hebbian synapse to produce a local synaptic modification that is input-specific. It is this local mechanism that enables a neural network made up of Hebbian synapses to perform unsupervised learning. __y * Interactive mechanism. Here we note that. the occurrence of a change:in.a Hebbian synapse depends on activity levels on both sides of the synapse. That is, a Hebbian form of leaming depends on a “true interaction” between presynaptic and postsynaptic activities in the sense that we cannot. make a prediction from either one of these two activities by itself. Note also’ that this dependence or » . interaction may be deterministic or statistical in nature. NN&A-IT POPULAR PUBLICATIONS. © Conjunctional ‘or correlational mechanism. One eee one Hebb postulate of learning is that the condition for a change | "Thus, accords iency is the conjunction of presynaptic and postsynaptic activi 7 . hid naatle ne to this interpretation, the co-occurrence of presynaptic ‘n Pi 7 _ modif Ctiviti < (within a short interval of time) is enough to produce t le synap' theed Cation. i, is for this reason that a Hebbian synapse is sometimes ee ic as y conjunctional synapse. For another interpretation of. Hebb’ 's Pe Baise, earning we may think of the interactive mechanism characterizing a Hel synapse jn statistical terms. The correlation over time between presynaptic and postsynapti, activities is viewed as being responsible for a synaptic change. 13.’ What are differences between Supervised and Unsupervised beanng? How Reinforcement learning differs from Supervised Learning? [wi 2018) Answer: . Refer to Question No. 7 of Short Answer Type Questions. 14. How Competitive Learning is different from Hebbian Learning?) [WBUT 2018) Answer: “ ° . Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to. respond to a subset of the input data. The significant difference between competitive learning and Hebbian learning is in the number neurons at any one time. Whereas.neural network based on Hebbian learnin; output neurons may be active simultaneously in competitive learning, only a single Output neuron is active at any one time. According to this feature, competitive learning is highly suitable for discovering statistically salient features) which makes it useful. for classification of input patterns. Of active B, several 45. Describe the main differences between the human brain and tod: (such as your desktop PC) in term: jay’s computers. 's of information processing. ° , 7 [MODEL QUESTION} Answer: « : ° The brain works in a highly. parallel fashion, but in’ the PC, everything has to go through one or several processors. ° Neurons compute slewly (several ms per computation), electronic elements compute fast ( /W, ; ‘ 2) y= (Sum) . e of either (0,1) or (=| . lized in the rang’ i" 1,1) ay W,,W,,W, a, are weight ve isthe weighted gum, and T is athreshold cong’ d T as shown in figure (a)'below, i hown in figure (b). associated with each itput line, The function f is a lirear step ful symbolic representatiot of the linear tht nction at threshol reshold gate is s| Inputs Weights mw + h = Threshold T Fig, (a): Linear Threslcld Function Fig, (b): Symbolic Illustration of Linear Threshold Gate The McCulloch-Pitts rode! of a neuron is simple yet has substantial computing Potential It also has a precise mhematical definition: However, this model is so simplistic that j, only generates a binaryoutput and. also the weight and threshold values are fixed. 3. Write short notes o1 the following Z a) Memory based learing DWBUT 2014 b) Supervised learnin: [WBUT 2016} 3 Neural network arcitecture [WBUT 2016} d) Gradient descent larning [WBUT 2017] e) Competitive Learniig [WBUT 2017} f) Boltzman'Learning . WBUT 2018} g) Reinforcement leaning [WBUT 2018] Answer: : a) Memory based leaning: Memory Based Learmg (MBL) is based on the idea that intelligent behavior can be ‘obtained by analogica reasoning, rather than by the application of abstract mental rules as in rule induction ad rule-based processing. In particular, MBL is founded in the hypothesis that the trapolation of behavior: from stored representations of earlie experience to new sitations, based on the similarity of the old and the new situation, i of key importance. NBL- algorithms take a set of examples (fixed-length patterns of feature-values and thir associated class) as input, and produce a classifier which can classify new, previouly unseen, input patterns. MBL can in principle be applied to any kind of classificatio task with symbolic: or numeric features and discrete (nor continuous) classes fe which training data is available. NN&A-16 IEURAL NETWORK & APPLICATION: b) Supervised learning: Supervised learning is the machine learning task of inferring a function from supervised training data. The training data consist of a set of sraining examples. In supervised learning. each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which is called a classifier (if the output is discrete) or a regression function (if the putput is continuous). The inferred function should predict the correct output value for any valid input object, This requires the learning algorithm to generalize from the training data to unseen situations in - a "reasonable" way (see inductive bias). The parallel task in human=and animal psychology is often referred to as concept learning. c) Neural network architecture: Humans and other animals process information with neural networks. These are formed from trillions of neurons (nerve cells) exchanging brief clectrical pulses called action potentials. Computer algorithms that mimic these biological structures are formally called artificial neural networks to distinguish them from the squishy things inside of animals. However, most scientists and engineers-are not-this formal and use. the term neural network to include both biological and nonbiological systems. . xt Tnformation flow: Neural network architecture: This is the most common (Xie structure for neural networks: three layers with full interconnection. The input layer nodes are passive, doing‘ Xl nothing but selaying the! values x, from their single input to ther *!” multiple outputs Ine” Xty comparison, the nodes of the hidden and output layers are Xt os active, modifying the signals in Sutput layer accordance with figure, The Xl Output layer action of thisineural network is (ective nodes) determined by “the weights ed inthe hidéen and output, Hidden ayer {active nodes) Input iayer {passive nodes) d) Gradient descent learning: . Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost). NN&A-17 POPULAR PUBLICATIONS Gradient descent is best used when the arameters i using linear algebra) and must be searched for by an optimization cleans ally (ee The procedure starts off with initial values for the coefficient oF coefficients function. These could be 0.0 or a small random value. Fm (the coefficient = 0.0 . The cost of the -coefficients-is evaluated by ‘plugging them into the function and calculating the cost. “ _____ Cost = (coefficient) or cost = evaluate({(coefficient)) The derivative of the cost is calculated. The derivative is a concept from calculus refers to the slope of the function ata given point: We need to know the slope so that na know the direction (sign) to move the coefficient ‘values in order to get a lower cost on the next iteration. a " . ‘ delta = derivative(cost) * @ Now that we know from the derivative which direction is downhill, we can now update the coefficient values. A learning rate Parameter (alpha)“must be specified that contro} how much the coefficients can change cin each update: . | . coefficient = coefficient = (alpha * delta) This process is repeated until the cost of the coefficients (Cost) is 0.0 or close enough to zero to be good enough. ©) Competitive Learning: — In competitive learning the following properties hold true: © . Nodes compete for inputs - © Node with highest activation is the winner ¢ Winner neuron adapts its tuning (pattern of weights) even further towards the current input - . Individual nodes specialize to win competition for a sét of similar inputs © Process leads to most efficient neural representation of input space © Typical for unsupervised leaming f) Boltzman Learning: Refer to Question No. 9 of Short Answer Type Questions. g) Reinforcement learning: Reinforcement Learning is a type of Machine Learning, and thereby also a branch of Anificial Intelligence. It allows machines’and software agents to automatically determine the ideal behaviour within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behaviour; this is known as the reinforcement signal. There are many different algorithms that tackle this issue. Reinforcement Learning is defined by a specific type of problem, and all its solutions are classed as Reinforcement Learning algorithms. In.the problem, an. agent is supposed decide the best action to select based on his current state. When this step is repeated, the problem is known as a Markov Decision Process. This automated learning scheme : NNGALIS ETWORK PLICATIONS. implies that there is little need for a human expert who knows about the domain of application. Much less time. will be spent designing a solution, since there is no need for hand-crafting complex sets of rules as with Expert Systems, and all that is required is someone familiar with Reinforcement Leaning. This automated learning scheme implies that there is little need for a human expert who knows about the domain of application. Much less time will be spent designing a solution, sinee there is no need for hand-crafting complex sets of rules as with Expert Systems, and all that is required is someone familiar with Reinforcement Learning, The possible applications of Reinforcement Learning are abundant, due to the generalness of the problem specification. As a matter of fact, a very large number of problems in Artificial Intelligence can be fundamentally mapped to a decision process. This is a distinct advantage, since the same theory can be applied to many different domain specific problem with little effort. In Practice, this ranges from controlling robotic arms to find the most efficient motor combination, to robot navigation where collision avoidance behaviour can be learnt by negative feedback from bumping into obstacles, Logic games are also well-suited to Reinforcement Learning, as they are waditionally defined as a sequence of decisions, 4, What is the principle of learning of the Adaline?’ Fully explain the Adaline architecture and learning algorithm. [MODEL QUESTION] Answer: . Adjust weights ¢ Learning method: delta rule (another way of error driven), also called Widrow-Hoff learning rule : * Try ‘to teduce the mean squared error (MSE) between the net input and the desired out put Algorithm LMS-Adaline: Start with a randomly chosen weight vector wy: Let & je MSE is unsatisfactory and computational bounds are not exceeded, do Let / be an input vector (chosen randomly or in some sequence) NN&A-19 LICATI for which d is the desired output value; Update the weight vector to wy, =, +d - Hy.) Increment k : end-while. 5. What is Hebbian Learning? Explain using mathematical terms. [MODEL QUESTION} Answer: Hebb’s postulate of learning is the oldest and most famous of all learning rules: 1. If two neurons on either side of a synapse (connection) are activated simultaneously (i.e. synchronously), then the strength of that synapse is, selectively increased. 2. If two neurons‘on either side of a synapse are activated, asynchronously, then-that ,’ synapse is selectively weakened or eliminated. . win) x(n) ~ yada) Fig: Synaptic connection * To formulate Hebb's postulate of learning. in mathematical terms, consider a synaptic weight 4; with presynaptic and postsynaptic activities denoted by x; and yi, respectively. . According to Hebb's postulate, the adjustment applied to the synaptic weight wijat time 1 is Aw, (n)=F(y. (1), x/()) As a special case we may use the activity product rule Awy (2) =7y, (=) x,(r) where 7 is a poSitive.constant that determines the rate of learning. This rule clearly emphasizes the correlational nature of a Hebbian synapse. From this representation we see that the repeated application of the input signal x, leads to an exponential growth that finally drives the synaptic weight wy into saturation. «ag (HL) my (0) Haye (dx, (rn) = m4 (n) (L405) Ifx, stays constant then, w, (2 +N) =, (n)(14 03)" To avoid such a situation from arising, we need to impose a limit on the growth of synaptic weights. One method for doing this is to introduce a nonlinear forgetting factor into the formula for the synaptic adjustment Aw,{7). Specifically, we redefine Aw,(m) as ~ ageneralized activity product rule: omy (1) 09% (2), (2) e294 (2) Wy (7) = ay,(n)[cx,(n)-w, (n)] where: c = qa. If the weight w,(m) increases to the point wherecx,(n)—wy (n)=0a balance point is reached and the weight update stops. NEURAL NETWORK & APPLICATIONS SINGLE-LAYER PERCEPTRONS . Multiple Choice Type Questions _ 4. A perceptron is . [WBUT 2014] a) a single layer feed-forward neural network with Preprocessing - b) an autoassociative neural network ¢) a double layer autoassociative neural network d) all of these Answer: (a) 2. In back-propagation algorithm is propagated backward:through the network. [WBUT 2014, 2016, 2018] a) error b) signal c)error+signal \d) signal — error Answer: (c) 3. A perceptron is: [WBUT 2015] a) a single layer feed-forward neural network with pre-processing b) an auto-associative neural network ¢) a double layer auto-associative neural network d) Hebb network Answer: (a) : . : 4. A 3-input neuron Is trained to output a zero when the input is 110 and a one when the input is 111. After generalization, the output will be zero when and only when the input is a) 000 or 110 or 014 or 104 [WBUT 2016] b) 010 or 100 or 110 or 101 ¢) 000 or 010 or 110 or 100 Answer: (c) 5. The madaline network is . [WBLT 2046, 2018] a) The combination of two single layered feed forward neural networks b) A type of multilayered feed forward neural network with multiple neurons in output layer ©) The combination of adaline networks. and multilayered feed forward network with one neuron in output layer d) A type of feedback network - ai Answer: (b) 6. Single layer Perceptron is used for [WBUT 2017, 2018) a) linear separability b) error minimization ¢) back propagation d) annealing Answer: (a) NN&A-21 POPULAR PUBLICATIONS i iy X03) = (0. 7. For a three input neuron representing a Perceptron where {1 2.93) (9.8.0.6, 0.4) and weight (ws, Ws, Ws} = (0.1, 0.3, -0.2} and bias ad using bipolar sigmoid activation function is ao259 zm a) 0.265 b) 0.746 ©) 0.346 ” Answer: (a) 3 and 4, The transfer function is linear wit B.A 4-1 7 input neuron has weights 1, 2, 3 and 4. The Vansiet Ce eet an the constant of proportionality being equa respectively. The output will be: awe 2017) a) 238 b) 76 e119 = Answer: (b) (4, 1) neuron representing a Perceptron with 9. For a four input (0, 0), (0, 4), (1, 0), [WBUT 2018) wl=w2=1 and .5 the classification does a) AND classifier b) OR classifier c) XOR classifier d) None of these Answer: (a) 10. The network of figure below is: : [MODEL QUESTION} a) a single layer feed-forward neural network, b) an autoassociative neural network c) a multiple layer neural network xt . ZT. x) OH] X2r Answer: (a) 11. A single perceptron can compute the XOR function.” [MODEL QUESTION] a) True b) False : Answer: (b) 42.,A perceptron adds up all the weighted inputs it receives, and if it exceeds a certain value, it outputs a 1, otherwise it just outputs a0. (MODEL QUESTION] a) True « — b) False c) Sometimes — it can also output intermediate values as well d) Can't say Answer: (a) NN&A-22 NEURAL NETWORK & APPLICATIONS 43. “The XOR problem can be solved by a multi-layer perceptron, but a multi-layer perceptron with bipolar step activation functions cannot learn to do this.” a) True b) False . [MODEL QUESTION] Answer: (a) . 14, The Perceptron Learning Rule states that “for any data set which is linearly separable, the Perceptron Convergence Theorem is guaranteed to find a solution in a finite number of steps.” [MODEL QUESTION] a) True b) False r Answer: (b) 15. A perceptron with a unipolar step function has two inputs with weights w1 ="0.5 " and w2 = -0.2, and a threshold @ = 0.3 (8 can therefore be considered as a weight for an extra input which is always set to -1). For a given training example x = [0,1]", the desired output is 1. Does the perceptron give the correct answer (that is, is the actual output the same as the desired output)? [MODEL QUESTION] a) Yes b) No Answer: (b) 16. A perceptron is guaranteed to perfectly learna given linearly separable function within a finite number of training steps. [MODEL QUESTION] a) True b) False Answer: (a) 17. In backpropagation learning, we should start with a small learning parameter n and slowly increase it during the learning process. [MODEL QUESTION] a) True b) False Answer: (b) Short Answer ye Questions 1. The Exclusive-OR. function is not.linearly separable and hence a single-layer perception cannot simulate it. Justify it. [WBUT 2014) OR, Define the single layer perceptron net and Its linear separability. [WBUT 2016] Answer: . Consider the two input neuron shown in figure below, Input x We ‘Threshold Eb Wy z ° The output from the summing stage of the neuron is: S=XW, + XW, : Input Y NN&A-23 LAR PUBLICATI We can re-arrange this equation into the equation of a straight line: n above, then the physica) ‘on in which’the output ig in figurebelow. = with Y=mX +c. Je threshold of the type show! divider between the regi 1 <0” as showni which can be compared If the neuron is a simp! meaning of this line jis that it logic ‘1 and the region in Line representing neuron function (equation of straight line given above) 01 XOR Truth table 10 which we would likeito produce @ +0” output are : fh produce an output of “1” are shown & filled ci ‘hat no maticr where the line is:plotted on the graph, the ‘0 i re Gutputs by the lines, and hence, @ simple neuron can Linear Separability-and we say 1 le Perceptron type neuron. 0.4. The transfer function is linear shown as empty circles, ang roles. It can be clearly seen, outputs cannot be separated not simulate a XOR gate, hat the XOR function ig i This clas: °F problem is/called not lineariy separable by asingl 2. A 4-input neuron tics weights 0.1, 0.2, 0.3 and 3 2. th the constant of proportionality being equal to 5. The inputs are 5, 10, 15 ang 20 respectively.-Find the output. [WBUT 2016) Answer: : . ghts with their respective inputs, summing the The output is found by multiplying the wei results and multiplying with the transfer function. * (0.195 +0.2"10 + 0.3815 + 6.4*20) = 75. ed in Back Propagation 3. Which type of Activation Function is commonly us algorithm? [WBUT 2018) Answer: . trie activation function is non-linear and differentiable. A commonly used activation function is the logistic function: 1/(1+e~*) . NN&A-24 NEURAL NETWORK & APPLICATIONS 4, Explain briefly the difference between A perceptron and a feed-forward, back- propagation neural network, [MODEL QUESTION] Answer: : A neural network consists of a collection of perceptrons organized in layers, so that the output of one layer is the input to the next layer. The perceptrons are modified so that the output is a continuous function of the inputs, rather than being’ step. function. The learning algorithm is similar in principle to the perceptron learning algorithm, but involves a system of message passing from the output layer backwards to the input layer. “5, Consider the following data set T. A and B are numerical attributes.and Z is a Boolean classification. [MODEL QUESTION} aA B Zz ~ 1 2 T 2 1 F 3 2 T 1 i F a) Let P be the perceptron with weights wa = 2, we = 1) and threshhold T=4.5. What isthe value of the standard error function for this perceptron? b) Find a set of weights and a threshold that categorizes all this data correctly. Answer: y a) Perceptron P gets the first instance wrong, with an error of |(2*1)+(1*2)-4.5|=0.5 and the second instance wrong with an error of \(2*2)#(1*1)-4.5|=0.5. The total error is therefore 1.0. b) wa= 0, we=1, T=1.5 will do fine. 6. What is the advantage of Adalines over perceptrons? How is it achiéved? . [MODEL QUESTION] Answer: . The advantage of Adalines.is that they do not simply find any solution, that leads to perfect classification of the training set, but they try to optimize their computation so that it works as best as possible with new (untrained) inputs. This is achieved by using a continuous, differentiable error function and minimizing this error using gradient descent. This error minimum is the best estimate for the optimal classification function with regard to the entire data set (of which the training data are usually only a small subset). z WwW Long Answer e Questions an! . rane 1. List the stages Involved in training of back propagation algorithm. [WBUT 2014] Answer: The back propagation-learning algorithm can be divided into two phases: propagation and weight update. NN&A-25 POPULAR PUBLICATIONS . ‘ Phase 1: Propagation Each propagation involves the following steps: . Forward propagation of a training pattern's input through the neural networ, in order to generate the propagation's output activations. _ tions through th + 2. Backward propagation of the propagation's output activat ee © eur network using the training pattern target in order to g¢ Aeltas of output and hidden neurons. Phase 2: Weight update . . . ight-synapse follow the following steps: ; / Fer aay pe canpat delta and input activatian to eet the gradient of the weight, 2. Subtract a ratio (percentage) of the gradient from, A Shae + it is cal This ratio (percentage) influences the speed and quality of eal et the. led the learning rate. The greater the ratio, the faster the neuron trains; rporaia ratio, the more accurate the training is. The sign of the gradient.of a weie) indicates eT the error is increasing; this is why the weight must be updated in the oppes ion. Repeat phase 1 and 2 until the performance of the network is satisfactory. 2. a) Define perceptron learning rule. How the linear separability concept jg implemented using perceptron network training? (WBUT 2015, Answer: 1" Part: + Algorithm Start with a randomly chosen weight vector Wo. Letk=1; ° While there exist input vectors that are misclassified by w,.1, do Let i; be a misclassified input vector; Let x, = class(ij)-ij, implying that wi.y-x. <0; , Update the weight vector to wi = Wii + MKS Increment k; - end-while; For example, for some input i with class(i If w-i > 0, then we have a misclassification. Then the weight vector needs to be modified to w + Aw with improve classification. We can choose Aw = -ni, because : (w+ Aw) = (w - Hi) = wei - nid < wei, and i-i is the square of the length of vector i and is thus positive. - : If class(i) = 1, things are the same but with opposite signs; we introduce x to unify these two cases. . ( + Aw):i