100% found this document useful (1 vote)
1K views

Introduction To Neural Networks Using MATLAB

S. N. Sivanandam S. Sumathi S. N. Deepa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
100% found this document useful (1 vote)
1K views

Introduction To Neural Networks Using MATLAB

S. N. Sivanandam S. Sumathi S. N. Deepa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 548
INTRODUCTION TO NEURAL ) NETWORKS _%, USING tilt ih Ue: %) RS yA ee OU a ll Information contained in this work has been obtained by Tata McGraw-Hill, from sources believed fo be reliable. However, neither Tata McGraw-Hill nor its authors guarantee the accuracy or completeness of any information including the program listings, published herein, and neither Tata McGraw-Hill nor its authors shall be responsible for any errors, omissions, or damages arising out of use of this information. This work is published with the understanding that Tata McGraw-Hill and is authors are supplying information but are not attempting to render engineering or other professional services. If such services are required, the assistance of an appropriate professional should be sought. INA [== Tata McGraw-Hill Copyright © 2006, by Tata McGraw-Hill Publishing Company Limited Second reprint 2006 RZLYCDRKRQLYC No part of this publication may be reproduced or distibuted im any form or by any means, electronic, mechanical, photocopying, recording, or otherwise or stored in a database or retrieval system without the prior written permission of the publishers. The program listings (if any) may be entered, stored and executed in a computer system, but they may not be reproduced for publication. This edition can be exported from India only by the publishers, Tata McGraw-Hill Publishing Company Limited. ISBN 0-07-059112-1 Published by the Tata McGraw-Hill Publishing Company Limited, 7 West Patel Nagar, New Delhi 110 008, typeset in Times at Script Makers, 19, AI-B, DDA Market, Pashchim Vihar, New Delhi 110063 and printed at S.P. Printers, E-120, Sector-7, Noida Cover: Shree Ram Enterprises Cover Design: Kapil Gupta, Delhi Contents Preface axi Acknowledgements ay 1.1 Neural Processing / 1.3__ The Rise of Neurocomputing 4 14__MATLAB = An Overview 5 Review Questions 8 I z ficial Neural N. 10 ‘ . 2.3 Historical Development of Neural Networks 13 2.4 Biological Neural Networks 15 2.5 Comparison Between the Brain and the Computer_17 2.6 Comparison Between Artificial and Biological Neural Network _17 2.7__ Basic Building Blocks of Artificial Neural Networks 19 ; 2.2.1 Network Architecture 19 2.7.2 Setting the Weights 20 2.2.3 Activation Function 22 2.8 Artificial Neural Network (ANN) Terminologies 23 2.8.1 Weights 23 2.82 Activation Functions 23 2.8.3 Sigmoidal Functions 24 2.8.4 Calculation of Net Input Using Matrix Multiplication Method 25 285 Bias 25 2.86 Threshold 26 2.9 Summary of Notations Used _27 ‘Summary 29 Review Questions 29 3._ Fundamental Models of Artificial Neural Networks 9D 3.2__McCulloch-Pitts Neuron Model _2/ 3.2.1 Architecture 31 3.3 Learning Rules 43 x Contents 1 Hebbian Leaming Rule 43 3.3.2 Perceptron Learning Rule 44 3.3.3 Delta Learning Rule (Widrow-Hoff Rule or Least Mean Square (LMS) Rule) 44 3.3.4 Competitive Learning Rule 47 3.3.5 Out Star Leaming Rule 48 3.3.6 Boltzmann Leaning 48 3.3.7 Memory Based Leaming 48 — 3.4.2 Algorithm 50 3.4.3 Linear Separability 50 Summary 56 Review Questions 57 Beard - 4, Perceptron Networks oO ; aa 4.2 _ Single Layer Perceptron_ 61 tot Anhivenm GI 42.2 Algorithm 62 42.3 Application Procedure _63 - 4.2.4 Perception Algorithm for Several Output Classes 63 4.3 _ Brief Introduction to Multilayer | Networks 84 ‘Summary 85° Review Questions 85 5. Adaline and 87 S.L__Introduction 87 5.2__Adaline 88 5.2.2 Algorithm 88 5.2.3 Application Algorithm 89 5.3__Madaline 99° Review Questions 107 Exercise Problems 107 6. Associative Memory Networks 109 6.2 Algorithms for Pattern Association 70 6.2.1 Hebb Rule for I A 6.2.1 Hebb Rule for Pattem Association 1/0 Elgments sous droits d'auteur 6.2.3 Extended Delta Rule Ji 6.3 Hetero Associative Memory Neural Networks 113 im ie 6.3.2 Application Algorithm 1/3 6.4 Auto Associative Memory Network 129 6.4.1 Architecture 129 6.4.2 Training Algorithm 129 6.4.3 Application Algorithm 130 6.5 Bi-directional Associative Memory 150 6.5.2 Types of Bi-directional Associative Memory Net_151 65.3 Application Algorithm 153 6.5.4 Hamming Distance 154 Summary _ 162 Review Questions 162 Exercise Problems 163 ‘1. Feedback Networks __166 7.2 Discrete Hopfield Net__ 167 7.2.2 Training Algorithm 167 7.2.3 Application Algorithm 168 ; 7.2.4 Analysis 168 7.3 _ Continuous Hopfield Net_179 7.4 _ Relation Between BAM and Hopfield Nets J8/_ Summary _ 18] Review Questions 181 8.1 Introduction 184 8.2 _ Back Propagation Network (BPN) 185 8.2.1 Generalized Delta Leaming Rule (or) Back Propagation Rule 185 e s 8.2.3 Training Algorithm 187 a Selection of P so 8.2.5 Learning in Back Propagation _190 8.2.6 Application Algorithm 192 $2.7 Local Minima and Global Minima 192 8.2.8 Merits and Demerits of Back Propagation Network 193 : . 8.2.9 ions_193 8.3 Radi: Function Network (RBEN} 212 8.3.1 Architecture 2/3 Eléments sous droits d'auteur xi Contents 83.2 Training Algorithm for an RBFN with Fixed Centers 2/3 Summary 217 Review Questions 218 Spee oh 9. Self Organizing Feature Map 9.2 Methods Used for Determining the Winner_22/ 9.3 _Kohonen Self Organizing Feature Maps (SOM) 22] 93.1 Architecture 222 93.2 Training Algorithm 223 9.4 _Leaming Vector Quantization (LVQ) 237 94.2 Training Algorithm 238 94.3 Variants of LVQ 239. 95 Max Net 245 95.2 Application Procedure 246 96.2 Training Algorithm 250 9.7__Hamming Net_ 253 97.2 Application Procedure 254 Summary 257 Review Questions _ 257. 10. Counter Propagation Network 10.1 Introduction 260 10.2 Full Counter Propagation Network (Full CPN) 261 10.2.1 Architecture 267 102.2 Training Phases of Full CPN 263 (10.2.4 Application Procedure 265 10.3 Forward only Counter Propagation Network 270 103.1 Architcture 270 10.3.2 Training Algorithm 271 10.3.3 Application procedure 272 Summary _274 Review Questions 274 Exewise Problems 274 11. Adaptive Resonance Theory TLL _Introduction 277 : i —_ sous droits Contents xiii 11.2.2 Basic Operation 279 11.2.3 Learning in ART 280 11.2.4 Basic Training Steps 280 “1L3__ART 1 280 113.1 Architecture 287 11.3.2 Algorithm 282 114.1 Architecture 299 114.2 Training Algorithm 300 Summary 309 Review Questions 309 Exercise Problems 310 12, Special Networks 3 re fon 312 22? aes 212 12.2.1 Architecture 312 122.2 Training Algorithm 3/3 12.2.3 Application Algorithm 3/4 12.3 Cognitron 3/4 123.2 Training 315 123.3 Excitatory Neuron 3/5 123.4 Inhibitory Neuron 376 123.5 Training Problems 3/7 12.4 Neocognitron 318 124.1 Architecture 3/8 12.4.4 Algorithm Calculations 3/9 12.4.5 Training 320 12.5__Boltzmann Machine 320 12.5.1 Architecture 320 125.2 Application Algorithm 327 12.6 Boltzmann Machine with Learning 322 126.1 Architecture _223 126.2 Algorithm 323 12.7 Gaussian Machine 325 12.8 Cauchy Machine 326 12.9 Optical Neural Networks 327 12.9.1 Electro-optical Matrix Multipliers 327, 12.9.2 Holographic Correlators 328 12.10 Simulated Annealing 329 1210.1 Algorithm 329 deuteu 12.10.2 Structure of Simulated Annealing Algorithm 331 12.10.3 When to Use Simulated Annealing 332 12.11 Cascade Correlation _333 12.12 Spatio-temporal Neural Network 336 12.12.1 Input Dimensions 336 12.12.2 Long- and Short-term Memory in Spatio-temporal Connectionist Networks _337 12.12.3 Output, Teaching, and Error _337 12.12.4 Taxonomy for Spatio-temporal Connectionist Networks 338 12.125 Computing the State Vector 339 12.12.6 Computing the Output Vector 339 12.12.7 Initializing the Parameters 339 12.12.8 Updating the Parameters 340 12.13 Support Vector Machines 340 12.13.1 Need for SVMs 342 ' 12,13.2 Support Vector Machine Classifiers 343 12.14.1 Pulsed Neuron Model (PN Model) 345 12.15_Neuro-dynamic Programming 347 12.15.1 Example of ic Pros ing 349 12.15.2 Applications of Ne ic Pros i 350 Summary 350 Review Questions 350 13, Applications of Neural Networks 352 13.1 __ Applications of Neural Networks in Arts 353 13.1 Neural Networks 352 13.1.2 Applications 355 13.L3 Conclusion 357 13.2 _ Applications of Neural Networks in Bioinformatics 358 13.2.1 A Bioinformatics Application for Neural Networks 358 13.3. Use of Neural Networks in Knowledge Extraction 360 13.3.1 Artificial Neural Networks on Transputers 36] 13.3.2 Knowledge Extraction from Neural Networks 363 13.4 Neural Networks in Forecasting 367 13.4.1 Operation of a Neural Network 368 13.4.2, Advantages and Disadvantages of Neural Networks 368 13.4.3 Applications in Business 369 13.5 __Neural Networks Applications in Bankruptcy Forecasting 37] 13.5.1 Observing Data and Variables 372 13.5.2 Neural Architechture 372 13.5.3 Conclusion 374 13.6 Neural Networks in Healthcare 374 Contents xv 13.6.1 Clinical Diagnosis 374 13.6.2 Image Analysis and Interpretation 376 13.6.3 Signal Analysis and Interpretation 377 13.6.4 Drug Development 378 13.7 _ Application of Neural Networks to Intrusion Detection 378 13.7.1. Classification of Intrusion Detection Systems 378 13.7.2 Commercially Available Tools 378 13.7.3 Application of Neural Networks to Intrusion Detection 380 13.2.4 DARPA Intrusion Detection Database 360 13.7.5_Georgia University Neural Network IDS 380 13.26 MIT Research in Neural Network IDS 381 13.7.7 UBILAB Laboratory _38/ 13.7.8 Research of RST Corporation _38/ 13.19 Conclusion 382 13.8.1 Using Intelligent Systems 383 13.8.2 Application Areas of Artificial Neural Networks 383 13.8.3 European Initiatives in the Field of Neural Networks 385 13.8.4 Application of Neural Networks in Efficient Design of RF and 13.8.5 Neural Network Models of Non-linear Sub-systems _ 387 13.8.6 Modeling the Passive Elements 388 13.8.7 Conclusion 289 <9 Te 13.9.1 Natural Landmark Recognition using Neural Networks for Autonomous Vacuuming Robots 389 13.9.2 Conclusions 395 13.10_Neural Network in Image Processing and Compression 395 13.10.1_ Windows based Neural Network Image Compression and Restoration _395 13.10.2 Application of Artificial Neural Networks for Real Time Data Compression 407 13.10.3 Image Compression using Direct Solution Method Based Neural Network 406 13.10.4 Application of Neural Networks to Wavelet Filter Selection in Multispectral Image Compression 411 13.10.6 Rotation Invariant Neural Network-based Face Detection 4/6 beet 13.11.1 Neural Network Applications in Stock Market Predictions—A Methodology Analysis 42/ 13.11.2 Search and Classification of “Interesting” Business Applications in the World Wide Web Using a Neural Network Approach 427 13.12_Neural Networks in Control _432 13.12.1 Basic Concept of Control Systems 432 13.12.2 Applications of Neural Network in Control Systems 433 13.13 Neural Networks in Pattern Recognition 440 xvi Contents 13.13.1 Handwritten Character Recognition 440 13.14 Hardware Implementation of Neural Networks 448 13.14.1 Hardware Requirements of Neuro-Computing 448 13.14.2 Electronic Circuits for Neurocomputing 450 13.14.3 Weights Adjustment using Integrated Circuit 452 14, Applications of Special Networks 455 1 14.2 143 14.4 14.5 14.6 14.7 ‘Temporal Updating Scheme for Probabilistic Neural Network with Application to Satellite Cloud Classificati 55 14.1.1 Ter Updating for Cloud Classification 457 Application of Knowledge-based Cascade-correlation to Vowel Recognition 458 14.2.1 Description of KBCC 459 14.2.2 Demonstration of KBCC: Peterson-Barney Vowel Recognition 460 23 Discussi 53 Rate-coded Restricted Boltzmann Machines for Face Recognition 464 14.3.1 Applying RBMs to Face Recognition 465 14.3.2 Comparative Results 467 143.3 Receptive Fields Learned by RBMrate 459 143.4 Conclusion 469 MPSA: A Methodology to Parallelize Simulated Annealing and its Application to the ‘Traveling Salesman Problem 470 144.1 Simulated Annealing Algorithm and the Traveling Salesman Problem 470 14.4.2 Parallel Simulated Annealing Algorithms 471 14.4.3 Methodology to Parallelize Simulated Annealing 471 14.4.4 TSP-Parallel SA Algorithm Implementation 474 14.4.5 TSP-Parallel SA Algorithm Test _ 474 Application of “Neocognitron” Neural Network for Integral Chip Images Processing 476 Generic Pretreatment for Spiking Neuron Application on Lip Reading with STANN (Spatio-Temporal Artificial Neural Networks) 480 14.6.1 STANN 480 14.6.2 General Classification System with STANN 481 14.6.3 A Generic Pretreatment 482 146.4 Results 483 Optical Neural Networks in Image Recognition 484 14.7.1 Optical MVM 485 14.7.2 Input Test Patterns 485 14.7.3 Mapping TV Gray-levels to Weights 486 14.7.4 Recall in the Optical MVM 487 14.7.5 LCLN Spatial Characteristics 488 14.7.6 Thresholded Recall 489 14.7.7 Discussion of Results 489 15, Neural Network Projects with MATLAB 491 15.1 Brain Maker to Improve Hospital Treatment using ADALINE 49 15.1.1 Symptoms of the Patient__492 15.2 153 15.4 155 Contents | xvii 15.1.2 Need for Estimation of Stay 493 15.13 ADALINE 493 15.1.4 Problem Description 493 15.1.5 Digital Conversion 493 15.16 Data Sets 494 15.1.7 Sample Data_ 494 15.1.8 Program for ADALINE Network 495 15.1.9 Program for Di 15.1.10 Program for Digitising the Target 498 15.1.1 Program for Testing the Data 499 15.1.12 Simulation and Results 499 15.113 Conclusion 507 Breast Cancer Detection Using ART Network 502 15.2.1 Art I Classification Operation 502 15.22 Data Representation Schemes 503 15.2.3 Program for Data Classification using Art] Network 508 15.24 Simulation and Results 5/1 15.2.5 Conclusion 5/2 Access Control by Face Recognition using Backpropagation Neural Network 51/3 15.3.1 Approach 513 15,32 Face Training and Testing Images 515 15.33 Data Description 516 15.34 Program for Discrete Training Inputs 518 15.3.5 Program for Discrete Testing Inputs 527 15.3.6 Program for Continuous Training Inputs 523 15.3.7 Program for Continuous Testing Inputs 527 15.3.7 Simulation 529 15.38 Results 530 15.39 Conclusion 531 Character Recognition using Kohonen Network 531 15.4.1 Kohonen’s Learning Law 531 Winner-take-all 532 3 Kohonen Self-organizing Maps 532 Data Representation Schemes 533 Description of Data 533 Sample Data 534 Kohonen's Program 536 Simulation Results 540 Kohonen Results 540 Observation 540 154.11 Conclusion 540 Classification of Heart Disease Database using Learning Vector Quantization Artificial Neural Network 54] 15.5.1 Vector Quantization 54] xviii Contents 15.5.2 Learning Vector Quantization 541 15.5.3 Data Representation Scheme 542 15.5.4 Sample of Heart Disease Data Sets 543 15.5.5 LVQ Program 543 15.5.6 Input Format 550 15.5.7 Output Format 55] 15.5.8 Simulation Results 551 15.5.9 Observation 552 15.5.10 Conclusion 552 15.6 Data Compression using Backpropagation Network 552 15.6.1 Back Propagation Network 552 15.6.2 Data Compression 553 15.6.3 Conventional Methods of Data Compression 553 15.6.4 Data Representation Schemes 554 15.6.5 Sample Data 555 15.6.6 for Bipolar Coding 556 15.6.7 Program for Implementation of Backpropagation Network for Data Compression 557 15.6.8 Program for Testing 561 15.6.9 Results 563 15.6.10 Conclusion 565 15.7__ System Identification using CMAC _ 565 15.8 15.7.1. Overview of System Identification 566 15.7.2 Applications of System Identification 566 15.7.3 Need for System Identification 568 15.7.4 Identification Schemes 569 15.7.5 Least Squares Method for Self Tuning Regulators 569 15.7.6 Neural Networks in System Identification 570 15.7.7 Cerebellar Mode! Arithmetic Computer (CMAC) 572 15.7.8 Properties of CMAC 575 15.7.9 Design Criteria 576 15.7.10 Advantages and Disadvantages of CMAC 578 15.7.11 Algorithms for CMAC 579 15.7.12 Program 58/ 15.7.13 Results 596 15.7.14 Conclusion 599 Neuro-fuzzy Control Based on the Nefcon-model under MATLAB/SIMULINK 600 15.8.1 Learning Algorithms 601 15.8.2 Optimization of a Rule Base 602 15.8.3 Description of System Error 602 15.8.4 Example 604 15.8.5 Conclusion 607 16. Fuzzy Systems 16.1 Introduction _608 16.2 _ History of the Development of Fuzzy Logic 608 16.3 Operation of Fuzzy Logic 609 16.4 Fuzzy Sets and Traditional Sets 609 16.5 Membership Functions 6/0 16.6 Fuzzy Techniques 6/0 16.7 Applications 6/2 16.8 Introduction to Neuro Fuzzy Systems 6/2 16.8.1 Fuzzy Neural Hybrids 6/4 16.8.2 Neuro Fuzzy Hybrids 616 Summary 619 Appendix—MATLAB Neural Network Toolbox A.L_A Simple Example 621 A.L.I Prior to Training 622 A.L.2 Training Error 622 A.1.3 Neural Network Output Versus the Targets 623, ‘A.2_A Neural Network to Model sin(x) 623 A.2.1 Prior to Training 624 A22 Training Error 624 - A.2.3 Neural Network Output Versus the Targets 625 A.3__ Saving Neural Objects in MATLAB 626 ‘A3.1 Examples 637 A.4 Neural Network Object: 631 Ad.1 Example 634 A.5_ Supported Training and Learning Functions 637 ‘AS.1 Supported Training Functions 638 AS.2 Supported Learning Functions 638 A5.3 Transfer Functions 638 AS4 Transfer Derivative Functions 639 A.5.5 Weight and Bias Initialization Functions 639 A.5.6 Weight Derivative Functions. 639 A6.1 Introduction to GUL_640 Summary 645 Index. Contents xix 620 Preface The world we live in is becoming ever more reliant on the use of electronic gadgets and computers to control the behavior of real world resources. For example, an increasing amount of commerce is performed without a single bank note or coin ever being exchanged, Similarly, airports can safely land and send off aeroplanes without even locking out of a window. Another, more individual, example is the increasing use of electronic personal organizers for organizing meetings and contacts. All of these examples share a similar structure: multiple parties (e.g. aeroplanes, or people) come together to coordinate their activities inorder to achieve a common goal. It is not surprising, then, that a lot of research is being done on how the mechanics of the coordination process can be automated using computers. This is where neural networks come in. ‘Neural networks are important for their ability to adapt. Neural nets represent entirely different models from those related to other symbolic systems. The difference occurs in the way the nets store and retrieve information. The information in a neural net is found to be distributed throughout the network and not localized. The nets are capable of making memory associations. They can handle a large amount of data, fast and efficiently. They are also fault tolerant, ie. even if a few neurons fail, it will not disable the entire system. The paradigm of artificial neural networks, developed to emulate some of the capabilities of the human brain, has demonstrated great potential for various low-level computations and embodies salient features such as learning, fault-tolerance, parallelism and generalization. Neural networks, comprising processing elements called neurons, are capable of coping with computational complexity, non-linearity and uncertainty. In view of this versatility of neural networks, it is believed that they hold great potential as building blocks for a variety of behaviors associated with human cognition. However, the subjective phenomena such as reasoning and perceptions are often regarded beyond the domain of the neural network theory. Neural networks can deal with imprecise data and ill-defined activities; thus they offer low-level computational features. About the Book Neural networks, at present is a much sought-after topic, among academicians as well as program developers. This book is designed to give a broad, yet an in-depth overview of the field of neural networks. The principles of neural networks are discussed in detail, including information and useful knowledge available for various network processes. The various algorithms and solutions to the problems given in the book are well balanced and pertinent to the neural networks research projects, labs and for college and university level studies. The modem aspects of neural networks have been introduced right from the basic principles and discussed in an easy-to-understand manner, so that a beginner to the subject is able to grasp the concept of soft networks with minimal effort. xxii | Preface The wide variety of worked-out examples relevant to the neural network area, will help in reinforming the concepts explained. The solutions to the problems are programmed using MATLAB 6.0 and the simulated results are given. The MATLAB neural network toolbox is provided in the Appendix for easy reference. This book provides the neural network architecture, algorithms and application procedure-oriented structures to help the reader move into the world of neural networks with ease. It also presents application of neural networks to a wide variety of fields of current interest. A few field projects are also included. Who will Benefit This book would be an ideal text for undergraduate students of Computer Science, Information Technology, Electrical and Electronics and Electronics and Communication engineering for their course on Neural Networks. Those pursuing MCA and taking a course on Neural Networks will find the book useful. Programmers involved in neural network applications programming will also benefit from this book. Organization The book includes 16 chapters altogether. The chapters are organized as follows: Chapter 1 gives an introduction to Neural Networks Techniques. An overview of MATLAB is also discussed, The preliminaries of the Artificial Neural Network are described in Chapter 2. The discussion is based on the development of artificial neural net, comparison between the biological neuron and the artificial neuron, the basic building blocks of a neural net and the terminologies used in neural net. The summary of notations is given at the end of the chapter. Chapter 3 deals with the fundamental models of an artificial neural net. The basics of McCulloch Pitts neuron and the Hebb net along with the concept of linear separability is given. The learning rules used in neural networks are also described in detail in this chapter. Chapter 4 provides information regarding the Perceptron Neural Net. The architecture and algorithm of the perceptron neural net was explained along with suitable example problems. An introduction to multi layer perceptron is given. The basic architecture and algorithm along with examples for Adaline and Madaline nets are described in Chapter 5. ‘Chapter 6 discusses pattern association nets. Pattern association nets include auto association, hetero association and bi-directional associative memory net. The learning rules used for pattern association are also given. Feedback network is described in Chapter 7. The chapter mainly provides information regarding Discrete Hopfield and Continuous Hopfield nets. Their architecture, algorithm and application procedure along with solved examples are discussed in this chapter. Chapter 8 gives details on feed forward nets. The feed forward nets described here are the Back Propagation Algorithm and the Radial Basis Function Network. Both the networks are described with Preface | xxiii their architecture, algorithm and example problem. The merits and demerits of back propagation algorithm are also included. Chapter 9 deals with competitive nets. The nets that come under this category are self-organizing feature map, learning vector quantization, Max net, Mexican Hat, Hamming net. All these networks are discussed in detail with their features in this chapter. The Counter Propagation Net (CPN) used for data compression is discussed in Chapter 10, The two types of CPN, full CPN and forward only CPN are discussed along with their architecture and algorithms. Chapter 11 describes the features of Adaptive Resonance Theory (ART). The types of ART, ART network and ART2 network are described with their respective architecture, algorithms and example problems. The information regarding the special nets like Boltzmann machine, cascade correlation, spatio temporal network, simulated annealing, optical neural net, Cauchy machine, Gaussian machine, cognitron, neo cognitron, Boltzmann machine with learning, ete. are given in Chapter 12. Chapter 13 discusses the application of neural network in arts, biomedicine, industrial and control area, data mining, robotics, patiern recognition, etc. with case studies. Chapter 14 presents the applications of various special networks dealt in Chapter 12. Few projects related to pattern classification, system identification using different networks with MATLAB programs are discussed in Chapter 15. Chapter 16 gives a brief introduction to Fuzzy Systems and Hybrid Systems (Fuzzy Neural Hybrid and Neural Fuzzy Hybrid), ‘The appendices include the neural network MATLAB tool box. In conclusion, we hope that the reader will find this book a truly helpful guide and a valuable source of information about the neural networks principles and their numerous practical applications. Critical somments and suggestions from the readers are welcome as they will help us improve the future editions of the book. SN Sivanandam S Sumathi SN Deepa Acknowledgements First of al, the authors would like to thank the Almighty for granting them perseverance and achievements. Dr N Sivanandam, Dr S Sumathi and S N Deepa wish to thank Mr V Rajan, Managing Trustee, PSG Institutions, Mr C R Swaminathan, Chief Executive, and Dr $ Vijayarangan, Principal, PSG College of Technology, Coimbatore, for their whole-hearted cooperation and encouragement provided for this endeavor. The authors are grateful for the support received from the staff members of the Electrical and Electronics Engineering and Computer Science and Engineering departments of their college. Dr Sumathi owes much to her daughter S Priyanka, who cooperated even when her time was being monopolized with book work. She feels happy and proud for the steel-frame support rendered by her husband. She would like to extend whole-hearted thanks to her parents and parents-in-law for their constant support. She is thankful to her brother who has always been the “Stimulator” for her progress, Mrs $ N Deepa wishes to thank her husband Mr TS Anand, her daughter Nivethitha TS Anand and her family for the support provided by them. Thanks are also due to the editorial and production teams at Tata McGraw-Hill Publishing Company Limited for their efforts in bringing out this book. ‘* Scope of neural networks and MATLAB. Dmdvrto Howneural network is used to learn patterns and relationships in data. The aim of neural networks. About fuzzy logic . Use of MATLAB to develop the | applications basedonneural =| networks. | | | | | | . Introduction to Neural Networks Artificial neural networks are the result of academic investigations that use mathematical formulations to model nervous system operations. The resulting techniques are being successfully applied in a variety of everyday business applications. Neural networks (NNs) represent a meaningfully different approach to using computers in the work- place. A neural network is used to lear patterns and relationships in data. The data may be the results of 2 Introduction to Neural Networks a market research effort, a production process given varying operational conditions, or the decisions of a loan officer given a set of loan applications. Regardless of the specifics involved, applying a neural network is substantially different from traditional approaches. Traditionally a programmer or an analyst specifically ‘codes’ for every facet of the problem for the computer to ‘understand’ the situation, Neural networks do not require explicit coding of the problems. For example, to generate a model that performs a sales forecast, a neural network needs to be given only raw data related to the problem. The raw data might consist of history of past sales, prices, competitors’ prices and other economic variables. The neural network sorts through this information and produces an understanding of the factors impacting sales. The model can then be called upon to provide a prediction of future sales given a forecast of the key factors, ‘These advancements are due to the creation of neural network learning rules, which are the algo- rithms used to ‘learn’ the relationships in the data. The learning rules enable the network to ‘gain knowl- edge’ from available data and apply that knowledge to assist a manager in making key decisions. What are the Capabilities of Neural Networks? In principle, NNs can compute any computable function, ie. they can do everything a normal digital ‘computer can do. Especially anything that can be represented as a mapping between vector spaces can be approximated to arbitrary precision by feedforward NNs (which is the most often used type). In practice, NNs are especially useful for mapping problems, which are tolerant of some errors, have lots of example data available, but to which hard and fast rules cannot easily be applied. However, NNs are, as of now, difficult to apply successfully to problems that concem manipulation of symbols and memory. Who is Concerned with Neural Networks? Neural Networks are of interest to quite a lot of people from different fields: © Computer scientists want to find out about the properties of non-symbolic information processing with neural networks and about learning systems in general. © Engineers of many kinds want to exploit the capabilities of neural networks in many areas (e.g. signal processing) to solve their application problems. © Cognitive scientists view neural networks as a possible apparatus to describe models of thinking and conscience (High-level brain function). # Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g. memory, sensory system). © Physicists use neural networks to model phenomena in statistical mechanics and for a lot of other tasks. # Biologists use Neural Networks to interpret nucleotide sequences. # Philosophers and some other people may also be interested in Neural Networks to gain knowledge about the human systems namely behavior, conduct, character, intelligence, brilliance and other psychological feelings. Environmental nature and related functioning, marketing business as well as designing of any such systems can be implemented via Neural networks. The development of Artificial Neural Network started 50 years ago. Artificial neural networks (ANNs) are gross simplifications of real (biological) networks of neurons. The paradigm of neural networks, Introduction to Neural Networks 3 which began during the 1940s, promises to be a very important tool for studying the structure-function relationship of the human brain, Due to the complexity and incomplete understanding of biological neurons, various architectures of artificial neural networks have been reported in the literature. Most of the ANN structures used commonly for many applications often consider the behavior of a single neu- ron as the basic computing unit for describing neural information processing operations. Each comput- ing unit, i. the artificial neuron in the neural network is based on the concept of an ideal neuron. An ideal neuron is assumed to respond optimally to the applied inputs. However, experimental studies in neuro-physiology show that the response of a biological neuron appears random and only by averaging many observations it is possible to obtain predictable results, Inspired by this observation, some re- searchers have developed neural structures based on the concept of neural populations. In common with biological neural networks, ANNs can accommodate many inputs in parallel and encode the information in a distributed fashion. Typically the information that is stored in a neural net is, shared by many of its processing units. This type of coding is in sharp contrast to traditional memory schemes, where a particular piece of information is stored in only one memory location. The recail process is time consuming and generalization is usually absent. The distributed storage scheme provides many advantages, most important of them being the redundancy in information representation. Thus, an ANN can undergo partial destruction of its structure and still be able to function well. Although redun- dancy can also be built into other types of systems, ANN has a natural way of implementing this. The result is a natural fault-tolerant system which is very similar to biological systems. The aim of neural networks is to mimic the human ability to adapt to changing circumstances and the current environment. This depends heavily on being able to leam from events that have happened in the past and to be able to apply this to future situations. For example the decisions made by doctors are rarely based on a single symptom because of the complexity of the human body; since one symptom could signify any number of problems, An experienced doctor is far more likely to make a sound deci- sion than a trainee, because from his past experience he knows what to look out for and what to ask, and may haye etched on his mind a past mistake, which he will not repeat. Thus the senior doctor is in a superior position than the trainee. Similarly it would be beneficial if machines, too, could use past events as part of the criteria on which their decisions are based, and this is the role that artificial neural net- works seek to fill Artificial neural networks consist of many nodes, i its analogous to neurons in the brain, Each node has a node function, associated with it which along with a set of local parameters determines the output of the node, given an input. Modifying the local parameters may alter the node function. Artificial Neural Networks thus is an information-processing system. In this information-pro- cessing system, the elements called neurons, process the information. The signals are transmitted by means of connection links. The links possess an associated weight, which is multiplied along with the incoming signal (net input) for any typical neural net. The output signal is ob- tained by applying activations to the act input. ‘The neural net can generally be a single layer or a multi- layer net. The structure of the simple artificial neural net is shown in Fig. 1.1. Figure 1.1 shows a simple artificial neural net with two input neurons (x,, x5) and one output neuron (y). The inter- connected weights are given by w, and w>. In a single layer net there is a single layer of weighted interconnections. Fig 14 | 4 Simple Artticial Neural Net 4 Introduction to Neural Networks A typical multi-layer artificial neural network, abbreviated as MNN, com-« prises an input layer, output layer and hidden (intermediate) layer of neurons. MNNs are often called layered net- * works. They can implement arbitrary complex inpuoutput mappings or de- cision surfaces separating different pat- terns. A three-layer MNN is shown in Eclat a 13 le ae Input Layer Hidden Layer Output Layer layer of input units is connected to a layes of fuddieg units ;which 1s connected A Densely Interconnected Three-layered Static to the layer of output units. The activity / —ig/412] Neural Network. Each Shaded Circle, or Node, of neurons in the input layer represents we ee a) Represents an Artificial Neuron the raw information that is fed into the network. The activity of neurons in the 1 hidden layer is determined by the activi x Nn, No >| Ny ties of the input neurons and the con- | I necting weights between the input and Input Layer Hidden Layer Output Layer hidden units. Similarly, the behavior of the output units depends on the activity A Block Diagram Representation of a of the neurons in the hidden layer and "| Three-tayered MNN the connecting weights between the hid- den and the output layers. This simple neural structure is interesting because neurons in the hidden layers are free to construct their own representation of the input, MNNSs provide an increase in computational power over a single-layer neural network unless there is a nonlinear activation function between layers, Many capabilities of neural networks, such as nonlinear functional approximation, learning, generalization, etc are in fact performed due to the nonlinear activa- tion function of each neuron. ANNs have become a technical folk legend. The market is flooded with new, increasingly technical software and hardware products, and many more are sure to come. Among the most popular hardware implementations are Hopfield, Multilayer Perceptron, Self-organizing Feature Map, Learning Vector Quantization, Radial Basis Function, Cellular Neural, and Adaptive Resonance Theory (ART) networks, Counter Propagation networks, Back Propagation networks, Neo-cognitron, etc. As a result of the exist- ence of all these networks, the application of the neural nctwork is increasing tremendously. ‘Thus artificial neural network represents the major extension to computation. They perform the op- erations similar to that of the human brain. Hence it is reasonable to expect « rapid increase in our understanding of artificial neural networks leading to improved network paradigms and a host of appli- cation opportunities. 1.3. The Rise of Neurocomputing A majority of information processing today is carried out by digital computers, This has led to the widely held misperception that information processing is dependent on digital computers. However, if Introduction to Neural Networks 5 we look at cybernetics and the other disciplines that form the basis of information science, we see that information processing originates with living creatures in their struggle to survive in their environ- ments, and that the information being processed by computers today accounts for only a small part — the automated portion — of this. Viewed in this light, we can begin to consider the possibility of infor- mation processing devices that differ from conventional computers. In fact, research aimed at realizing a variety of different types of information processing devices is already being carried out, albeit in the shadows of the major successes achieved in the realm of digital computers. One direction that this research is taking is toward the development of an information processing device that mimics the struc- tures and operating principles found in the information processing systems possessed by humans and other living creatures. Digital computers developed rapidly in and after the late 1940's and after originally being applied to the field of mathematical computations, have found expanded applications in a variety of areas, like text (word), symbol, image and voice processing, i.e. pattern information processing, robotic control and artificial intelligence. However, the fundamental structure of digital computers is based on the principle of sequential (serial) processing, which has little if anything in common with the human nervous system. The human nervous system, itis now known, consists of an extremely large number of nerve cells, or neurons, which operate in parallel to process various types of information. By taking a hint from the structure of the human nervous system, we should be able to build a new type of advanced parallel information processing device. In addition to the increasingly large volumes of data that we must process as a result of recent devel- opments in sensor technology and the progress of information technology, there is also a growing re- quirement to simultaneously gather and process huge amounts of data from multiple sensors and other sources. This situation is creating a need in various fields to switch from conventional computers that process information sequentially, to parallel computers equipped with multiple processing elements aligned to operate in parallel to process information. Besides the social requirements just cited, a number of other factors have been at work during the 1980’s to prompt research on new forms of information processing devices. For instance, recent neuro- physiological experiments have shed considerable light on the structure of the brain, and even in fields such as cognitive science, which study human information processing processes at the macro level, we are beginning to see proposals for models that call for multiple processing elements aligned to operate in parallel. Research in the fields of mathematical science and physics is also concentrating more on the mathematical analysis of systems comprising multiple clements that interact in complex ways. These factors gave birth to 2 major research trend aimed at clarifying the structures and operating principles inherent in the information processing systems of human beings and other animals, and constructing an information processing device based on these structures and operating principles. The term ‘neurocomputing’ is used to refer to the information engineering aspects of this research. Dr. Cleve Moler, chief scientist at MathWorks, Inc., originally wrote MATLAB to provide easy access to matrix software developed in the LINPACK and FISPACK projects. The first version was written in the late 1970s for use in courses in matrix theory, linear algebra, and numerical analysis. MATLAB is therefore built upon a foundation of sophisticated matrix software, in which the basic data element isa matrix that does not require predimensioning. 6 Introduction to Neural Networks MATLAB is a product of The MathWorks, Inc. and is an advanced interactive software package specially designed for scientific and engineering computation. The MATLAB environment integrates graphical illustrations with precise numerical calculations, and is a powerful, easy-to-use, and compre- hensive tool for performing all kinds of computations and scientific data visualization. MATLAB has proven to be a very flexible and useful tool for solving problems in many areas. MATLAB is a high- performance language for technical computing. It integrates computation, visualization and program- ming in an easy-to-use environment where problems and solutions are expressed in familiar mathemati- cal notation, Typical areas of application of MATLAB include: © Math and computation # Algorithm development # Modeling, simulation and prototyping ‘ Data analysis, exploration, and visualization * Scientific and engineering graphics + Application development, including graphical user interface building MATLAB isan interactive system whose basic element is an array that does not require dimensioning. This helps in solving many computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar non-interactive language such as C or FORTRAN. Mathematics is the common language of science and engineering. Matrices, different equations, arrays of data, plots and graphs are the basic building blocks of both applied mathematics and MATLAB. It is the underlying mathematical base that makes MATLAB accessible and powerful. MATLAB allows expressing the entire algorithm in a few dozen lines, to compute the solution with great accuracy in about a second. Therefore it is especially helpful for technical analysis, algorithm prototyping and application development, MATLAB‘s two-and three-dimensional graphics are object oriented. MATLAB is thus both an envi- ronment and a matrix/vector-oriented programming language, which enables the user to build own reusable tools. The user can create his own customized functions and programs (known as M-files) in MATLAB code. The Toolbox is a specialized collection of M-files for working on particular classes of problems. MATLAB Documentation Set has been written, expanded and put online for ease of use. The set includes online help, as well as hypertext-based and printed manuals. The commands in MATLAB. are expressed in a notation close to that used in mathematics and engineering. There isa very large set of these commands and functions, known as MATLAB M-files, As a result, solving problems through MATLAB is faster than the other traditional programming. It is easy to modify the functions since most of the M-files can be opened and modified. For ensuring high performance, the MATLAB software has been written in optimized C and coded in assembly language. The main features of MATLAB can be summarized as: * Advance algorithms for high-performance numerical computations, especially in the field of ma- trix algebra. # A large collection of predefined mathematical functions and the ability to define one’s own func- tions. © Two- and three-dimensional graphics for plotting and displaying data. * A complete online help system. © Powerful, matrix/vector-oriented, high-level programming language for individual applications. Introduction to Neural Neworks 7 © Ability to cooperate with programs written in other languages and for importing and exporting formatted data. © Toolboxes available for solving advanced problems in several application areas. Figure 1.4 shows the main features and capabilities of MATLAB. User-written Functions Built-in Functions T | Graphics: Computations | External Interface *2-D Graphics + Linear Algebra * Interface with C © 3-D Graphics * Data Analysis and FORTRAN * Color and * Signal Processing | Programs: Lighting * Quadrature '* Animation Etec. | Toolboxes « Signal Processing «Image Processing * Control System ‘* Optimization «Neural Networks. * Communications « Robust Control * Statistics * Splines Fg. 14 | Features and Capabilities of MATLAB An optional extension of the core of MATLAB called SIMULINK is also available. SIMULINK means SIMUlating and LINKing the environment. SIMULINK is an environment for simulating linear and non-linear dynamic systems, by constructing block diagram models with an easy (0 use graphical user interface 8 Introduction to Neural Networks SIMULINK is a MATLAB toolbox designed for the dynamic simulation of linear and non-linear systems as well as continuous and discrete-time systems. It can also display information graphically. MATLAB is an interactive package for numerical analysis, matrix computation, control system design, and linear system analysis and design available on most CAEN (Computer Aided Engineering Network) platforms (Macintosh, PCs, Sun, and Hewlett-Packard). In addition to the standard functions provided by MATLAB, there exist a large set of Toolboxes, or collections of functions and procedures, available as part of the MATLAB package. Toolboxes are libraries of MATLAB functions used to customize MATLAB for solving particular class of problems. Toolboxes are a result of some of the ‘world’s top researches in specialized fields. They are equivalent to prepackaged “off-the-shelf” software solution for a particular class of problem or technique. It is a collection of special files called M-files that extend the functionality of the base program. The various Toolboxes available are: * Conirol System: Provides several features for advanced control system design and analysis. © Communications: Provides functions to model the components of a communication system’s physical layer. « Signal Processing: Contains functions to design analog and digital filters and apply these filters to data and analyze the results. ‘© System identification: Provides features to build mathematical models of dynamical systems based on observed system data. © Robust Control: Allows users to create robust multivariable feedback control system designs based on the concept of the singular-value Bode plot. © Simulink; Allows you to model dynamic systems graphically. * Neural Network: Allows you to simulate neural networks. * Fuzzy Logie: Allows for manipulation of fuzzy systems and membership functions. # Image Processing: Provides access to a wide variety of functions for reading, writing, and filtering images of various kinds in different ways. + Analysis: Includes a wide variety of system analysis tools for varying matrices. © Optimization: Contains basic tools for use in constrained and unconstrained minimization prob- lems. * Spline: Can be used to find approximate functional representations of data sets. + Symbolic: Allows for symbolic (rather than purely numeric) manipulation of functions. © User Inierface Utilities: Includes tools for creating dialog boxes, menu utili interactions for script files. and other user MATLAB has been used as an efficient tool, all over this text to develop the applications based on Neural Nets. Review Questions 1.1 How did neurocomputing originate? 1.2. What is a multilayer net? Describe with a neat sketch. Introduction to Neural Networks 1.3 State some of the popular neural networks. 1.4 Briefly discuss the key characteristics of MATLAB. 1.5 List the basic arithmetic operations that can be performed in MATLAB. 1.6 What is the necessity of SIMULINK package available in MATLAB? 17 Discuss in brief about the GUI toolbox feature of MATLAB. 1.8 What is meant by toolbox and list some of the toolboxes available for MATLAB? © The preliminaries of Artificial Neural Networks. © Definition of an artificial neuron. © The development of neural net- works, Dmyavrdwo No * Comparison between the biological Neuron and artificial neuron based | onspeed, fault tolerance, memory, control mechanism, etc. . | © The method of setting the value | for the weight id how it. Introduction to — renee mtentenes | © Various methods of training, viz. Artificial Neural pce perecomorniad Networks | | '* Basic building blocks of the artificial neural network, i.e. network architecture, setting the weights, activation function, etc. ‘© The activation function used to calculate the output response of a | neuron. | « Summary of notations used alll ‘over in this text. Introduction to Artificial Neural Networks Ww ‘The basic preliminaries involved in the Artificial Neural Network (ANN) are described in this chapter. A brief summary of the history of neural networks, in terms of the development of architectures and algorithms, the structure of the biological neuron is discussed and compared with the artificial neuron. The basic building blocks and the various terminologies of the artificial neural network are explained towards the end of the chapter. The chapter concludes by giving the summary of notations, which are used in all the network algorithms, architectures, etc. discussed in the forthcoming chapters. Artificial neural networks are nonlinear information (signal) processing devices, which are built from interconnected elementary processing devices called neurons. An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key clement of this para- digm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in union to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recogni- tion or data classification, through a learning process. Learning in biological systems involves adjust- ments to the synaptic connections that exist between the neurons. This is true of ANNs as well. ANN’s are a type of artificial intelligence that attempts to imitate the way a human brain works, Rather than using a digital model, in which all computations manipulate zeros and ones, a neural net- work works by creating connections between processing elements, the computer equivalent of neurons. ‘The organization and weights of the connections determine the output. A neural network is 2 massively parallel-distributed processor that has ¢ natural propensity for stor- ing experimental knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network through a learning process, and, 2. Inter-neuron connection strengths known as synaptic weights are used to store the knowledge. Neural networks can also be defined as parameterized computational nonlinear algorithms for (nu- merical) data/signalimage processing. These algorithms are either implemented on a general-purpose computer or are built into a dedicated hardware. Artificial Neural Networks thus is an information-processing system. In this information-processing system, the elements called as neurons, process the information. The signals are transmitted by means of connection links. The links possess an associated weight, which is multiplied along with the incoming signal (net input) for any typical neural net. The output signal is obtained by applying activations to the net input. An artificial neuron is characterized by: 1, Architecture (connection between neurons) 2. Training or learning (determining weights on the connections) 3. Activation function 12 Introduction to Neural Networks All these are discussed in detail in the forthcoming subsections, The structure of the simple artificial neural network is shown in Fig, 2.1. Figure 2.1 shows a simple artificial neural network with two input neurons (x, X2) and one output neuron (y). The inter connected weights are given by w and w5. An artificial neuron is a p-input single-output signal-pro- cessing element, which can be thought of as a simple model of a non-branching biological neuron. In Fig 2.1, various inputs to the network are represented by the math- Fig. 2.4 | A Simple Artificial Neuféll Net ematical symbol, x(n). Each of these inputs are multi- plied by a connection weight. These weights are represented by w(n). In the simplest case, these prod- ucts are simply summed, fed through a transfer function to generate a result, and then delivered as output. This process lends itself to physical implementation on a large scale in a small package. This electronic implementation is still possible with other network structures, which utilize different sum- ming functions as well as different transfer functions. Why Artificial Neural Networks? Inputs _ oS (Output) Output Layer We (weights) Input Layer The long course of evolution has given the hurnan brain many desirable characteristics not present in ‘Von Neumann or modem parallel computers. These include © Massive parallelism, © Distributed representation and computation, © Learning ability, © Generalization ability, © Adaptivity, © Inherent contextual information processing © Fault tolerance, and © Low energy consumption. It ig hoped that devices based on biological neural networks will posses some of these desirable characteristics. Modern digital computers outperform humans in the domain of numeric computation and related symbol manipulation. However, humans can effortlessly solve complex perceptual prob- lems (like recognizing a man in a crowd from a mere glimpse of his face) at such a high speed and extent as to dwarf the world’s fastest computer. Why is there such a remarkable difference in their perfor- mance? The biological neural system architecture is completely different from the Von Neumann archi- tecture (see Table 2.1). This difference significantly affects the type of functions each computational model can best perform. Numerous efforts to develop “intelligent” programs based on Von Neumann's centralized architec- ture have not resulted in any general-purpose intelligent programs. Inspired by biological neural net- works, ANNs are massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections. ANN models attemp! to use some “organizational” prin- ciples believed to be used in the human brain. Introduction to Artificial Neural Networks 13 Fable 24] Von Neumann Computer Versus Biological Neural System Von Neumann Biological Computer Neural System Processor ‘complex ‘Simple High speed Low speed One or a few A large number Memory Separate from a processor _Integrated into Localized Processor Noncontent addressable Distributed Content addressable Computing Centralized Distributed Sequential Parallel Stored programs Self-learning Raliability Very vulnerable Robust Expertise Numerical and symbolic Perceptual manipulations problems Operating Well-defined, Poorly defined, environment well-constrained unconstrained Either humans or other computer techniques can use neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, to extract patterns and detect trends that are too complex to be noticed. A trained neural network can be thought of as an “expert” in the category of information it has been given to analyze. This expert can then be used to provide projections given new situations of interest and answer “what if” questions. Other advantages include: 1. Adaptive learning: An abi initial experience. 2. Self-organization: An ANN can create its own organisation or representation of the information it receives during learning time. to lean how to do tasks based on the data given for training or 3. Real-time operation: ANN computations may be carried out in parallel, using special hardware devices designed and manufactured to take advantage of this capability. 4, Fault tolerance via redundant information coding: Partial destruction of a network leads to a corresponding degradation of performance. However, some network capabilities may be retained even after major network damage due to this feature. The historical development of the neural networks can be traced as follows: + 1943—McCulloch and Pitts: start of the modern era of neural networks This forms a logical calculus of neural networks. A network consists of sufficient number of neu- rons (using a simple model) and properly set synaptic connections can compute any computable 14 Introduction to Neural Networks function. A simple logic function is performed by a neuron in this case based upon the weights set in the McCulloch-Pitts neuron. The arrangement of neuron in this case may be represented as a combination of logic functions. The most important feature of this type of neuron is the concept of threshold. When the net input to a particular neuron is greater than the specified threshold by the user, then the neuron fires. Logic circuits are found to use this type of neurons extensively. © 1949—Hebb’s book “The organization of behavior” An explicit statement of a physiological learning rule for synaptic modification was presented for the first time, Hebb proposed that the connectivity of the brain is continually changing as an organ- ism learns differing functional tasks, and that neural assemblies are created by such changes. Hebb’s work was immensely influential among psychologists. The concept behind the Hebb theory is that if two neurons are found to be active simultaneously the strength of connection between the two neurons should be increased. This concept is similar to that of the correlation matrix learning © 1958—Rosenblatt introduces Perceptron (Block {1962], Minsky and Papert (1988]) In Perceptron network the weights on the connection paths can be adjusted. A method of iterative ‘weight adjustment can be used in the Perceptron net. The Perceptron net is found to converge if the weights obtained allow the net to reproduce exactly all the training input and target output vector pairs * 1960—Widrow and Hoff introduce adaline ADALINE, abbreviated from Adaptive Linear Neuron uses a leaming rule called as Least Mean Square rule or Delta rule. This rule is found to adjust the weights so as to reduce the difference between the net input to the output unit and the desired output. The convergence criteria in this case are the reduction of mean square error to a minimum value. This delta rule for a single layer net can be called a precursor of the backpropagation net used for multi-layer nets. The multi-layer exten- sions of Adaline formed the Madaline [Widrow and Lehr, 1990]. + 1982—John Hopfield’s networks Hopfield showed how to use “Ising spin glass” type of model to store information in dynamically stable networks. His work paved the way for physicists to enter neural modeling, thereby trans- forming the field of neural networks. These nets are widely used as associative memory nets. The Hopfield nets are found to be both continuous valued and discrete valued. This net provides an efficient solution for the “Travelling Sales-man Problem’ « 1972—Kohonen’s Self-Organizing Maps (SOM) Kohonen’s Self-Organizing Maps are capable of reproducing important aspects of the structure of biological neural nets. They make use of data representation using topographic maps, which are ‘common in the nervous systems. SOM also has a wide range of applications. It shows how the output layer can pick up the correlational structure (from the inputs) in the form of the spatial arrangement of units. These nets are applied to many recognition problems. « 1985—Parker, 1986—Lecum During this period the backpropagation net paved its way into the Neural Networks. This method propagates the error information at the output units back to the hidden units using a generalized delta rule. This net is basically a multilayer, feed forward net trained by means of backpropagation. Originally, even though the work was performed by Parker (1985) the credit of publishing this net goes to Rumelhart, Hinton and Williams (1986). Backpropogation net emerged as the most popular Introduction to Anificial Neural Networks 15 learning algorithm for the training of multilayer perceptrons and has been the workhorse for many neural network applications. © 1988—Grossberg Grossberg developed a leaming rule similar to that of Kohonen, which is widely used in the Counter Propagation net, This Grossberg type of learning is also used as outstar learning. This learning occurs for all the units in a particular layer; no competition dmong these units is assumed. + 1987, 1990—Carpenter and Grossberg Carpenter and Grossberg invented Adaptive Resonance Theory (ART). ART was designed for both binary inputs and the continuous valued inputs. The design for the binary inputs formed ARTI, and ART2 came into being when the design became applicable to the continuous valued inputs, The ‘most important feature of these nets is that the input patterns can be presented in any order. ‘* 1988—Broomhead and Lowe developed Radial Basis Functions (RBF). This is also a multi- layer net that is quiet similar to the back propagation net. ‘* 1990—Vapnik developed the support vector machine. 2.4 Biological Neural Networks A biological neuron or anerve cell consists of synapses, dendrites, the cell body (or hillock), and the axon. The “building blocks” are discussed as follows: ‘© The synapses are elementary signal processing devices » A synapse is a biochemical device, which converts a pre-synaptic electrical signal into achemical signal and then back into a post-synaptic electrical signal. + The input pulse train has its amplitude modified by parameters stored in the synapse. The nature of this modification depends on the type of the synapse, which can be either inhibitory or excita- tory. * ‘The postsynaptic signals are aggregated and transferred along the dendrites to the nerve cell body. © The cell body generates the output neuronal signal, a spike, which is transferred along the axon to the synaptic terminals of other neurons. © The frequency of firing of a neuron is proportional to the total synaptic activities and is controlled by the synaptic parameters (weights). © The pyramidal cell can receiv of target cells —a connectivit 104 synaptic inputs and it can fan-out the output signal to thousands difficult to achieve in the artificial neural networks. In general the function of the main elements can be given as, Dendrite - Receives signals from other neurons Soma — Sums all the incoming signals Axon — When a particular amount of input is received, then the cell fires. It transmits signal through axon to other cells. 16 Introduction to Neural Networks The fundamental processing element of a neural network is a neuron. This building block of human awareness encompasses a few general capabilities. Basically, a biological neuron receives inputs from other sources, combines them in some way, performs a generally nonlinear operation on the result, and then outputs the final result. Figure 2.2 shows the relationship of these four parts. — \ = 4 Parts of a Typical Nerve Cell ~ / { Denarites: Accept inputs (® + Soma: Process the inputs ——— Axon: Turn the processed inputs into outputs Als Synapses: The electrochemical contact between neurons Fig. 22| A Biological Neuron The properties of the biological neuron pose some features on the artificial neuron, They are: 1. Signals are received by the processing elements. This clement sums the weighted inputs. ‘The weight at the receiving end has the capability to modify the incoming signal. The neuron fires (transmits output), when sufficient input is obtained. The output produced from one neuron may be transmitted to other neurons. The processing of information is found to be local. ‘The weights can be modified by experience. Neurotransmitters for the synapse may be excitatory or inhibitory. Both artificial and biological neurons have inbuilt fault tolerance. en oy he Figure 2.3 and Table 2.2 indicate how the biological neural net is associated with the artificial neural net. Cell Body ro —_ | sresgraa : ea “SE LD _s | Axon Summation #IGlBS| Assocar0n of Bo1qca Ne wn Arca Met Introduction to Aniificial Neural Networks 17 Associated Terminologies of Biological and Ariical Neural Table 2.2| Net Biological Neural Network Aniificial Neural Network Cell Body Neurons Dendrite Weights or interconnections Soma Net input Axon Output The main differences between the brain and the computer are: * Biological neurons, the basic building blocks of the brain, are slower than silicon logic gates. The neurons operate in milliseconds, which is about six orders of magnitude slower than the silicon gates operating in the nanosecond range. © The brain makes up for the slow rate of operation with two factors: «a huge number of nerve cells (neurons) and interconnections between them. The human brain contains approximately 10" to 10’ interconnections. «= the function of a biological neuron seems to be much more complex than that of a logic gate. © The brain is very energy efficient. It consumes only about 10-16 joules per operation per second, comparing with 10-6 joules per operation per sec, for a digital computer. © The brain is a highly complex, non-linear, parallel information processing system. It performs tasks like pattern recognition, perception, motor control, many times faster than the fastest digital com- puters. © Consider an efficiency of the visual system which provides a representation of the environment which enables us to interact with the environment. For example, a complex task of perceptual recognition, ¢.g. recognition of a familiar face embedded in an unfamiliar scene can be accom- plished in 100-200 ms, whereas tasks of much lesser complexity can take hours if not days on conventional computers. As another example consider an efficiency of the SONAR system of a bat. SONAR is an active echolocation system. A bat SONAR provides information about the distance from a target, its rela- tive velocity and size, the size of various features of the target, and its azimuth and elevation. The complex neural computations needed to extract all this information from the target echo occur within the brain, which has the size of a plum. The precision and success rate of the target location is rather impossible to match by RADAR or SONAR engineers. ‘Table 2.3 shows the major differences between the biological and the artificial neural network, aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 20 Introduction to Neural Networks Feed Forward Net Feed forward networks may have a single layer of weights where the inputs are directly connected to the outputs, or multiple layers with intervening sets of hidden units (see Fig. 2.4), Neural networks use hidden units to create internal representations of the input patterns, In fact, it has been shown that given enough hidden units, it is possible to approximate arbitrarily any function with a simple feed forward network. This result has encouraged people to use neural networks to solve many kinds of problems. 1. Single layer net: It is a feed forward net. It has only one layer of weighted interconnections. The inputs may be connected fully to the output units. But there is a chance that none of the input units and output units are connected with other input and output units respectively, There is also a case where, the input units are connected with other input units and output units with other output units. In a single layer net, the weights from one output unit do not influence the weights for other output units. 2. Multi layer net: It is also.a feed forward net i.e., the net where the signals flow from the input units to the output units in a forward direction. The multi-layer net pose one or more layers of nodes between the input and output units. This is advantageous over single layer net in the sense that, it can be used to solve more complicated problems. Competitive Net The competitive net is similar to a single-layered feed forward network except that there are connec- tions, usually negative, between the output nodes. Because of these connections the output nodes tend to ‘compete to represent the current input pattern. Sometimes the output layer is completely connected and sometimes the connections are restricted to units that are close to each other (in some neighborhood). With an appropriate learning algorithm the latter type of network can be made to organize itself topo- logically. In a topological map, neurons near each other represent similar input patterns. Networks of this kind have been used to explain the formation of topological maps that occur in many animal sensory systems including vision, audition, touch and smell Recurrent Net ‘The fully recurrent network is perhaps the simplest of neural network architectures. All units are con- nected to all other units and every unit is both an input and an output. Typically, a set of patterns is instantiated on all of the units, one at a time. As each pattern is instantiated the weights are modified. When a degraded version of one of the patterns is presented, the network attempts to reconstruct the pattern. Recurrent networks are also useful in that they allow networks to process sequential information. Processing in recurrent networks depends on the state of the network at the last time step. Consequently, the response to the current input depends on previous inputs. Figure 2.4 shows two such networks: the simple recurrent network and the Jordan network. 2.7.2. Setting the Weights ‘The method of setting the value for the weights enables the process of learning or training. The process of modifying the weights in the connections between network layers with the objective of achieving the expected output is called training a network. The internal process that takes place when a network is trained is called leaming. Generally, there are three types of training as follows. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 24 Introduction to Neural Networks Binary Step Function The function is given by, k if {20 fay = @ fe if fnw-p This is the condition for absolute inhibition. The McCulloch-Pitts neuron will fire if it receives k or more excitatory inputs and no inhibitory inputs, where k,20>(k-1)w. Example 3.1 Generate the output of logic AND function by McCulloch-Pitts neuron model. Solution ‘The AND fanction returns a true value only if both the inputs are true, else it returns a false value. “1 represents true value and ‘0’ represents false value. ‘The truth table for AND function is, ES Be: 1 1 1 0 1 0 core 0 0 0 A McCulloch-Pitts neuron to implement AND function is (#)— shown in Fig. 3.2. The threshold on unit Y is 2. ‘The output Y is, McCulloch-Pitts Neuron to Per- Y=f,) form Logical AND Function ‘The net input is given by > weights * input txt lt x, Yin =%1 +X From this the activations of output neuron can be formed. 1 if yi22 Y= = fi) { if Yq <2 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 36 ‘introduction to Neural Networks ‘The activations of z, and 7, are given as, it 1 if 2=Cn-d= Vp ig The calculation of net input and activations of z, and z, are shown below. 2% =(%) ANDNOT x) Zq_ Wy YW, % x 1 1 1 0 0 1 0 0 2)=(%, ANDNOT X))—%y..9= iW) +X. % 3 1 1 1 oO oO 1 oO oO ‘The activation for the output unit y is 1. 1 if yi, 21 =f = 7” Y= F(a) {, if Yq =thete yG)el; else yQ) end end disp( ‘Output of Net’): disp(‘Net is not learning enter another set of weights and Threshold value’): wl=input(‘weight wi=" ): w2=input (‘weight w2="): theta=input('theta=" end end disp(*MccuTloch-Pitts Net for ANDNOT function’ ): disp( ‘Weights of Neuron’): dispiwl): dispiw2); disp(‘Threshold value’ disp(theta); Elgments sous droits d'euteur aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 44 Introduction to Neural Networks ‘This type of synapse is called Hebbian synapse. The fourkey mechanisms that characterize a Hebbian synapse are time dependent mechanism, local mechanism, interactive mechanism and correlational mechanism. The simplest form of Hebbian leaming is described by, Aw=x,y ‘This Hebbian learning rule represents a purely feed forward, unsupervised learning. It states that if the cross product of output and input is positive, this results in increase of weight, otherwise the weight decreases. In some cases, the Hebbian rule needs to be modified to counteract unconstrained growth of weight values, which takes place when excitations and response consistently agree in sign. This corresponds to the Hebbian learning rule with saturation of weights at a certain preset level. 3.3.2 Perceptron Learning Rule For the perceptron learning rule, the learning signal is the difference between the desired and actual neuron’s response. This type of learning is supervised. The fact that the weight vector is perpendicular to the plane separating the input patterns during the learning processes, can be used to interpret the degree of difficulty of training a perceptron for different types of input. ‘The perceptron leaming rule states that for a finite ‘n’ number of input training vectors, x(n) where n=110N each with an associated target value, tin) where n= 110N which is +1 or - 1, and an activation function y — f(y_,,), where, 1 if y>o y=40 if -0sy,s0 Hl if ¥j,<-8 the weight updation is given by ify #t, then Waew = Wor + ify =, then there is no change in weights. ‘The perceptron learning rule is of central importance for supervised learning of neural networks. The weights can be initialized at any values in this method. There is a perceptron learning rule convergence theorem which states, “If there is a weight vector w* such that f(x(p) w*) = t(p) for all p, then for any starting vector w, the perceptron learning rule will converge toa weight vector that gives the correct response for all training patterns, and this will be done in a finite number of steps”. 3.3.3 Delta Learning Rule (Widrow-Hoff Rule or Least Mean Square (LMS) Rule) The delta learning rule is also referred to as Widrow-Hoff rule, named due to the originators (Widrow and Hoff, 1960). The delta learning rule is valid only for continuous activation functions and in the aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 48 Introduction to Neural Networks the dot product or Euclidean norm, Euclidean norm is most widely used because dot product may re~ quire normalization. 3.3.5 Out Star Learning Rule Out star leaming rule can be well explained when the neurons are arranged in a layer. This rule is designed to produce the desired response t from the layer of n neurons. This type of learning is also called as Grossberg learning. Out star learning occurs for all units in a particular layer and no competition among these units are assumed. However the forms of weight updates for Kohonen learning and Grossberg learning areclosely related. In the case of out star learning aw, . fen —W,) if neuron j wins the competition = 0 if neuron j losses the competition The rule is used to provide leaming of repetitive and characteristic properties of input-output rela- tionships. Though it is concemed with supervised learning, it allows the network to extract statistical Properties of the input and output signals. It ensures that the output pattern becomes similar to the undistorted desired output after repetitively applying on distorted output versions. The weight change here will be a times the error calculated. 3.3.6 Boltzmann Learning ‘The learning is a stochastic learning. A neural net designed based on this leaming is called Boltzmann earning. In this learning, the nourons constitute a recurrent structure and they work in binary form. This learning is characterized by an energy function, E, the value of which is determined by the particular states occupied by the individual neurons of the machine, given by, = Lhwaan ii where x; is the state of neuron i and wyis the weight from neuron i to neuron j. The value i#/ means that none of the neurons in the machine has self feedback. The operation of machine is performed by choos- ing a neuron at random. The neurons of this learning process are divided into two groups; visible and hidden. In visible neurons there is an interface between the network and the environment in which it operates but in hidden neurons, they operates independent of the environment. The visible neurons might be clamped ‘onto specific states determined by the environment, called as clamped condition. On the other hand, there is free-running condition, in which all the neurons are allowed to operate freely. 3.3.7 Memory Based Learning In memory based learning, all the previous experiences are stored in a large memory of correctly classi- fied input-output examples: (x,44,){4, where 4; is the input vector and 4 is the desired response. The desired response is a scalar, aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 52 Introduction to Neural Networks For the 3rd and 4th epoch the separating line remains the same, hence this line separates the boundary regions as shown in Fig. 3.9. Xe \ (IRGIR] Hote Net tor AND Function ‘The same procedure can be repeated for generating the logic function OR, NOT, AND NOT ete. Example 3.10 Apply the Hebb netto the training patterns that define XOR function with bipolar input and targets. Soluation Input Target 8 b y 1 1 1-1 1-1 1 1 -1 1 1 1 -1 -1 1 -1 By Hebb training algorithm, assigning initial values of the weights w,, & w, to be zero and bias to be zero. wy = 0 =0 and b= 0 Input Target ‘Weight Changes Weights & x by yaw AW, Ab wow o 0 9 1 1 1 -1 -1 -1 -Loo-1 0 =b oo =1 1 -1 1 1 +1 -1 1 0 -2 0 -1 1 1 1 -1 1 1-1) =1 1 -1 -1 1 -1 1 1 -1 0 o 0 The weight changes are called using, Aw, =x, yand Ab=y aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 56 Introduction to Neural Networks Solution The ‘** symbol indicates that there exist a “+ I’ and ‘.’ Symbol indicates that there exist a ‘- 1". The inputis givenby, E=[11111-1-1-111111-1-1-11111) Fs(l1ii1-1-1-111111-1-1-11-1-1-1}; ‘The MATLAB program is given as follows Program Hebb Net to classify two dimensional input patterns clear; cic: ‘Input Patterns G-[hibit-L-l -11 Fe[l1111-1-1-11 x(1,1:20)=E: (2,1:20)=F w(1:20)=0; tefl -1]; be0; for in1:2 wewex (4 .1:20)*tC): bebet(i): end disp( ‘Weight matrix’): displw): disp( Bias’); disp(b): Summary ‘The fundamental models of the artificial neural network was discussed in this chapter. The models were used to generate the logic functions like AND, OR, XOR etc. The linear seperability concept to obtain the decision boundary of the regions and the Hebb rule for the pattern classification problem was illus- trated. The learning rules used in various networks for weight updation process were also derived in this chapter. ments sous dr aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. How the perceptron learning rule is better than the Hebb rule. Layer structure in the original perceptrons. Learning and training algorithms in the perceptron network . Architecture, algorithm and the application procedure of the perceptron net. Derivation of perceptron algorithm for several output classes . Applications of multilayer Perceptron peepee Networks Amy wv>Tto = Frank Rosenblatt [1962], and Minsky and Papert [1988], developed large class of artificial neural networks called Perceptrons. The perceptron learning rule uses an iterative weight adjustment that is more powerful than the Hebb rule. The perceptrons use threshold output function and the McCulloch- Pitts model of a neuron. Their iterative learning converges to correct weights, i.e. the weights that produce the exact output value for the training input pattem. The original perceptron is found to have three layers, sensory, associator and response units as shown in Fig. 4.1. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 64 Introduction to Neural Networks Step 5: Compute the activation output of each output unit y_ jy = 1, if Vig > O if -O 00020 1roaii2 4 1 o 0 0 0 0 00020 aod od-t 2-2-1 =I ° 0 0 0 000020 Initial > -1 1-1-11 -1 1-1-1 1 5 1 1 0 0 0 @ O-1 1-1-1 1 P-lb=1 1 1-1 <1 -1 0 0 0 0 O-1 1-1-1 1 ‘The final weights from Epoch | are used as the initial weights for Epoch 2. Thus the output is equal to target by training for suitable weights. ‘Testing the response of the net ‘The final weights are, For the Ist set of input, w, =0, w, =0, w3 =0,w,=2,b=0, and For the 2nd set of input, w, =— 1, w)= 1 w;=-1,w,=-1,b=1 ‘The net input is, ¥jq=b + 50x, w, For the Ist set of inputs, @ (111) Yin =0+0X140X140%142x1=2>0 Applying activation, Y= AY) = 1 @ aii-1n Yoit =OF0X140X140x142% 2<0 Applying activation, y= fly.) =— 1 For 2nd set of inputs, @ Cli-1-11) Vim = 14-1X—141x14-1x-14-1x-141x1=5>0 Applying activation, ——_y, = fi di) (-1-111) Yoing = 14-1141 x-14-1K-14+-1x14-1x15-1<0 Applying activations, y= fly_iqa) = aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. n Introduction to Neural Networks 01001; 1d; 10111; O1111; 10111; 11100; 01010; 10011; 111k 11110; 11tda) Program clear: cle; cdeopen( ‘reg.mat) ‘input=[cd.A":cd.B" :cd.C' :ed.0" sed. E' :e¢.F':0d.G' scd.H' sed.1°:ed.J']"; 10 J output (+. j)=1 else output(i.j)=0 end end end for t#1:15 test=[cd.k" :cd.L" zed." :¢d.N*:cd.0°)": net=newp(aw. 10, ‘hard] in’) net .trainparam. epochs=1000; net .trainparam.goal=0; net=train(net.. input output) ; yesim(net, test) ays aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Example 4.8 With a suitable example simulate the perceptron learning network and separate the bound- aries. Plot the points assumed in the respective quadrants using different symbols for identification. Solution Plot the elements as square in the first quadrant, as star in the second quadrant, as diamond in the third quadrant, as circle in the fourth quadrant. Based on the learning rule draw the decision boundaries. Progran Clear; pl=(1 1]': p2=(12]': %- class 1, first quadrant when we plot the elements. square p3"[2 -1]': p4=[2 -2]'; %- class 2, 4th quadrant when we plot the elements, circle p5=[-12]': p6=[-2 1]'; %- class 3, 2nd quadrant when we plot the elements.star p7*[-1 -1]'; p8=[-2 -2]':% - class 4, 3rd quadrant when we plot the elements diamond Now, lets plot the vectors hold on plot(pi(1).p1(2), 'ks" .p2(1).p2(2). "ks" .p3(1) .p3(2). "ko" .p4(1).p4(2). "ko" ) plot(p5(1).p5(2), 'k*" .p6(1) .p6(2), *k*" .p7(1).p7(2), "kd" .p8(1).p8(2). "kd" ) grid hold axis([-3 3 -3.3])&set nice axis on the figure t1*[0 0]': t2[0 0]': %- class 1. first quadrant when we plot the elements. square t3-[0 1]': t4=[0 1]"; %- class 2. 4th quadrant when we plot the elements. circle t5=[1 0]'; t6=[10]': %- class 3, 2nd quadrant when we plot the elements. star t7#[1 1]'; t8=[11]':% - class 4, 3rd quadrant when we plot the elements, diamond ‘lets simulate perceptron learning Elaments sous droits d'auteur aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 80 Introduction to Neural Networks Example 4.9 Write a MATLAB program for pattern classification using perceptron network. Test train the net with a noisy pattem. Form the input vectors and noisy vectors from patterns as shown below and store it in a* mat file, a + * ++ * 7 * * + ae * +e * He aa <1-4t 0 -4te -111 ae + +e * * + ” ee ate * oe ee * + 1-14 41-1 114 Input vectors Noisy vectors. Solution ‘The input vectors and the noisy vectors are stored ina mat file, say class.mal, and the required data is taken from the file. Here a subfunction called charplot.m is used. The MATLAB program for this is given below. Program Perceptron for pattern classification clear: cle Get the data from file data=open( ‘class mat") Sinput pattern Blarget tsedata.ts; Testing pattern nels; BInitialize the Weight matrix wezeros(n.m) bezeros(a. 1. EIntitalize alphae1 earning rate and threshold value aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. | 84 | Introduction to Neural Neworks Noisy Input Pattern used for Training mies me Pr ae Classified Output Pattern 4.3. Brief Introduction to |Networks Multilayer perceptron networks is an important class of neural networks. The network consists of a set of sensory units that constitute the input layer and one or more hidden layer of computation modes. The input signal passes through the network in the forward direction. The network of this type is called multilayer perceptron (MLP). ‘The multilayer perceptrons are used with supervised leaming and have led to the successful back- propagation algorithm, The disadvantage of the single layer perceptron is that it cannot be extended to multi-layered version. In MLP networks there exists a non-linear activation function. The widely used non-linear activstion function is logistic sigmoid function. The MLP network also has various layers of hidden neurons. The hidden neurons make the MLP network active for highly complex tasks. The layers of the network are connecied by synaptic weights, The MLP thus has a high computational efficiency. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. 88 Introduction to Neural Neiworks rule and are applied to various neural network applications. The weights on the interconnections between the adaline and madaline networks are adjustable, The adaline and madaline networks are discussed in detail in this chapter Adaline, developed by Widrow and Hoff [1960], is found to use bipolar activations for its input signals and target output. The weights and the bias of the adaline are adjustable. The learning rule used canbe called as Delta rule, Least Mean Square rule or Widrow-Hoffrule. The derivation of this rule with single output unit, Several output units and its extension has been dealt already in Section 3.3.3. Since the activation function is an identity function, the activation of the unit is its net input. ‘When adaline is to be used for pattern classification, then, after training, a threshold function is applied to the net input to obtain the activation The activation is, y= @ @e@ The adaline unit can solve the problem with linear separability if it occurs. 5.2.1 Architecture § ‘The architecture of an adaline is shown in Fig. 5.1. th 7 ‘The adaline has only one output unit. This output unit receives input from several units and also from bias: whose (y,) wy, activation is always +1. The adaline also resembles asingle —\. aa a We layer network as discussed in Section 2.7. It receives input from several neurons. It should be noted that it also receives input from the unit which is always ‘+1’, called as bias. The @)— bias weights are also trained in the same manner as the other jor weights, In Fig. 5.1, an input layer with xy..-xj--.x,and bias, aan output layer with only one output neuron is present. The link between the input and output neurons possess weighted Cy interconnections. These weights get changed as the trai progresses. y Fig 6:4] Architecture of an Adaline 5.2.2 Algorithm Basically, the initial weights of adaline network have to be set to small random values and not to zero as discussed in Hebb or perceptron networks, because this may influence the error factor to be considered. After the initial weights are assumed, the activations for the input unit are set. The net input is calculated based on the training input patterns and the weights. By applying delta leaming rule discussed in 3.3.3, the weight updation is being carried out. The training process is continued until the error, which is the difference between the target and the net input becomes minimum. The step based training algorithm for an adaline is as follows:

You might also like