100% found this document useful (1 vote)
237 views165 pages

AI Applications

This document discusses artificial intelligence and its applications. It covers expert system techniques, artificial neural networks, fuzzy rule-based systems, genetic algorithms, and hybrid soft computing systems. The key components of an expert system are described as the knowledge base, inference engine, explanation facility, and user interface. Expert systems capture human expertise to provide intelligent advice and have been applied in clinical decision making, diagnostics, and other domains.

Uploaded by

Hosam Hatim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
237 views165 pages

AI Applications

This document discusses artificial intelligence and its applications. It covers expert system techniques, artificial neural networks, fuzzy rule-based systems, genetic algorithms, and hybrid soft computing systems. The key components of an expert system are described as the knowledge base, inference engine, explanation facility, and user interface. Expert systems capture human expertise to provide intelligent advice and have been applied in clinical decision making, diagnostics, and other domains.

Uploaded by

Hosam Hatim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 165

Artificial Intelligence

Applications
References
(Neural networks and AI for biomedical engineering), Donnal L. .1
Hudoson, 2003
Fuzzy sets and system; Theory and applications ebook .2

“Neural networks, fuzzy logic and genetic algorithms. synthesis and .3


applications” ,S. Rajasekaran and G.A Vijayalakshmi, 2005

“ Neuro -Fuzzy and soft computing. Computational approach to learning .4


and machine intelligence”, J.S. R. Jang, C T Sun and E. Mizutanni, 1997

“Introduction to artificial neural systems” ,Jacek M. Zurada,1994v(ebook) .5

“Intelligent hybrid systems, fuzzy logic, neural networks, and genetic .6


algorithms” Du Ran,1997

“ Fuzzy logic and expert systems applications”, Cornelius T. Leondes, .7


1998
Contents
• Introduction to AI
• An Expert System Techniques
• Artificial Neural Networks
• AI in clinical decision systems
• Fuzzy rule based systems (or fuzzy inference systems)
• Genetic Algorithms
Introduction to AI
What is AI •
Intelligent behavior •
Major branches of AI •
Overview of expert systems •
What is Soft computing •
Soft computing techniques •
Hybrid systems •
From AI to soft computing •
What is AI:

“AI is the area of computer science concerned •


with designing intelligent computer systems’
that is, systems that exhibit the characteristic
associated with intelligence in human
behavior” (Avron Barr and eigenbaum,1981)
“AI is a branch of computer science that is •
concerned with the automation of intelligent
behavior” (Luger and Subblefied,1993)
“AI is the art of making computers do smart •
things”, (Waldrop)
Intelligent behavior graph Chart of AI

Learn from experience o


Apply knowledge acquired from experience o
Handle complex situations o
Solve problems when important information is missing o
Determine what is important o
React quickly and correctly to a new situation o
Understand visual images o
Process and manipulate symbols o
Be creative and imaginative o
Use heuristics o
Major Branches of AI
Perceptive system –
A system that approximates the way a human sees, hears, and •
feels objects
Vision system –
Capture, store, and manipulate visual images and pictures •
Robotics –
Mechanical and computer devices that perform tedious tasks with •
high precision
Expert system –
Stores knowledge and makes inferences •
Major Branches of AI (2)
Learning system –
Computer changes how it functions or reacts to situations based •
on feedback
Natural language processing –
Computers understand and react to statements and commands •
made in a “natural” language
Neural network –
Computer system that can act like or simulate the functioning of •
the human brain
Artificial
intelligence

Vision Learning
systems systems

Robotics
Expert systems

Neural networks
Natural language
processing
Overview of Expert Systems
Def: A program that uses available information, heuristics,
and inference to suggest solutions to problems in a particular
discipline
It Can… •
Explain their reasoning or suggested decisions –
Display intelligent behavior –
Draw conclusions from complex relationships –
Provide portable knowledge –
Expert system shell (def) •
A collection of software packages and tools used to develop –
expert systems (its is an expert system without a knowledge
base. I.e. it is not restricted to an specific domain)
Limitations of Expert Systems

Not widely used or tested •


Limited to relatively narrow problems •
Cannot readily deal with “mixed” knowledge •
Possibility of error •
Cannot refine own knowledge base •
Difficult to maintain •
May have high development costs •
Expert System Achievements
Capture and preserve irreplaceable human expertise •
Provide expertise needed at a number of locations at the •
same time or in a hostile environment that is dangerous to
human health
Provide expertise that is expensive or rare •
Develops a solution faster than human experts can •
Provide expertise needed for training and development to •
share the wisdom of human experts with a large number of
people
Components of an Expert System

Knowledge base •
Stores all relevant information, data, rules, cases, and relationships –
used by the expert system
Rule ?: is A conditional statement that links given conditions to –
actions or outcomes
Inference engine •
Seeks information and relationships from the knowledge base and –
provides answers, predictions, and suggestions in the way a human
expert would
In order to produce a reasoning, it is based on logic. There are several –
or more, 1 predicates of order ,kinds of logic: propositional logic
.cte ,epistemic logic, modal logic, temporal logic, fuzzy logic
..Components of an Expert System
Explanation facility: •
A part of the expert system that allows a user or –
decision maker to understand how the expert
system arrived at certain conclusions or results
User interface •
..Components of an Expert System

Knowledge acquisition facility •


Provides a convenient and efficient means of capturing •
and storing all components of the knowledge base

Knowledge
Knowledge acquisition
base facility

Expert
Explanation Inference
facility engine

Knowledge
base User
Knowledge
acquisition interface
base
facility

Experts User
Participants in Expert Systems
Development and Use
Domain expert •
The individual or group whose expertise and knowledge is –
captured for use in an expert system
Knowledge user •
The individual or group who uses and benefits from the –
expert system
Knowledge engineer •
Someone trained or experienced in the design, –
development, implementation, and maintenance of an
expert system
Schematic
Expert
system

Knowledge engineer
Domain expert Knowledge user
Evolution of Expert Systems Software
Expert system shell •
Collection of software packages & tools to design, –
develop, implement, and maintain expert systems

high
Expert system
Ease of use

shells
Special and 4th
generation
Traditional
languages
programming
languages
low
Before 1980 1980s 1990s
Expert Systems Development Alternatives

high
Develop
from
scratch
Develop
Development from
costs shell
Use
existing
low package

low high
Time to develop expert system
Overview of The Expert Systems
Development Life Cycle: The Knowledge
Engineering Process
1.1 Components of an Expert System Expert systems, as
defined in before, are built to capture the expertise of human experts in a well-
defined, narrow area of knowledge. The early expert systems took 20 to 50 person-
years to build; today’s complex expert systems are still apt to take about 10 person-
years. With the use of an expert system shell, however, expert systems can be built in
5 person-years and simple expert system prototypes in only 3 person-months.

1.2 Building and Expert System


1.3 Implementing an Expert System
This chapter briefly discusses the components of expert systems and their
building process. More detailed explanation of each of these components and their
development is provided in later chapters.
1.1 Components of an Expert System
The components of an expert system, as shown in Figure 1.1, can
be described in terms of
USER

DIALOG STRUCTURE

INFERENCE
ENGINE

KNOWLEDGE BASE

Figure (1) Components of an Expert System

a user interface – the dialog structure; •


a control structure – the inference engine; and •
a fact base - the knowledge base. •
Dialog Structure
The dialog structure is the language interface with which the user can access the expert
system. Usually the user interacts with the experts system in a consultative mode. An
explanation module is also included in the expert system. The explanation module allows
the user to query and challenge the expert system and examine its reasoning process.
According to Michie, there are three different user modes for an expert system:
1. Getting answers to problems – the user as client;
2. Improving or increasing the system’s knowledge - the user as tutor, and
3. Harvesting the knowledge base for human use- the user as pupil.
Each of these three modes requires interaction through the dialog structure.

Inference Engine
The inference engine is a program that allows hypotheses to be generated based upon the
information in the knowledge base. It is the control structure that manipulates the
knowledge in the knowledge base to arrive at various solutions.
Three major methods are incorporated in the inference engine to search a space
efficiently for deriving hypotheses from the knowledge base: forward chaining,
backward chaining, and forward and backward processing combined. Later
chapters will explain each of these methods in detail
Forward chaining, often described as event- or data-driven reasoning, is used for problem
solving when data or basic ideas are a starting point. With this method, the system does not
start with any particular goals defined for it. It works through the facts to arrive at
conclusions, or goals. One drawback of forward chaining is that one derives everything
possible, whether one needs it or not. Forward chaining has been used in expert systems for
data analysis, design, and concept formulation.
Backward chaining, called goal – directed reasoning, is another inference engine technique.
This method entails having a goal or a hypothesis as a starting point and then working
backward along certain paths to see if the conclusion is true. A problem with backward
chaining involves conjunctive subgoals in which a combinatorial explosion of possibilities
could result. Expert systems employing backward chaining are those used for diagnosis and
planning.
Forward and backward processing combined is another method used for search direction in
the inference engine. This approach is used for a large search space, so that bottom-up and
top-down searching can be appropriately combined. This combined search is applicable to
complex problems incorporating uncertainties, such as speech understanding.
Most inference engines have the ability to reason in the presence of uncertainty. Different
techniques have been used for handling uncertainty – namely, Bayesian statistics, certainty
factors and fuzzy logic. In many cases, one must deal with uncertainty in data or in
knowledge. Thus, an expert system should be able to handle this uncertainty.
Knowledge Base
The knowledge base is the last and most important component of an expert system. It is
composed of domain facts and heuristics based upon experience. According to Duda and
Gaschnig, the most powerful expert systems are those containing the most knowledge.
3.2 Building an Expert System
Building an expert system typically involves the rapid prototyping approach. This
approach entails “building a little and testing a little” until the knowledge base is
refined to meet the expected acceptance rate and users’ needs. Expert systems
development is an iterative process in which, after testing, knowledge is
reacquired, represented, encoded, and tested again until the knowledge base is
refined. A principal refinement for expert systems continues to be the need for
effective filtering out of the biases created by the particular values and vies of the
experts who are the sources of the knowledge that constitute knowledge bases.
The first step in building an expert system is to select the problem, define the
expert system’s goal(s), and identify the sources of knowledge. The criteria for
expert system problem selection can be broken down into three components:
problem criteria, expert criteria, and domain area personnel criteria. The following
criteria should be followed when selecting an expert system problem:
Problem Criteria
* The task involves mostly symbolic processing.
* Test cases are available.
* The problem task is well bounded.
* The task must be performed frequently.
* Written materials exist explaining the task.
* The task requires only cognitive skills.
* The experts agree on the solutions.
Expert Criteria
An expert exists. •
The expert is cooperative. •
The expert is articulate. •
The expert’s knowledge is based on experience facts and judgment •
Other experts in the task area exist. •
Domain Area Personnel Criteria
A need exists to develop an expert system for that task. •
The task would be provided with the necessary financial support. •
Top management supports the project. •
The domain area personnel have realistic expectations for the use of an expert •
system.
Users would welcome the expert system. •
The knowledge is not politically sensitive or controversial. •
Once these criteria are met and the problem is selected, the next step is to acquire the knowledge from •
the expert in order to develop the knowledge base .Knowledge acquisition is an iterative process in
which many meetings with the expert are needed to gather all the relevant and necessary information
for the knowledge base. Before being able to acquire the knowledge from the expert, however, the
knowledge engineer must be familiar with the domain. Reading documentation and manuals and
observing experts in the domain are needed to obtain a fundamental background on which the
knowledge engineer’s knowledge will be based. This background is needed so that the knowledge
engineer can ask the appropriate questions and understand what the expert is saying.
Knowledge can be acquired by various methods. The most commonly used method is to have the •
knowledge engineer (i.e., the developer of the expert system) interview the expert. Various structured
and unstructured techniques can be used for interviewing. Some of the methods are as follows:
Method of familiar tasks-- analysis of the tasks that the expert usually performs. •
Limited information tasks—a familiar task is performed but the expert is not given certain information •
that is typically available .
Constrained processing tasks—a familiar task is performed, but the expert must do so under time or •
other constraints.
Method of tough cases – analysis of a familiar task that is conducted for a set of data that •
presents a tough case for the expert.
After the knowledge acquired, the next step is to select a knowledge representation
approach. These approaches include predicate calculus, frames, scripts, semantic networks,
and production rules, as described earlier. According to Software Architecture and
Engineering, Inc., rule-based deduction would be an appropriate method to use if (1) the
underlying knowledge is already organized as rules, (2) the classification is predominantly
categorical, and (3) there is not much context dependence. Frames, scripts, and semantic
networks are best used when the knowledge preexists as descriptions.
The next step consists of programming the knowledge by using a text editor in an expert •
system shell or by using LISP, Prolog, C, or some other appropriate programming language.
An expert system shell contains a generalized dialog structure and an inference engine. A
knowledge base can be designed for a specific problem domain and linked to the expert
system shell to form a new expert system for a particular application. Some of the more
popular expert system shells are these:

ART – Inference Corporation


KEE – Intellicorp
Knowledge Craft – The Carnegie Group
TIRS – IBM Corporation
Nexpert Object – Neuron Data
Level 5 Object – Information Builders, Inc.
VP-Expert – Wordtech, Inc.
Exsys –Exsys, Inc.
KBMS / ADS—Trinzic Corporation
G2—Gensym Corporation.
An expert system is not just a group of rules. It is a program that should be able to explain its
reasoning, and human factors design should be incorporated into the system. Human factors
should be designed into the expert system to include such features as messages, menus,
free-text comments, rules against undesirable input, and high-level English-like
representation language for encoding the knowledge. Explanation capabilities are needed so
that the user can question and challenge the results of the expert system as well as
understand how the results were achieved. In MYCIN, an expert system designed to
diagnose bacterial infections in the blood explanation can be arrived at as shown here, with
the user asking the questions:
** HOW DID YOU DECIDE THAT ORGANISM-1 MIGHT BE AN E.COL? •
I used RULE021 to conclude that the identity of •
ORGANISM – 1 is e. coli. This gave a cumulative CF of (.47). •
Then 1 used RULE084 to conclude that the identity of •
ORGANISM-1 is e. coli. This gave a cumulative CF of (.55). •
Finally, I used RULE003 to conclude that the identity of •
ORGANISM-1 is e. coli. This gave a cumulative CF of (.74). •
** DID YOU USE RULE163 TO FIND OUT ANYTHING ABOUT ORGANISM-1? •
RULE163 was tried in the context of ORGANISM-1 but it failed because it is not true that the •
patient has had a genitourinary tract manipulative procedure (clause 3).
**WHY DIDN’T YOU CONSIDER STREPTOCOCCUS AS A POSSIBILITY FOR ORBANISM -1? •
The following rule could have been used to determine that the identity of ORGANISM-1 was •
streptococcus: RULE033
But clause 2 (“the morphology if the organism is coccus”) was already known to be false for •
ORGANISM-1, so the rule was never tried.
The last step in the expert system building process consists of validating, testing, and •
evaluating the system. Validity assessment is necessary to check the knowledge base and the
expert system as a whole. Validity can also be confirmed by other experts knowledgeable in
the problem domain.
Testing is an important area to be examined because when the expert system finally •
runs, it typically produces a variety of unexpected results. These unexpected results are
summarized by Hayes –Roth et al:
Excess generality – special cases overlooked. •
Excess specificity – generality undetected. •
Concept poverty – useful relationships not detected and exploited. •
Invalid knowledge – misstatement of facts or approximations. •
Ambiguous knowledge – implicit dependencies not adequately articulated. •
Invalid knowledge – misstatement of facts or approximations. •
Inadequate integration – dependencies among multiple pieces of advice •
incompletely integrated.
Limited horizon – consequences of recent, past, or probable future events not •
exploited.
Egocentricity – little attention paid to the probable meaning of others’ actions. •
In order to correct these problems, knowledge refinement and maintenance are
needed. Also, many test cases involving hard and soft data, gray areas, and special
cases must be used to refine the knowledge base.
After testing of the knowledge base and the resulting expert system, an
evaluation of the expert system can be made by users and experts currently
working in the problem domain. This evaluation process is a post-audit to see if the
expert system meets the objectives for which it was developed. Evaluation criteria
such as input/output content, quality of advice, correct reasoning approach, cost
effectiveness, and ease of use should be considered. With these considerations
built into its design, the success of the expert system will be greatly enhanced.
3.3 Implementing an Expert System
After building the expert system there are various implementation obstacles that may
have to be faced. There might be resistance to change in the organization, either because of
the need to learn a new procedure / tool or the belief that “we have been doing things all
right so far, so why change?” Another obstacle might come from the expert who might
think that he or she is being replaced. Still another obstacle might be a reluctance to use the
expert system due to difficulty in learning how to operate it. A last major obstacle is that the
expert system is unusable because it is not maintained and therefore not current.
There are several ways to overcome these barriers. A resistance to change could be
eliminated by incorporating users’ comments into the development process. The users’
inputs and feedback would be helpful in designing the user interface and determining how
best to represent the output. Also, if the expert system is properly validated and tested, and
proves to perform at the level of a human expert, then the system could be relied upon and
confidence in using it would increase.
The obstacle of taking the competitive advantage away from the expert can be
eliminated by recognizing that the expert system can free the expert’s time to tackle other
projects of interest that the expert never had time to do.
Reluctance to use the expert system can be overcome by training sessions, good
documentation, and knowledge engineering consulting. If an expert system shell is used, the
vendor usually provides training, documentation, a hot line, and knowledge engineering
consulting. Even if a shell is not used, good user documentation and training should be
provided by the developers to reduce the reluctance to use the expert system.
The last obstacle, not keeping the expert system current, is remedied by
designating an individual or group of individuals to maintain the system. In most
applications a good percentage of the knowledge in the knowledge base is
dynamic. It needs to be constantly refined and updated. For example, Digital
Equipment Corporation has a group of people whose only task is to maintain XCON,
an expert system for configuring VAX computer systems. New component
descriptions and configuration knowledge need to be put into the system on a
regular basis. Likewise, expert systems designed for tax planning need to be
updated regularly, as the tax laws frequently change. Expert systems should be
maintained, and their knowledge bases should be designed to facilitate change.
By employing these implementation strategies, the obstacles will be eliminated
and expert systems will gain wide acceptance and usage.
Problem Selection for Expert Systems Development
2.1 Problem Selection Guidelines
2.2 Methodologies for Expert System
2.3 Problem Selection
Someone once said that there are three important rules in developing expert
systems. The first rule is “pick the right problem” the second is “pick the right
problem,” and the third is “pick the right problem.” In expert systems development,
as in software development or scientific research, the most critical step is selecting
the problem. Selecting too large a problem or one with few test cases could lead to
disastrous results when building an expert system. Picking too trivial a problem
could leave managers and users unimpressed. If the problem is not properly
identified and researched, then complications will most likely occur later in the
knowledge engineering (i.e., expert systems development) process. By spending
time up front identifying the problem, time and money will ultimately be saved.
This chapter will discuss guidelines to use in selecting a problem for expert
systems development. First, a description of expert systems and their building
process will be presented. Then, after characteristics of problem selection are
identified, some methodologies will be presented to help one select a problem
suitable for expert systems development. Last the application areas in which expert
systems have been built and used will be reviewed.
2.1 Problem Selection Guidelines •
The first step in building an expert system is to select the problem. The step can be discussed in •
terms of the type of problem the expert, and the domain area personnel. Each of these areas will now
be explained.
Type of Problem •
The following guidelines might be followed by the knowledge engineering team in order to select •
an appropriate problem:
The task primarily requires symbolic reasoning . •
The task requires the use of heuristics. •
The task may require decisions to be based upon incomplete or uncertain information. •
The task does not require knowledge from a large number of areas. •
The purpose of development is either to create a system for actual use or to make major advances in the •
state of the art of expert system technology; however, it does not attempt to achieve both of these goals
simultaneously.
The task is defined very clearly: At the outset of the project, there should be a precise definition of the •
inputs and outputs of the system to be developed.
A good set of test cases exists. •
Some small systems will be developed to solve problems that are amenable to conventional techniques •
simply because the users need the systems quickly and decide that they can develop workable solutions
by themselves, using shells, rather than waiting for their data processing groups to help them.
A few key individuals are in short supply. •
Corporate goals are compromised by scarce human resources. •
Competitors appear to have an advantage because they can perform the task consistently better. •
The domain is one where expertise is generally unavailable, scarce, or expensive. •
The task does not depend heavily on common sense. •
The outcomes can be evaluated. •
The task is decomposable, allowing relatively rapid prototyping for a closed small subset of the complete •
task and then slow expansion to the complete task.
The Expert •
An essential part of expert systems development is having an expert to •
work with the knowledge engineering team. Here are some guidelines in selecting an expert:
There is an expert who will work with the project. •
The expert’s knowledge and reputation must be such that if the system captures a portion of •
the expert’s expertise, the system’s output will have credibility and authority.
The expert has built up expertise over a long period of task performance. •
The expert will commit a substantial amount of time to the development of the system. •
The expert is capable of communicating his or her knowledge judgment, and experience, as •
well as the methods used to apply them to the particular task.
The expert is cooperative. •
The expertise for the system at least that pertaining to one particular sub domain, is to be •
obtained primarily from one expert.
If multiple experts contribute in a particular subdomain, one of them should be the primary •
expert with final authority.
It’s always nice to have a backup expert. •
The expert is the person the company can least afford to do without. •
The expert should have a vested interest in obtaining a solution. •
The expert must also understand what the problem is and should have solved it quite often. •
Experts must agree on the solutions. •
Domain Area Personnel
Besides the knowledge engineer who builds the expert systems and the domain
expert whose knowledge and experiential learning are captured in the expert system the
domain area personnel must be considered when selecting problem for expert systems
development. The domain area personnel are the users and management. The domain area
personnel are the users and management.
The following are guidelines relating to these when selecting a problem:
Personnel in the domain area are realistic understanding the potential uses and •
limitations of an expert system for their domain.
Domain area personnel understand that even a successful system will likely be •
limited in scope and, like a human expert, may not produce optimal or correct
results all the time.
There is strong managerial support from the domain area, especially regarding the •
large commitment of time by the expert(s) and their possible travel or temporary
relocation, if required.
The specific task within the domain is jointly agreed upon by the system developers •
and the domain area personnel.
Managers in the domain area have previously identified the need to solve the •
problem.
The project is strongly supported by a senior manager for protection and follow-up. •
Potential users would welcome the completed system. •
The system can be introduced with minimal disturbance of the current practice. •
The user group is cooperative and patient. •
The introduction of the system will not be politically sensitive or controversial. •

. •
The knowledge contained by the system will not be politically sensitive or controversial

2.2 Methodologies for Expert System Problem Selection


To ensure proper problem selection for expert system development, the
criteria and guidelines just presented should be carefully considered. One method
that helps the knowledge engineer select and scope a problem is the Analytic
Hierarchy process (AHP). This method is helpful in quantifying subjective judgments
used I decision making. A microcomputer software package called Expert Choice
embodies this process.
The AHP was developed by Saaty and has been applied successfully in
numerous situations ranging from selecting an appropriate expert system shell to
choosing the best house to buy. The AHP breaks down a problem into its
constituents and then calls for simple pairwise comparison judgments to develop
priorities in each hierarchy.
The steps of the AHP are as follows:
1\ The problem is defined, and you determine what you want to know.
2\ The hierarchy is structured from the top(the objectives from a general
viewpoint) through the intermediate levels (criteria on which subsequent levels
depend) to the lowest level (which is usually a list of the alternatives).
3\ A set of pairwise comparison matrices is constructed for each of the lower
levels—one matrix for each element in the level immediately above.
4\ After all the pairwise comparisons have been made and the data entered,
the consistency is determined using the eigenvalue.
5\ Steps 3 and 4 are performed for all levels in the hierarchy.
6\ Hierarchical composition is now used to weight the eigenvectors by the
entries corresponding to those in the next lower level of the hierarchy.
7\ The consistency of the entire hierarchy is found by multiplying each
consistency index by the priority of the corresponding criterion and adding them
together.
Mathematically speaking, priorities are calculated by the process of principal
eigenvector extraction and hierarchical weighting. Suppose that we have a
matrix of pairwise comparisons of weights that have n objects A1,….An whose
vector of corresponding weights is w = (w1,….wn). The problem Aw =(maximum
eigenvalue ) (w) should be solved to obtain an estimate of the weights w. A
pairwise comparison reciprocal matrix is used to compare the relative
contribution of the elements in each level of the hierarchy to an element in the
adjacent upper level.
The principal eigenvector of this matrix is then derived and weighted by the priority of the
property with respect to which the comparison is made. That weight is determined by comparing the
properties in terms of their contribution to the criteria of a still higher level. The weighted eigenvectors
can next be added componentwise to obtain an overall weight or priority of the contribution of each
element to the hierarchy. Bazaraa et al. provide a further explanation, in terms of linear algebra, of the
derivation of an eigenvalue.
The AHP has been rigorously tested and successfully applied in numerous diverse
applications. It provided remarkably accurate results when validated in situations where
numerical measures are known.
In one experiment, four chairs were arranged in a straight line from a light source,
and pairwise verbal judgments from subjects were then made about the relative brightness
of the chairs. The results, when analyzed, showed a remarkable conformity to the inverse
square law of brightness as a function of distance, as seen by the following numbers:

TRIAL TRIAL 2 INVERSE SQUARE LAW

.61 .61 .61


.24 .22 .22
.10 .10 .11
0.05 0.05 0.06
Another validation experiment involved estimating the relative areas of two-
dimensional figures by using pairwise judgments. Subjects were asked to estimate the
relative areas of five geometrically shaped objects by using Expert Choice. When different
subjects provided verbal judgments in using Expert Choice, the accuracy of their results was
amazing:

FIGURE ACTUAL EXPERT CHOICE ESTIMATE


PERCENTAGE
A .47 .45
B .05 .07
C .23 .24
D .15 .15
E .10 .09
Expert Choice
Expert Choice represents a significant contribution to the decision-making
process, as it is able to quantify subjective judgments in complex decision-making
environments. This program enables decision makers to structure a multifaceted
problem visually in the form of a hierarchy. Expert Choice, which uses the AHP, is
helpful in selecting an appropriate expert system problem.
In using Expert Choice, the user (i.e., decision maker) first constructs a
hierarchy of the goal, criteria, and alternatives for the application. At the top level
of the hierarchy, the goal is defined – which, in this case, is to select an expert
system problem. At the next level, the criteria used in selecting an appropriate
expert system problem are defined. These criteria are based on those discussed in
Section 2.1. At this level, the criteria are as follows:
PRO TYPE – type of problem criteria
EXPERT – expert criteria
DOM PERS – domain area personnel criteria
Subcriteria under each of these headings can be defined at the next level of the hierarchy. These are:
PRO TYPE
SYMBOLIC – task involves mostly symbolic processing
TEST CAS – test cases are available
WELL-BND – problem task is well bounded
FREQUENT – task must be performed frequently
WRIT MAT – written materials exist explaining the task
COG SKLS – task requires only cognitive skills
EXP AGRE – experts agree on the solutions
EXPERT
EXP EXST – an expert exists
COOPERTE – the expert is cooperative
ARTICULT – the expert is articulate
EXPERNCE – the expert’s knowledge is based on
experience, facts, and judgment
OTHER EX – other experts in the task exist
DOM PEERS
NEED EXI—a need exists to develop an expert system
For that task
FIN SPRT – the task would be provided with the
necessary financial support
TOP MGMT – top management supports the project
REAL EXP – the domain area personnel have realistic
Expectations for the use of an expert system
USERS WL—users would welcome the expert system
NOT POL—the knowledge is not politically sensitive or
Controversial
The last level of the hierarchy consists of the alternatives—in this case, the possible problems (or
tasks) to be worked on for expert systems development. In this example, there are three
alternatives;
MACROECO—develop an expert system for determining
marco economic policy in the United States
NUCLEAR—develop an expert system for determining what
the United States should do in case of a nuclear
war
BID/NO—develop an expert system for determining whether
To bid on a request for proposal.

Figure 2.1 shows the Expert Choice hierarchy before pair wise comparisons (i.e., before weighting
takes place) for this expert system problem selection application.
After constructing this hierarchy, the evaluation process begins. Expert Choice will first
question the user in order to assign priorities (i.e., weights) to the criteria. Expert Choice allows
the user to provide judgment verbally; so that no numerical guesses are required (it also allows
the user to give a numerical answer). Thus, the first question would be: “With respect to the goal
of selecting an expert system problem, is criterion one (i.e., PRO TYPE) just as important as
criterion two (i.e., EXPERT)?” If the user’s answer to the question is “Yes,” then criterion one is
compared to criterion three (PRO TYPE vs. DOM PERS). If the answer is “NO,” as it is at the top of
Figure 2.2, then Expert Choice will ask, “Is PRO TYPE more important then EXPERT?” and it will ask
for the level of importance. The number of pairwise comparisons is shown in a triangle of dots, as
displayed in the right-hand corner of Figure 2.2. The relative importance, as shown at the top of
Figure 2.2, is moderately more important, strongly more important, very strongly more important
extremely more important, or a degree within the range. Based upon the user’s verbal judgments,
Expert Choice will calculate the relative importance on the following scale, based on Saaty’s work:
1. Equal importance
3. Moderate importance of one over another
5. Essential or strong importance
7. Very strong importance
9. Extremely important
2,4,6,8. Intermediate values between the two adjacent judgments
This procedure is followed to obtain relative priorities of the criteria in which eigenvalues are
calculated based upon pairwise comparisons of one criterion versus another, as discussed in
the previous section. These pairwise comparisons are made for each of the criteria and sub
criteria. Figure 2.2 and 2.3 show the priorities after the pairwise comparisons are made.
Also, an inconsistency index is calculated after each set of pairwise comparisions to show
how consistent the user’s judgments are. An overall inconsistency index is calculated at the
end of the synthesis as well. This measure is zero when all judgments are perfectly
consistent with one another and becomes larger when the inconsistency is greater. The
inconsistency is tolerable if it is 0.10 or less.
After the criteria and subcriteria are weighted, the next step involves
pairwise comparisions between the alternatives and the subcriteria. For example,
one question would be: “With respect to SYMBOLIC, are MACROECO and NUCLEAR
equally preferable?” After all the pairwise comparisions have been entered, Expert
Choice performs synthesis by adding the global priorities (global priorities indicate
the contribution to the overall goal) at each level of the tree hierarchy. Figure 2.5
shows the synthesis of the results and, finally the ranking of the alternatives. In this
example after taking all the pairwise comparisions into account, the best problem
to select for expert systems development is the BID/NO (i.e., develop an expert
system for determining whether a company should bid on a request for proposal).
Its priority is .578, followed by MACROECO (.266) and, last, NUCLEAR (.56). The
overall inconsistency index is 0.06, which is within the tolerable range.
Through this methodology, the best problem for expert systems
development given these three alternatives, is to determine whether to bid
or not on a request for proposal. Expert Choice also allows for sensitivity
analysis if the user to desires.
Other Approaches for Expert System Problem Selection
If one does not want to use the AHP/Expert Choice approach to
select an expert system problem another technique is develop a checklist of
important problem criteria based on those in Section 2.1, and sees how many of
them fit the problem under consideration. This is an unsophisticated approach, but
it is effective and is probably the one most knowledge engineers use when
selecting a possible problem task for expert systems development.
Cost-benefit analysis should also be conducted to determine
whether it is technically economically, and operationally feasible and wise to
develop an expert system for a particular problem. Costs include the expert’s time,
as well as that of the knowledge engineer [1,4]. Additional costs include possible
acquisition of hardware, possible acquisition of software like expert system shells,
overhead, the expert’s travel and lodging expenses, and computing time. The
benefits of an expert system might include reduced costs, increased productivity,
increased training productivity and effectiveness, preservation of knowledge,
enhanced products or services, or even the development of new products and
services [426272829].
The costs and benefits between the status quo and the proposed expert
system could then be compared. Expert Choice could even be used for this cost
benefit analysis to see if the expert system would be more cost-effective than the
status quo.
2.4 Conclusion
Problem selection is a critical step in the expert systems development process.
Selecting the wrong problem or failing to reduce the problem to a manageable size
will create problems later when constructing the expert system.
The guidelines for problem selection criteria presented in this chapter should
be followed closely when considering the type of problem, domain expert, and
domain area personnel. The AHP is a useful structured methodology for
incorporating these criteria in selecting an expert system problem. Expert Choice
easily facilitates the use of the AHP. Whether this structured methodology or some
other approach is used in the problem selection process, the important point is to
make sure that the method used will help to identify the right problem for expert
systems development.
Knowledge Acquisition
3.3 Possible Solutions to 3.1 Knowledge Acquisition
Problems the knowledge Acquisition
3.4 In the Future 3.2 How to Reduce the Expert’s
Boredom
Acquiring knowledge from an expert for expert systems development is a
difficult process. One reason is that an expert, talking continuously, will produce
about 10,000 words a hour; this is equivalent to 300 to 500 pages of transcript
from a single day’s session with an expert. This voluminous information must
somehow be digested by the knowledge engineer and appropriately represented
for expert systems development. In addition, the knowledge engineer, when
interviewing the expert, must pay attention to the expert’s intonation and use of
words, which might influence the meaning of some of the information.
There are many problems in extracting information from an expert. This
chapter will survey the problems that may be encountered during the knowledge
acquisition process. Then some possible solutions to this problem will be
presented.
3.1 Knowledge Acquisition Problems
Knowledge acquisition is the most difficult part of expert systems development. There are
numerous problems associated with this process. The following are some of the major
problems that may be incurred:
Human biases in judgment on the part of the knowledge engineer and expert which might
inhibit transmission of the correct information. These biases include the following:
Recency—people are influenced by the most recent events. –
Availability—people use only the information that is available to them.
Imaginability—people use information only in the form that is presented to them. –
Correlation—people make correlations where none exist. –
Causality—people assign causes where none exist. –
Anchoring and adjustment—people use an anchor point and argue around that point. –
Statistical intuition—people do not fully understand the effects of variance and sample –
size.
The expert could cancel or defer knowledge acquisition appointments, fail to answer
questions, or neglect to supply information.
It is difficult for the knowledge engineer to extract and the expert to convey •
knowledge and heuristics that have been acquired over many years of professional
experience; for example information that the expert considers common sense may
not be so to the knowledge engineer.
The expert might, consciously or unconsciously, use the knowledge engineer to •
experiment with different models of the knowledge domain that he or she has
developed.
The process of knowledge elicitation requires many hours of an expert who is •
already busy and has many demands on his or her time.
Some knowledge engineers may not be good at interviewing, causing them to •
interrupt, not listen to the way the expert uses knowledge, misinterpret
information, or not ask the right questions.
Some knowledge engineers might start feeling expert and then think that they are •
the experts.
Even an expert may not be right 100 percent of the time. •
A knowledge engineer may not listen fully to the language that the expert uses to •
represent his or her experience; the knowledge engineer should be sensitive to
auditory thoughts, visual thoughts, sensory memories, or feeling.
The knowledge engineer may assume or project his or her favored modes of •
thinking into the expert’s verbal reports.
The knowledge engineer may not be cognizant of the body language that the •
expert is using.
It might be difficult to uncover the expert’s ability that is hidden at •
the gut level.
The knowledge engineer may not be organized in his or her approach •
to eliciting knowledge from the expert.
The knowledge engineer may not be skilled in or knowledgeable •
about the different methods that can be used to extract the expert’s
knowledge. Some these methods are as follows:
Method of familiar tasks—analyze the tasks that the expert usually performs. –
Structured and unstructured interviews—the expert is queried with regard to –
knowledge of facts and procedures.
Limited information tasks—a familiar task is performed, but the expert is not –
given certain information that is typically available.
Constrained processing tasks—a familiar task is performed, but the expert must –
work under time or other constraints.
Method of tough cases—analysis of a familiar task is conducted for a set of –
data that presents a tough case for the expert.
Some knowledge engineers may not be very familiar with the domain, and may –
not know what questions to ask or understand what the expert is saying.
The expert may not be cooperative or articulate. –
These are typical problems that may surface during the knowledge
acquisition process. One way to help ensure success during the knowledge
acquisition process is to make sure that the problem selected is appropriate and
well scoped for expert systems development. Additionally, the knowledge engineer
should become knowledgeable about the domain before interacting frequently
with the expert. The expert selected should be willing to participate, cooperative,
and articulate.
The next sections survey solutions that are being developed to improve the
knowledge acquisition process .
3.2 How to Reduce the Expert’s Boredom
It is said that those closes to the technology may have a clouded view of where it is
going. They may be too close to the technology to take a step back and distinguish
the forest for the trees. The same situation may be true of the knowledge engineer
during the sessions with the expert. The knowledge engineer may be caught up in
questioning the expert, without realizing that the expert is bored with the process.
This issue was recently raised by a domain expert who was being interviewed by a
first-time knowledge engineer, who kept telling the expert to give him information
in terms of IF-THEN rules. After about an hour of this exercise, the domain expert
became bored and frustrated, and eventually withdrew from the project.
The question, then, is how to reduce the chance of the expert’s becoming
bored during the knowledge acquisition process. There are 10 methods that may
be used:
1. The knowledge engineer needs to vary the methods used in
acquiring knowledge. Alternatives include scenario building, observation, and
limited information task interviewing to allow flexibility on the part of the expert to
explain his or her reasoning. Using a variety of methods should prevent the expert
from getting into a rut.
2. Each knowledge acquisition session should last no more than 2
hours. Studies have shown that this time limit is optimal.
3.It is helpful to have two knowledge engineers present when
interviewing the expert. Thus, during the questioning/listening process, the expert
can bounce ideas off both of them, instead of dealing with the same person every
day.
4.It might be helpful to deal with the expert away from the office, in
an informal setting. Meeting at a restaurant or pub might make the expert more
relaxed and at ease in answering the knowledge engineer’s questions.
5.Let the expert explain his or her reasoning by running through
typical scenarios. Asking about familiar tasks will make the expert feel more
comfortable with the interviewing process. Later on, in order to obtain some of the
heuristics, the method of tough cases, constrained time, or limited information
tasks might used.
6. Don’t require or force the expert to reason or talk in a certain
way, such as in the form of IF-THEN rules. This will be unnatural, awkward, and
annoying to the expert. The knowledge engineer should listen to the way the
expert is using his or her knowledge and should later determine the best way for
representing it.
7. Early on, show the expert the interactive expert system in order
to capture the expert’s attention. In this manner, the expert can better visualize the
expert system instead of looking at hard copies of rules. This will also show the
expert that his or her time is not being wasted; there is a substantive result of the
knowledge acquisition sessions. One caveat, however, is to be careful in showing
the system to the expert too early. If there is not much in the system, the expert
may consider it trivial and meaningless.
8. The knowledge engineer needs the expert to feel ownership of
the expert system. One way to do this is to name the system after the expert or
include the expert’s name on the opening screen as the “expert con101sultant.”
9. According to Earl Sacerdoti at Copernican, let the expert do his or
her normal daily activities, as well as spending up to 2 hours a session on
knowledge acquisition. With this arrangement, the expert will not feel that the
knowledge engineer is monopolizing his or her time.
10. As a corollary to the previous guideline, remember that, typically
1 day of the expert’s time for every 4 days of the knowledge engineer’s time will be
needed to develop the expert system. This formula should be kept in mind in
projecting the expert’s involvement in the expert system project.
Hopefully, by following these guidelines, the knowledge engineer will create a fruitful
and enjoyable relationship with the expert. Ultimately this should lead to an improved
chance of building and implementing a successful expert system.

3.3 Possible Solutions to the Knowledge


Acquisition Bottleneck

Traditional ways of acquiring knowledge have been based on interviewing


techniques, scenario building, questionnaires, and protocol analysis [2,3,8,9].
Interviewing, that is, asking questions, is the most frequently used technique.
Scenario building is a descriptive technique in
which the knowledge engineer describes a scenario about the domain and then
records how the expert solves the problem. Questionnaires are sometimes used to
obtain specific information particularly if the expert has limited time that day.
Protocol analysis involves identifying phrases with high information content
grouping these phrases into areas of knowledge, showing the interrelationship
between them, and then representing this knowledge.
Quinlan[10] claims that there are three general knowledge acquisition methods: •
descriptive, the intuitive, and observational. He explains each method and is
weakness as follows:
Descriptive method—the expert tells the knowledge engineer how he or she solves •
a problem, either one from past experience or a new one in the domain.
Weakness—the knowledge is so “compiled” that the expert cannot describe the process –
accurately.
Intuitive method—through introspection the expert tries to build theories about •
his or her own problem-solving methods and incorporate them directly into a
computer system.
Weakness—frequently, the expert will, perhaps unknowingly, resort to the –
reconstructive method of creating plausible reasoning that may not reflect the actual
technique.
Observational method—as the expert works on a problem, the knowledge •
engineer records the expert’s problem-solving method based on “thinkingaloud
protocols,” the expert talking to himself or herself while working.
Weakness—the expert frequently leaves huge gaps in the description of the process. –
Quinlan[10] also describes the following specific knowledge acquisition techniques: •
On-site observation—staying in the background and watching as the expert •
handles real problems on the job; there is no interruption from the knowledge
engineer.
Problem discussion—an informal talk about a set of representative problems. •
Problem description—the expert describes a typical case for each type of –
problem that may arise; this is especially useful for diagnostic problems with
relatively few solutions.
Problem analysis—the expert solves problems while the knowledge engineer –
questions each step of the process.
System refinement—the knowledge engineer solves problems based on the •
concepts and strategies learned from the expert; then the expert critiques the
solutions to help in further refinement.
System examination or inspection—the expert directly examines and critiques the •
prototype’s rules and control strategies.
System validation or review—comparing solutions achieved by the prototype •
system and by the expert to those developed by other experts.
Besides these direct methods for acquiring knowledge, there are also indirect •
methods. Olson et al. describe some of them:
Multidimensional scaling (MDS)—the expert provides similarity judgments on all •
pairs of objects or concepts in the domain of inquiry; these judgments are assumed
to be symmetric and graded (i.e., A is as similar to B and B is to A), and the
similarities are assumed to take on a variety of continuous values, not just 0 or q.
Johnson hierarchical clustering—the begins with a half-matrix of similarity •
judgments and assumes that an item is or is not a member of a cluster.
General weighted networks—the expert gives symmetric distance judgments on all •
possible pairs of objects; these distances are assumed to arise from the expert’s
traversal of a network of associations.
Ordered trees from recall—this technique is built on a model of how the data are •
produced by the expert; it assumes that people recall all items from a stored
cluster before recalling items from another cluster.
Repertory grid analysis—this includes an initial dialog with the expert a rating •
session, and analyses that cluster both the objects and dimensions on which the
items are rated.
To improve these knowledge acquisition techniques, two main avenues of
research have emerged. The first technique is an induction system in which expert
system rules are inferred automatically from collections of data and facts or case
histories. Quinlan’s ID3/ID4 algorithms follow this line of
research. The second technique consists of automated knowledge acquisition aids or
intelligent editors that the expert can use to elicit his or her own knowledge. An
example of this approach is ETS (Expertise Transfer System) /AQUINAS, developed
by Boeing Computer Services. This approach uses a rating/repertory grid technique
based on personal construct theory to elicit knowledge from an expert. Based on
this knowledge, ETS then constructs the knowledge base structure and will
accordingly infer the rules. Other automated knowledge acquisition systems are
currently being worked on, like KREME,KRITON,MOLE,KNACK,YAKYAK,SALT, and
INFORM. Still other aids are being developed that only partially involve the expert.
This new version of the process, which involves intelligent text retrieval, has been
worked on by companies such as ICF/Phase Linear, Inc.
ICF has developed the Knowledge Acquisition Module (KAM), a PC-based program
that scans text and creates rules, relationships, IF-THEN statements, and heuristics
based on constraints set by the user or expert. KAM can even scan the text of an
interview with an expert by a knowledge engineer and create rules.
Works is also underway to develop knowledge acquisition tools based on
learning by analogy. ThinkBack is one such tool; its core is an analogical inference
engine and its connecting control agent. ThinkBack is a convenient means of
collecting verbal protocols and has helped human problem solvers during analogy
sessions. Lenat’s CYC is being built as a large knowledge base of realworld facts and
heuristics, as well as methods for reasoning efficiently about the knowledge base.
Hopefully, it will be able to use commonsense knowledge and reasoning to draw
analogies and overcome the knowledge acquisition bottleneck. Acquire(by
Acquired Intelligence Inc., Canada) is another knowledge acquisition tool. It is
commercially available, PC based, and acts as an interactive knowledge acquisition
aid for certain kinds of tasks. Other automated knowledge acquisition tools are
Nextra (Neuron Data), AC2 (ISoft, France), and KnAcq tool (England).
3.4 In the Future
Work will continue to develop methods and tools to overcome
the knowledge acquisition bottleneck. There are two schools of
thought on how to address the knowledge acquisition bottleneck.
One side believes that the expert knows the problem area best and
should therefore be the direct link to the software. The other side
believes that since so much of an expert’s ability is hidden at the gut
level, intense interviewing and probing by an informed outsider (i.e.,
the knowledge engineer) is needed to get to the heart of the matter.
Additionally there is an overriding “knowledge engineering paradox”
that states. “The more expert an expert is, the more compiled is his
or her knowledge and the more difficult it is to extract this
knowledge.” This attitude emphasizes the knowledge engineer’s role
in eliciting knowledge from the expert. There is an ongoing debate on
this issue.
This chapter has presented a survey of the problems in
knowledge acquisition and possible solutions for overcoming them.
With continued efforts in both basic and applied research, this
problem will be greatly minimized in the future.
Knowledge Representation and Inferencing

4.6 Case-Based Reasoning 4.1 Predicate Calculus


4.7 Object-Oriented Programming 4.2 Production Rules
4.8 Control Structure in an Expert- 4.3 Frames
4.4 Scripts System—Inference Engine
4.9 Conclusions 4.5 Semantic Networks
After acquiring knowledge from the expert, the next step in building an expert system is
deciding on the knowledge representation approach. According to Duda and Gaschnig:
The power of the expert system lies in the specific knowledge of the problem domain,
with potentially the most powerful systems being the ones containing the most
knowledge.
This suggests that the knowledge base, which is the set of domain facts and heuristics, is
probably the most important component of the expert system.
In deciding which knowledge representation method to incorporate in the expert
system, a good rule of thumb is to select the one that seems most natural to the expert.
In other words, the knowledge should be represented in the expert system in the same
way the expert uses it when explaining a domain or task to the knowledge engineer. For
example, suppose that an expert system is being developed to determine whether one
has a cold. In an interview, the expert on colds, presents some knowledge in the following
manner:
If one has a runny nose, watery eyes, and a sore throat, then there is a very good
likelihood that the individual has a cold. Of course, this is not certain, as the
person might have an allergy.
This description suggests two conclusions. One conclusion is that the expert is
thinking of IF-THEN or SITUATION-ACTION kinds of rules. Thus, the use of
production rules which will be explained later, might be an appropriate way of
representing the knowledge for that expert system. The other conclusion is that
the expert is thinking in terms of probability or possibility. Such term as “a very
good likelihood” suggest that the expert system should incorporate uncertainty in
this case. It turns out that, in most cases expert system tasks involve uncertainty.
There are five major ways of representing knowledge in an expert system:
predicate calculus, production or inference rules, frames, scripts, and semantic
networks. Each of these methods will be discussed.
4.1Predicate Calculus:
Predicate calculus is one way of representing knowledge in an expert system. This is a reasoning
approach built on formal logic that incorporates mathematical properties (transitive and
associative laws). It is made up of constants (called terms ), predicates (called atomic
formulas), functions (called mappings), and logical connectives (` for and, v or -- for
implies, and ~ for not). For example, the statement “All football players are big” can be
expressed in predicate calculus as follows:
(ALL (x) (IF IS-A X FOOTBALL PLAYER) (BIG X) )
This means that for all x, if x is a-football player, then x is big.
Let’s take another example, derived from Rich and Knight and Kaisler, which uses
predicate calculus to represent knowledge in a knowledge base. Suppose that we have
the following facts, with their predicate calculus representations:

1- Harry is a man.
MAN(HARRY)
2- Harry is a tennis player.
TENNISPLAYER(HARRY)
3- All tennis players are athletes.
(FORALL X) [TENNISPLAYERS(X)-- ATHLETE(X)
4- Bob is a coach.
COACH(BOB)
5- All athletes either obey or disobey the coach.
(FORALL X)[ATHLETE(X) OBEYS(X,COACH) OR
DISOBEY(X,COACH) ]
6- Everyone is loyal to someone.
(FORALL X) (EXISTS Y) LOYALTS(X,Y)
7- Athletes only disobey coaches they aren’t loyal to.
(FORALL X)(FORALL Y) [ATHLETE(X) AND COACH(Y) AND DISOBEY(X,Y) ] NOT LOYALTS (X,Y)
8- Harry was disobedient to Bob.
DISOBEDIENT(HARRY, BOB)
If we want to prove “Is Harry loyal to Bob?”, the following proof could be done using
predicate calculus:
RULES SHOW: Is Harry loyal to Bob?
NOTLOYAL TO(HARRY,BOB)
7,Substitution
ATHLETE(HARRY) AND COACH(BOB)
DISOBEY(HARRY, BOB) 4

ATHLETE(HARRY) AND DISOBEY(HARRY, BOB)


8
ATHLETE(HARRY)

2,3,Substitution
True, so Harry is not loyal to Bob

A major drawback to predicate calculus is that it provides only a skeleton of a


representation scheme. Its main disadvantage is that the relevant information is not collected
together , as it is under frames or semantic networks.
4.2 Production Rules
Another technique for representing knowledge is through production rules.
This is probably the most popular approach for knowledge representation in
expert systems. Production rules were popularized by Newell and Simon and by
Davis and King. They are used for procedural representation—knowledge that
can be executed. Production rules take the form IF (antecedent) THEN
(consequent) or SITUATION-ACTION. The rules also can have some measure of
uncertainty associated with them, as will be explained in the next chapter. They
have been used extensively in expert systems, particularly those created for
diagnosis and planning.
A typical production rule in MYCIN, an expert system used to diagnose bacterial
infections in the blood, is as follows:

1. The site of the culture is blood, and IF:


2- The identity of the organism is not known with certainty, and
3- The stain of the organism is gramneg, and
4- The morphology of the organism is rod, and

5- The patient has been seriously burned,


Then: There is weakly suggestive evidence (.4) that the identity of the organism is
pseudomonas.
In order for the consequent of this rule to be true, each of the antecedents
must be true. If one of the antecedents is false, then the consequent, using this rule,
would not be concluded. Another production rule from R1, an expert system used to
configure VAX computers (now called XCON), is as follows:

1. The current context is assigning devices to unibus, and IF:


2. There is an unassigned dual port disc drive and
3. The type of controller it requires is known and
4. There are two such controllers
5. Neither of which has any devices assigned to it and
6. The number of devices that these controllers can support is known.

Then: Assign the disc drive to each of the controllers and note that the two controllers have
been associated and that each supports one device.
According to Reggia and Perricone and Software Architecture and Engineering, there
are three criteria to use in selecting a knowledge representation approach: (1)
preexisting format of the knowledge, (2) type of classification desired, and (3)
context dependence of the inference process. Production rules are usually used
when the preexisting format of the knowledge already organized as rules or
expressed in terms of rules when the expert is explaining the task to the knowledge
engineer. In this case, by using production rules the knowledge can be kept in the
same form presently used, thus creating intuitive appeal. Production rules are also
used when the classification of knowledge is predominantly categorical.
If most of the decisions in the expert system task can be answered by “yes” or “no,”
then production rules would be appropriate. Last, if the knowledge has little
context dependence, then production rules are a good form for representing it
because there is not much descriptive knowledge. For descriptive knowledge, other
knowledge representation methods, such as frames, are better.
There are several advantages to using production rules. First, rules are a
natural expression of what-to-do knowledge, that is, procedural knowledge.
Second, all knowledge for a problem is uniformly presented as rules. Third, rules
are comprehensible units of knowledge. Fourth rules are modular units of
knowledge that can be easily deleted or added. Last, rules may be used to
represent how-to-do knowledge, that is, metaknowledge. Metaknowledge refers to
knowledge about knowledge and can be represented as metarules. A metarule is a
production rule that controls the application of object-level knowledge. It gives
another layer of sophistication to the expert system because it adds additional
layers of space to a search space to help decide what to do next. A disadvantage of
production rules is that there is a limit to the amount of knowledge that can be
expressed conveniently in a single rule. This is not a severe limitation because even
when using microcomputer-based expert systems shell like Exsys, a rule can have
up to 126 conditions in the IF part and up to 126 conditions in its THEN part.
4.3 Frames
A third knowledge representation method used in expert systems is frames.
Frames, developed by Minsky and Kuipers, are used for declarative knowledge.
Declarative knowledge, in contrast to procedural knowledge, is knowledge that
can’t be immediately executed but can be retrieved and stored. Frames were
developed because there was evidence that people do not analyze new situations
from scratch and then build new knowledge structures to describe those situations.
Instead, people use analogical reasoning and take a large collection of structures,
available in memory, to represent previous experience with objects, locations,
situations, and people [2]. Frames[2] (1) contain information about many aspects
of the objects or situations they describe; (2) contain attributes that must be true
of objects that will be used to fill individual slots; and (3) describe typical instances
of the concepts they represent. Frames are used in situations where there is a large
amount of context dependence, implying the use of descriptive knowledge. Frames
are represented like cookbook recipes, where “slots” are
filled with the ingredients needed for the recipe, and then procedural attachments
(e.g., if-added, if-needed, and to-establish procedures) are used to manipulate the
data (i.e., to fill the slots) within and among the frames, such as going through the
steps on how to actually “prepare” the “recipe.” Default values may be provided
with frames.
Each frame corresponds to one entity and contains a number of
labeled slots for things pertinent to that entity. Slots, in turn, may be
blank, or may be specified by terminals referring to other frames, so
that the collection of frames is linked together in a network. This
allows the knowledge to be useful for modularity an accessible.
Attempts to design general knowledge structures based on the frame
concept were made by Bobrow and Winograd via the Knowledge
Representation Language and by Roberts and Goldstein via the Frame
Representation language.
4.4 Scripts
A special kind of frames is sometimes called a script. Clusters of facts can
have useful special-purpose structures that exploit specific properties of their
restricted domains. A script developed by Schank and Abelson in 1977, is such a
structure that describes a stereotyped sequence of events in a particular context.
The components of a script include the following:
Entry conditions •
Results—conditions that will generally be true after the events described in the •
script have occurred.
Props—slots representing objects. •
Roles—slots representing people. •
Track—specific variation on a more general pattern represented by a particular •
script.
Scenes—actual sequences of events that occur. •
In part of a Restaurant Script, the track can be a coffee shop. The entry
conditions are given where the customer is hungry and has money. The scenes of
entering the coffee shop, ordering, eating, and exiting are displayed. The results of
this script are that the customer has less money, is not hungry (hopefully!), and is
pleased (optional). Another result is that the owner of the coffee shop has more
money. Scripts are helpful in situations there are many causal relationships
between events. Figure 4.1 shows examples of the knowledge representation
methods discussed.
4.5 Semantic Networks
The last major way of representing knowledge in an expert system is by
semantic networks. Semantic networks were discovered by Quillian and Raphael in
1968 and are used to represent declarative knowledge. With semantic networks,
knowledge is organized around the objects being described, but objects are
represented as nodes on a graph and relations among them are represented by
labeled arcs. A semantic network is a collection of nodes and arcs where the
following conditions occur:

Nodes represent classes, objects, concepts, situations, events, and so on. •
Nodes have attributes with values that describe the characteristics of the thing •
they represent.
Arcs represent relationships between nodes. •
Arcs allow us to organize knowledge within a network hierarchically. •
For example, Figure 4.2 shows a fragment of a semantic network on computers. •
This figure shows the following associations:
MICROCOMPUTER Isa COMPUTER

DISK DRIVE Ispart MICROCOMPUTER

IBM PC Isa MICROCOMPUTER

IBM PC Color BEIGE

IBM PC Isattached OKIDATA 192

IBM PC Owner ME

OKIDATA 192 Isa PRINTER

ME Isa PERSON

Figure 4.2 Fragment of a Semantic Network on Computers

This is only a fragment of a semantic network because we haven’t included all the
relevant nodes and arcs relating to microcomputers, nor have we indicated
those nodes and arcs associated with other kinds of computers, like
minicomputers, mainframes, superminicomputers, supercomputers, and
labpsize computers.
The reasoning in a semantic network depends on the procedures used to
manipulate the network. The following steps are usually accomplished:
Basically, match patterns against one or more nodes to retrieve information; •
heuristics may be needed to tell where to begin matching.
Inference—derive general properties by examining a set of nodes for common •
features and relations.
Deduction—follow paths through a set of nodes to derive a conclusion. •
The main advantage of using semantic networks is that for each object,
event, or concept, all the relevant information is collected together. Semantic
networks are used to represent specific events or experiences, as well as for tasks
that have a large amount of context dependence.
4.6 Case-Based Reasoning
Another paradigm of increasing interest is called case-based reasoning. Case based
reasoning is built on the premise of analogical reasoning, whereby one relies on
past episodes or experiences and modifies an old plan to fit the new situation. It
assumes a memory model for representing, indexing, and organizating past cases
and a process model for retrieving and modifying old cases and assimilating new
ones.
Many computer programs use case-based reasoning for problem solving or
interpretation: MEDIATOR and PERSUADER use cases to resolve disputes. CLAVIER
and KRITIK use case-based reasoning for design, HYPO uses cases for legal
reasoning, and MEDIC utilizes case-based reasoning for diagnosis. With today’s
growing interest in case-based rasoning, several case-based reasoning shells are
being sold commercially. These include ReMind (Cognitive Systems), Easteem
(Esteem Softward, Inc.), and CBR Express (Inference Corporation). Many research
issues still need to be resolved to advance case-based reasoning. These include the
representation of episodic knowledge, memory organization, indexing, case
modification, and learning.
4.7 Object-Oriented Programming
Object-oriented programming (OOP) is another popular paradigm that is gaining
worldwide interest. Object-oriented representation of knowledge can be used in
expert systems. Some of the main features of OOP involve the use of objects,
inheritance and specialization, and methods and message passing.
An object is a data structure that contains both data and procedures. The data
structures contained within an object are called attributes or slots or
variables).Inheritance is the technique that ensures that the child of any object will
include the data structures and methods associated with its parents. Inheritance
means that a developer does not have to re-create slots or methods that have been
created. Specialization is the idea that one can specialize (or override) information
that is inherited. A method is a function or procedure attached to an object. A
method is activated by a message sent to the object. Message passing (or binding)
is the process involved in sending messages to objects.
Pure object-oriented languages include Smalltalk, Simula 76, and Eiffel.
Hybrid object-oriented languages, which arebuilt on an existing high-level
language, include C++ Turbo Pascal 5.5, and CLOS. It is expected that object-
oriented language and development tools, expert system tools, and current CASE
(computer-aided software engineering) tools will all merge into a single product by
the mid-1990s.
Soft Computing
Definition
According to Zadeh (who introduced this term), soft computing differ
from hard (conventional ) computing in tolerance to:
Imprecision •
Uncertainty •
Partial truth •

The role model of soft computing


is the human mind which has the
ability to reason and learn in an
environment of uncertainty and
imprecision
What is Soft Computing
Achievement

Soft computing exploit this tolerance to


achieve:
Tractability com •
Robustness •
Low cost solutions •
Soft computing techniques
Neural Networks •
Fuzzy Logic •
Derivative Free Optimization Methods: •
Genetic algorithms –
Simulated annealing –
Random search –
Individual methodologies Attributes

Methodology Attributes

Neural networks Learning

Fuzzy systems Knowledge representation


through if_then rules
Genetic algorithms and Systematic random search
simulated annealing
Conventional AI Symbolic manipulation

The 1st three items are the soft computing constituents


Hybrid systems
Hybrid intelligence systems integrate two or •
more of the technologies to provide a more
effective problem solver
Neuro-GA
Nuero-Fuzzy-
GA

NN GA

Neuro-fuzzy
Fuzzy Logic
GA-Fuzzy
From AI to soft computing
Conventional AI symbolisms •

New trends of AI •

Is soft computing a part of AI •


Conventional AI symbolisms
Conventional AI research focuses on an •
attempt to mimic human intelligent behavior
by expressing it in language forms or symbolic
forms

In practice symbolic manipulation limits the •


application of AI because knowledge
acquirement and representation by no mean
easy.
New trends of AI

More attention has been directed toward biologically •


inspired methodologies such as brain modeling,
evolutionary algorithms, and immune modeling.
These methodologies simulate biological •
mechanisms responsible of generating natural
intelligence.
These tech are orthogonal to conventional AI and •
generally compensate for the shortcomings of
symbolism
Is soft computing a part of AI

Calling Soft computing a part of modern AI •


depends on personal judgment.
AI is steadily expanding and the boundary •
between AI and soft computing is becoming
indistinct
AI in CDSS
AI in CDSS
Clinical decision support system (CDSS or CDS) is an •
interactive Computer Software, which is designed to
assist physicians and other health professionals with
decision making tasks, as determining diagnosis of
patient data.
They link health observations with health knowledge •
to influence health choices by clinicians for improved
health care".
Role & Characteristics of CDSS

A clinical decision support system has been coined as an “interactive •


knowledge systems, which use two or more items of patient data to
generate case-specific advice.”

Purpose/Goal
• The main purpose of modern CDSS is to assist clinicians at the point
of care. This means that a clinician would interact with a CDSS to
help determine diagnosis, analysis, etc. of patient data
• Previous theories of CDSS were to use the CDSS to literally make
decisions for the clinician .
• The new methodology of using CDSS forces the clinician to interact with the
CDSS utilizing both his own knowledge and the CDSS to make a better
analysis of the patients data than either human or CDSS could make on
their own
Functions of Computer-Based Clinical Decision
Support Systems

Function Example •
Alerting: highlighting out-of-range laboratory values •
Reminding: reminding the clinician to schedule a drug dose •
Interpreting: interpreting the electrocardiogram •
Predicting: predicting risk of mortality from a severity of illness •
Diagnosing: listing a differential diagnosis for a patient with •
chest pain
Assisting: tailoring the antibiotic choices for liver transplant and •
renal failure
Suggesting: generating suggestions for adjusting the mechanical •
ventilator
Types of CDSS:
There are two types of CDSS

Knowledge-Based: (expert systems and fuzzy inference systems) .1


It consist of three parts, •
The knowledge base: which contains the rules and associations of o
compiled data which most often take the form of IF-THEN rules. If this
was a system for determining drug interactions, then a rule might be
that IF drug X is taken AND drug Y is taken THEN alert user
Inference engine: which combines the rules from the knowledge o
base with the patient’s data (using logic) to come out with a
recommendation or an advice
Communication mechanism. it allows the system to show the results o
to the user as well as have input into the system
NonKnowledge-Based .2
This type use a form of artificial intelligence called 
machine learning, which allow computers to learn
from past experiences and/or find patterns in clinical
data. Two types of non-knowledge-based systems
are:
Artificial neural networks and 
Genetic algorithms. 
CDSS

Knowledge based CDSS NonKnowledge based CDSS

Rule based Evidence based Neural Networks Genetic algorithms

Fuzzy logic Others

For closing the gap between the physicians and CDSSs, evidence based
appeared to be a perfect technique. It proves to be a very powerful
tool for improving clinical care and also patient outcomes. It has the
potential to improve quality and safety as well as reducing the cost.
Characteristics of a Successful Rule-based System for CDSS
Clear and available interventions •
For high volume areas in the hospital •
Addresses problems with high mortality or morbidity •
In clinical areas of high importance to the organization •
Cost-benefits are clear •
Health care professionals are not overloaded with alerts •
Easy for clinicians to satisfy the alerts (simply ordering a test) •
Willingness of clinician to accept the alert •
Timeliness of the alert •
Rules differ depending on the audience – nurse or physician as example •
Easiest rules to implement that satisfy clinicians and that maintain their •
interest
Need for partnership between clinical and IT groups •
What Clinician’s Want in a CDSS
Efficient •
Not time consuming •
Alerts are triggered only for eligible patients •
Exceptions can be indicated •
Repetition is minimized •
User-friendly interface and presentation •
Easy to see alerts •
Easy to respond to alerts •
Content is accurate and robust •
Easy to access additional information from the alert •
Integrated into the workflow •
Alert appears at an appropriate time •
Alert appears to the appropriate person •
Introduction to Neural
Networks
.1 Biological Basis
ocomputing Fundamentals
al Network Architectures
Learning Paradigms
antages and Limitations
f Neural Networks

2/5/2019 95
6.1 Biological Basis
Biological Neural Networks
The brain is composed of over 100 different •
kinds of special cells called neurons.
The number of neurons in the brain is •
estimated to range from 50 billion to over 100
billion.

96 2/5/2019
6.1 Biological Basis (cont.)
Neurons are divided into interconnected •
groups called networks and provide
specialized functions
Each group contains several thousand neurons •
that are highly interconnected with each
other

97 2/5/2019
Biological Basis (cont.)

98 2/5/2019
6.1 Biological Basis (cont.)
the brain can be viewed as a collection of •
neural networks. A portion of a network
composed of two cells is shown in Figure 6.1.
Human intelligence is used to understand the •
various visual features that are extracted and
stored in memory.

99 2/5/2019
6.1 Biological Basis (cont.)
The ability to learn from and react to changes •
in our environment requires intelligence.
An example is the optical path in visual •
systems. External stimuli are transformed via
cone cells and rod cells into signals that map
features of the visual image into internal
memory.

100 2/5/2019
6.1 Biological Basis (cont.)
An artificial neural network (ANN) is a model •
that emulates a biological neural network.
The nodes in an ANN are based on the •
simplistic mathematical representation of
what we think real neurons look like.

101 2/5/2019
Biological Basis (cont.)
today’s neural computing uses a limited set of •
concepts from biological neural systems to
implement software simulations of massively
parallel processes involving processing
elements (also called artificial neurons or
neurodes) interconnected in a network
architecture

102 2/5/2019
Biological Basis (cont.)
The neurode is analogous to the biological •
neuron, receiving inputs that represent the
electrical impulses that the dendrites of
biological neurons receive from other
neurons.
The output of the neurode corresponds to a •
signal set out from a biological neuron over its
axon

103 2/5/2019
6.1 Biological Basis (cont.)
The axon of the biological neuron branches to •
the dendrites of other neurons, and the
impulses are transmitted over synapses. A
synapse is able to increase or decrease its
strength, thus affecting the level of signal
propagation and is said to cause excitation or
inhibition of a subsequent neuron.

104 2/5/2019
Artificial Neural Networks
The state of the art in neural computing is •
inspired by our current understanding of
biological neural networks .
However, despite the extensive research in •
neurobiology and psychology, important
questions remain about how the brain and the
mind work

105 2/5/2019
The "basic" biological neuron

Dendrites Soma Axon with branches and


synaptic terminals

106 2/5/2019
Artificial Neural Networks (cont.)
Research and development in the area of ANN is •
producing interesting and useful systems that borrow
some features from the biological systems, even
though we are far from having an artificial brain-like
machine. The field of neural computing is in its
infancy, with much research and development
required in order to mimic the brain and mind.
However, many useful techniques inspired by the
biological systems have already been developed and
are finding use in real-world applications.

107 2/5/2019
Artificial Neural Networks (cont.)
More recently, neural network development •
systems and tools have become commercially
available. As with expert systems, the
availability of a convenient development
method is allowing the spread of
neurocomputing and is putting
neurocomputing on the road to becoming part
of the standard repertoire of systems
developers.
108 2/5/2019
6.2 Neurocomputing
Fundamentals
The key concepts needed to understand •
artificial neural networks will now be
discussed.

109 2/5/2019
Neurode
An ANN is composed of basic with called •
artificial neurons, or neurodes, that are the
processing elements (PEs) in a network. Each
neurode receives input data, processing it,
and delivers a single output. This process is
shown in Figure 6.2. The input can be raw data
or output of other PEs. The output can be the
final product or it can be an input to another
neurode.
110 2/5/2019
Figure 6.2 Models of the artificial neurode and network •

input output
(dendrites) f (axon)

summation and activation

output

4 x 3 = 12 weght

input

111 2/5/2019
Networks
An ANN is composed of a collection of •
interconnected neurons that are often
grouped in layers; however, in general, no
specific architecture should be assumed. The
various possible neural network topologies are
the subject of research and development.

112 2/5/2019
Network Architectures

113 2/5/2019
Networks
In terms of layered architectures, two basic •
structures are shown in Figure 6.3. In part (a)
we see two layers: input and output. In part
(b) we see three layers: input, intermediate
(called hidden), and output. An input layer
receives data from the outside world and
sends signals to subsequent layers.

114 2/5/2019
Networks
The outside layer interprets signals from the •
previous layer to produce a result that is
transmitted to the outside world the network
understands of the input data.

115 2/5/2019
Figure 6.3 Taxonomy of ANN architectures and learning algorithms •

Learning
Algorithms

Discrete/Binary Continuous

Supervised unsupervised Supervised unsupervised


Simple Hopfield
Delta rule ART-3
Outerproduct ART-1
Gradient descent SOFM
AM
Competitive
learning
Neocognitron

Architectures

Supervised unsupervised

Recurrent Feed-forward
Estimators Extractors
Hopfield Backpropagatio
n SOFM ART-1
ML perceptron ART-2
Boltzmann
116 2/5/2019
Input
Each input corresponds to a single attribute of a •
pattern or other data in the external world. The
network can be designed to accept sets of input
values that are either binary-valued or continuously
valued. For example, if the problem is to decide
whether or not to approve a loan, an attribute can
be income level, age, and so on. Note that in
neurocomputing, we can only process numbers.
Therefore, if a problem involves qualitative attributes
or graphics, the information must be preprocessed to
a numerical equivalent before it can be interpreted
by the ANN.
117 2/5/2019
Input (cont.)
Examples of inputs to neural networks are pixel •
values of characters and other graphics, digitized
images and voice patterns, digitized signals from
monitoring equipment, and coded data from loan
applications. In all cases, an important initial step is
to design a suitable coding system so that the data
can be presented to the neural networks, commonly
as sets of 1s and 0s.For example, a 6x8-pixel
character would be a 48-bit vector input to the
network.

118 2/5/2019
Output
The out put of the network is the solution to the •
problem. For example, in the loan application case it
may be yes or no.
The ANN, again, will assign numerical values (e.g., + 1 •
means yes; zero means no). The purpose of the
network is to compute the value of the output. In the
supervised type of ANN, the initial output of the
network is usually incorrect and the network must be
adjusted or tuned until it gives the proper output.

119 2/5/2019
Hidden Layers
In multilayered architectures, the inner (hidden) •
layers do not directly interact with the outside world,
but rather add a degree of complexity to allow the
ANN to operate on more interesting problems.
The hidden layer adds an internal representation of •
the problem that gives the network the ability to
deal robustly with inherently nonlinear and complex
problems.

120 2/5/2019
Weights
The weights in an ANN express the relative strengths •
(or mathematical values) of the various connections
that transfer data from layer to layer. In other words,
the weights express the relative importance of each
input to a PE.
Weights are crucial to ANN because they are the •
means by which we repeatedly adjust the network to
produce desired outputs and thereby allow the
network to learn.

121 2/5/2019
Weights (cont.)
The objective in training a neural network is •
to find a set of weights that will correctly
interpret all the sets of input values that are of
interest for a particular problem. Such a set of
weights is possible if the number of neurodes,
the architecture, and the corresponding
number of weights form a sufficiently complex
system to provide just enough parameters to
adjust (or “tune”) to produce all the desired
outputs.
122 2/5/2019
Summation Function
The summation function finds the •
weighted average of all the input elements. A
simple summation function will multiply each
input value (X1) by its weight (Wq) and total
them for a weighted sum, Si . The formula for
N input elements is :

N
S i  Wq  X 1
j i
123 2/5/2019
Summation Function (cont.)
The neurodes in a neural network thus have •
very simple processing requirements. Mainly,
they need to monitor the incoming signals
from other neurodes, compute the weighted
sums, and determine a corresponding signal
to send to other neurodes.

124 2/5/2019
Transformation Function
The summation function computes the internal •
stimulation or activation level of the neuron. Based
on this level, the neuron may or may not produce an
output. The relationship between the internal
activation levels may be either linear or nonlinear.
Such relationships are expressed by a transformation
function. The sigmoid function, which is commonly
and effectively used, is discussed here.

125 2/5/2019
From Logical Neurons to Finite
Automata
1 Brains, Machines, and
AND 1.5 Mathematics, 2nd Edition,
1987
1 Boolean Net

1
X Y
OR 0.5

NOT X
0 Finite
-1 Automaton

126
Y Q 2/5/2019
Transformation Function (cont.)
The selection of the specific function as well •
as of the transformation function, is one of
the variables considered in choosing a
network architecture and learning paradigm.
Although many different functions are
possible, a very useful and popular nonlinear
transfer function is the sigmoid (or logical
activation) function.

127 2/5/2019
Transformation Function (cont.)
Its formula is : •
1
YT  s
1 e

where S is the weighted sum of the inputs to the neurode.

128 2/5/2019
Transformation Function (cont.)
The collective action of a neural network is like that •
of a committee or other group making a decision.
Individuals interact and affect each other in the
process of arriving at a group decision. The global
average or consensus of the group is more significant
than an individual opinion and can remain the same
even if some individuals drop out. Also, a group can
have different mechanisms for arriving at the
collective decision.

129 2/5/2019
Learning
The sets of weight values for a given neural •
network represent different states of its
memory or understanding of the possible
inputs. In supervised networks, training
involves adjustment of the weights to produce
the correct outputs. Thus, the network learns
how to respond to patterns of data presented
to it. In other types of ANN, the networks self-
organize and learn categories of input data
(Figure 6.3).
130 2/5/2019
Learning (cont.)
An important function of the artificial neuron is •
the evaluation of its inputs and the production of an
output response. A weighted sum of the inputs from
the simulated dendrites is evaluated to determine
the level of the output on the simulated axon. Most
artificial systems use threshold values, and a
common activation function is the sigmoid
function,, that can squash the total input
summation to a bounded output value, as shown in
Figure 6.2.

131 2/5/2019
Learning (cont.)
This model of the neuron, or basic perceptron, •
requires a learning algorithm for deriving the
weights that correctly represent the
knowledge to be stored. A fundamental
concept in that regard is Hebbian learning,
based on Donald Hebb’s work in 1949 on
biological systems, which postulates that
active connections should be reinforced.

132 2/5/2019
Learning (cont.)
This means that the strengths (weights) of the •
interconnections increase if the prior node
repeatedly stimulates the subsequent node to
generate an output signal. In some algorithms,
the weights of connections may also be
decreased if they are not involved in
stimulating a node, and negative weights can
also be used to represent inhibiting actions.

133 2/5/2019
Learning (cont.)
For more complex neural computing applications, •
neurodes are combined together in various
architectures useful for information processing
(Figure 6.4).
A common arrangement has layers of neurodes with •
forward connection every neurode except those in
the same or the prior layer. Useful applications
require multip.e (hidden) layers between the input
and output neurodes and a correspondingly large
number of connections.

134 2/5/2019
Learning (cont.)
Information processing with neural computers consists of •
analyzing patterns of data, with learned information stored as
neurode connection weights.
A common characteristic is the ability of the system to classify •
streams of input data without the explicit knowledge of rules
and to use arbitrary patterns of weights to represent the
memory of categories.
During the learning stages, the interconnection weights •
change in response to training data presented to the system.
In contrast, during recall, the weights are fixed at the trained
values.

135 2/5/2019
Learning (cont.)
Although most applications use software •
simulations, neural computing will eventually use
parallel networks of simple processors that use the
strengths of the interconnections to represent
memory.
Each processor will compute node outputs from the •
weights and input signals from other processors.
Together the network of neurons can store •
information that can be recalled to interpret and
classify future inputs to the network.

136 2/5/2019
6.3 Neural Network Architectures
Many different neural network models and •
implementations are being developed and
studied. There representative architectures
(with appropriate learning paradigms) are
shown in Figure 6.4 and are discussed next.

137 2/5/2019
Network Architectures
1. Associative memory systems. These systems •
correlate input data with information stored in
memory. Information can be recalled from
incomplete or noisy input, and the performance
degrades slowly as neurons fail. Associative memory
systems can detect similarities between new input
and stored patterns. Most neural network
architectures can be used as associative memories;
and a prime example is the Hopfield network.

138 2/5/2019
Multi-layer Network

139 2/5/2019
Multi-layer Perceptron
Classifier

140 2/5/2019
Network Architectures
2. Multiple-layer systems. Associative memory •
systems can have one or more intermediate
(hidden) layers. An example of a simple
network is shown in Figure 6.4. The most
common learning algorithm for this
architecture is back propagation, which is a
kind of credit-blame approach to correcting
and reinforcing the network as it adjusts to
the training data presented to it
141 2/5/2019
Network Architectures
Another type of supervised learning, •
competitive filter associative memory, can
learn by changing its weights in recognition of
categories of input data without being
provided examples by an external trainer. A
leading example of such a self-organizing
system for a fixed number of classes in the
inputs is the Kohonen network.

142 2/5/2019
Network Architectures
3. Double-layer structures. A double layer structure, •
exemplified by the adaptive resonance theory (ART)
approach, does not require the knowledge of a precise
number of classes in the training data but uses feed-forward
and feedback to adjust parameters as data is analyzed to
establish arbitrary numbers of categories that represent the
data presented to the system.
Parameters can be adjusted to tune the sensitivity of the •
system and produce meaningful categories.

143 2/5/2019
144 2/5/2019
6.4 Learning Paradigms
An important consideration in ANN is the •
appropriate use of algorithms for learning.
ANN’s have been designed for different type
of learning.
Heteroassociation—mapping one set of data •
to another. This produces output that
generally is different in form from the input
pattern. It is used, for example, in stock
market prediction applications.

145 2/5/2019
6.4 Learning Paradigms
Autoassociation—storing patterns for error •
tolerance. It reproduces an output pattern
similar to or exactly the same as the input
pattern. It is used in optical character
recognition systems.
Regularity detection—looking for useful •
features in data ((feature extraction)). It is
used in sonar signal identification systems.

146 2/5/2019
6.4 Learning Paradigms
Reinforcement learning—acting on feedback. •
This is a supervised form of learning in which
the teacher is more of a critic than an
instructor. It is used in controllers in ultrasonic
spaceplanes.
Two basic approaches to learning in an ANN •
exist: supervised and unsupervised. These
approaches will now be discussed.

147 2/5/2019
Supervised Learning
In the supervised learning approach, we use a set of inputs for •
which the appropriate outputs are known.
In one type, the difference between the desired and actual •
outputs is used to calculate corrections to the weights of the
neural network (learning with a teacher).
A variation on that approach simply acknowledges for each •
input trial whether or not the output is correct as the network
adjust weights in an attempt to achieve correct results
(reinforcement learning).

148 2/5/2019
Unsupervised Learning
In unsupervised learning, the neural network self-organizes to •
produce categories into which a series of inputs fall. No
knowledge is supplied about what classifications are correct,
and those that the network derives may or may not be
meaningful to the person using the network. However, the
number of categories into which the network classifies the
inputs can be controlled by varying certain parameters in the
model. In any case, a human must examine the final
categories to assign meaning and determine the usefulness of
the results. Examples of this type of learning are the ART and
the Kohonen self-organizing feature maps.

149 2/5/2019
Perception learning
As a simple example of learning, consider that •
a single neuron learns the inclusive OR
operation.
The neuron must be trained to recognize the •
input patterns and classify them to give the
corresponding outputs.

150 2/5/2019
Perception learning (cont.)
The procedure is to present to the neuron the •
sequence of input patterns and adjust the weights
after each one.
This step is repeated until the weights converge to •
one set of values that allow the neuron to classify
correctly each of the four inputs.
The results shown in the following example were •
produced using Excel spreadsheet calculations.

151 2/5/2019
Perception learning (cont.)
In this simple case of perceptron learning, the •
following example uses a step function to evaluate
the summation of input values. After outputs are
calculated, a measure of the error between the
output and the desired values is used to update the
weights, subsequently reinforcing correct results. At
any step in the process,
Delta = Z – Y •
Where Z and Y are the desired and actual outputs, •
respectively.

152 2/5/2019
Perception learning (cont.)
Then the updated weights are wi = wi + alpha •
* delta * xi, where alpha is a parameter that
controls how rapidly the learning takes place.

153 2/5/2019
Perception learning (cont.)
As shown in Table 6.1, each calculation uses one of the x1 and •
x2 pairs and the corresponding value for the OR operation,
along with initial values, w1 and w2 , of the neurode weights.
In this example, the weights are assigned random values at
the beginning and a learning rate, alpha, is set to be relatively
low. The value Y is the result of calculation using the equation
just described, and delta is the difference between Y and the
desired result. Delta is used to derive the final weights, which
then become the initial weights in the next row.

154 2/5/2019
Perception learning (cont.)
The initial values of weights for each input are transformed, •
using the previous equation, to values that are used with the
next input. The threshold value causes the Y value to be 1 if
the weighted sum of inputs is greater than 0.5; otherwise, the
output is set to 0. In this example, after the first step, two of
the four outputs are incorrect and no consistent set of
weights has been found. In the subsequence steps, the
learning algorithm produces a set of weights that can give the
correct results. Once determined, a neuron with those weight
values can quickly perform the OR operation.

155 2/5/2019
Back Propagation
Although many supervised learning •
examples exist, other important cases, such as
the exclusive OR, cannot be handled with a
simple neural network. Patterns must be
linearly separable—that is, in the x-y plot of
pattern space, it must be possible to draw a
straight line that divides the clusters of input-
output-points that belong to the desired
categories.
156 2/5/2019
Back Propagation (cont.)
In the previous example, the input-output •
pairs (0,1), (1,0), and (1,1) are linearly
separable from (0,0). Although the
requirement of a linearly separable input
pattern space caused initial disillusionment
with neural networks, recent models such as
back propagation in multilayer networks have
greatly broadened the range of problems that
can be addressed.
157 2/5/2019
Back Propagation (cont.)
Back propagation, a popular technique that is •
relatively easy to implement, requires training data
to provide the network with experience before using
it for processing other data. Externally provided
correct patterns are compared with the neural
network output during training, and feedback is used
to adjust the weights until all training patterns are
correctly categorized by the network. In some cases,
a disadvantage of this approach is prohibitively large
training times.

158 2/5/2019
Back Propagation (cont.)
The neural network output during training and •
feedback is used to adjust the weights until all
training patterns are correctly categorized by
the network. In some cases, a disadvantage of
this approach is prohibitively large training
times.

159 2/5/2019
Back Propagation (cont.)
For any output neuron, the error delta = (Zj-Yj) *’, •
where Z and Y are the desired and actual outputs,
respectively, and ’ is the slope of a sigmoid function
evaluated at the jth neuron. If  is chosen to be the
logistic function, then ’=d/dx =(1-), where
(x)=[1+exp(-x)]-1 and x is proportional to the sum of
the weighted inputs of the th neuron. A more
complicated expression can be derived to work
backward from the output neurons through the
inner layers to calculate the corrections to their
associated weights.

160 2/5/2019
Back Propagation (cont.)
The procedure for executing the learning •
algorithm is as follows: Initialize weights and other
parameters to random values, read in the input
vector and desired output, calculate actual output
via the calculations forward through the layers, and
change the weights by calculating errors backward
from the output layer through the hidden layers. This
procedure is repeated for all the input vectors until
the desired and actual outputs agree within some
predetermined tolerance.

161 2/5/2019
6.5 Advantages and Limitations of
Neural Networks
Although ANNs offer exciting possibilities, they also have certain •
limitations.
Traditional Al approaches have in their favor the more transparent •
mechanisms, often expressed in terms such as logic operations and rule-
based representations, that are meaningful to us in our everyday lives.
By comparison, ANNs do not use structured knowledge with symbols •
used by humans to express reasoning processes.
Furthermore, ANNs have so far been used for classification problems and, •
although quite effective in that task, need to be expanded to other types
of intelligent activities.

162 2/5/2019
6.5 Advantages and Limitations of
Neural Networks
An ANN’s weights, even though quite effective, are •
just a set of numbers that in most cases have no
obvious meaning to humans. Thus, an ANN is a black
box solution to problems, and an explanation system
cannot be constructed to justify a given result. As
noted before, another limitation can be excessive
training times, for example, in ANNs using back
propagation.

163 2/5/2019
6.5 Advantages and Limitations of
Neural Networks
Nuerocomputing is a relatively new field, and •
continued research and development will surely
minimize the limitations and find the further
strengths of this approach. The fault tolerance
aspects of ANNs will be improved, allowing them to
be effective as individual neurodes fail or have
incorrect input. The exciting prospects of self-
organizing networks will be exploited to produce
systems that learn on their own how to categorize
input data.

164 2/5/2019
6.5 Advantages and Limitations of
Neural Networks
Future systems will improve in the areas of •
generalization and abstraction, with the ability to go
beyond the training data to interpret patterns not
explicitly seen before. Finally, the collaboration
between scientists in neurocomputing and
neurobiology should lead to advances in each field as
computers mimic what we understand about human
thinking and as neuroscientists learn from computer
simulations of the theories of human cognition.

165 2/5/2019

You might also like