371CPE Lectures Part2
371CPE Lectures Part2
Mohammad Alshamri
Artificial Intelligence
AI is a computer program that mimics some level of human intelligence.
AI algorithms can tackle:
• Knowledge,
• Learning,
• Perception,
• Problem-solving,
• Language understanding and/or
• Logical reasoning.
Turing Test
Imitation Game
It is an operational test for intelligent behavior. The following illustrate the game:
Robots
• Robots are programmable machines that carry out a series of actions
autonomously, or semi-autonomously (initially, they perform tedious tasks with high
precision)
Industrial Robots
Industrial Robot is a programmed robot that carries out a repetitive series of movements
on two or more axes. Repetitive movements do not require artificial intelligence.
Telerobotic
Telerobotic is the area of robotics concerned with the control of semi-autonomous robots
from a distance.
Learning system A computer system that learns how to function or how to react to situations
based on some feedback.
Natural language A computer system that understands and reacts to statements and commands
processing (NLP) made in a natural language, such as English Automatic Processing of human
language, communication between people and computers
• Types of Features:
1. Numerical Features: They represent numerical values (e.g., age, income).
2. Categorical Features: represent discrete categories or labels (e.g., gender, city).
3. Text Features: They represent textual information (product descriptions, tweets).
4. Temporal Features: They represent timestamps or time-related information.
• Types of Datasets:
1. Structured Datasets: Organized into rows and columns, often represented as
tables or spreadsheets.
2. Unstructured Datasets: Lack a predefined structure, such as text data, images,
audio, or video.
• Dataset Dimensionality: it is the number of features that contribute to the
dimensionality of the dataset.
Example 1
Dataset Features
Housing prices Square footage, number of bedrooms, location, and proximity to amenities
Cancer classification Gender, age, weight, tumor size, tumor shape, blood pressure, etc.
Customer behavior Purchase history, time spent on a website, and demographic information
Diabetes dataset As below
Important Terminologies
Feature Selection:
• Feature selection is a process of selecting a subset of relevant features from the
original set of features to reduce the dimensionality of the feature space, simplify
the model, and improve its generalization performance.
• Feature selection aims to retain the most informative features while discarding less
important ones.
Feature Extraction:
• Feature extraction is a process of transforming the original features into a new set
of features that are more informative and compact.
• The new features still capture the essential information from the original data but
represent it in a lower-dimensional feature space.
• Feature extraction is usually used when the original data was very different (when
you could not use the raw data).
• If the original data were images, then you can extract the redness value, or a
description of the shape of an object in the image.
Feature Scaling
• Feature scaling is a preprocessing technique in ML that involves standardizing or
normalizing the range of independent variables or features of a dataset.
• The goal is to ensure that all features contribute equally to the modeling process,
preventing certain features from dominating due to their scale.
Scaling formula Formula Range
𝑥 − 𝑥𝑚𝑖𝑛 between 0 and 1
Min-Max Scaling (Normalization) 𝑥𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 =
𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
𝑥−𝜇 -3 standard deviation
Z-score normalization 𝑥𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 = up to +3 standard
𝜎 deviation
Feature Engineering
• Feature engineering is the careful preprocessing into more meaningful features,
even if you could have used the old ones.
• Feature engineering involves selecting, transforming, normalization, one-hot
encoding, and creating new features based on existing ones.
• For example:
Instead of using the dataset variables x, y, z you decide to use log(𝑥) − 𝑧 × sqrt(𝑦)
instead, because the derived quantity is more meaningful to solve your problem. You get
better results than without.
Example 3
Show how feature engineering can help to identify Area Price
(Square feet) (Million Dollar)
errors in the dataset if an expert tells you that the price 1 2400 9
2 3200 15
per square foot cannot be less than $3,400. 3 2500 10
4 2100 1.5
5 2500 8.9
Answer 3
We will add a new column to display the Area Price Cost per
(Square foot) (Million Dollar) square foot
cost per square foot. 1 2400 9 4150
2 3200 15 4944
The results show that the data of house 4 3 2500 10 3950
has a problem 4 2100 1.5 510
5 2500 8.9 3600
Learning System
A computer program is said to learn from experience E with respect to some class
of tasks T and performance measure P, if its performance at tasks in T, as measured by P,
improves with experience E.
Example 4
Assume a learning system for playing tic-tac-toe game (or nuggets
and crosses). Describe the elements of this learning system.
Answer 4
T: Play tic-tac-toe;
Supervised Learning
There are two main types:
• Classification: It maps an
input data point to an output
label.
• Regression: It maps an
input data point to a
continuous output value.
Example 5
Assume an instructor is training an agent to become a taxi driver. Then:
• Every time the instructor shouts "Brake!" The agent can learn a condition action rule for
when to brake.
After learning the "Brake” rules, the agent applies them and find the following case:
• Breaking hard on a wet road causes something bad.
Then the agent will learn the effects of its actions.
Supervised Classification
• The goal is to learn a functional mapping between the input data (patterns or
examples) 𝑋, to a class label 𝑌, i.e., 𝑌 = 𝑓(𝑋).
• The function approximates the relationship between input data and output label.
• There are three phases in supervised classification:
1. Training stage: The input is 𝑋, and 𝑌. The output is the mapping, 𝑌 = 𝑓(𝑋).
2. Classification stage: The input is a new example 𝑋𝑡 . The output is the predicted
value, 𝑦𝑡 = 𝑓(𝑋𝑡 ).
3. Output stage: Define the level of the classification. The input is the predicted value
𝑦𝑡 . The output is the predicted label, 𝑌𝑡 .
• Noisy, or incorrect, data labels will clearly reduce the effectiveness of the model.
• The error for any supervised ML algorithm comprises of 3 parts:
1. Bias error.
2. Variance error.
3. The noise.
• The main considerations for supervised learning are:
1. Model complexity: It refers to the complexity of the function you are attempting
to learn — like the degree of a polynomial.
2. Bias-Variance tradeoff.
• As we move away from the bullseye, predictions get worse and worse.
• Different individual realizations result in a scatter of hits on the target due to
repeating the model on different training data.
Example 6
Consider the following data concerning credit Age Loan (SR) Default
25 40,000 N
default. Age and Loan are two numerical variables 35 60,000 N
(predictors), and Default is the target. 45 80,000 N
20 20,000 N
Use 1-nearest neighbors and 3-nearest neighbors 35 120,000 N
52 18,000 N
with Euclidean distance to classify unknown case 23 95,000 Y
40 62,000 Y
(Age=48 and Loan=$142,000)
60 100,000 Y
48 220,000 Y
Answer 6 33 150,000 Y
Calculate the Euclidean distance between the test example and all training examples.
2
2 2 2
𝐷𝑒 (𝑋𝑇𝑅 , 𝑋𝑡 ) = √∑(𝑥𝑇𝑅,𝑖 − 𝑥𝑡,𝑖 ) = √(𝑥𝑇𝑅,1 − 𝑥𝑡,1 ) + (𝑥𝑇𝑅,2 − 𝑥𝑡,2 )
𝑖=1
48 142,000 ? Test
KNN disadvantages are:
• KNN are computationally expensive
• KNN requires a large memory to store the training data.
Example 7
Unsupervised Learning
• Unsupervised Learning learns patterns in the input data when no specific output
values are supplied.
• A cluster refers to a collection of data points aggregated together because of
certain similarities.
• For instance – a taxi agent might gradually develop a concept of "good traffic
days" and "bad traffic days" without ever being given labels.
• A purely unsupervised learning agent cannot learn what to do, because it has no
information as to what constitutes a correct action or a desirable state.
• This time the agent does not know anything about fruits, it is the first time he has
seen these fruits so how it will arrange the same type of fruits.
• The agent will take on a fruit and will select any physical character of that fruit
(suppose it is the color).
• The agent will arrange fruits by the color. The groups will be something like this.
1. RED COLOR GROUP: apples & cherry fruits.
2. GREEN COLOR GROUP: bananas & grapes.
• If the agent adds another physical character as size, so now the groups will be
something like this.
1. RED COLOR AND BIG SIZE: apple.
2. RED COLOR AND SMALL SIZE: cherry fruits.
3. GREEN COLOR AND BIG SIZE: bananas.
4. GREEN COLOR AND SMALL SIZE: grapes.
Clustering Algorithms
1. K-Means Clustering: It divides the data into a specific number of groups or clusters by
minimizing the total squared distances between the data points and the centers of each
cluster.
2. Hierarchical Clustering: It develops a hierarchy of clusters by merging or splitting them
depending on their similarity.
3. Density-Based Spatial Clustering of Applications with Noise: DBSCAN identifies
clusters as dense regions of data points separated by sparser regions.
Update Step: The algorithm updates the position of each centroid as the average
of respective data points belonging to each cluster.
(𝑡+1) 1
𝑚𝑖 = (𝑡)
∑ 𝑥𝑗
|𝑆𝑖 | (𝑡)
𝑥𝑗 ∈𝑆𝑖
• It repeats the process until no centroid moves more than a given threshold.
Example 8
Example 9
Assume eight location points represented by (𝑥, 𝑦):
𝐴1(2, 10), 𝐴2(2, 5), 𝐴3(8, 4), 𝐴4(5, 8), 𝐴5(7, 5), 𝐴6(6, 4), 𝐴7(1, 2), 𝐴8(4, 9)
Assume initial cluster centers are 𝐴1(2, 10), 𝐴4(5, 8) and 𝐴7(1, 2) and the used distance
function is Manhattan distance. Use K-Means Algorithm to find the three cluster centers
after the second iteration.
Answer 9
Calculate the distance of each point from each of the center of the three clusters.
𝑑𝑖𝑠(𝐶2 , 𝐴𝑖 ) = |𝑥𝐶 − 𝑥𝐴 | + |𝑦𝐶 − 𝑦𝐴 |
Iteration 1:
A1(2, 10) 0 5 9 C1
A2(2, 5) 5 6 4 C3
A3(8, 4) 12 7 9 C2
A4(5, 8) 5 0 10 C2
A5(7, 5) 10 5 9 C2
A6(6, 4) 10 5 7 C2
A7(1, 2) 9 10 0 C3
A8(4, 9) 3 2 10 C2
8+5+7+6+4 4+8+5+4+9
( , )
C2 A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A8(4, 9) 5 5
= (6,6)
2+1 5+2
C3 A2(2, 5), A7(1, 2) ( , ) = (1.5,3.5)
2 2
Re-compute the new cluster clusters (The new cluster center is computed by taking the
average of all the points contained in that cluster.)
Iteration 2:
A1(2, 10) 0 8 7 C1
A2(2, 5) 5 5 2 C3
A3(8, 4) 12 4 7 C2
A4(5, 8) 5 3 8 C2
A5(7, 5) 10 2 7 C2
A6(6, 4) 10 2 5 C2
A7(1, 2) 9 9 2 C3
A8(4, 9) 3 5 8 C1
2 + 4 10 + 9
C1 A1(2, 10), A8(4,9) ( , ) = (3,9.5)
2 2
8+5+7+6 4+8+5+4
C2 A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4) ( , ) = (6.5,5.25)
4 4
2+1 5+2
C3 A2(2, 5), A7(1, 2) ( , ) = (1.5,3.5)
2 2
Example 10
1. What are the possible itemsets? Transaction ID Items
T1 Milk, Bread, Coffee, Tea
2. What is the support count for {Milk}, T2 Milk, Bread
T3 Milk, Coffee
{Milk, Bread}, and {Milk, Ketchup}? T4 Bread, Ketchup
T5 Milk, Tea, Sugar
Answer 10
The possible itemsets are:
{Milk} 4 {Ketchup} 1 {Milk, Tea} 2 {Bread, Ketchup} 1 {Milk, Bread, Tea}
Example 11
Suppose you are building a Face recognition program using the following:
1. Supervised Learning (Classification):
• You have a dataset of faces image and other images.
• The dataset is labeled (classified) into faces images, and other images.
• The program task is to classify new images based on the available dataset.
2. Unsupervised learning (Clustering):
• The goal is to infer the natural structure present within a set of data points.
• The program clusters similar data points for a new given dataset into different groups,
e.g., it can distinguish that faces are very different from landscapes, which are very
different from horses.
Apriori Algorithm
• Apriori algorithm is the most used
algorithm for association rule
mining.
• It uses a breadth-first search strategy
to generate frequent item sets and
then generates association rules from
these item sets.
Example 12
Use the Apriori algorithm on the grocery
store data with minimum support 33.34%
and confidence 60%.
Indicate the association rules that are
generated and highlight the strong ones,
sort them by confidence.
Answer 11
The minimum support of 33.34% means that the minimum support count is 2.
Pass (𝒌) Candidate 𝒌-itemsets and their support Frequent 𝒌-itemsets
𝑘=1 HotDogs(4), Buns(2), Ketchup(2), Coke(3), HotDogs, Buns, Ketchup,
Chips(4) Coke, Chips
𝑘=2 {HotDogs, Buns}(2), {HotDogs, Ketchup}(1), {HotDogs, Buns},
{HotDogs, Coke}(2), {HotDogs, Chips}(2), {HotDogs, Coke},
{Buns, Ketchup}(1), {Ketchup, Chips}(1), {Coke, {HotDogs, Chips},
Chips}(3) {Coke, Chips}
𝑘=3 {HotDogs, Coke, Chips}(2) {HotDogs, Coke, Chips}
𝑘=4 {}
Reinforcement Learning
Example 13
Example 14
How can human knowledge of all kinds be represented by a computer language so that the computers can use this
knowledge for purposes of reasoning?
Knowledge-based Systems
Data
• Data refers to factual, discrete, and static things and raw observations of the given
area of interest that are not organized to convey any specific meaning.
• Data can be numbers, letters, figures, sounds, or images.
Information
• It is data within a context (data that has been shaped into a form meaningful and
useful to human beings).
For example,
1. a grade point average (GPA) is data, but
2. a student’s name coupled with his or her GPA is information.
• The recipient interprets the meaning and draws conclusions and implications from
the information.
• Information is only as good as the data from which it is derived otherwise ‘garbage
in, garbage out’ or simple GIGO.
Knowledge
It consists of data and/or information that has been organized and processed to convey
understanding, experience, or accumulated learning.
• For example, a company has found over time that students with a grade point average
over 3.0 have had the most success in its management program.
• Based on its experience, that company may decide to interview only those students with
GPAs over 3.0.
Information Types
Information type Meaning Example
Permanent It never changes, like physical laws The earth moves around the Sun
Static It is constant over a period of time Policies and procedures
Dynamic It is continuously changing Prices of shares and gold
Knowledge Components
1. Facts
2. Rules
3. Heuristics
Facts
• Facts represent (claims) a set of raw observations, alphabets, symbols, or
statements that can be true or false at the time that they are used.
"Fire is hot" tap is open
"The earth moves around the Sun" Joe Bloggs works for ACME
"Every car has a battery"
• The fact has three parts:
1. An object (also called a linguistic object).
2. The value of the linguistic object.
3. An operator to assign a value to the linguistic object (like, for example, is, are, or
mathematical operators)
Example 1
A collection of facts about my car are:
Example 2
An example of a semantic network with an overridden default is given below. Here
attributes are treated in the same way as relationships.
Example 3
Show some examples of multiple antecedents joined by logical operators.
Answer 3
Multiple antecedents combined by AND
IF (antecedent1 AND antecedent2 … AND antecedentN) THEN consequent
IF (the season is winter AND the temperature is <0 degrees AND it is windy) THEN the weather is cold
Example 4
Apply Rule 1 to Fact 1 to get a derived fact.
Fact 1 Joe Bloggs works for ACME
Rule 1 IF ?x works for ACME THEN ?x earns a large salary
Answer 4
Fact 2 Joe Bloggs earns a large salary
Rules Classification
• Some ways to classify the rules are based on their: function, structure, or behavior.
• In terms of structure, rules can be logic, definition, or constraint rules.
Logic Rules
Logic rules are rules with a clearly recognizable IF condition and THEN
conclusion.
• The conclusion of the logic rule changes the value of something in the system.
• Example of logic rule is: IF (x = 1 AND y = 2) THEN z = 3
Definition Rules
Definition rules are rules without any fact (the fact is always true) in the IF part.
Constraint Rules
Constraint rules are rules without any fact in the THEN part.
Note that
• Based on the conclusion or the consequent of a rule, rules can express:
Relation IF (x > 0) THEN (x is positive)
Recommendation IF (it is rainy) THEN take an umbrella
Directive IF (phone battery signals AND phone battery is empty) THEN (charge the phone)
Heuristic IF phone light is off THEN battery is flat
Example 5
The derived fact may satisfy, or partially satisfy, another rule, such as:
Rule 1 IF ?x works for ACME THEN ?x earns a large salary
Rule 2 IF (?x earns a large salary OR ?x has job satisfaction) THEN ?x has
professional contentment
• Rules 1 and 2 are interdependent since the conclusion of one can satisfy the
condition of the other.
Heuristics
▪ They are solutions that experts have employed in similar situations.
1. IF there is a total eclipse of the Sun THEN there is no daylight.
(even though the Sun is in the sky).
2. IF (It is rainy season AND a car was driven through water) THEN (The car
silencer would have water in it AND the car may not start).
Inference Network
Inference network is a network structure that represents the logical relationships
between facts, rules, or pieces of knowledge within a knowledge base .
• Inference network represents a closed world that facilitates the process of drawing
conclusions or making logical inferences based on the available information.
• Each node represents a possible state of some aspect of the world, hence a model
of the current overall state of the world can be maintained.
• Such a model is dependent on the extent of the relationships between the nodes.
• If a change occurs in one aspect of the world, many other nodes could be affected.
• Frame problem: It is the problem of determining what else has been changed in
the world model because of changing one thing.
Example 6
An example of an inference
network is given here.
Example 7
For the inference network of Example 6, assume Joe Bloggs gets a new job. What are the
changes that could happen to the network?
Answer 7
If Joe Bloggs gets a new job, the inference network suggests that the only direct change is his
salary, which could change his professional contentment and happiness.
However, in a more complex model of Joe Bloggs world, many other nodes could also be
affected.
Rule
Example 8
Use the inference network of Example 6 to generate a deduction about Joe Bloggs who
works for ACME and has a stable relationship.
Answer 8
Deduction:
IF Joe Bloggs works for ACME AND is in a stable relationship (the causes) THEN he is happy
(the effect).
Abduction: Given the observation that Joe Bloggs is happy, we can infer that Joe Bloggs
enjoys domestic bliss and professional contentment.
Example 9
A boiler control system produces steam to drive a turbine and generator. Water is heated
in the boiler tubes to produce a steam and water mixture that rises to the steam drum,
which is a cylindrical vessel mounted horizontally near the top of the boiler. The purpose
of the drum is to separate steam from the water. Steam is taken from the drum, passed
through the superheater, and applied to the turbine that turns the generator. Many sensors
are fitted to the drum to monitor the following parameters
1. The temperature of the steam
in the drum.
2. The level of water in the
drum (monitored by the voltage
output from a transducer).
3. The status of the pressure
release valve (open or closed).
4. The water flow rate through
the control valve
Suggest a rule-based system to monitor the state of a power station boiler and to advise
appropriate actions.
Answer 9
Rule 1 IF transducer output is low THEN water level is low
Rule 2 IF water level is low THEN open the control valve
Rule 3 If steam pressure is low THEN start the boiler tubes
Rule 4 IF (temperature is high AND water level is low)
THEN (open control valve AND shutdown boiler tubes)
Rule 5 IF steam outlet is blocked THEN replace the outlet pipe
Rule 6 IF pressure release valve is stuck THEN steam outlet is blocked
Rule 7 IF (temperature is high AND NOT (water level is low)) THEN steam pressure is
high
Rule 8 IF steam pressure is high THEN shutdown boiler tubes
Rule 9 IF (pressure is high AND pressure release valve is closed) THEN pressure release
valve is stuck
Rule 10 IF (pressure release valve is open AND water flow rate is high) THEN steam is
escaping
Rule 11 IF steam is escaping THEN steam outlet is blocked
Rule 12 IF water flow rate is low THEN control valve is closed
• The input data to the system (sensor readings) are low-level facts; higher-level facts
are facts derived from them.
• Rules 2, 3, 4, 5, and 8 give recommendations to the boiler operators.
In a fully automated system, such rules would be able to perform their recommended actions
rather than simply making a recommendation.
• The remaining rules involve taking a low-level fact, such as a transducer reading,
and deriving a higher-level fact, such as the quantity of water in the drum.
• Rule 1 is a low-level rule since it depends on a transducer reading.
• Rule 5 is a high-level rule that uses more abstract information (It relates the
occurrence of a steam outlet blockage to a recommendation to replace a pipe).
• Most of the rules of this system are specific to one boiler arrangement and would
not apply to other situations.
Example 10
RBS has access to the transducer output and the temperature readings of the boiling
control system. What is the applicable rule if the temperature is high, and transducer level
is found to be LOW?
Answer 10
A sensible set of rules to examine would be Rules 1, 4, and 7, as these rules are
conditional on the boiler temperature and transducer output.
Closed-World Assumption
This assumption assumes a proposition is FALSE if we do not know it is TRUE.
Example 11
Consider a rule-based system in a medical diagnosis application.
IF patient has a fever AND patient has a cough, THEN recommend a flu test
Show the rule-firing process of this system.
Answer 11
If the input data indicates that the patient has a fever and cough, the conditions of the rule
are satisfied, and the rule is fired. The system then recommends a flu test.
Inference Chain
• It is the sequence of logical steps or reasoning processes that are followed by an
intelligent system to derive a conclusion or make an inference.
• It represents the flow of rule evaluations and activations that lead to a final
decision or outcome.
Example 12
RBS with the facts A, C, D, Rule 1: IF (A is TRUE AND C is TRUE) THEN B is TRUE
and E and a rule base as given Rule 2: IF (C AND D) THEN F
Answer 12
Inference Engine
1. Forward-chaining (data-driven): Rules are selected and applied in response to the current
fact base.
2. Backward-chaining (goal-driven):
Example 13
Show how forward chaining is applied for the system of Example 12 to conclude Z. The
given facts are A, C, D, and E. If multiple rules can fire at a time, then fire the first rule,
which was not fired before.
Answer 13
Cycle 1:
• Matching for generating the conflict set: Match the IF part of each rule against
facts in the working memory (A, C, D, E).
Rule 1: IF (A AND C) THEN B Yes, both A and C are in the database
Rule 2: IF (C AND D) THEN F Yes, since both C and D are in the database
Rule 3: IF (C AND D AND E) THEN X Yes, since all C, D, and E are in the database
Rule 4: IF (A AND B AND X) THEN Y No, X is not in the database at this moment
Rule 5: IF (D AND Y) THEN Z No, Y is not in the database at this moment
• Conflict resolution: among Rule 1, Rule 2, and Rule 3, select the first one if not
applied earlier. Thus, Rule 1 will be fired first.
• Apply the rule (If new facts are obtained add them to working memory).
The consequent of rule 1 is B which is not in the database, so add the new fact.
• Stop condition: Z has not been reached yet, so go again to the first step.
Cycle 2
Working memory A, B, C, D, E
Conflict set Rule 1, Rule 2, and Rule 3
Conflict resolution Rule 2
Apply the rule The consequent of rule 2 is F which is not in the database, so add the new
fact to the database
Stop (or exit) condition Our conclusion, Z, has not been reached yet
Cycle 3
Working memory A, B, C, D, E, F
Conflict set Rule 1, Rule 2, and Rule 3
Conflict resolution Rule 3
Apply the rule The consequent of rule 3 is X which is not in the database, so add the new
fact to the database
Stop (or exit) condition Our conclusion, Z, has not been reached yet
Cycle 4
Working memory A, B, C, D, E, F, X
Conflict set Rule 1, Rule 2, Rule 3, and Rule 4
Conflict resolution Rule 4
Apply the rule The consequent of rule 3 is Y which is not in the database, so add the new
fact to the database
Stop (or exit) condition Our conclusion, Z, has not been reached yet
Cycle 5
Working memory A, B, C, D, E, F, X, Y
Conflict set Rule 1, Rule 2, Rule 3, Rule 4, and Rule 5
Conflict resolution Rule 5
Apply the rule The consequent of rule 5 is Z which is not in the database, so add the new
fact to the database
Stop (or exit) condition Our conclusion, Z, has been reached yet. Stop
• To backward chain from a goal in the working memory, the inference engine must
follow the steps:
1. Select rules with conclusions matching the goal.
2. Replace the goal by the rule's premises. These become sub-goals.
3. Work backwards until all sub-goals are known to be true. The backtracking takes
place if:
a. The goal cannot be satisfied by the set of rules currently under consideration; or
b. The goal has been satisfied but the user wants to investigate other ways of
achieving the goal (i.e., to find other solutions).
• Example: A backward-chaining system might be presented with the proposition: a
plan exists for manufacturing a widget.
It will then attempt to ascertain the truth of this proposition by generating the plan, or it may
conclude that the proposition is false, and no plan is possible.
• This strategy is appropriate when a more tightly focused solution is required.
Example 14
Show how backward chaining is applied for the system of Example 12 to conclude Z.
The given facts are A, C, D, and E.
Answer 14
Cycle 1
Rule matching the goal The only rule with a conclusion matching the goal is Rule 5.
Rule 5: IF (D AND Y) THEN Z
Rule's premises D is in the database, but we do not have Y. Add Y as a sub-goal
Stop (or exit) condition The sub-goal is not true, so we back-chain again.
New Working memory A, C, D, E Goals Z, Y
Cycle 2
Rule matching the goal Rule 4 has Y as a conclusion.
Rule 4: IF (A AND B AND X) THEN Y
Rule's premises A and B are in the database, but we do not have X. Add B and X as sub-
goals.
Stop (or exit) condition All goals are not true, so we back-chain again
New Working memory A, C, D, E Goals Z, Y, B, X
Cycle 3
Rule matching the goal Rule 3 has X as a conclusion.
Rule 3: IF (C AND D AND E) THEN X
Rule's premises All premises C, D, and E are in the database. Remove X from goals, add it to
the database, and fire Rule 3.
Stop (or exit) condition The remaining goals are not true, so we back-chain again
New Working memory A, C, D, E, X Goals Z, Y, B
Cycle 4
Rule matching the goal Rule 3 has B as a conclusion.
Rule 1: IF (A AND C) THEN B
Rule's premises All premises A and C are in the database. Remove B from goals, add it to the
database, and fire Rule 1.
Stop (or exit) condition The remaining goals are not true, so we back-chain again
Meta-Rules
Meta knowledge is extra knowledge about the knowledge the system possesses to improve
its performance
Meta-rules are rules which are not specifically concerned with knowledge about the
application at hand, but rather with knowledge about how it should be applied.
• Meta-rules define how conflict resolution will be used, and how other aspects of
the system itself will run.
• Meta-rules are “rules about rules” (or more generally, “rules about knowledge”).
• Some examples of meta-rules might be:
Meta-Rule 1: PREFER rules about shutdown TO rules about control valves
Meta-Rule 2: PREFER high-level rules TO low-level rules
Explanation Module
• This module is made in support of RBS to explain its reasoning. This gives users
of the system confidence in the accuracy or wisdom of the system’s decisions.
• The explanation can be divided into two categories:
1. How has the conclusion been derived? (would normally be applied when the system has
completed its reasoning)
2. Why a particular line of reasoning is being followed (It is applicable while the system is
carrying out its reasoning process). This type is appropriate for an interactive intelligent
system, which involves a dialogue between a user and the computer. During such a
dialogue the user will often want to establish why particular questions are being asked.
• If either type of explanation is incorrect or impenetrable, the user is likely to
distrust or ignore the system’s findings.
• Explanation facilities are desirable for increasing user confidence in the system, as
a teaching aid and as an aid to debugging.
• The quality of explanation can be improved by placing an obligation on the rule-
writer to provide an explanatory note for each rule.
Example 15
For the boiler control system, the following would be a typical explanation for a
recommendation to replace the outlet pipe.
Answer 15
Replace outlet pipe
BECAUSE (Rule 3) steam outlet is blocked
pressure is high
BECAUSE (Rule 7) temperature is high AND NOT(water level is low)
Decision Trees
Decision Tree: It is a tree-structured classifier for getting all the possible
solutions to a problem based on given conditions. Each internal node corresponds
to an attribute (feature), and every terminal node corresponds to a class (label).
• Decision tree identifies the best possible course of action using a set of hierarchical
decisions on the features.
• Decision tree is constructed by recursively partitioning the input data into subsets
based on the values of the input variables.
• Decision tree is used for classification of unseen test instances with the use of top-
down traversal from the root to a unique leaf.
• The algorithm stops the growth of the tree based on a stopping criterion.
• The stopping criterion could be:
1. a maximum depth for the tree,
Example 1
Suppose a candidate who has a job offer and wants to decide whether he should Accept
the offer or Not based on three features. Assume the features’ order is Salary (>50000$),
Distance from the office, and Cab facility. Suggest a decision tree for solving this
problem.
Answer 1
The decision tree starts with the root
node (Salary) that splits further into the
next decision node and one leaf node
(Declined offer).
The next decision node further splits
into one decision node and one leaf
node.
Finally, the decision node splits into
two leaf nodes (Accepted offers and
Declined offer).
Example 2
A person will try to decide if he/she should Age Experience Rank Nationality Go
36 10 9 UK NO
go to a comedy show or not based on the 42 12 4 USA NO
23 4 6 N NO
registered information about the comedian. 52 4 4 USA NO
The decision tree can be used to decide if 43 21 8 USA YES
44 14 5 UK NO
any new shows are worth attending to. 66 3 7 N YES
35 14 9 UK YES
52 13 7 N YES
35 5 9 N YES
24 3 5 USA NO
18 3 7 UK YES
45 9 9 UK YES
Example 3
Assume Ali has recorded various attributes of the weather and whether his friend Basel
played tennis or not over two weeks and two weeks.
• For each example, we have five feature values: day, outlook, temperature,
humidity, and wind.
• In fact, Day is not a useful feature since it is different for every example. So, we
will focus on the other four input features.
• Given a data set, we can generate many different decision trees.
Entropy
• Entropy measures the: (all having the same meaning)
1. amount of information contained in a class.
2. class impurity associated with a given attribute.
3. randomness of a given feature.
• It is the highest for a feature with equal probable classes and reduces as some
classes appear more (If entropy is high, the randomness is high).
• In general, entropy for a given feature having 𝑆 classes is:
𝑆 𝑆
1
𝐻 = ∑ 𝑝𝑖 log 2 ( ) = − ∑ 𝑝𝑖 log 2 𝑝𝑖
𝑝𝑖
𝑖=1 𝑖=1
𝑚𝑖
where 𝑝𝑖 = is the probability of the 𝑖 𝑡ℎ class.
𝑀
𝐻𝐹 = ∑ 𝑤𝑠 × 𝐻𝐹:𝑠
𝑠=1
Information Gain
• Information gain measures the change in entropy after the segmentation of a
dataset (𝐷) based on an attribute, 𝐹 (How much uncertainty in 𝐷 was reduced after
splitting it based on attribute 𝐹 ).
• The information gain is:
𝐷𝐹:𝑠 represents the data subset created from splitting the dataset based on the value,
𝑠, of the attribute 𝐹 such that 𝐷 = ⋃𝑠∈𝑆 𝐷𝐹:𝑠 .
|𝐷𝐹:𝑠 |
𝑤𝑠 = |𝐷|
is the proportion of the number of elements in 𝐷𝐹:𝑠 to the number of
elements in 𝐷.
𝐻𝐶:𝐹:𝑠 is the entropy of subset 𝐷𝐹:𝑠 .
𝐺 = 1 − ∑ 𝑝𝑖2
𝑖=1
𝐺𝑠𝑝𝑙𝑖𝑡 = ∑ 𝑤𝑠 × 𝐺𝑠
𝑠=1
Example 4
Suppose a binary classification problem for whether a
person will buy a particular product based on his age
and income. Find the overall Gini impurity for the
first split if ASM chooses age with a threshold of 35.
Answer 4
• The left child node will contain the data where age is less than or equal to 35, and
the right child node will contain the data where age is greater than 35.
• With 𝑎𝑔𝑒 ≤ 35, there are two data points, one of which buys the product, and one
of which does not (the probability of a randomly chosen element being labeled as "Yes" or
"No" is 1/5, 𝑝1=𝑌𝑒𝑠 = 𝑝2=𝑁𝑜 = 0.5). Therefore, the Gini index for this branch is:
𝑖=2
• With 𝑎𝑔𝑒 > 35, there are three data points, two of which buys the product (the
probability of a randomly chosen element being labeled as "Yes" is 2/3, 𝑝1=𝑌𝑒𝑠 = 0.67), and
one of which does not (the probability of a randomly chosen element being labeled as "No"
is 1/3, 𝑝2=𝑁𝑜 = 0.33) Therefore, the Gini index for this branch is:
𝑖=2
Example 5
Assume a dataset of patients with information about their age, gender, blood pressure,
and cholesterol level to identify whether patients have heart disease.
1. Calculate the Gini impurity of the entire dataset.
Answer 5
• Gini impurity of the entire dataset (𝑝1=𝑌𝑒𝑠 = 500 = 0.4) (𝑝2=𝑁𝑜 = 500 = 0.6) is:
200 300
𝑖=2
• Gini impurity of the split, 𝑎𝑔𝑒 ≤ 50 (𝑝1=𝑌𝑒𝑠 = 300 = 0.33) (𝑝2=𝑁𝑜 = 300 = 0.67) is:
100 200
𝑖=2
• The Gini impurity of the split, 𝑎𝑔𝑒 > 50 (𝑝1=𝑌𝑒𝑠 = 𝑝2=𝑁𝑜 = 200 = 0.5) is:
100
𝑖=2
ID3 Steps
1. Calculate the Information Gain of each feature.
2. If all rows of a feature value do not belong to the same class, split the feature into
subsets using the feature value for which the Information Gain is maximum.
3. Fix a decision tree node using the feature with the maximum Information gain.
4. If all rows belong to the same class, make the current node as a leaf node with the
class as its label.
5. Repeat for the remaining features until you run out of all features, or the decision
tree has all leaf nodes.
Example 6
Build the decision tree for the following Outlook Temp Humidity Windy Play Golf
Rainy Hot High FALSE No
Golf dataset. Rainy Hot High TRUE No
Overcast Hot High FALSE Yes
Answer 6 Sunny Mild High FALSE Yes
Sunny Cool Normal FALSE Yes
The entropy of the dataset is: Sunny Cool Normal TRUE No
Overcast Cool Normal TRUE Yes
5 5 9 9
𝐻𝐷 = − ( ) log 2 ( ) − ( ) log 2 ( ) Rainy Mild High FALSE No
14 14 14 14 Rainy Cool Normal FALSE Yes
𝐻𝐷 = 0.94 Sunny Mild Normal FALSE Yes
Rainy Mild Normal TRUE Yes
There are three categories for Outlook: Overcast Mild High TRUE Yes
Overcast Hot Normal FALSE Yes
Sunny, Overcast, and Rainy Sunny Mild High TRUE No
The entropy for each category is:
Outlook No Yes H
2 2 3 3
Sunny 2 3 𝐻𝑂𝑢𝑡𝑙𝑜𝑜𝑘:𝑆𝑢𝑛𝑛𝑦 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0.971
5 5 5 5
0 0 4 4
Overcast 0 4 𝐻𝑂𝑢𝑡𝑙𝑜𝑜𝑘:𝑂𝑣𝑒𝑟𝑐𝑎𝑠𝑡 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0
4 4 4 4
3 3 2 2
Rainy 3 2 𝐻𝑂𝑢𝑡𝑙𝑜𝑜𝑘:𝑅𝑎𝑖𝑛𝑦 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0.971
5 5 5 5
The total entropy for this feature is:
𝑠=𝑆
5 4 5
𝐻𝑂𝑢𝑡𝑙𝑜𝑜𝑘 = ∑ 𝑤𝑠 × 𝐻𝑂𝑢𝑡𝑙𝑜𝑜𝑘:𝑠 = ( ) × 0.971 + ( ) × 0 + ( ) × 0.971 = 0.693
14 14 14
𝑠=1
𝑠=𝑆
The entropy for each category of Humidity (Normal and High) is:
Humidity No Yes H
4 4 3 3
Normal 4 3 𝐻𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦:𝑁𝑜𝑟𝑚𝑎𝑙 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0.985
7 7 7 7
1 1 6 6
High 1 6 𝐻𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦:𝐻𝑖𝑔ℎ = − (7) log 2 (7) − (7) log 2 (7) = 0.592
The total entropy for this feature is:
𝑠=𝑆
7 7
𝐻𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = ∑ 𝑤𝑠 × 𝐻𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦:𝑠 = ( ) × 0.985 + ( ) × 0.592 = 0.7885
14 14
𝑠=1
𝑠=𝑆
The three categories for Temp are: Cool, Mild, and Hot. The entropy for each category is:
Temp No Yes H
1 1 3 3
Cool 1 3 𝐻𝑇𝑒𝑚𝑝:𝐶𝑜𝑜𝑙 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0.811
4 4 4 4
2 2 4 4
Mild 2 4 𝐻𝑇𝑒𝑚𝑝:𝑀𝑖𝑙𝑑 = − (6) log 2 (6) − (6) log 2 (6) = 0.918
2 2 2 2
Hot 2 2 𝐻𝑇𝑒𝑚𝑝:𝐻𝑜𝑡 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 1
4 4 4 4
The total entropy for this feature is:
𝑠=𝑆
4 6 4
𝐻𝑇𝑒𝑚𝑝 = ∑ 𝑤𝑠 × 𝐻𝑇𝑒𝑚𝑝:𝑠 = ( ) × 0.811 + ( ) × 0.918 + ( ) × 1 = 0.91
14 14 14
𝑠=1
𝑠=𝑆
The two categories for Windy: TRUE and FALSE. The entropy for each category is:
Windy No Yes H
3 3 3 3
TRUE 3 3 𝐻𝑊𝑖𝑛𝑑𝑦:𝑇𝑅𝑈𝐸 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 1
6 6 6 6
2 2 6 6
FALSE 2 6 𝐻𝑊𝑖𝑛𝑑𝑦:𝐹𝐴𝐿𝑆𝐸 = − ( ) log 2 ( ) − ( ) log 2 ( ) = 0.811
8 8 8 8
The total entropy for this feature is:
𝑠=𝑆
6 8
𝐻𝑊𝑖𝑛𝑑𝑦 = ∑ 𝑤𝑠 × 𝐻𝑊𝑖𝑛𝑑𝑦:𝑠 = ( ) × 1 + ( ) × 0.811 = 0.892
14 14
𝑠=1
𝑠=𝑆
Outlook Temp Humidity Windy Play Golf Outlook Temp Humidity Windy Play Golf
Rainy Hot High FALSE No Sunny Mild High FALSE Yes
Rainy Hot High TRUE No Sunny Cool Normal FALSE Yes
Rainy Mild High FALSE No Sunny Cool Normal TRUE No
Rainy Cool Normal FALSE Yes Sunny Mild Normal FALSE Yes
Rainy Mild Normal TRUE Yes Sunny Mild High TRUE No
The complete decision tree is:
Tree Pruning
• Pruning prevents overfitting by restricting or reducing tree growth.
• Pruning can be done using pre-pruning or post-pruning.
• Pre-pruning occurs before or during the growth of the tree.
• Post-pruning: It allows the tree to grow as deep as the data will allow, and then
trim (prune) branches that do not effectively change the classification error rates.
1. Advantage: Post-pruning may not miss significant relationships between attribute
values and classes if the tree is allowed to reach its maximum depth.
2. Disadvantage: It requires additional computations, which may be wasted when
the tree needs to be trimmed back.
Rule Induction
• Rule induction is the process of deducing IF-THEN rules from a dataset.
• Decision rules explain an inherent relationship between the attributes and class
labels in a dataset.
• There are two ways:
1. Direct approach: Direct extraction from the dataset. This can be done using:
1. Sequential Covering using Repeated Incremental Pruning to Produce Error
Reduction (RIPPER) algorithm
2. Sequential Covering using Learn-One-Rule.
2. Indirect (Passive) approach: Derived from previously built decision trees from
the same dataset which is the easiest way to extract rules.
Example 7
Induce the set of rules from the decision tree of the Golf dataset of Example 6.
Answer 7
Rule 1 IF (Outlook = Overcast) THEN Play = Yes
Rule 2 IF (Outlook = Rainy) AND (Windy = FALSE) THEN Play = Yes
Rule 3 IF (Outlook = Rainy) AND (Windy 5 TRUE) THEN Play = No
Rule 4 IF (Outlook = Sunny) AND (Humidity = High) THEN Play = No
Rule 5 IF (Outlook = Sunny) AND (Humidity = Normal) THEN Play = Yes
Rule accuracy: It is the ratio of the correct records covered by the rule (𝑁𝑐 ) to all records
covered by the rule (𝑁).
𝑁𝑐
𝑅𝑎𝑐𝑐 =
𝑁
Pruning metric (It evaluates the need for pruning the rule): It calculates the difference
between positive (𝑁𝑝 ) and negative (𝑁𝑛 ) validation records covered by the rule to the
total.
𝑁𝑝 − 𝑁𝑛
𝑃𝑚𝑒𝑡𝑟𝑖𝑐 =
𝑁𝑝 + 𝑁𝑛
RIPPER Steps
1. The algorithm starts with the selection of class labels one by one (The first class is
usually the least-frequent class label)
2. Training stage: Develop all the rules for the selected class.
3. Validation stage: The rule model of the selected class is evaluated with a
validation dataset used for pruning to reduce generalization errors such that:
a. Iteratively remove a conjunct if it improves the pruning metric.
b. Aggregate all rules that identify the class data points to form a rule group.
4. In multi-class problems, steps 2 and 3 are repeated for the next class label.
Learn-One-Rule
1. Learn-one-rule starts with an empty rule condition set:
IF {} THEN first class
Obviously, the accuracy of this rule is the same as the proportion of “first class” data points in
the dataset.
2. Then the algorithm greedily adds conjuncts until the rule accuracy reaches 100%.
If the addition of a conjunct decreases the accuracy, then the algorithm:
a. looks for other conjuncts or
b. stops and starts the iteration of the next rule.
3. After a rule is developed, then all the data points covered by that rule are
eliminated from the dataset and then
4. If some data points are left, start creating the second rule (Step 1) and so on.
Example 8
For the following rule, A, B, C, D are called conjuncts and Y is the class.
IF (A AND B AND C AND D) THEN Y
Discuss how rule pruning is applied for this rule.
Answer 8
Rule pruning first removes conjunct D and measures the metric value.
1. If the quality of the metric is improved conjunct D is removed.
2. If not, then the pruning is checked for CD, BCD and so on.
Example 9
Assume a dataset of two attributes (dimensions) on the
X and Y axis and two-class labels marked by “+” and
“-”. Illustrate how learn-one-rule is applied for this
dataset.
Answer 9
• The least-frequent class is “+”, therefore the
algorithm focuses on generating all rules for “+”
class.
• Learn-One-Rule starts developing the first rule
such that it should cover all “+” data points
using a rectilinear box with none or as few “-”
as possible.
A thermostat agent senses the temperature of a physical system and performs actions to
maintain the temperature near a desired set point (a "closed loop" control device).
Percept
Percept is the agent’s perceptual input at any given instant.
Percept Sequence
It is the complete history of everything the agent has ever perceived.
Example 3
Agent Sensors Actuators
Robotic agent Cameras, infrared range finder, various motors, grippers, wheels,
microphone, accelerometers, … speakers, …
Software agent keystrokes, file contents, Displaying on the screen, writing
network packets, … files, sending network packets, ….
Example 4
Consider a hand-held calculator as an agent. For 2 + 5 = 7,
1. Specify the percept sequence?
Answer 4
1. Percept sequence “2 + 5 =”
2. Action is displaying “7”
Agent Program
Agent program implements an agent function (accepts percepts, combines them with any
stored knowledge (internal state), and returns an action).
Example 5
Assume a two-location vacuum cleaner
1. Specify the percept?
2. What are the possible actions?
3. Write the agent function?
Answer 5
1. The vacuum agent perceives which square it is in and whether there is dirt in it.
Location Cleanliness
A B A B
2. The agent can choose to move left, move right, suck up the dirt, or do nothing.
Example 6
What will happen if the set of actions is Left, Right, and Suck?
Answer 6
• Once all the dirt is cleaned up, the agent will oscillate needlessly back and forth,
• If the performance measure includes a penalty of one point for each movement left
or right, the agent will fare poorly.
Example 7
List the set of actions for a thermostat agent.
Answer 7
The actions are turning the heat ON or turning the heat OFF or taking NO action.
• Factored state:
✓ Each state is defined by a set of features and each of
which has a value.
✓ Example: GPS location, amount of gas in the tank.
• Structured state:
✓ Each state is expressed in the form of objects and relations between them.
✓ Example: Natural language processor
Example 8
How many possible states for the two-square
vacuum cleaner.
Answer 8
The agent is in one of two locations, each of
which might or might not contain dirt.
𝑃𝑠 = 𝑛 × 2𝑛 = 2 × 22 = 8
Environment Types
Fully observable: vs Partially
can access the observable: can
complete state of observe a subset of
the environment the environment
at each point in due to noisy,
time inaccurate or
incomplete sensor
data.
Deterministic: the next state of vs Stochastic: the next state of the environment is
the environment is completely random in nature which is not unique and cannot
determined by the current state be completely determined by the agent.
and the agent’s action A partially observable environment can appear to
be stochastic
(Strategic: the environment is deterministic
except for actions of other agents)
Episodic: agent’s experience is divided into vs Sequential: Previous and
independent, atomic episodes in which the agent current decision affects all
perceives and performs a single action in each episode. future decisions
Static: The environment is unchanged vs Dynamic: environment keeps changing
while an agent is deliberating (the agent itself when the agent is up with some
does not need to keep sensing while deciding
Known: the output for all probable vs unknown: the agent should gain knowledge
actions is given. about how the environment works.
Single agent: An agent operating by vs Multiagent: many agents affect each other’s
itself in an environment. performance measure.
Example 9
Pick and Place robot is used to detect defective parts from the conveyor belts.
Environment Description
Agent Deterministic Episodic / Static / Discrete / Competitive /
Observable Agents
/ stochastic sequential dynamic continuous Collaborative
Crossword
fully deterministic sequential static discrete Competitive single
Puzzle
Deterministic:
The board is Discrete: it Competitive:
only a few
fully has only a the agents
possible moves
observable, finite compete to win
Chess at the current sequential static single
and so are the number of the game
state and these
opponent’s moves for which is the
moves can be
moves each game output
determined
Self-Driving Environment Stochastic: the sequential Dynamic: it Continuous Collaborative: multi
Performance Measure
1. It is an objective criterion for the success of an agent's behavior.
2. There is no fixed performance measure for all tasks and agents.
3. Intelligent agents are supposed to maximize their performance measure.
Example 10
List some performance measures for the vacuum cleaner.
Answer 10
1. Amount of dirt cleaned up,
2. Amount of time taken,
3. Amount of electricity consumed,
4. Amount of noise generated.
Example 11
Discuss the following two performance measures for vacuum cleaner
1. “The amount of dirt cleaned up in a single eight-hour shift.”
Answer 11
1. “The amount of dirt cleaned up in a single eight-hour shift.”
A rational agent can maximize this performance measure by cleaning up the dirt,
then dumping it all on the floor, then cleaning it up again, and so on.
2. “Clean floor: average cleanliness over time.”
This rewards the agent for having a clean floor. For example, one point could be
awarded for each clean square at each time step (perhaps with a penalty for
electricity consumed and noise generated).
Example 12
What is the PEAS description for an automated taxi driver?
Answer 12
Performance
Agent Environment Actuators Sensors
measure
Cameras, sonar,
Safe, fast, legal, Roads, other traffic, Steering, accelerator,
speedometer, GPS,
Taxi driver comfortable trip, pedestrians, brake, signals, horn,
odometer, accelerometer,
maximize profits customers display
engine sensors, keyboard
Example 13
Which is a more complex problem, an automated vacuum cleaner or an automated taxi
driver? Why?
Answer 13
An automated taxi driver because there is no limit to the novel combinations of
circumstances that can arise.
Rational Agent
• Rational agent is one that does the right thing—conceptually speaking, every entry
in the table for the agent function is filled out correctly.
• The “right thing” can be specified by a performance measure defining a
numerical value for any environment history.
• Rational agent will choose actions to maximize some performance measure.
Rational Action
Whichever action maximizes the expected value of the
performance measure given the percept sequence to date.
▪ The only mitigating feature of this bleak environment is the possibility of finding a
heap of gold.
PEAS Description
Agent Architectures
Reactive agent The decision-making is implemented in some form of direct mapping
from situation to action.
Logic-based agent The decision about what action to perform is made via logical deduction.
Belief-Desire- The decision-making depends upon the manipulation of data structures
Intention agents representing the beliefs, desires, and intentions of the agent.
Layered The decision-making is realized via various software layers, each of
architecture which is explicitly reasoning about the environment at different levels of
abstraction.
Agent Types
Agents can be grouped into five classes based on their degree of perceived intelligence
and capability.
• Simple Reflex Agent
• Model-based Agent.
• Goal-based Agent.
• Utility-based Agent.
• Learning Agent.
• This agent is rational only if a correct decision is made based on current precepts.
• Example:
✓ Robotic vacuum cleaner that deliberates in an infinite loop, each percept
contains a state of a current location [clean] or [dirty] and accordingly it decides
the action whether to [suck] or [continue moving].
𝑓(𝑥) = 𝑥 2
Abstraction
• Abstraction is a process of simplification by removing detail from a representation
and replacing it with concepts.
• For example, King Khalid University, without saying its position, state, or country.
Model-based Agent
• Model-based agent maintains an internal state via a model of the world to choose
the actions.
1. Model: The knowledge about “how things happen in the world”.
2. Internal State represents percept history which is the history of all that an agent
has perceived to date.
3. Model-based agent needs memory for storing the percept history.
• Model-based agent can handle partially observable environments by keeping
track of the part of the world it cannot see now (using a model about the world).
• Example: self-steering mobile vision where it is necessary to check the percept
history to fully understand how the world is evolving.
• To update the state, the model requires information about −
✓ How the world evolves independently of the agent.
✓ How the agent’s actions affect the world.
Goal-based Agent
• Goal-based agent has a goal and a strategy to reach that goal.
• The agent program combines the goal information with the environment model to
choose the action that improves the progress towards the goal (not necessarily the
best one).
• Goal-based agent is proactive, not reactive in its decision-making.
• Two important aspects for goal-based agents are searching and planning.
• Example:
✓ GPS system to find a path to a certain destination.
✓ Any search robot that has an initial location and wants to reach a destination.
Utility-based Agent
Agent Learning
The idea behind learning is that percepts should be used not only for acting, but also for
improving the agent's ability to act in the future:
• Learning is essential for unknown environments, i.e., when designer lacks
omniscience.
• Learning is useful as a system construction method, i.e., exposing the agent to
reality rather than trying to write it down.
• Learning modifies the agent's decision mechanisms to improve performance.
Learning Agent
• Learning agent learns from its past experiences to improve its performance and has
learning capabilities via machine learning techniques.
• Critic: It is designed to tell the learning element how well the agent is doing with
respect to a fixed performance standard.
1. The critic employs a fixed standard of performance which is necessary because the
percepts themselves do not indicate the agent's success.
2. For example, a chess program may receive a percept indicating that it has
checkmated its opponent, but it needs a performance standard to know that this is
a good thing; the percept itself does not say so.
3. It is important that the performance standard is a fixed measure that is
conceptually outside the agent. Otherwise, the agent could adjust its performance
standards to meet its behavior.
• Performance element: It is responsible for selecting and executing external actions (It
decides what actions to take) based on the information from the learning element.
• Problem generator: It is responsible for suggesting actions that will lead to new and
informative experiences for the learning element to improve its performance.
Important Terms
1. States: The possible world states, 𝑆 = {𝑠1 , 𝑠2 , 𝑠3 , … }
2. Initial state: s0
3. Actions: 𝐴 = {𝑎1 , 𝑎2 , 𝑎3 , … }
Given a state 𝑠, 𝐀𝐂𝐓𝐈𝐎𝐍𝐒(𝐬) returns the set of actions that can be executed for a
state 𝐬. We say that each of these actions is applicable in 𝑠.
4. Transition model (𝝆): This model describes what each action does for transiting
the agent from one state to another (Actions cause transitions between states).
𝜌: 𝑆 × 𝐴 → 𝑆
• Transition model is specified by a function 𝐑𝐄𝐒𝐔𝐋𝐓(𝐬, 𝐚) that returns the
state that results from doing action 𝐚 in state 𝐬.
• 𝐑𝐄𝐒𝐔𝐋𝐓(𝐬, 𝐚) means the agent knows the consequences of its actions.
5. Successors: 𝑠𝑢𝑐(𝐬) is the set of states reachable from a given state, 𝐬, by a single
action (Each action changes the state).
6. StepCost(s, a, s ′ ): It is the cost of taking action 𝐚 in state 𝐬 to reach state 𝐬 ′ .
7. Path (𝑷): It is a sequence of states connected by a sequence of actions from one
𝑎1 𝑎2 𝑎3 𝑎𝑁
state to another; 𝑠0 → 𝑠1 → 𝑠2 → … → 𝑠𝑁 : such that 𝑠𝑁 is a goal state.
𝑃 = [𝑠0 𝑎1 𝑠1 𝑎2 𝑠2 … 𝑎𝑁 𝑠𝑁 ] ∀𝑖 ∈ {1, … , 𝑁} 𝜌(𝑠𝑖−1 , 𝑎𝑖 ) = 𝑠𝑖
8. Goal test (𝑮) Function: It determine whether a given state is a goal state or not
(𝐺: 𝑆 → 𝑏𝑜𝑜𝑙).
9. Path cost: It is a function that assigns a cost to each path (The cost of a path is the
sum of the costs of individual actions along the path).
10. Solution: It is the sequence of actions and states the agent takes from the initial
state to the final (goal) state.
• The solution quality is measured by the path cost function.
11. Optimal solution: It is the solution that has the lowest path cost among all
solutions.
12. Search: The process of looking for a solution sequence, involving a systematic
exploration of alternative actions.
13. State accessibility: It describes if the agent can determine via its sensors in which
state it is or not.
14. State space: the set of all states reachable from the initial state by any sequence of
actions. It if defined be initial state, actions, and transition model.
Example 1
Water Jug Problem: You have a two-gallon jug and a one-gallon jug; neither have any
measuring marks on them at all. Initially both are empty. You need to get exactly one
gallon into the two-gallon jug. Formulate this problem.
Answer 1
• A state is defined by the content of each jug, 𝑠 = (2_𝑔𝑎𝑙𝑙𝑜𝑛 𝑗𝑢𝑔, 1_𝑔𝑎𝑙𝑙𝑜𝑛 𝑗𝑢𝑔).
• 𝑺 = {0,1,2} × {0,1} = {(0,0), (1,0), (2,0), (0,1), (1,1), (2,1)}
• Initial state is 𝑠0 = (0,0)
• Goal: 𝐺 = {(1,0), (1,1)}
• 𝑨 = {𝑓2, 𝑓1, 𝑒2, 𝑒1, 𝑡21, 𝑡12} : 𝑓2 fill jug 2, 𝑒2 empty jug 2, 𝑡21 transfer one
gallon of 2_gallon jug to 1_gallon jug.
• 𝝆 is given by the following diagram and table.
A graphical view of the transition function (initial state shaded, goal states outlined bold):
• Path cost is the number of actions in the path.
• There are an infinite number of solutions. Example solutions are:
[𝑓1, 𝑓2, 𝑒2, 𝑡12] [𝑓1, 𝑒1, 𝑓2, 𝑡21, 𝑡12, 𝑓1, 𝑒2, 𝑡12] [𝑓2, 𝑡21]
Example 2
Problem States Actions
8-puzzle Tile configurations Up, Down, Left, Right
8-queens Partial board configurations Add queen, remove queen
(incremental formulation)
8-queens Board configurations Move queen
(complete-state formulation)
TSP Partial tours Add next city, pop last city
• A complete state formulation starts with all 8 queens on the board and moves them around
Problem’s Types
There are four essentially different types of problems.
• Single state problem.
• Multiple state problem.
• Contingency problem.
• Exploration problem.
Contingency Problem
State State is unknown in advance, may depend on the outcome of
actions and changes in the environment
State accessibility Some essential information may be obtained through sensors only
at execution time
Consequences of actions Consequences of action may not be known at planning time
Goal Instead of single action sequences, there are trees of actions
Prediction Exact prediction is impossible: It is impossible to define a complete
sequence of actions that constitute a solution in advance because information
about the intermediary states is unknown.
Examples Nondeterministic and/or partially observable problems
Exploration Problem
State The set of possible states may be unknown
State accessibility Some essential information may be obtained through sensors only
at execution time
Consequences of actions Consequences of actions may not be known at planning time
Goal Goal cannot be completely formulated in advance because states
and consequences may not be known at planning time
Prediction Effects of actions are unknown
Examples For problems with unknown state space.
Example 3
1. What are ACTIONS(𝑠5 ) and ACTIONS(𝑠6 )?
2. What is 𝐑𝐄𝐒𝐔𝐋𝐓𝐒(𝐬𝟓 , 𝐫𝐢𝐠𝐡𝐭)?
3. What are 𝑠𝑢𝑐(𝐬𝟓 ) and 𝑠𝑢𝑐(𝐬𝟔 )?
𝑠5 𝑠6
Answer 3
This simplest case is called a single-state problem.
𝐀𝐂𝐓𝐈𝐎𝐍𝐒(𝐬𝟓 ) = {right, suck}
𝐀𝐂𝐓𝐈𝐎𝐍𝐒(𝐬𝟔 ) = {left, suck}
𝐑𝐄𝐒𝐔𝐋𝐓𝐒(𝐬𝟓 , 𝐫𝐢𝐠𝐡𝐭) = 𝑠6
𝑠8
𝑠𝑢𝑐(𝐬𝟓 ) = {𝐬𝟓 , 𝐬𝟔 }
𝑠𝑢𝑐(𝐬𝟔 ) = {𝐬𝟓 , 𝐬𝟖 }
Example 4
What is the state space of the
Vacuum World domain if
Actions are: Left, Right, and
Suck?
Navigation Problem
Given an initial and a goal state(s) defined in the same environment, a system should use
its knowledge (prior knowledge if available, or accumulated knowledge) to plan and execute
a feasible trajectory from a start to a goal state.
Example 5
Draw the Problem Space Graph of navigation in Romania.
Example 6
Assume you are in Arad and you want to reach Bucharest
What is a state? 𝑠 = IN(city)
What is an action? a = GO(city)
What is the initial state? s0 = IN(Arad)
The applicable actions are:
What is 𝐀𝐂𝐓𝐈𝐎𝐍𝐒(𝐈𝐍(𝐀𝐫𝐚𝐝))
{Go(Sibiu), Go(Timisoara), Go(Zerind)}
What is the goal? s𝑁 = {IN(Bucharest)}
Transition model RESULT(IN(Arad), Go(Zerind)) = IN(Zerind)
Successors (𝐈𝐍(𝐀𝐫𝐚𝐝)) {IN(Sibiu), IN(Timisoara), IN(Zerind)}
Example 7
Calculate the path cost of [Oradea → Sibiu → Fagaras → Bucharest] if the step cost is:
1. 1 unit per step.
2. Measured in Km?
Answer 7
1. Path cost = 1+1+1=3 units
2. Path cost = 151+99+211=361 Km
Search Process
Search is the process of examining different possible sequences of actions that lead to
goal state(s) and then choosing the best one.
Search Terminologies
• Problem Space − It is the environment in which the search takes place. (A set of
states and set of operators to change those states)
• Problem Instance − It is Initial state + Goal state.
• Depth of a problem − Length of the shortest path or the shortest sequence of
operators from initial state to goal state.
• Space Complexity − The maximum number of nodes that are stored in memory.
• Time Complexity − The maximum number of nodes that are created.
• Search cost: It is the time and storage requirements to find a solution.
Example 8
For 8-puzzle problem, specify
the problem components?
Answer 8
1. States: Tile configurations (location of blank and location of the 8 tiles)
2. Initial State: Initial configuration of the puzzle.
3. Goal formulation: as shown in the figure.
4. Goal test: Match the given state to the Goal state
5. Actions: move blank to Left, Right, Up, or Down.
6. Transition model: Move one tile to the blank. This will move the blank.
7. Path Cost: The total cost is the length of path as each step costs 1 unit.
Example 9
The 5-Queens problem requires arranging 5 queens on
an 5 × 5 (chess) board such as the queens do not attack
each other. Specify the problem components.
Answer 9
1. States: 0 to 5 queens arranged on the chess board.
2. State is 𝑠 = (𝑝1 , 𝑝2 , 𝑝3 , 𝑝4 , 𝑝5 ) where 𝑝1 is the position of the queen in column 1.
Example: 𝑠𝑖 = (3,6,0,0,0): There are two queens in row 3 and row 6.
3. Initial state: No queen on board, 𝑠0 = (0,0,0,0,0).
4. Goal formulation: A configuration where no queen attacks another.
5. Goal test: 5 queens on the board such that no queen attacks another.
6. Actions: place a queen on an empty square.
7. Transition model: place a queen on an empty square such that no queen attacks
another.
8. Step cost: 0 (we are only interested in the solution). Note that every goal state is
reached after exactly 5 actions.
Answer 10
1. States: A state is described by:
(𝑀𝐿 , 𝐶𝐿 , 𝐵) 𝑴𝑳 : Number of missionaries on the left bank.
𝑪𝑳 : Number of cannibals on the left bank.
𝑩: Location of boat (𝐿, 𝑅).
2. Initial state: (3,3, 𝐿).
3. Goal: (0, 0, R)
4. Goal test: All missionaries are safe in the right bank.
5. Operator: A move is represented by the number of missionaries and the number
of cannibals taken in the boat at one time. There are 5 possible combinations:
(2 Missionaries, 0 Cannibals)
(1 Missionary, 0 Cannibals)
(1 Missionary, 1 Cannibal)
(0 Missionary, 1 Cannibal)
(0 Missionary, 2 Cannibals)
6. Transition model: move the boat with someone on it.
7. Path cost: The number of crossings.
Example 11
What is the Goal test for chess?
Answer 11
The goal is to reach a state called “checkmate,” where the opponent’s King is under
attack and cannot escape.
Example 12
How to reach 5 from 4 of Knuth problem?
Answer 12
The problem definition is very simple:
▪ States: Positive numbers.
▪ Initial state: 4.
▪ Goal: 5
▪ Goal test: State if we reach the desired positive integer.
▪ Actions: Apply factorial, square root, or floor operation (factorial for integers
only).
▪ Transition model: As given by the mathematical definitions of the operations.
▪ Path cost: number of factorials and square roots.
Fuzzy Sets
A set is a Many that allows itself to be thought of as a One. Georg Cantor.
Example 2
Suppose we want to represent the following with the classical set theory
• Intelligent students in a class. • Comfortable houses.
• Tall persons. • Temperature.
• Healthy person.
Example 3
Comparison between bipolar and MOS technology Bipolar MOS
Integration Low Very high
is fuzzy Power High Low
Cost Low Low
Universe of Discourse
▪ The universe of discourse, 𝑋, is the space of all elements which can be either
continuous or discrete.
▪ Any fuzzy set 𝐴 defined on a universe of discourse 𝑋 is a subset of that universe.
trapezoidal bell-shaped
Example 4
Represent the group of young people using crisp and fuzzy sets.
Example 5
Triangle membership function in
graphical form and mathematical
form is:
Example 6
Identity function of a crisp set and membership function of fuzzy Graphical representation of a crisp set and a
set fuzzy set
Example 7
A person of height 1.79m would belong to both tall and short fuzzy sets with a particular degree of
membership.
Important Terminology
▪ Height of 𝑨 [ℎ(𝐴)]: it is the upper bound of the codomain of its membership
function.
▪ Support of A [𝑠𝑢𝑝𝑝(𝐴)]: is the set of elements of 𝑋 belonging to at least some 𝐴.
𝑠𝑢𝑝𝑝(𝐴) = {𝑥 ∈ 𝑋 ∶ 𝜇𝐴 (𝑥) > 0}
▪ Kernel of A: It is the set of elements of 𝑋 belonging entirely to 𝐴.
𝑘𝑒𝑟𝑛𝑒𝑙(𝐴) = {𝑥 ∈ 𝑋 ∶ 𝜇𝐴 (𝑥) = 1}
▪ 𝜶-cut of A: It is the classical subset of elements with a membership degree greater
than or equal to 𝛼.
𝐴𝛼 = {𝑥 ∈ 𝑋 ∶ 𝜇𝐴 (𝑥) ≥ 𝛼}
Example 8
Example 9
Find 𝐴 ∪ 𝐵, 𝐴 ∩ 𝐵, and
𝐴𝑐 ?
Example 10
𝑋 = {1,2,3,4}
𝐴 = {(1,0.4), (2,0.6), (3,0.7), (4,0.8)}
𝐵 = {(1,0.3), (2,0.65), (3,0.4), (4,0.1)}
Example 11
𝑋 = {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 }
𝐴 = {(𝑥1 , 0.2), (𝑥2 , 0.7), (𝑥3 , 1)}
𝐵 = {(𝑥1 , 0.5), (𝑥2 , 0.3), (𝑥3 , 1), (𝑥4 , 0.1)}
Example 12
𝑋 = {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 }
𝐴 = {(𝑥1 , 0.2), (𝑥2 , 0.7), (𝑥3 , 1)}
𝐵 = {(𝑥1 , 0.5), (𝑥2 , 0.3), (𝑥3 , 1), (𝑥4 , 0.1)}
Disjoint Sets
Two fuzzy sets 𝐴, and 𝐵 are disjoint if and only if: (all the following are equivalent)
∀𝑥 ∈ 𝑋 ∶ 𝜇𝐴 (𝑥) = 0 ∨ 𝜇𝐵 (𝑥) = 0 Every element has zero membership value to A or B
∀𝑥 ∈ 𝑋 ∶ min(𝜇𝐴 (𝑥), 𝜇𝐵 (𝑥)) = 0 For every element, the minimum membership value is zero
∄𝑥 ∈ 𝑋 ∶ 𝜇𝐴 (𝑥) > 0 ∧ 𝜇𝐵 (𝑥) > 0 No element has positive membership value to both sets
Linguistic Variables
• Linguistic variables are variables whose values are words or sentences in a natural
or artificial language.
• Example: Linguistic variable: speed - Linguistic values: slow, medium, fast
Example 13
Linguistic Hedges
• Linguistic hedges are special linguistic terms by which other linguistic terms are
modified.
• Hedges modify the shape of fuzzy sets, fuzzy truth values, fuzzy probabilities, or
fuzzy predicates.
• Hedges include adverbs such as very, somewhat, more or less, fairly, and slightly.
Example 14
Use hedge “very” for the proposition “x is tall” and “x is short”
Answer 14
This proposition may be modified by the
hedge “very” in any of the following
three ways:
“x is very tall is true”
“x very short”
Example 15
The hedge, shape function, and graphical representations of some hedges in FL are.
Little Slightly Very Extremely
1.3 1.7 2 3
ℎ𝑙𝑖𝑡𝑡𝑙𝑒 (𝑥) = (𝜇(𝑥)) ℎ𝑠𝑙𝑖𝑔ℎ𝑡𝑙𝑦 (𝑥) = (𝜇(𝑥)) ℎ𝑣𝑒𝑟𝑦 (𝑥) = (𝜇(𝑥)) ℎ𝑒𝑥𝑡𝑟𝑒𝑚𝑒𝑙𝑦 (𝑥) = (𝜇(𝑥))
Fuzzy Logic
Fuzzy Logic is a form of multi-valued logic derived from fuzzy set theory to deal with
reasoning that is approximate rather than precise (resembles human reasoning).
• Human decision making includes a range of possibilities between YES and NO,
such as: CERTAINLY YES, POSSIBLY YES, CANNOT SAY, POSSIBLY NO,
CERTAIN LY NO.
Fuzzy Rules
• Fuzzy rules are a collection of linguistic statements (IF-THEN rules) that describe
how the FLS should decide regarding classifying an input or controlling an output.
IF (temperature is high AND humidity is high) THEN room is hot
IF wind is strong THEN sailing is good.
IF project duration is long THEN completion risk is high.
IF speed is slow THEN stopping distance is short.
• The number of fuzzy rules required is dependent on:
1. The number of variables, 2. The number of fuzzy sets, and
3. The ways in which the variables are combined in the fuzzy rule conditions.
• If a fuzzy rule has multiple antecedents, the fuzzy operator (AND or OR) is used to
obtain a single number that represents the result of the antecedent evaluation.
This number (the truth value) is then applied to the consequent membership function.
Example 1
Assume a fuzzy system with the following fuzzy sets.
What is the fuzzifier output if the temperature is 350°C and the water level is 1.2m.
Answer 1
• The temperature, 350°C, is a member of both fuzzy sets high and medium.
• The possibility that the temperature is high is 𝜇𝐻𝑇 = 0.75 and the possibility that
the temperature is medium is 𝜇𝑀𝑇 = 0.25.
• The water level of 1.2m is a member of both fuzzy sets low and medium.
• The possibility that the water level is low is 𝜇𝐿𝑊 = 0.6 and the possibility that the
water level is medium is 𝜇𝑀𝑊 = 0.4.
Example 2
• The common technique for de-fuzzifying is Centroid technique which takes the
output distribution and finds its center of mass to come up with one crisp number.
• Centroid technique [Center of gravity (COG)] finds the point where a vertical
line would slice the aggregate set into two equal masses.
𝑏
∑𝑏𝑥=𝑎 𝑥 𝜇𝐴 (𝑥)
∫𝑥=𝑎 𝑥 𝜇𝐴 (𝑥)𝑑𝑥
𝐶𝑂𝐺 = 𝑏 = 𝑏
∫𝑥=𝑎 𝜇𝐴 (𝑥)𝑑𝑥 ∑𝑥=𝑎 𝜇𝐴 (𝑥)
Example 3
Assume the following fuzzy control system.
Rule 1: IF project funding is adequate OR project staffing is small THEN risk is low
IF x is A3 OR y is B1 THEN z is C1
Rule 2: IF project funding is marginal AND project staffing is large THEN risk is normal
IF x is A2 AND y is B2 THEN z is C2
Rule 3: IF project funding is inadequate THEN risk is high
IF x is A1 THEN z is C3
Show how the Mamdani system works if the inputs are project funding (x1), and project
staffing (y1).
Answer 3
Step 1: Fuzzification
1. Take the crisp inputs, project funding (x1), and project staffing (y1)
2. Determine the degree to which these inputs belong to each of the appropriate
fuzzy sets.
3. The fuzzified inputs are: 𝜇(𝑥=𝐴1) = 0.5, 𝜇(𝑥=𝐴2) = 0.2, 𝜇(𝑦=𝐵1) = 0.1, and
𝜇(𝑦=𝐵2) = 0.7
Step 4: Defuzzification
Divide the aggregated curve
into slots and then apply
COG.
∑𝑏𝑥=𝑎 𝑥 𝜇𝐴 (𝑥)
𝐶𝑂𝐺 = 𝑏
∑𝑥=𝑎 𝜇𝐴 (𝑥)
Example 4
Assume a fuzzy system with the following rules.
Rule 1: IF temperature is high THEN pressure is high
Rule 2: IF temperature is medium THEN pressure is medium
Rule 3: IF temperature is low THEN pressure is low
Rule 4: IF temperature is high AND water level is NOT low THEN pressure is high
What is the possibility of the pressure variable if the measured temperature is 350°C and
the water level is 1.2m.
Answer 4
Step 1: Fuzzification
• The temperature, 350°C, is a member of both fuzzy sets high and medium (𝜇𝐻𝑇 =
0.75, 𝜇𝑀𝑇 = 0.25).
• For a water level of 1.2m, the possibility that the water level is low is 𝜇𝐿𝑊 = 0.6
and the possibility that the water level is medium is 𝜇𝑀𝑊 = 0.4.
• The possibility of the water level not being low is: 𝜇̅̅̅̅̅
𝐿𝑊 (1.2𝑚) = 1 −
• Thus, the possibility that the pressure is high, 𝜇𝐻𝑃 = 0.4, if it has not already been
set to a higher value.
Example 6
Show how to build a
rule base for a simple
Air Conditioner FLS
that control the AC by comparing the room temperature and the target temperature value.
Answer 6
• Typically, the air conditioner has a fan which blows/cools/circulates fresh air and
has a cooler which is under thermostatic control.
• The amount of air being compressed is proportional to the ambient temperature.
Step 4: Build a set of rules into the knowledge base in the form of IF-THEN structures.
Rule 1: IF (temperature is very_cold AND target is cold) THEN heat
Rule 2: IF ((temperature is very_cold OR temperature is cold) AND target is warm) THEN heat
Rule 3: IF ((temperature is very_cold OR temperature is cold OR temperature is warm) AND
target is hot) THEN heat
Rule 4: IF ((temperature is very_cold OR temperature is cold OR temperature is warm OR
temperature is hot) AND target is very_hot) THEN heat
Rule 5: IF (temperature is very_hot AND target is hot) THEN cool
Rule 6: IF ((temperature is very_hot OR temperature is hot) AND target is warm) THEN cool
Rule 7: IF ((temperature is very_hot OR temperature is hot OR temperature is warm) AND
target is cold) THEN cool
Rule 8: IF ((temperature is very_hot OR temperature is hot OR temperature is warm OR
temperature is cold) AND target is very_cold) THEN cool
Rule 9: IF ((temperature is hot OR temperature is very_hot) AND target is warm) THEN cool
Rule 10: IF temperature is very_cold AND target is very_cold THEN nochange
Rule 11: IF temperature is cold AND target is cold THEN nochange
Rule 12: IF temperature is warm AND target is warm THEN nochange
Rule 13: IF temperature is hot AND target is hot THEN nochange
Rule 14: IF temperature is very_hot AND target is very_hot THEN nochange
Answer 7
1. The room temperature may be defined with five fuzzy sets, cold, cool, pleasant,
warm, and hot.
2. The corresponding speeds of the motor controlling the fan on the air-conditioner
have five graduations: minimal, slow, medium, fast, and blast fuzzy sets.
Example 8
What is the output if the air-conditioner is required to operate at 16𝑜 𝐶?
Answer 8
1. Fuzzification: 16𝑜 𝐶 corresponds to Cool/Pleasant fuzzy sets, 𝜇𝐶𝑜𝑜𝑙 = 0.3, and
𝜇𝑃𝑙𝑒𝑎𝑠𝑎𝑛𝑡 = 0.3.
2. Inference: Check the rules which contain the above linguistic values. Rule 2 and
rule 3 will be fired. The clipped outputs of the speed fuzzy variable are 𝜇𝑆𝑙𝑜𝑤 =
0.3, and 𝜇𝑀𝑒𝑑𝑖𝑢𝑚 = 0.3.
3. Composition: Create new membership function of the alpha levelled functions for
Cool and Pleasant.
4. Defuzzification: Examine the fuzzy sets of Slow and Medium and obtain a speed
value.
∑𝑏𝑥=𝑎 𝑥 𝜇𝐴 (𝑥) (20 + 30 + 40 + 50) × 0.3
𝐶𝑂𝐺 = 𝑏 = = 35
∑𝑥=𝑎 𝜇𝐴 (𝑥) 4 × 0.3