0% found this document useful (1 vote)
200 views14 pages

AI Programming CAT 1 N CAT 2 Muchiri

1. Calculate the entropy of the entire dataset to measure initial uncertainty. 2. Calculate the information gain of each feature by measuring the reduction in entropy. 3. Select the feature with the highest information gain to split the dataset on. 4. Recursively split the dataset on this feature and calculate information gain of remaining features at each node, until reaching pure nodes with only one class. 5. Prune the tree to avoid overfitting if needed. The final tree can then be used to classify new data.

Uploaded by

Jay Kibe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
200 views14 pages

AI Programming CAT 1 N CAT 2 Muchiri

1. Calculate the entropy of the entire dataset to measure initial uncertainty. 2. Calculate the information gain of each feature by measuring the reduction in entropy. 3. Select the feature with the highest information gain to split the dataset on. 4. Recursively split the dataset on this feature and calculate information gain of remaining features at each node, until reaching pure nodes with only one class. 5. Prune the tree to avoid overfitting if needed. The final tree can then be used to classify new data.

Uploaded by

Jay Kibe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIVERSITY EXAMINATIONS; 2022/ 2023

20/04672
BSD 3102 ARTIFICIAL INTELLIGENCE PROGRAMMING

ASSIGNMENT 1

Explain the following as applied in LISP programming, giving an example of each in each
(5 marks)

LISP is a list processing programming language. It is widely used in the manipulation of data strings. It provides
an input and output library.

LISP provides a macro system and provides well control structures for the manipulation of data.

i. Atoms:

A fundamental data element in LISP that stands for a single value is called an atom. Atoms can be expressed as
boolean values, symbols, or numbers. They are not dissectable into smaller parts.

For instance, in LISP, numbers can be represented as atoms. For example, the number 5 is represented by the
atom 5.

ii. Strings:

A string in LISP is a series of characters encircled by double quote marks. Textual data is represented by it.

Example: The double quote notation in LISP may be used to define a string. As an illustration, the string
"Hello, World!" stands in for the words "Hello, World!"

iii. Lists:

A list is a basic data structure in LISP that consists of an enumeration of components surrounded by parenthesis.
An atom, a string, or another list can be any element in the list.

Example: In LISP, parentheses may be used to build a list. A list is represented by the expression (1 2 3), which
is a list comprising the integers 1, 2, and 3.
Explain the following components of a planning system (5 marks)

i. States:

In a planning system, states refer to the different configurations or conditions in which a system or
environment can exist. States represent the current state of the system at any given point in time and can
include various variables, attributes, or properties that describe the system's state. These states can change as
actions are executed, leading to different subsequent states.

Example: In a navigation planning system, states can represent various aspects, such as the location of the
user, the destination, the current traffic conditions, and any other relevant information related to the
navigation process.

ii. Goal:

The goal in a planning system defines the desired outcome or state that the system aims to achieve. It
represents the final objective that the planning process is working towards. The goal provides a target or a
condition that needs to be satisfied for the planning system to be considered successful.

Example: In a puzzle-solving planning system, the goal could be to arrange a set of puzzle pieces in a
specific order to form a complete picture. The planning system would aim to find a series of actions or steps
that will lead to the desired arrangement of the puzzle pieces.

iii. Precondition:

In a planning system, a precondition refers to the conditions or requirements that must be met or true before a
certain action can be executed. Preconditions ensure that actions are executed only when the necessary
conditions are satisfied, avoiding any potential issues or inconsistencies.

Example: In a manufacturing planning system, a precondition for the action "Start production" could be
that all required raw materials are available in the required quantities. The action of starting production
can only be executed if this precondition is met.

NOTE: These components - states, goal, and preconditions - are crucial elements in a planning system as they
help define the starting point, the desired outcome, and the conditions under which actions can be executed to
reach the goal.

ASSIGNMENT 3
Discuss the main features of LISP that make it useful in artificial programming

(5 marks)

LISP (List Processing) is a high-level programming language that is widely used in artificial intelligence
programming because of its key features.

LISP provides flexibility through its dynamic typing - makes it easier to prototype and modify code.

It also supports symbolic computation and is well-suited for handling complex data structures.

LISP has a powerful macro system that allows for language extension and customization. These features make
LISP particularly useful in artificial intelligence programming where flexibility and complex data manipulation
are essential.

Garbage Collection:

One of Lisp's advantageous features in artificial intelligence programming is its automatic garbage collection,
which manages memory without manual intervention. This capability is particularly beneficial in AI
programming, where complex data structures are prevalent, and manual memory management can be error-prone.

Functional Programming:

Lisp supports and emphasizes functional programming paradigms, empowering developers to treat functions as
first-class citizens. This support allows for the creation of concise and expressive code, facilitating the
development of intricate algorithms and AI-related computations.

Code as Data and Data as Code:

Lisp's unique ability to treat code as data and vice versa enables powerful metaprogramming and the creation of
domain-specific languages. This characteristic is especially vital in AI programming, where the ability to
dynamically manipulate and generate code is often necessary.

Macro System:

Lisp boasts a robust macro system that enables developers to define their own language constructs. This capability
facilitates the creation of domain-specific languages tailored to the specific requirements of AI tasks, enhancing
productivity and code readability.

Interactive Development:
Lisp provides an interactive development environment that empowers programmers to experiment, test, and
modify code interactively. This feature proves advantageous in AI research and development, where rapid
prototyping, experimentation, and iterative development cycles are common.

Extensibility:

Lisp is highly extensible, allowing programmers to modify the language itself. This extensibility is invaluable in
AI programming, as researchers and developers may need to adapt the language to meet the unique requirements
of their projects, enabling greater flexibility and customization.

Historical Significance:

Lisp has a rich historical presence in AI research and development. Many AI algorithms and techniques were
initially implemented in Lisp, and a wide array of libraries and tools are available for AI programming in Lisp,
allowing developers to leverage existing knowledge and resources.

Portability:

Lisp has been implemented on various platforms, ensuring portability across different systems. This portability is
crucial in AI programming, where algorithms and models often need to be deployed on diverse hardware and
software environments.

CAT 1
Problem Statement: Construct decision tree classifier for a given date set using Id3.

Dataset description: In this assignment, we’ll be using a sample dataset of COVID-19


infection. A preview of the entire dataset is shown below.

Breathin Infecte
ID Fever Cough g d
1 No No No No
2 Yes Yes Yes Yes
3 Yes Yes No No
4 Yes No Yes Yes
5 Yes Yes Yes Yes
6 No Yes No No
7 Yes No Yes Yes
8 Yes No Yes Yes
9 No Yes Yes Yes
10 Yes Yes No Yes
11 No Yes No No
12 No Yes Yes Yes
13 No Yes Yes No
14 Yes Yes No No
Implementation on our Dataset:

The first step is to find the best feature i.e. the one that has the maximum Information Gain (IG).
We’ll calculate the IG for each of the features now, but for that, we first need to calculate the
entropy of S

Entropy(S) = — (8/14) * log₂ (8/14) — (6/14) * log₂ (6/14) = 0.99


IG calculation for Fever:

In this (Fever) feature there are 8 rows having value YES and 6 rows having value
NO. As shown below, in the 8 rows with YES for Fever, there are 6 rows having
target value YES and 2 rows having target value NO.
ID Feve Cough Breathin Infected

r g
1 Yes Yes Yes Yes
2 Yes Yes No No
3 Yes No Yes Yes
4 Yes Yes Yes Yes
5 Yes No Yes Yes
6 Yes No Yes Yes
7 Yes Yes No Yes
8 Yes Yes No No
As shown below, in the 6 rows with No for Fever, there are 2 rows having target
value YES and 4 rows having target value NO.

Breathin Infecte
ID Fever Cough g d
1 No No No No
2 No Yes No No
3 No Yes Yes Yes
4 No Yes No No
5 No Yes Yes Yes
6 No Yes Yes No
The Calculation of Information Gain for Fever.

|F| = 14

For F = YES, |FYes| = 8

Entropy (FYes) = - (6/8) * log₂ (6/8) - (2/8) * log₂ (2/8) = 0.81 For v = NO, |FNo| =
6

Entropy (FNo) = - (2/6) * log₂ (2/6) - (4/6) * log₂ (4/6) = 0.91 IG(S, Fever) =
Entropy(S) - (|Fʏᴇꜱ| / |F|) * Entropy (FYes) -

(|FNo| / |F|) * Entropy (FNo)

∴ IG(S, Fever) = 0.99 - (8/14) * 0.81 - (6/14) * 0.91 = 0.13


IG calculation for Cough:

In this (Cough) feature there are 8 rows having value YES and 6 rows having value NO.
As shown below, in the 10 rows with YES for Cough, there are 5 rows having target
value YES and 5 rows having target value NO.
ID Fever Cough Breathing Infected
1 Yes Yes Yes Yes
2 Yes Yes No No
3 Yes Yes Yes Yes
4 No Yes No No
5 No Yes Yes Yes
6 Yes Yes No Yes
7 No Yes No No
8 No Yes Yes Yes
9 No Yes Yes Yes
10 Yes Yes No No
As shown below, in the 4 rows with No for Fever, there are 3 rows having target
value YES and 1 row having target value NO.

Breathin Infecte
ID Fever Cough g d
1 No No No No
2 Yes No Yes Yes
3 Yes No Yes Yes
4 Yes No Yes Yes
The Calculation of Information Gain for Cough.

|C| = 14

For C = YES, |CYes| = 10

Entropy (CYes) = - (5/10) * log₂ (5/10) - (5/10) * log₂ (5/10) = 1 For C = NO, |CNo| = 4

Entropy (CNo) = - (3/4) * log₂ (3/4) - (1/4) * log₂ (1/4) = 0.81

IG(S, Cough) = Entropy(S) - (|CYes| / |C|) * Entropy (CYes) - (|CNo| /


|C|) * Entropy (CNo)

∴ IG(S, Cough) =0.99 - (10/14) * 1 - (4/14) * 0.81=0.04


IG calculation for Breathing_Issue:

In this (Breathing_Issue) feature there are 8 rows having value YES and 6 rows having value NO.
As shown below, in the 8 rows with YES for Breathing_Issue, there are 7 rows having target
value YES and 1 row having target value NO.
Breathin Infecte
ID Fever Cough g d
1 Yes Yes Yes Yes
2 Yes No Yes Yes
3 Yes Yes Yes Yes
4 Yes No Yes Yes
5 Yes No Yes Yes
6 No Yes Yes Yes
7 Yes Yes Yes Yes
8 No Yes Yes No
As shown below, in the 6 rows with No for Breathing_Issue, there are 6 rows having target
value No and No row having target value Yes.

Breathin Infecte
ID Fever Cough g d
1 No No No No
2 Yes Yes No No
3 No Yes No No
4 No Yes No No
5 No Yes Yes No
6 Yes Yes No No
The Calculation of Information Gain for Breathing_Issue.

|B| = 14

For B= YES, |BYes| = 8

Entropy (BYes) = - (7/8) * log₂ (7/8) - (1/8) *log₂ (1/8)

=0.168+0.375=0.544

For B = NO, |CNo| = 6

Entropy (CNo) = - (1/6) * log₂ (1/6) - (5/6) * log₂ (5/6) = 0.43+0.21=0.64


IG(S, Breathing_Issue) = Entropy(S) - (|BYes| / |B|) * Entropy (BYes) - (| BNo| / |B|) *
Entropy (BNo)

∴ IG(S, Breathing_Issue) =0.99 - (8/14) * 0.544 - (6/14) * 0.64= 0.40


Since the feature Breathing issues have the highest Information Gain it is used to create the root node.
Hence, after this initial step our tree looks like this:

Breathing_Issue
Yes No

Next, from the remaining two unused features, namely, Fever and Cough, we decide which one is
the best for the left branch of Breathing Issues.
Since the left branch of Breathing Issues denotes YES, we will work with the subset of the
original data i.e. the set of rows having YES as the value in the Breathing Issues column. These 8
rows are shown below:
Breathin Infecte
ID Fever Cough g d
1 Yes Yes Yes Yes
2 Yes No Yes Yes
3 Yes Yes Yes Yes
4 Yes No Yes Yes
5 Yes No Yes Yes
6 No Yes Yes Yes
7 No Yes Yes Yes
8 No Yes Yes No

Next, we calculate the IG for the features Fever and Cough using the subset Sʙʏ (Set Breathing
Issues Yes) which is shown above:

Entropy (SBY) = - (7/8) * log₂ (7/8) - (1/8) * log₂ (1/8)

= 0.16+0.375=0.535
The Calculation of Information Gain for Fever.

|F| = 8

For F = YES, |FYes| = 5

Entropy (FYes) = - (5/5) * log₂ (5/5) - (0/5) * log₂ (0/5) =


0 For v = NO, |FNo| = 3

Entropy (FNo) = - (2/3) * log₂ (2/3) - (1/3) * log₂ (1/3) = 0.9

IG(S, Fever) = Entropy (SBY) - (|Fʏᴇꜱ| / |F|) * Entropy (FYes) - (|FNo| / |


F|) * Entropy (FNo)

∴ IG(S, Fever) = 0.535 - (5/8) * 0 - (3/8) * 0.9 = 0.20

The Calculation of Information Gain for Cough.

|C| = 8

For C = YES, |CYes| = 5

Entropy (CYes) = - (4/5) * log₂ (4/5) - (1/5) * log₂ (1/5) = 0.71


For C = NO, |CNo| = 3

Entropy (CNo) = - (3/3) * log₂ (3/3) - (0/3) * log₂ (0/3) = 0

IG(S, Cough) = Entropy (SBY) - (|CYes| / |C|) * Entropy (CYes) - (|CNo| /


|C|) * Entropy (CNo)

∴ IG(S, Cough) =0.535-(5/8)*0.71-(3/8)*0 = 0.09

IG of Fever is greater than that of Cough, so we select Fever as the left


branch of Breathing Issues:
Our tree now looks like this:

Breathing_Issue

Yes No
Fever
Yes No

There are no more unused features, so we stop here and jump to the final
step of creating the leaf nodes.

For the left leaf node of Fever, we see the subset of rows from the original
data set that has Breathing Issues and Fever both values as YES.

ID Fever Cough Breathing Infected


1 Yes Yes Yes Yes
2 Yes No Yes Yes
3 Yes Yes Yes Yes
4 Yes No Yes Yes
5 Yes Yes No Yes
6 Yes Yes No No
Since all the values in the target column are YES, we label the left leaf node
as YES, but to make it more logical we label it infected.

Similarly, for the right node of Fever we see the subset of rows from the
original data set that have Breathing Issues value as YES and Fever as NO.

ID Fever Cough Breathing Infected


1 No Yes Yes Yes
2 No Yes Yes Yes
3 No Yes Yes No
We repeat the same process for the node Cough, however here both left and
right leaves turn out to be the same.

Our tree, now, looks like this:

Breathing_Issue
Yes No

Fever Cough

Yes No Yes No

Infected Not Infected Not Infected Not Infected


CAT 2
(a) Explain three knowledge representation techniques other than predicate logic (6 marks)
1. Semantic Networks:

Semantic networks represent knowledge in the form of nodes and links, where nodes
represent objects or concepts, and links represent relationships between them. This
technique allows for the representation of complex relationships and hierarchical
structures. For example, in a semantic network representing a family tree, nodes would
represent individuals, and links would represent parent-child relationships.

2. Frames:
Frames organize knowledge using a schema-like structure. Each frame represents an
entity or concept and contains slots that hold attribute-value pairs. Frames provide a way
to represent attributes and their values, as well as inheritance between frames. This
technique is useful for representing structured knowledge domains. For instance, a frame
representing a car would have slots for attributes like color, model, and manufacturer.
3. Neural Networks:

Neural networks are used to represent knowledge in a more connectionist approach,


inspired by the structure and functioning of the human brain. They consist of
interconnected nodes (neurons) that simulate the learning and inference processes.
Neural networks are particularly effective in capturing patterns, recognizing patterns in
data, and making decisions based on learned knowledge. This technique is commonly
used in areas like machine learning and pattern recognition.

(b)Consider the search space below, where S is the start node and G1 and G2 are goal nodes.
Arcs are labeled with the value of a cost function; the number gives the cost of traversing the
arc. Above each node is the value of a heuristic function; the number gives the estimate of the
distance to the goal. Assume that uninformed search algorithms always choose the left branch
first when there is a choice. Assume that the algorithms do not keep track of and recognize
repeated states.

For each of the following search strategies, indicate which goal state is reached first
(if any) and list in order, all the states that are popped off the OPEN list.

(i) Depth-first (2 marks)


(ii) Breadth-first (2 marks)

You might also like