0% found this document useful (0 votes)
35 views34 pages

AI Summary

This document provides a summary of an artificial intelligence course. It covers the following key points: 1) There are different methods for developing AI, including making machines that can think or act like humans rationally. The best approach is to make machines that can act rationally to achieve predefined goals. 2) Topics covered in the course include the history of AI, applications such as computer vision and robotics, designing rational agents, and different types of agents and environments. 3) Rational agents are those that select actions to maximize their expected utility. They require models of the environment and how it evolves in response to actions.

Uploaded by

mohamed hammdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views34 pages

AI Summary

This document provides a summary of an artificial intelligence course. It covers the following key points: 1) There are different methods for developing AI, including making machines that can think or act like humans rationally. The best approach is to make machines that can act rationally to achieve predefined goals. 2) Topics covered in the course include the history of AI, applications such as computer vision and robotics, designing rational agents, and different types of agents and environments. 3) Rational agents are those that select actions to maximize their expected utility. They require models of the environment and how it evolves in response to actions.

Uploaded by

mohamed hammdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

AI Summary

Based on the course:


Artificial Intelligence
2nd Semester 2022

Course provided by:


Dr. Kareem Badawi

Notes prepared by:


Mohamed Alaa Eldeen Shehata

This summary is heavily based on the course and written content provided by Dr. Kareem Badawi, though
not entirely.
Artificial Intelligence
1. Introduction
Artificial intelligence is the science of making machines that have the ability to act rationally.

1.1. Different methods of developing an AI:


An AI that can think like people:

Therefore; it’s assumed that it can think like people in both the good way and the bad way which is not
ideal due to the fact that machines cannot differentiate between the good and the bad.

This also requires us to get inside the human mind to see how it works and then creating a model that
attempts to copy that which is a difficult task and is more related to cognitive science.

An AI that can act like people:

Alan Turing proposed the Turing test which aims to identify a machine by making an interrogator talk to
it for about an hour, if the AI manages to go unidentified; it means that a perfect AI has been built.
However an AI that can pass this test has not yet been created as people focus on things like don’t answer
too quickly, having a favorite movie, and not being able to calculate what the square root of 1412 is in the
blink of an eye.

An AI that can think rationally:

A concept introduced in the past by Aristotle but still isn’t recommended as:

1. It’s difficult to encode how to think.


2. It’s not about how an AI thinks but about how it ends up acting.
3. Giving the AI an ability to think rationally can lead to unwanted decisions/solutions made by the
AI which is not an ideal scenario in many cases as bank management, etc.
1.2. An AI that can act rationally:
Acting rationally means acting so as to achieve one’s goals given one’s beliefs, or maximally achieving pre-
defined goals.

Rationality only concerns what decisions are made not the thought process behind them.

Since goals are expressed in terms of the utility )‫ (منفعة‬of outcomes, therefore, being rational means
maximizing your expected utility.

One can argue that to be able to act rationally, one must think logically, and because an AI is basically a
computer and computers have the ability to think logically, therefore, they can achieve their goal based
on logic and trial and error.

Since an AI is a computer as well, this means that it can store past experiences in their memory or use
databases to achieve their goals.

Therefore, we can deduce that, memory and simulation are the key to decision making.

The study of AI as a rational agent is advantageous but it’s also not perfect.

Achieving perfect rationality in complex environments is not possible because the computational
demands are too high.

Note: There’s a huge difference between programming a computer to do a certain thing given all the steps
to achieve the goal, and programming a computer with certain decision-making tools that allows it to
learn how to achieve said goal by trial and error.

1.3. AI History:
1.4. Examples of what an AI can do:
Action List Can it be implemented
Play a decent game of table tennis. Yes
Play a decent game of jeopardy. Yes
Drive safely along a curving mountain road. Yes
Drive safely along Cairo Maybe
Buy a week’s worth of groceries on the web. Yes
Buy a week’s worth of groceries at a hyper super market. No
Discover and prove a new mathematical theorem. Maybe
Converse successfully with another person for an hour. No
Perform a surgical operation. No
Put away the dishes and fold the laundry. Yes
Translate spoken Chinese into spoken English in real time. Yes
Write an intentionally funny story. No
When the answer is No, this is due to the fact the environment is too complex and this complexity MUST
be implemented in the AI as the action is too sensitive. Achieving complexity requires unachievable
computational demands.

1.5. AI Applications:
Natural Language: Vision (Perception):
Speech technologies: • Object and face recognition.
• Automatic Speech Recognition (ASR). • Scene segmentation.
• Text-To-Speech Synthesis (TTS). • Image classification.
• Dialog Systems.
Language processing technologies: Robotics:
• Question answering. • Home assistant robots.
• Machine translation. • Soccer robots.
• Web search. • Self-driving cars. (Google cars)
• Text classification and spam filtering. • Advanced motion robots.
1.6. Designing Rational Agents:
• An agent is an entity that perceives and acts.
• A percept is a lasting result of something we have perceived, which, rather than being
immediately perceived, is something we know we could possibly perceive due to a certain action.
• A rational agent selects actions that maximize its expected utility.
• An agent is anything that can be viewed as perceiving its environment through sensors and acting
upon that environment through actuators.
• Characteristics of the sensors, actuators, and environment dictate techniques for selecting
rational actions.
• Humans can be considered as agents.

Agents can be grouped into five classes based on their degree of perceived intelligence and capability.

With Waymo for example, the model-based agent uses GPS to understand its location and predict
upcoming drivers. You and I take for granted that, when the brake lights of the car ahead of us come on,
the driver has hit the brakes and so the car in front of us is going to slow down. But there's no reason to
associate a red light with the deceleration of a vehicle, unless you are used to seeing those two things
happen at the same time. So the Waymo can learn that it needs to hit the brakes by drawing on its
perceptual history. Waymo can learn to associate red brake lights just ahead with the need to slow itself
down.
2.1. Types of agents:
Reflex Agents:

Reflex agents ignore percept history and act only on the current percept.

The agent function is based on the condition-action rule where if a condition (state) is true, action is taken
else it’s not.

We can deduce from those characteristics that a reflex agent doesn’t consider the future consequences
of their actions; therefore, it doesn’t need a model of the world’s current state.

This kind of agent can work in a fully observable environment but in partially observable environments
infinite loops are impossible to avoid unless the agent can randomize its actions.

Example: Blinking your eye, and vacuum cleaner moving towards the nearest dirt.

Planning (Model Based Reflex) Agents:

It works by asking “what if” questions.

Model based agents must have a model of the environment and how the environment evolves in response
to its actions (percept history).

The model of the environment must be given to them initially but how the environment evolves in
response to their actions is percept history learned by trial and error or through given data.

Therefore, a planning agent can replan to complete its action more delicately but with a more
computational cost through evolving its percept history.

They must have a goal (test).

Re-planning Agents

Mastermind Agents
2.2. Environment Classification:
Fully or Partially Observable:

Fully: Meaning that the AI can see the whole environment clearly as in chess where it can see all the
pieces, tiles, and where the pieces are currently are.

Partially: Meaning that the AI can see part of the environment only as in dominos where it can’t see the
pieces in the opponent’s hands, medical diagnosis, and self-driving cars where it can only see the road to
the extent where its sensors can reach.

Adversarial:

Is the AI in a hostile environment?

Deterministic or Stochastic:

Deterministic: Meaning that the current state and the action taken can completely determine the next
state of the environment.

Stochastic: A stochastic environment is a random one that cannot be determined by the agent.

HOW IS A SELF-DRIVING CAR DETERMINISTIC?

Discrete or Continuous:

Discrete: Meaning that the environment has a finite amount of percepts and actions that can be
performed within it.

Continuous: Meaning that the environment has an infinite amount of percepts and actions that can be
performed within it.

Single or Multi-Agent
2.3. PEAS:
PEAS stands for a Performance measure, Environment, Actuator, Sensor.

• Performance Measure: it’s the unit to define the success of an agent in what it does.
• Environment: it’s the surroundings of an agent at every instant.
• Actuator: It’s the part of the agent that delivers the output of action to the environment.
• Sensor: It’s the receptive part of an agent that takes in the input for the agent.

2.4. Search Problems:


A search problem consists of:

1. A state space (or a group of states).


2. A successor function.
3. A start state.
4. A goal test/state.

A world state includes every last detail of the environment while a search state keeps only the details
needed for planning or to solve the search problem.

Problem: Path Problem: Eat all dots


• States: Location (X,Y) • States: Location, Dot Booleans
• Actions: Move (NSEW) • Actions: Move (NSEW)
• Successor: Update location • Successor: Update location, and possibly a
• Goal Test: Is location = (X,Y)END dot Boolean.
• Goal Test: Dots all false.
2.5. State Space Sizes:
World State:
Agent Positions: 120
Food Count: 30
Ghost Positions: 12
Agent Facing: NSEW (4)
Size Calculations:
World States Count: 120 ∗ 230 ∗ 122 ∗ 4 = ⋯ 𝑏𝑖𝑙𝑙𝑖𝑜𝑛
States for Path: 120
States for Eat all dots: 120 ∗ 230

Therefore, we can understand that we have to be picky when designing an AI; choosing the states with
the greatest benefits to the AI’s functionality and neglecting the ones with minimum benefits.
2.6. State Space Representations:
2.6.1. State Space Graph:
• It’s a mathematical representation of a search
problem.
• Nodes are abstracted world configurations.
• Arcs represent successors (action results).
• The goal test is a set of goal nodes (or just one
node).
• In a state space graph, each state occurs only
once.

A state space graph cannot be fully built as it’s too big; usually a partial graph is built to solve the problem.

2.6.2. Search Tree:


A search tree is a tree that connects successive nodes in the same path together such that each child node
is a successor of its parent node.

• The start state is the root node.


• Children nodes correspond to successor
nodes.
• Nodes show states but correspond to
plans that achieve those states not the
states themselves.
• The current path taken in the tree is
called a fringe.
• Each node in the search tree corresponds
to an entire path in state space graph.

It’s a “what if” tree of plans and their outcomes.

• The whole idea is to ask the question of “what if” on each node and if the goal state isn’t met at
that node, then the tree expands according to the search algorithm.

A search tree only expands the states that are needed by the agent to achieve its goals.

Definitions:

• Fringe: The nodes that are currently being explored.


• Expansion: The possible outcome for expanding each node.
• Exploration strategy: The strategy used to select and expand nodes.
3. Search Algorithms:
A search algorithm is the approach that an agent uses in order to solve a search problem and reach that
goal state.

Properties:

• Complete: Guaranteed to find a solution if one exists.


• Optimal: Guaranteed to find the least cost path.
• Time Complex: Requires much time. (Heavy on the CPU)
• Space Complex: Requires much space. (Heavy on the memory)

For a search tree:

The number of nodes in an entire tree:

1 + 𝑏 2 + 𝑏 3 + ⋯ + 𝑏 𝑚 = 𝑂(𝑏 𝑚 ) [𝐶𝑜𝑚𝑝𝑙𝑒𝑥𝑖𝑡𝑦]

Such that:

• b is the branching factor.


• m is the maximum depth.

In this chapter, the search problem -travelling from the node “s” to the node “e”- will be used to
demonstrate different search algorithms.
3.1. Uninformed Search:

3.1.1. Depth-First Search (DFS) 3.1.2. Breadth-First Search


Expand the deepest node first starting from the Expand the shallowest node first.
left. Fringe is a FIFO queue.
Fringe is a LIFO stack. Needs to store all the explored nodes.
Only needs to store the path from the root to the https://www.youtube.com/watch?v=PZO3Ue9ySsY
goal as well as the unexpanded nodes to allow Properties:
for backtracking. • Complete: Only if cycles are prevented in
Properties: the case of infinite m.
• Complete: Only if cycles are prevented in • Non-Optimal: Because the depth doesn’t
the case of infinite m. represent the cost as well, therefore, it’s
• Non-Optimal: It finds the leftmost only optimal if the cost is 1 for each jump.
solution regardless of depth or cost. • Time complexity advantage: It takes lesser
• Time complexity disadvantage: It takes time 𝑂(𝑏 𝑠 ) as it explores the shallowest
time 𝑂(𝑏 𝑚 ) as it explores the whole depth and reaches its goal state the
depth. earliest.
• Space complexity advantage: It only • Space complexity disadvantage: It keeps all
keeps nodes on path to root 𝑂(𝑏 ∗ 𝑚). the explored notes as mentioned, therefore
requires more space 𝑂(𝑏 𝑠 ).
3.1.3. Iterative Deepening:
The whole idea is to get a shallow solution with lower memory requirements.

Idea of work:

1. Run a DFS with depth limit 1, if no solution…


2. Run a DFS with depth limit 2, if no solution…
3. Run a DFS…

Since iterative deepening visits states multiple times, it may seem wasteful, but it turns out to be not so
costly, since in a tree, most of the nodes are in the bottom level, so it does not matter much if the upper
levels are visited multiple times.

It has lower number of actions advantage.

Properties:

• Complete.
• Optimal if all costs are 1.
• Time complexity is proven to be similar to BFS (advantage).
• Space complexity similar to DFS (advantage).
3.2. Informed Search:
3.2.1. Uniform Cost Search:

Algorithm:

• It processes all nodes with cost less than the cheapest solution by exploring increasing cost
contours.
• If that solution costs 𝐶 ∗ and arcs cost at least 𝜀, then the effective depth is roughly 𝐶 ∗ /𝜀.
• Expand the cheapest node first.
• Fringe is a priority queue depending on cumulative cost.

Properties:
• Complete: Assuming that the best
solution has a finite cost and minimum
arc cost is positive.
• Optimal: Chooses the path with the
lowest cost.

• Time complexity: Takes time 𝑂(𝑏 𝐶 /𝜀 ).
(Exponential in effective depth)
• Space Complexity: Has roughly the last

tier, so 𝑂(𝑏 𝐶 /𝜀 ).

Disadvantages:

• It explores options in every direction.


• No information about goal location.
4. Search Heuristics:
A heuristic is a function that estimates how close a state is to a goal. (Am I getting closer to a goal or not?)

A heuristic is designed for a particular search problem. (Meaning that it changes according to the search
problem)

Examples:
For the following Pac-man game, assume the search problem of pathing where we want to reach the dot.
The heuristic here would be the distance between the Pac-man and the dot. The distance may be the
Manhattan distance (horizontal and vertical distance summed up), or Euclidean distance (directly from
the Pac-man to the dot).

For the following map of Romania, assume the search problem of travelling from Arad to Bucharest. The
heuristic here would be the straight-line distance to Bucharest (Euclidean distance to Bucharest).

Choosing the heuristic for the search problem is a critical design step as a different heuristic will lead to a
solution with different properties (time complexity, space complexity, …)
4.1. Informed Search using Search Heuristics:
4.1.1. Greedy Search:
Expand the node that seems closest (with the least heuristic).

Example:
For the following map of Romania, assume the search problem of travelling from Arad to Bucharest. The
heuristic here would be the straight-line distance to Bucharest (Euclidean distance to Bucharest).

Notice how in terms of cost, the algorithm chose a path with a cost equal to (140+99+211=450) while the
optimal path has a cost equal to (140+80+97+101=418), therefore, the greedy search didn’t choose the
optimal solution.
Properties:
• Heuristic: Estimate of distance to nearest goal for each state.
• Classified as a single shot cost (doesn’t accumulate the total cost).
• Classified as a very fast search algorithm.
• Common-case: Can easily take you straight to the (wrong/not the optimal) goal.
• Worst-case: A bad greedy search algorithm can be classified as a badly-guided DFS.
• Greedy search is complete but not optimal.
4.1.2. A* Search
The UCS algorithm orders by path cost (cumulative) or backward cost 𝑔(𝑛).

The GS algorithm orders by goal proximity or forward host ℎ(𝑛).

The A* Search algorithm combines both and orders by the sum of the backward and forward cost.

𝑓(𝑛) = 𝑔(𝑛) + ℎ(𝑛)


The combination of both the UCS and GS algorithms leads to the A* algorithm which is an algorithm that
isn’t as fast as the greedy search but not as slow as the UCS, requires as much space as UCS, and is optimal
through admissible heuristics.

Example:

Algorithm:
• Combines the speed of greedy search with the cost effectiveness of UCS.
• Only stops when we dequeue a goal.
• Expands in a directive root to the goal.
• Expands mainly toward the goal but does hedge its bets to ensure optimality.

Properties:
• Complete.
• Optimal through admissible/consistent heuristics.
• Time Complexity: Less than UCS.
• Space Complexity: Almost the same as UCS.
Termination of A* Search:
An A* search algorithm only stops/terminates when we dequeue a goal.

To put it simply, this means that the algorithm only terminates when no other nodes exist on the fringe
with a total cost equal to or less than the total cost of the current candidate solution. When that happens,
the algorithm explores the goal which removes it from the fringe.

Examples:

In this example, the red path is not optimal, however, the algorithm will not terminate once it finds that
goal, it’ll first explore 𝑆 → 𝐴 as 𝐴 is a node on the fringe that has a less total cost than the current
candidate solution’s cost.

It’ll then proceed to find the optimal solution 𝑆 → 𝐴 → 𝐺.

In this example, the red path is not optimal but will be chosen as the solution anyway because the node
𝐴 on the fringe has a total cost more than the total cost of the candidate solution, therefore, it won’t be
explored.

Therefore, the node 𝐴 where the optimal goal lies beyond is trapped on the fringe by the bad heuristic.

This is where admissibility is required to solve the problem that may be caused by the heuristic function.
Admissibility of a Heuristic:

A heuristic is inadmissible when it breaks optimality (leads to the non-optimal solution) by trapping good
plans on the fringe.

A heuristic is admissible when it slows down bad plans but never outweighs true costs. This means that
the heuristic at any node is lower than or equal to the true cost to a nearest goal.

0 ≤ ℎ(𝑛) ≤ ℎ∗ (𝑛)

Where ℎ∗ (𝑛) is the true cost to a nearest goal.

An example of an admissible heuristic would be taking the Euclidean distance as the heuristic in a Pac-
man pathing search problem as the heuristic at any location will always be less than the actual cost to the
nearest goal due to the presence of barriers.
Optimality of A* Search Algorithm:

In the following example, we’ll proof that the A* search algorithm is optimal using the heuristic ℎ by
showing that 𝐴 will exit the fringe before 𝐵 under the following assumptions:

• 𝐴 is an optimal goal node.


• 𝐵 is a suboptimal goal node.
• ℎ is admissible.
Proof:

1. Assume that B is on the fringe while an ancestor of A which is called n is also on the fringe.
2. Since 𝐴 is the optimal goal, then its cost is the true cost to the nearest goal.
o 𝑓(𝐴) = 𝑔(𝐴) + ℎ(𝐴) = 𝑔(𝐴) [ℎ(𝐴) = 0 𝑎𝑠 𝑡ℎ𝑒 ℎ𝑒𝑢𝑟𝑖𝑠𝑡𝑖𝑐 𝑎𝑡 𝑡ℎ𝑒 𝑔𝑜𝑎𝑙 𝑖𝑠 𝑧𝑒𝑟𝑜]
3. The total cost of reaching the node (n) is:
o 𝑓(𝑛) = 𝑔(𝑛) + ℎ(𝑛)
4. Since the heuristic is admissible and n is an ancestor of the optimal goal A, then:
o 𝑓(𝑛) = 𝑔(𝑛) + ℎ(𝑛) ≤ 𝑓(𝐴) = 𝑔(𝐴)
5. The total cost of reaching the candidate goal (B) is:
o 𝑓(𝐵) = 𝑔(𝐵) + ℎ(𝐵) = 𝑔(𝐵) [ℎ(𝐵) = 0 𝑎𝑠 𝑡ℎ𝑒 ℎ𝑒𝑢𝑟𝑖𝑠𝑡𝑖𝑐 𝑎𝑡 𝑡ℎ𝑒 𝑔𝑜𝑎𝑙 𝑖𝑠 𝑧𝑒𝑟𝑜]
6. Since the heuristic is admissible and B is a suboptimal goal, then:
o 𝑔(𝐴) < 𝑓(𝐵) = 𝑔(𝐵)
7. Therefore:
o 𝑓(𝑛) ≤ 𝑓(𝐴) < 𝑓(𝐵)
o n expands before B.
o A expands before B, therefore, A leaves the fringe (dequeues).
o A* search algorithm is optimal.

A* Applications:
• Video games. • Language analysis.
• Pathing/routing problems. • Machine translation.
• Resource planning problems. • Speech recognition.
• Robot motion planning.
5. Constraint Satisfaction Problems:
5.1. A comparison between SSPs & CSPs
5.1.1. Standard Search Problems:
In standard search problems we assumed that:

• We’re dealing with a single agent.


• We’re performing deterministic actions.
• We’re in a fully observed environment.
• We’re in a discrete environment.

In standard search, we deal with a planning problem.

Planning: Sequences of actions

• The path to the goal is important but the goal itself is really not the center of attention.
• Paths have various costs, depths.
• Heuristics give problem-specific guidance.

5.1.2. Constraint Satisfaction Problems:


In CSPs, we deal with an identification problem.

An identification problem is one where we must simply identify whether a state is a goal state or not,
with no regard to how we arrive at that goal.

Identification: Assignments to variables

• The goal itself is important not the path or plan.


• All paths at the same depth (for some formulations)
• CSPs are a specialized class of identification problems.
5.1.3. EXTENSTION: Standard Search Problems:
In standard search problems if you remember, we mentioned in search trees that states do not
represent the states themselves but represent plans that allow us to reach the goal state.

From the AI’s perspective:

• In standard search problems, a state is a “black box”: arbitrary data structures, they are given to
solve the problem, the AI agent doesn’t have any information about what each state represents.
• Goal test can be any function over states.
• Successor function can also be anything.

5.1.4. EXTENSION: Constraint Search Problems:


• A CSP is a special subset of search problems
• In constraint search problems, a state is defined by variables 𝑋𝑖 , with values from a domain 𝐷
(sometimes 𝐷 depends on 𝑖).
• Goal test is a set of constraints specifying allowable combinations of values for subsets of
variables.

CSPs are useful general-purpose algorithms with more power than standard search algorithms.

• Variables: CSPs possess a set of N variables 𝑋1 , … , 𝑋𝑁 that can each take on a single value from
some defined set of values.
• Domain: A set {𝑥1 , … , 𝑥𝑑 } representing all possible values that a CSP variable can take on.
• Constraints: Constraints define restrictions on the values of variables, potentially with regard to
other variables.

For each variable, there are 𝑂(𝑑𝑁 ) possible assignments.

EX: for IPv4 you have 32 bits in the IP, there are 232 variations of that IP, the domain is 1 or 0, the
variables are the 32 bits with the location in mind.
5.2. Examples of CSPs
5.2.1. Map Coloring Example:
The idea is that we want each state (‫ )والية‬to have a color that is different to its neighbors.

• Variables: WA, NT, Q, NSW, V, SA, T.


• Domain: D = {red, green, blue}
• Constraints: Adjacent regions must have different colors.
o Implicit: 𝑊𝐴 ≠ 𝑁𝑇 ‫ضمني‬
o Explicit: (𝑊𝐴, 𝑁𝑇) ∈ {(𝑟𝑒𝑑, 𝑔𝑟𝑒𝑒𝑛), (𝑟𝑒𝑑, 𝑏𝑙𝑢𝑒), … } ‫صريح‬
• Solutions are assignments satisfying all constraints as:
o {𝑊𝐴 = 𝑟𝑒𝑑, 𝑁𝑇 = 𝑔𝑟𝑒𝑒𝑛, 𝑄 = 𝑟𝑒𝑑, 𝑁𝑆𝑊 = 𝑔𝑟𝑒𝑒𝑛, 𝑉 = 𝑟𝑒𝑑, 𝑆𝐴 = 𝑏𝑙𝑢𝑒, 𝑇 = 𝑔𝑟𝑒𝑒𝑛}

5.2.2. Sudoku Example:


• Variables: Each (open) square.
• Domain: {1, 2, … , 9}
• Constraints:
o 9 way all different for each row.
o 9 way all different for each column.
o 9 way all different for each region.
5.2.3. N-Queens Example:
The idea is that we want each queen to be in a position where the neighbor tile (right, left, diagonal) is
empty.

Formulation One:

• Variables: 𝑋𝑖𝑗
o The variables represent each tile in the board with 0 < 𝑖, 𝑗 < 𝑁 where N is the number
of rows/columns and the number of queens on the board.
• Domain: D = {0, 1}
o The domain represents whether there’s a queen or not.

If a variable/tile is given a value from the domain 0 then there’s no queen in that tile, if it’s given 1 then
there is.

• Constraints:
• ∑𝑁 𝑋𝑖𝑗 = 𝑁
▪ This constraint states that we must have exactly N grid positions marked
with a 1, and all others marked with a 0, capturing the requirement that
there are exactly N queens on the board.
• ∀𝑖, 𝑗, 𝑘 (𝑋𝑖𝑗 , 𝑋𝑖𝑘 ) ∈ {(0, 0), (0, 1), (1,0)}
▪ This constraint states that if two variables have the same value for i, only
one of them can take on a value of 1, encapsulating the condition that no
two queens can be in the same row.
• ∀𝑖, 𝑗, 𝑘 (𝑋𝑖𝑗 , 𝑋𝑘𝑗 ) ∈ {(0, 0), (0, 1), (1,0)}
▪ This constraint states that if two variables have the same value for j, only
one of them can take on a value of 1, encapsulating the condition that no
two queens can be in the same column.
• ∀𝑖, 𝑗, 𝑘 (𝑋𝑖𝑗 , 𝑋𝑖+𝑘,𝑗+𝑘 ) ∈ {(0, 0), (0, 1), (1,0)}
• ∀𝑖, 𝑗, 𝑘 (𝑋𝑖𝑗 , 𝑋𝑖+𝑘,𝑗−𝑘 ) ∈ {(0, 0), (0, 1), (1,0)}
▪ With similar reasoning as above, we can see that the previous two
constraints represent the conditions that no two queens can be in the same
major or minor diagonals, respectively.
Formulation Two:

• Variables: 𝑄𝑘
o The variable represents each row.
• Domain: {1, 2, 3, 𝑁}
o The domain represents the tiles in the row.
• Constraints:
o Implicit: ∀𝑖, 𝑗 𝑁𝑜𝑛 − 𝑡ℎ𝑟𝑒𝑎𝑡𝑒𝑛𝑖𝑛𝑔(𝑄𝑖 , 𝑄𝑗 )
o Explicit: (𝑄1 , 𝑄2 ) ∈ {(1, 3), (1, 4), … }, (𝑄2 , 𝑄3 ) ∈ ⋯
▪ The explicit constraint means that for rows 1 and 2, select the first tile in
row 1 and the third tile in row 3, therefore, we’re picking all the choices
where row 1 and row 2 are safe to place the queen at.
5.3. Expanding Knowledge on CSPs:
5.3.1. Constraint Graphs:
Constraint satisfaction problems are often represented as constraint graphs, where nodes represent
variables and edges represent constraints between them.

Algorithms use the constraint graph structure to speed up the search process.

5.3.2. Types of Constraints:


Unary Constraints Binary Constraints Higher-order Constraints
• Involve a single variable in • Involve two variables. • Constraints involving three
the CSP. 𝑆𝐴 ≠ 𝑊𝐴 or more variables.
𝑆𝐴 ≠ 𝑔𝑟𝑒𝑒𝑛 • They’re represented in cryptarithmetic column
• They are not represented constraint graphs as constraints.
in constraint graphs, traditional graph edges. • They can also be
instead simply being used represented with edges in a
to prune the domain of the CSP graph, they just look
variable they constrain slightly unconventional.
when necessary.

Soft Constraints

• Often represented by a cost for each variable assignment.


• Gives constrained optimization problems.
• E.g., red is better than green.
5.3.3. Types of Variables:
Discrete Variables

A Discrete variable can take only a specific value amongst the set of all possible values or in other words,
if you don’t keep counting that value, then it is a discrete variable aka categorized variable.

They can be categorized into:

• Finite Domains:
o The number of values in the domain is limited.
o E.g., Boolean CSPs, including Boolean satisfiability.
o Size 𝑑 means 𝑂(𝑑𝑛 ) complete assignments.
• Infinite Domains (integers, strings, etc.)
o The number of values in the domain is vast.
o E.g., job scheduling, variables are start/end times for each job.
o Linear constraints are solvable while nonlinear are undecidable.

Continuous Variables

A continuous variable can take any values. Think of it like this: If that number in the variable can keep
counting, then it’s a continuous variable.

• The variables in the domain are periodic.


• E.g., start/end times for Hubble telescope observations.
• Linear constraints are solvable in polynomial time by LP methods.

5.4. Real-World CSPs Applications:


• Assignment Problems: As who teaches what class.
• Timetabling Problems: As which class is offered when and where.
• Hardware Configuration.
• Transportation Scheduling.
• Factory Scheduling.
• Circuit Layout.
• Fault Diagnosis
5.5. Enhancing Standard Search Formulation/Algorithm by CSPs:
Standard search formulation utilizing CSPs is an improvement to the standard search algorithm.

The whole idea is implementing the methods used in CSPs on standard search algorithms to improve
their performance.

For understanding:

A State is defined by the value assigned so far from the domain to the variables in the state:

• Initial State: The state contains variables; those variables don’t have any assigned values from
the domain.
• Successor Function: Assign a value from the domain to an unassigned variable in the current
state.
• Goal Test: Is the current state complete and satisfies all the constraints?

For memorizing:

States are defined by the values assigned so far (partial assignment):

• Initial State: The empty assignment/variable, {}.


• Successor Function: Assign a value to an unassigned variable.
• Goal Test: Is the current assignment complete and satisfies all the constraints?
6. Backtracking Search:
6.1. What is Backtracking Search:
Constraint satisfaction problems are traditionally solved using a search algorithm known as backtracking
search. Backtracking search is an optimization on depth first search used specifically for the problem of
constraint satisfaction. Backtracking search is the basic uninformed algorithm for solving CSPs with
improvements on DFS coming from two main principles:

Improvement 1: One Variable at a time.

When assigning values from the domain to variables, only do one variable at a time.

• Only need to consider assignments to a single variable at each step.


• Value assignments are commutative, so fix ordering.
o I.e., [WA = red then NT = green] is the same as [NT = green then WA = red].

Improvement 2: Check constraints as you go.

When selecting values for a variable from the domain, only select values that don’t conflict with any
previously assigned values. If no such values exist, backtrack and return to the previous variable,
changing its value.

• Consider only values which do not conflict with previous assignments.


• “Incremental goal test”
• Might have to do some computation to check the constraints.

DFS with the previous two improvements is called backtracking search.

Backtracking = DFS+ Variable-Ordering + Fail-On-Violation


6.2. Improving on Backtracking Search:
Though backtracking search is a vast improvement over the brute-forcing of depth first search, we can
get more gains in speed still with further improvements through filtering, variable/value ordering, and
structural exploitation.

6.2.1. Filtering:
Filtering involves keeping track of the domain for unassigned variables and crossing off bad options.

This can be done using two methods:

Forward Checking:

After assigning a value from the domain to a variable, remove from the unassigned variables that share
a constraint with the assigned variable the values that would cause a violation.

• Cross off values that violate a constraint when added to the existing assignment.
• Disadvantage is that it doesn’t provide early detection for all failures.

Forward Checking: Enforcing consistency of arcs pointing to each new assignment.

Consistency of a single Arc:

• For an arc X→Y, it’s consistent if for every “X” in the tail, there is some “Y” in the head which
could be assigned without violating a constraint.
o In other words, each value that can be assigned to the variable at the tail has a value
that can be assigned to the variable at the head. If that is true, then the arc is consistent.

Forward checking enforces that all arcs pointing from the unassigned variables that share a constraint
with the assigned variable are consistent.
Arc-Consistency of an entire CSP:

In arc-consistency filtering algorithm, we make sure that all arcs are consistent.

• If “X” loses a value, neighbors of “X” need to be rechecked.


• Arc-Consistency detects failure earlier than forward checking.
• Can be run as a preprocessor or after each assignment.

Extra Knowledge:

For arc consistency, we interpret each undirected edge of the


constraint graph for a CSP as two directed edges pointing in
opposite directions.

Each of these directed edges is called an arc. The arc


consistency algorithm works as follows:

• Begin by storing all arcs in the constraint graph for the CSP in a queue 𝑄.
• Remove arcs from 𝑄 such that in each removed arc, 𝑋𝑗 → 𝑋𝑖 , for every value for the tail variable
𝑋𝑗 , there exists at least one value for the head variable 𝑋𝑖 that does not violate any constraints.
• If one value for the tail variable 𝑋𝑗 would not work with any values in the head variable 𝑋𝑖 , we
remove that value from the set of possible values tail variable 𝑋𝑗 .
• If a value is removed for 𝑋𝑗 when enforcing arc-consistency for an arc 𝑋𝑗 → 𝑋𝑖 , add arcs of the
form 𝑋𝑘 → 𝑋𝑗 to 𝑄 where 𝑋𝑘 represents all unassigned variables.
• Continue until all arcs are removed from 𝑄.
• Assign a value to a variable, then repeat the previous steps.

Limitations of Arc-Consistency:

• After enforcing arc consistency, you may have one solution left, multiple solutions, or no
solutions at all and not know it.
• Arc-Consistency can lead to no solution for the problem due to the arc structure.

EX: All Arcs are consistent but there is still no solution.


6.2.2. Ordering:
Ordering is deciding which variable should be assigned next and decide the order or which value should
be assigned.

Apply the following to improve backtracking search:

Variable Ordering: Minimum Remaining Values (MRV)

• Choose the variable with the fewest legal left values in its domain.
• That variable is also called “most constrained variable”.

Value Ordering: Least Constraining Value (LCV)

• Given a choice of variable, choose the least constraining value that rules out the fewest values in
the remaining variables.
• It may take more computation.

6.2.3. Structure:
Structure is exploiting the problem’s structure to improve the performance of backtracking search.

You might also like