0% found this document useful (0 votes)
15 views32 pages

Search in Complex Environment

Local search algorithms focus on exploring neighboring states from an initial state without tracking paths, offering advantages like low memory usage and efficiency in large search spaces. Hill-climbing, a type of local search, iteratively improves solutions but can get stuck in local optima, while simulated annealing introduces randomness to escape these traps and converge towards global optima. Both methods have specific applications in optimization problems, with various strategies to enhance their effectiveness.

Uploaded by

elfahsym
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views32 pages

Search in Complex Environment

Local search algorithms focus on exploring neighboring states from an initial state without tracking paths, offering advantages like low memory usage and efficiency in large search spaces. Hill-climbing, a type of local search, iteratively improves solutions but can get stuck in local optima, while simulated annealing introduces randomness to escape these traps and converge towards global optima. Both methods have specific applications in optimization problems, with various strategies to enhance their effectiveness.

Uploaded by

elfahsym
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Search in complex environment

Local search
Definition
Local search algorithms operate by starting from an initial state and then searching
through neighboring states without keeping track of the paths or the set of states that
have already been reached. This means these algorithms are not systematic; they may
never explore certain areas of the search space where a solution could exist. However,
they offer two significant advantages:

They require very little memory.


They can often find reasonable solutions in large or infinite state spaces where
systematic algorithms may not be suitable.

Key Characteristics
Neighborhood-Based: These methods explore solutions that are close to the
current one.
Iterative Improvement: They progress to a better solution at each step.
Local Optima: They can become stuck in a local optimum, which may not be the
global optimum.
Efficiency: They are computationally efficient, making them suitable for large
search spaces.

Examples:
Hill-Climbing: Moves to the best neighboring solution.
Simulated Annealing: Allows for moves to worse solutions based on a probability
mechanism.
Tabou Search: Utilizes memory to prevent revisiting recent solutions.
Genetic Algorithms: Combines multiple solutions to explore the search space
effectively.

Applications
Solving optimization problems such as the Traveling Salesman Problem (TSP).
Tuning hyperparameters in machine learning.
Resource allocation and scheduling tasks.

Local search algorithms can also solve optimization problems, in which the aim is to
Optimization problem Objective function find the best state according to an objective
function.
Optimization problem
Finding the best state according to some objective function.
In the state-space landscape, we can solve an optimization problem by locating the
lowest or the highest valley, under the assumption that each state is defined by its
location and elevation (cost or heuristic function).

If elevation represents the global minimum cost, then the goal is to locate the lowest
valley, which signifies the global minimum. Conversely, if elevation refers to the global
maximum of an objective function, the objective is to find the highest peak, indicating
the global maximum.

Local search algorithms explore this landscape, and a complete local search algorithm
will always find a goal if one exists. In contrast, an optimal local search algorithm will
identify a global minimum or maximum if one is present.

Hill-climbing algorithm
Hill climbing is a local search algorithm used to solve optimization problems. This
iterative method begins with a random solution and seeks to improve it through
incremental changes. The name "hill climbing" comes from the analogy of climbing a
hill, where the objective is to reach the highest point (the optimal solution) by
consistently moving upward toward better solutions. The algorithm moves in the
direction that offers the steepest ascent.

Unlike other algorithms, hill climbing does not maintain a search tree; therefore, the
data structure for the current state only needs to record the current state and the
value of the objective function. Additionally, hill climbing only considers the immediate
neighbors of the current state and does not look beyond them.
Key Concepts of Hill-Climbing Search:
Objective Function: The algorithm aims to maximize an objective function, which
evaluates the quality of a solution.
Neighborhood: At each step, the algorithm explores the "neighborhood" of the
current solution, which consists of solutions that can be reached by making small
changes (e.g., flipping a bit, swapping elements, etc.).
Greedy Approach: Hill-climbing always moves to a neighboring solution that
improves the objective function, making it a greedy algorithm.
Local Optima: The algorithm can get stuck in a local optimum, where no
neighboring solution is better, but the global optimum (the best possible solution)
has not been reached.

Steps of Hill-Climbing Search:


1. Start with an initial solution: This can be randomly generated or based on some
heuristic.
2. Evaluate the current solution: Calculate the value of the objective function for the
current solution.
3. Generate neighboring solutions: Create a set of solutions that are close to the
current solution (e.g., by making small modifications).
4. Evaluate neighboring solutions: Calculate the objective function for each
neighbor.
5. Move to the best neighbor: If a neighbor has a better objective value, move to
that solution and repeat the process.
6. Terminate: If no better neighbor exists, the algorithm stops, and the current
solution is returned as the result.

Pseudocode for Hill-Climbing Search:


Algorithm Hill-Climbing Search
Input: Initial node with state and objf
Output: Best solution they can find
function hillClimbingSearch(Initial_node)
current ← Initial_node //Initialize with starting solution
while True do
//Generate the neighbors of current and return the best one according to the objective
function
n ← bestNeighborOf(current)
if n.objf ≤ current.objf then
return current //Return the best solution found
else
current ← n
end if
end while
end function

Example: 8-Quens problem


To illustrate how works the hill-climbing algorithm we will consider the 8-queens
problem with complete-state formulation

Problem formulation

State-space description: The 8-queens problem contains

PYTHON

nsts = 64*63*62*61*60*59*58*57
print(f"The size of state-space is: {nsts}")

Initial State:

PYTHON

import sys
sys.stdout.reconfigure(encoding='utf-8')
from tabulate import tabulate
import numpy as np
import random
random.seed(0)

grid = np.full((8,8), "-", dtype=np.str_)


for j in range(8):
grid[random.randint(0,7),j] = 'Q'
print(tabulate(grid, tablefmt="grid"))

Neighborhood: Generate neighboring state by moving each single queen up or down


in the same column which generates 56 successors.
Objective function we utilize as objective function the minus of the number of pairs of
queens that are attacking each other. The solution will be reached if f obj = 0
Iteration: Move to the node with the highest objective function.
Termination: Stop when any improvement can be done.
Hill climbing is often referred to as greedy local search because it chooses a
favorable neighboring solution without considering the long-term path ahead. While
greed is traditionally seen as one of the seven deadly sins, greedy algorithms
frequently yield good results. Hill climbing can quickly advance towards a solution, as
it is typically straightforward to enhance a poor state.

Pros of Hill-Climbing Search:

Simplicity: Easy to understand and implement.


Efficiency: Requires minimal memory and computational resources.
Scalability: Works well for problems with large search spaces.

Cones of Hill-Climbing Search:


Local Optima: The algorithm can get stuck in a local optimum and fail to find the
global optimum.
Plateaus: If the objective function is flat (no improvement in any direction), the
algorithm may stop prematurely.
Ridges: If the search space has ridges, the algorithm may oscillate and fail to
make progress.

In general, the Hill-Climbing algorithm will be stuck in 86% of cases and only reach the
solution state in 14%.

Variants of Hill-Climbing Search:

Simple Hill-Climbing:
Moves to the first neighbor that improves the objective function.
Faster but may not find for best solution.
Steepest-Ascent Hill-Climbing:
Evaluates all neighbors and moves to the one with the best improvement.
More computationally expensive but more thorough.
Stochastic Hill-Climbing:
Randomly selects a neighbor and moves to it if it improves the objective
function.
Introduces randomness to escape local optima.
Random-Restart Hill-Climbing:
Runs hill-climbing multiple times with different initial solutions.
Increases the chances of finding the global optimum.
Analysis of Hill-Climbing Search Algorithm performance

To evaluate the Hill-Climbing Search Algorithm, we will analyze it based on three key
criteria:

Completeness
Quality of Solution
Time and spatial complexity

Completeness

Completeness refers to whether an algorithm is guaranteed to find a solution, if one


exists, given sufficient time and resources.

Analysis:

Hill-Climbing is NOT Complete:


It can get stuck in local optima, where no neighboring solution is better, but
the global optimum has not been reached.
It does not explore the entire search space systematically.
If the initial solution is far from the global optimum, the algorithm may never
find it.

Exceptions:

Random-Restart Hill-Climbing:
By restarting the algorithm multiple times with different initial solutions, it
becomes probabilistically complete. Given infinite time, it will eventually find
the global optimum.

Quality of Solution

Quality of Solution refers to how close the algorithm's output is to the optimal
solution.

Analysis:

Hill-Climbing Provides Local Optima:


The quality of the solution depends on the initial solution and the structure
of the search space.
If the search space has many local optima, the algorithm may return a
suboptimal solution.
In problems with a smooth and convex objective function, hill-climbing can
find a high-quality solution.

Factors Affecting Quality:

Initial Solution: A good initial solution increases the likelihood of finding a high-
quality local optimum.
Neighborhood Structure: A well-defined neighborhood can help the algorithm
explore better solutions.
Objective Function: If the objective function has few local optima, the solution
quality is likely to be high.

Complexity

Complexity refers to the time and space requirements of the algorithm.

Time Complexity:

Depends on the Problem:


The time complexity of hill-climbing depends on:
1. The size of the neighborhood (number of neighboring solutions
generated per iteration).
2. The number of iterations required to reach a local optimum.
Worst Case:
If the algorithm explores a large number of neighbors or gets stuck in a
plateau, the time complexity can be exponential.
Average Case:
For most problems, the time complexity is O(n ⋅ k), where:
n = number of iterations.
k = number of neighbors evaluated per iteration.

Space Complexity:
O(1) for Basic Hill-Climbing:
The algorithm only needs to store the current solution and its evaluation.
Higher for Variants:
Variants like tabu search or random-restart hill-climbing may require
additional memory to store visited solutions or restart points.
Summary of Analysis

Criterion Analysis
Completeness Not complete (can get stuck in local optima). Random-restart is
probabilistically complete.
Quality of Returns local optima. Quality depends on the initial solution and
Solution problem structure. Variants like simulated annealing improve
quality.
Time Average case: O(n ⋅ k). Worst case: Exponential.
Complexity
Space O(1) for basic hill-climbing. Higher for variants like tabu search.
Complexity

Recommendations for Using Hill-Climbing

Use Hill-Climbing When:


The search space is large, and exhaustive search is impractical.
A good-enough solution (local optimum) is acceptable.
The problem has a smooth and continuous objective function.
Avoid Hill-Climbing When:
The problem has many local optima.
The global optimum is required.
Improve Hill-Climbing with Variants:
Use random-restart hill-climbing or simulated annealing to increase the
chances of finding the global optimum.
Use tabu search to avoid cycling and explore more of the search space.

Conclusion
Hill-climbing is a simple and efficient algorithm for optimization problems, but it has
limitations like getting stuck in local optima. By understanding its completeness,
solution quality, and complexity, you can decide when and how to use it effectively.
Variants like stochastic hill-climbing, random-restart hill-climbing, and simulated
annealing can help overcome its limitations.
Simulated annealing
Simulated Annealing (SA) is a probabilistic search method used to solve optimization
problems, especially in large and complex search spaces where traditional methods
may struggle. This technique is inspired by the physical process of annealing in
metallurgy, where a material is heated and then gradually cooled to reduce defects
and achieve a stable, low-energy state. Similarly, SA explores the search space by
occasionally allowing "worse" solutions, with a certain probability that decreases over
time. This approach helps the method escape local optima and ultimately converge
toward a global optimum.

Key Concepts
Objective Function:
SA aims to minimize (or maximize) an objective function f(S) , where S
represents a solution in the search space.
Temperature (T ):
A control parameter that influences the probability of accepting worse
solutions.
Starts at a high value and gradually decreases over time according to a
cooling schedule.
Cooling Schedule:
Defines how the temperature T is reduced over time.
Common schedules include exponential decay (e.g.,

T = T0 ⋅ αk

Where α is a decay rate and k is the iteration number.


Acceptance Probability:
The probability of accepting a worse solution is given by the Metropolis
criterion:

1 if ΔE ≤ 0,
P (ΔE) = { $$W here$ΔE = f(x new ) − f(x current )$isthechangeintheob
e −ΔE/T if ΔE > 0,

Exploration vs. Exploitation:


At high temperatures, SA explores the search space more freely, accepting
worse solutions to escape local optima.
As temperature decreases, SA transitions to exploitation, favoring better
solutions and converging toward the global optimum.
Algorithm Steps
Initialization:
Start with an initial solution x current .
Set an initial temperature T 0 and define the cooling schedule.
Iteration:
Repeat until a stopping condition is met (e.g., temperature reaches a
minimum, or a maximum number of iterations is reached):
a. Generate a Neighbor:
Create a new solution x new by perturbing x current (e.g., small random
changes).
b. Evaluate the New Solution:
Compute ΔE = f obj (x new ) − f obj (x current ).
c. Accept or Reject the New Solution:
If ΔE ≥ 0, accept x new as the new current solution.
If ΔE < 0, accept x new with probability e −ΔE/T .
d. Update Temperature:
Reduce the temperature according to the cooling schedule.
3. Termination:
Return the best solution found during the search.

Simulated annealing algorithm


Algorithm Simulated Annealing Search
procedure simulatedAnnealingSearch(S,T0 ,Tend ,n) ​ ​

S←S //Initial state


T ← T0 ​ //Annealing temperature
while T > Tend do ​

for i ∈ range(n) do //Number of iteration at a constant temperature T



S randomNeighbor(S ) //a random neighbor

ΔE = S .f obj − S.f obj
if ΔE < 0 then //Difference of energy to evaluate how best or worst the new state
S ← S′ //Always acceptable
else
if S ′ .P > uniformRand() then //acceptance of the worst condidate if P > r
S←S ′

else
continue
end if
end if
end for
T ← updateAnnealingTemperature(T ) //Temperature update rule
end while
return S
end procedure

Performance Analysis of Simulated Annealing


Simulated Annealing (SA) performance is influenced by several factors, including
convergence speed, solution quality, computational complexity, and sensitivity to
parameter settings. Below is a detailed performance analysis:

Convergence and Solution Quality

Global vs. Local Optima: Simulated Annealing (SA) differs from simple local search
methods like Hill Climbing because it has the ability to escape local optima. It does this
by accepting worse solutions based on a probability determined by a temperature-
dependent function. When properly tuned, SA can converge to near-optimal or even
globally optimal solutions in many practical scenarios.

Cooling Schedule Impact: The cooling schedule (temperature decay function)


significantly affects convergence.
- Fast Cooling (Exponential Decay, T k+1 = 𝛂T k ): Leads to faster convergence but
increases the risk of getting stuck in local optima.
- Slow Cooling (Logarithmic Decay, T k = ln(k+1) c
): Theoretically guarantees
convergence to the global optimum, but is computationally expensive.

Acceptance Probability: The probability of accepting a worse solution follows the


Metropolis criterion: P (ΔE, T ) = exp (− ΔE
T ) where ΔE is the energy difference
(objective function difference). A well-tuned acceptance function ensures exploration
in early stages and exploitation in later stages.

Computational Complexity

Time Complexity: The worst-case complexity depends on the number of iterations of


inner, outer loop and the neighborhood evaluation cost O(C). For combinatorial
problems, each iteration typically requires evaluating a move in the solution space,
leading to: O(n × N × C). The choice of n and N is critical:

Small values leads to premature convergence.


Large values increases computational cost but improves solution quality.

Space Complexity: SA generally requires O(1) additional space beyond the input data
and solution representation. It stores the current state, candidate state, and
temperature value, making it memory-efficient.
Robustness and Parameter Sensitivity

Initial Temperature Selection:

If T 0 is too low, the algorithm behaves like Hill Climbing.


If T 0 is too high, the algorithm initially accepts too many bad solutions, leading to
inefficiency.

Cooling Rate α Sensitivity (exponential decay case):

If α is too high (e.g., 0.99), the system cools too slowly, increasing runtime.
If α is too low (e.g., 0.8), SA cools too quickly, leading to suboptimal solutions.

Problem Dependency:

SA performs well for problems with rugged search landscapes, such as the
Traveling Salesman Problem (TSP) and Job Scheduling.
For smooth landscapes, gradient-based methods might outperform SA.

Advantages of Simulated Annealing


Escapes Local Optima: By probabilistically accepting less desirable solutions,
simulated annealing (SA) can escape local optima and explore the search space
more efficiently.
Flexibility: Can be applied to a wide range of optimization problems, including
combinatorial and continuous domains.
Simplicity: Easy to implement and requires minimal problem-specific knowledge.

Disadvantages of Simulated Annealing


Parameter Sensitivity: Performance depends on the choice of initial temperature,
cooling schedule, and neighbor generation strategy.
Computational Cost: May require many iterations to converge, especially for large
search spaces.
No Guarantee of Optimality: While SA often finds good solutions, it does not
guarantee finding the global optimum.

Comparison with Hill-Climbing

Algorithm Convergence Escaping Computational Best Use Cases


Rate Local Cost
Optima
Hill Climbing Fast Poor Low Simple convex
problems
Algorithm Convergence Escaping Computational Best Use Cases
Rate Local Cost
Optima
Simulated Moderate Good Moderate NP-hard
Annealing combinatorial
problems

Optimization Strategies for SA

Adaptive Temperature Scheduling: Adjusts T dynamically based on solution


improvement rate.
Hybrid Approaches: Combine SA with Genetic Algorithms to enhance efficiency.
Parallel Simulated Annealing: Runs multiple SA instances with different cooling
schedules to accelerate convergence.

Application example: Traveling Salesman Problem (TSP)


1. Objective Function: Minimize the total distance of the tour.
2. Neighbor Generation: Swap two cities in the tour.
3. Cooling Schedule: Exponential decay with α = .99 .
4. Initial Temperature: High enough to allow exploration.

Summary
Simulated Annealing algorithm is a powerful and versatile search method for solving
optimization problems. By balancing exploration and exploitation through a
temperature-controlled probabilistic acceptance criterion, it can effectively navigate
complex search spaces and find high-quality solutions. However, careful tuning of
parameters and problem-specific adaptations are often necessary for optimal
performance.

Genetic algorithm for search method


Genetic Algorithms (GAs) are a class of evolutionary algorithms inspired by the
process of natural selection. They are used to find approximate solutions to
optimization and search problems. GAs operate on a population of potential solutions,
applying principles such as selection, crossover (recombination), and mutation to
evolve the population over generations.

Components of Genetic Algorithms


Population: A set of candidate solutions (individuals) to the problem.
Chromosome: A representation of a solution, often encoded as a string of bits,
numbers, or other data structures.
Fitness Function (Objective function): A function that evaluates how close a
given solution is to the optimum.
Selection: The process of choosing individuals based on their fitness to produce
offspring.
Crossover: Combining parts of two or more parent solutions to create new
offspring.
Mutation: Randomly altering parts of an individual's chromosome to introduce
variability.

Genetic Algorithms as Local Search Methods


While GAs are typically considered global search methods due to their ability to
explore a wide search space, they can also be adapted for local search by focusing on
refining existing solutions. This can be achieved through:

Elitism: Preserving the best solutions from one generation to the next.
Local Mutation: Applying small, localized changes to solutions to explore nearby
regions.
Restricted Crossover: Using crossover operators that produce offspring close to
the parents.

Algorithm steps
Initialization: Generate an initial population of random solutions
Evaluation: Calculate the fitness of each individual solution
Selection: Choose individuals for reproduction based on fitness
Crossover: Create new solutions by combining selected parents
Mutation: Randomly alter some solutions to maintain diversity
Replacement: Form a new population from offspring and possibly some parents
Termination: Stop when a termination condition is met (e.g., fitness threshold,
generation limit)

Pseudocode
Algorithm Genetic Algorithm
procedure GeneticAlgorithm(populationSize, maxGenerations, crossoverRate,
mutationRate)
population ← InitializePopulation(populationSize)
generation ← 0
while generation < maxGenerations and not TerminationCondition() do
EvaluateFitness(population)
newP opulation ← ∅
while ∣newP opulation∣ < populationSize do
parent1 ← SelectParent(population)
parent2 ← SelectParent(population)
if Random() < crossoverRate then
child1, child2 ← Crossover(parent1, parent2)
else
child1, child2 ← parent1, parent2
end if
if Random() < mutationRate then
child1 ← Mutate(child1)
end if
if Random() < mutationRate then
child2 ← Mutate(child2)
end if
Add child1, child2 to newP opulation
end while
population ← Replace(population, newP opulation)
generation ← generation + 1
end while
return BestSolution(population)
end procedure

Performance Analysis
Exploration vs. Exploitation: GAs balance exploration (searching new areas) and
exploitation (refining existing solutions). As local search methods, they lean more
towards exploitation.
Convergence: GAs can converge to local optima, especially if the population
diversity is low. Techniques like maintaining population diversity and adaptive
mutation rates can mitigate this.
Scalability: GAs can handle large, complex search spaces, but their performance
may degrade with very high-dimensional problems due to the curse of
dimensionality.
Parallelism: GAs are inherently parallel, allowing for efficient use of computational
resources.
Robustness: GAs are robust to noisy and dynamic environments, making them
suitable for real-world problems where the fitness landscape may change over
time.

Complexity Analysis of Genetic Algorithms (GAs)

The complexity of a Genetic Algorithm (GA) is influenced by several factors, including


population size, the number of generations, the complexity of the fitness function, and
the genetic operators, which include selection, crossover, and mutation. Below is a
detailed analysis of the complexity associated with each component of the GA.

Population Initialization

Time Complexity: O(N ⋅ L)


N : Population size.
L: Length of each chromosome (solution representation).
Each chromosome is generated and initialized, which takes O(L) time per
individual.

Fitness Evaluation

Time Complexity: O(N ⋅ T f )


T f : Time complexity of the fitness function for a single individual.
The fitness of all N individuals is evaluated in each generation.

Selection

Time Complexity: O(N) (for roulette wheel, tournament selection or stochastic


universal sampling)
Selection typically involves iterating over the population and selecting
individuals based on their fitness.
More advanced selection methods (e.g., rank-based selection) may have
higher complexity.

Crossover

Time Complexity: O(N ⋅ L)


Crossover is applied to pairs of individuals, and each crossover operation
takes O(L) time.
In the worst case, N
2 pairs are processed.

Mutation

Time Complexity: O(N ⋅ L)


Mutation is applied to each individual with probability p m .
Each mutation operation takes O(L) time.

Replacement

Time Complexity: O(N)


Replacing the old population with the new population involves copying N
individuals.

Overall Complexity

The overall time complexity of a GAs depends on the number of generations (G) and
the complexity of each generation. The total time complexity is:

O(G ⋅ (N ⋅ T f + N ⋅ L))
Dominant Terms:
Fitness evaluation (O(N ⋅ T f )) is often the most expensive operation,
especially if T f is large.
Genetic operations (crossover and mutation) scale linearly with N and L.

Space Complexity

Population Storage: O(N ⋅ L)


The population of N individuals, each of length L , must be stored.
Auxiliary Space: O(N ⋅ L)
Additional space is required for storing offspring and intermediate results
during crossover and mutation.

Factors Affecting Complexity


Fitness Function: If the fitness function is computationally expensive (e.g.,
involves simulations or complex calculations), it dominates the overall complexity.
Population Size: Larger populations increase both time and space complexity but
improve exploration.
Chromosome Length: Longer chromosomes increase the cost of crossover and
mutation.
Number of Generations: More generations improve solution quality but increase
runtime.

Optimization Techniques
Parallelization: Fitness evaluation and genetic operations can be parallelized to
reduce runtime.
Elitism: Preserving the best solutions reduces the need for excessive generations.
Adaptive Operators: Dynamically adjusting crossover and mutation rates can
improve efficiency.

Summary of Complexity

Component Time Complexity Space Complexity


Population Initialization O(N ⋅ L) O(N ⋅ L)
Fitness Evaluation O(N ⋅ T f ) O(1)
Selection O(N) O(1))
Crossover O(N ⋅ L) O(N ⋅ L)
Mutation O(N ⋅ L) O(1))
Replacement O(N) O(N ⋅ L)
Component Time Complexity Space Complexity
Total O(G ⋅ (N ⋅ T f + N ⋅ L)) O(N ⋅ L)

Pros and Cons


Pros:
Flexibility: Can be applied to a wide range of problems with different
representations and fitness functions.
Global Search Capability: While adapted for local search, they retain the
ability to escape local optima.
Parallelism: Can be easily parallelized for faster computation.
Cons:
Computational Cost: Can be computationally expensive, especially for large
populations and complex fitness functions.
Parameter Sensitivity: Performance can be highly dependent on the choice
of parameters (e.g., mutation rate, crossover rate).
Premature Convergence: Risk of converging too quickly to suboptimal
solutions if diversity is not maintained.

Application example: 8-Queens problem.

Let solve the 8-Queens problem through the genetic algorithms approaches using
binary and numeric chromosomes.

*Conclusion

Genetic Algorithms, when adapted for local search, offer a powerful and flexible
approach to problem-solving. They balance exploration and exploitation, making them
suitable for a wide range of optimization problems. However, careful tuning of
parameters and strategies to maintain diversity are crucial for their effective
performance. Their robustness and parallelism further enhance their applicability in
complex and dynamic environments.

Local search in continuous state-space


What happens if the state space of our problem is continuous? Can we use the same
algorithm, or must we change our strategy to find a solution? To address these
questions, let's consider an example to determine the best approach.
Example : gradient descent
Plan to construct three new football stadiums in northern Morocco, ensuring minimal
distance from major cities (9 cities) in the region, provided that road accessibility is
guaranteed.

The objective is to identify the best locations for three new football stadiums in
northern Morocco. The goal is to minimize the total distance to nine major cities while
ensuring that these locations are accessible by road. This is a classic optimization
problem within a continuous domain, where the solution space consists of geographic
coordinates (latitude and longitude).

Problem Formulation
Objective: Minimize the total distance from the three stadiums to the nine major
cities in northern Morocco. Let’s assume "distance" refers to road distance (though
Euclidean distance could be a simplifying proxy if road data is unavailable).
State Space: Continuous 2D coordinates (x i , y i ) for each stadium i = 1, 2, 3, where
x i and y i are longitude and latitude within northern Morocco’s geographic bounds
(roughly 34°N to 36°N, -2°W to -7°W).
Variables: Six coordinates total (two per stadium): (x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 ).
Cities: Assume the nine major cities in northern Morocco are Tangier, Tetouan,
Chefchaouen, Al Hoceima, Nador, Oujda, Fez, Meknes, and Taza (common urban
centers in the region). Their coordinates are fixed.
Objective Function:
9
f(x 1 , y 1 , x 2 , y 2 , x 3 , y 3 ) = ∑ min d(c, (x i , y i ))
i=1,2,3
c=1

where d(c, (x i , y i )) is the distance from city c to stadium i. This assumes each city is
"assigned" to its nearest stadium, minimizing the maximum travel distance for
residents.
Constraint: Locations must have road accessibility (e.g., within a certain distance
of existing highways or feasible road-building zones). For simplicity, we’ll assume
all points in the region could be made accessible with reasonable infrastructure,
refining this later if needed.

Local Search Approach: Gradient Descent with Clustering


Given the continuous nature of the problem, a gradient-based local search method like
gradient descent, combined with a clustering heuristic, is well-suited. Here’s how we
could apply it:
Step 1: Initialization
Randomly place the three stadiums at initial coordinates within northern Morocco.
For example:
Stadium 1: (35.5°N, -5.5°W) near Tangier.
Stadium 2: (34.5°N, -4.0°W) near Fez.
Stadium 3: (35.0°N, -2.5°W) near Nador.
These starting points are guesses based on geographic spread, but randomness
ensures exploration.

Step 2: Assign Cities to Stadiums


For each city, compute the distance to all three stadiums (using Euclidean distance
as a proxy or road distance if data is available). Assign each city to its nearest
stadium. This forms three clusters:
Cluster 1 (Stadium 1): Tangier, Tetouan, Chefchaouen.
Cluster 2 (Stadium 2): Fez, Meknes, Taza.
Cluster 3 (Stadium 3): Al Hoceima, Nador, Oujda.
Initial clustering depends on the starting positions and will adjust as locations shift.

Step 3: Optimize Stadium Locations


For each stadium, minimize the sum of distances to its assigned cities by adjusting
its position. The sub-objective for stadium i is:

f i (x i , y i ) = ∑ d(c, (x i , y i ))
c∈C i

where C i is the cluster of cities assigned to stadium i.


Compute the gradient of f i with respect to (x i , y i ). For Euclidean distance
d(c, (x i , y i )) = √(x c − x i ) 2 + (y c − y i ) 2 , the partial derivatives are:
∂f i
∂x i = ∑ c∈Ci x i −x c
√(x c −x i ) 2 +(y c −y i ) 2
,

.
∂f i y i −y c
∂y i = ∑ c∈Ci √(x c −x i ) 2 +(y c −y i ) 2

Update each stadium’s position:

∂f i ∂f i
(x i , y i ) ← (x i , y i ) − η ⋅ ( , )
∂x i ∂y i

where η is the step size (e.g., 0.01).

Step 4: Iterate
Reassign cities to the nearest stadium based on updated positions.
Recalculate gradients and update positions.
Repeat until convergence (e.g., when position changes or objective function
improvement falls below a threshold, like 0.001).
Step 5: Road Accessibility Check
After convergence, verify each stadium’s location against a road network map
(e.g., proximity to highways like A1, A2, or N13). If a location is too far (say, >10 km
from a major road), perturb it toward the nearest accessible point and reoptimize
locally.

Pseudocode
Algorithm Gradient Descent for Stadium Placement in Northern Morocco
Input: City coordinates C = {(cjx , cjy )}9j=1 , learning rate α, max iterations T , tolerance ϵ ​ ​ ​

Output: Optimal stadium coordinates S = {(x1 , y1 ), (x2 , y2 ), (x3 , y3 )} ​ ​ ​ ​ ​ ​

(0) (0) (0) (0) (0) (0)


Initialize S (0) = {(x1 , y1 ), (x2 , y2 ), (x3 , y3 )} randomly​ ​ ​ ​ ​ ​

Set t ← 0
(0) (0)
Compute initial cost f (S (0) ) = ∑9j=1 mini=1,2,3 ​ ​
(cjx − xi )2 + (cjy − yi )2
​ ​ ​ ​ ​

while t < T and change in cost > ϵ do


for i = 1 to 3 do //Update each stadium's position
(t) (t)
Assign each city j to nearest stadium kj = arg mini ​ ​
(cjx − xi )2 + (cjy − yi )2
​ ​ ​ ​ ​

Compute gradient components:


(t)
∂f xi −cjx
← ∑j:kj =i
​ ​

(t) (t) (t)


∂xi
​ ​ ​

(cjx −xi )2 +(cjy −yi )2


​ ​ ​ ​ ​

(t)
∂f yi −cjy
← ∑j:kj =i
​ ​

∂yi(t) (t) (t)


​ ​ ​

(cjx −xi )2 +(cjy −yi )2


​ ​ ​ ​ ​

(t+1) (t) ∂f
Update: xi ← xi −α⋅ (t)
∂xi
​ ​

Update: yi(t+1) ← yi(t) − α ⋅ ∂f


(t)
∂yi
​ ​ ​

end for
Compute new cost f (S (t+1) )
if ∣f (S (t+1) ) − f (S (t) )∣ < ϵ then
break
end if
t←t+1
end while
return S (t)

Example Walkthrough

Cities’ Coordinates (approx.):


Tangier (35.76°N, -5.80°W), Tetouan (35.57°N, -5.37°W), Chefchaouen (35.17°N,
-5.27°W), Al Hoceima (35.25°N, -3.93°W), Nador (35.17°N, -2.93°W), Oujda
(34.68°N, -1.91°W), Fez (34.03°N, -5.00°W), Meknes (33.89°N, -5.55°W), Taza
(34.21°N, -4.01°W).
Initial Positions: As above.
Iteration 1:
Cluster 1 (Tangier, Tetouan, Chefchaouen) pulls Stadium 1 toward their
centroid, say (35.5°N, -5.5°W) → (35.4°N, -5.4°W).
Cluster 2 (Fez, Meknes, Taza) shifts Stadium 2 to (34.0°N, -4.8°W).
Cluster 3 (Al Hoceima, Nador, Oujda) moves Stadium 3 to (35.0°N, -2.9°W).
Iteration 2: Reassign cities (e.g., Taza might switch to Stadium 3 if closer),
recompute gradients, and adjust.
Convergence: After several iterations, positions stabilize, e.g., near Tetouan, Fez,
and Nador, adjusted for road access (e.g., near A1 for Tetouan, N13 for Fez, N15
for Nador).

Final Locations (Hypothetical)


Stadium 1: Near Tetouan (35.6°N, -5.4°W), accessible via A1.
Stadium 2: Near Fez (34.1°N, -5.0°W), near N13.
Stadium 3: Near Nador (35.1°N, -2.9°W), near N15.
These minimize distances to their clusters while aligning with Morocco’s road
network.

Example: Newton's method


To address the challenge of constructing three football stadiums in northern Morocco
while minimizing the distance from nine major cities and ensuring road accessibility,
we can utilize a local search approach based on Newton's method. This strategy
involves optimizing the stadium locations by minimizing a cost function that considers
both distance and road accessibility.

Problem Formulation
Objective:
Minimize the total distance between the stadiums and the nine major cities.
Ensure that the stadiums are located on or near accessible roads.
Decision Variables:
Let (x i , y i ) be the coordinates of the i-th stadium (i = 1, 2, 3).
Constraints:
Each stadium must be located on or near a road (represented as a set of road
segments or a road network).
Cost Function:
The cost function f(x 1 , y 1 , x 2 , y 2 , x 3 , y 3 ) can be defined as:
3 9
f = ∑ ∑ d ij + λ ⋅ RoadPenalty
i=1 j=1

where:
d ij is the Euclidean distance between the i-th stadium and the j-th city.
RoadPenalty is a penalty term that increases if a stadium is far from a
road.
λ is a weighting factor to balance the distance and road accessibility
terms.
Local Search Based on Newton's Method
Newton's method is an iterative optimization technique that uses the gradient and
Hessian of the cost function to find a local minimum. Here's how we can apply it:

Steps
Initialization:
Start with initial guesses for the stadium locations (x 01 , y 01 , x 02 , y 02 , x 03 , y 03 ).
These can be random or based on heuristic locations (e.g., centroid of cities).
Gradient and Hessian Calculation:
Compute the gradient ∇f and Hessian H of the cost function f.

3. Update Rule:

Update the stadium locations using Newton's method:

p k+1 = p k − H −1 (p k ) ⋅ ∇f(p k )

where p k = (x k1 , y k1 , x k2 , y k2 , x k3 , y k3 ) is the current solution at iteration k.

4. Constraints Handling:

After each update, project the stadium locations onto the nearest road
segment to ensure road accessibility.

5. Termination:

Stop when the change in the cost function f is below a threshold or a


maximum number of iterations is reached.

Implementation Details

Cost Function

Distance Term:
Compute the Euclidean distance between each stadium and each city:

d ij = √(x i − c jx ) 2 + (y i − c jy ) 2

where (c jx , c jy ) are the coordinates of the j-th city.


Road Penalty Term:
Compute the distance from each stadium to the nearest road segment and
apply a penalty if the distance exceeds a threshold.
Gradient and Hessian
The gradient ∇f and Hessian H can be computed numerically or analytically,
depending on the complexity of the cost function.

Projection onto Roads


Use a road network representation (e.g., a graph of road segments) and project
each stadium location onto the nearest road segment.

pseudocode
Algorithm Newton's Method for Stadium Placement
Input: City coordinates {(cjx , cjy )}9j=1 , road network R, initial stadium locations p0 , max
​ ​ ​

iterations G, threshold ϵ
Output: Optimized stadium locations p∗
Initialize:
Set p ← p0
Set k ← 0
while k < G and ∥∇f (p)∥ > ϵ do
Gradient and Hessian Calculation:
Compute ∇f (p) and H(p)
Update Rule:
p ← p − H −1 (p) ⋅ ∇f (p)
Projection onto Roads:
for each stadium location (xi , yi ) in p do
​ ​

Project (xi , yi ) onto the nearest road segment in R


​ ​

end for
k ←k+1
end while
Return: Optimized stadium locations p∗ ← p

Example
Input:

Cities: Coordinates of 9 major cities in northern Morocco.


Roads: A road network represented as a set of line segments.
Initial stadium locations: Random or heuristic-based.

Output:
Optimized stadium locations that minimize the total distance to cities and ensure
road accessibility.

On the concept of convexity: A convex function is a mathematical function that has a


specific geometric property: if you draw a straight line between any two points on its
graph, the line segment will always lie above or on the graph of the function itself. In
simpler terms, a convex function curves upward or is flat, never dipping downward
between two points.
Formally, a function f(x) defined on an interval I is convex if, for any two points x 1 and
x 2 ​in that interval and any λ where 0 ≤ λ ≤ 1, the following inequality holds:

f(λ x 1 + (1 − λ) x 2 ) ≤ λ f(x 1 ) + (1 − λ) f(x 2 )

This is known as Jensen's inequality. It basically says that the function's value at a
weighted average of two points is less than or equal to the weighted average of the
function's values at those points.

Local search with uncertain actions


Local search is an optimization technique used to explore a state space iteratively in
order to find a goal state, such as an optimal solution. This method typically assumes
deterministic actions, meaning that if you are in a state S and take an action A, you will
end up in a predictable state S ′ . However, when actions involve uncertainty, the
outcome of an action A does not guarantee a single result; instead, it leads to a set of
possible next states, each associated with a certain probability. This uncertainty adds
complexity, as we must not only optimize a cost function but also manage risks and
expected outcomes.

In the context of state space, we define:

State Space: A collection of possible configurations (for example, different


positions for stadiums).
Actions: Moves or transformations that can be made (such as shifting the
coordinates of a stadium).
Uncertainty: Each action may succeed with a probability P or fail (resulting in
other outcomes) with a probability of 1 − P . This uncertainty can be influenced by
factors like noisy data, environmental changes, or incomplete models. This
situation reflects real-world scenarios, such as planning stadium locations where
factors like construction delays, weather, or road conditions can unpredictably
affect outcomes.

Key Concepts
Stochastic Transitions: Instead of S → S ′ , we have S → S ′ ​with probability
A A

P (S′ ∣ S, A), forming a probabilistic transition model.


Objective: Often, we aim to minimize an expected cost (e.g., expected distance to
cities) or maximize an expected utility, rather than a deterministic value.
Exploration vs. Exploitation: Local search must balance trying new moves
(exploration) with refining promising solutions (exploitation), made trickier by
uncertainty.

Adapting Local Search Algorithms


Let’s explore how classic local search methods adapt to uncertainty:
Hill Climbing with Uncertainty

Standard hill climbing picks the neighbor with the best value. With uncertain actions:

Approach: Evaluate neighbors based on expected value. For each action A,


compute

E[cost(S ′ )] = ∑ P (S ′ ∣ S, A) ⋅ cost(S ′ )
S′

Challenge: You might need to sample outcomes repeatedly to estimate P and


cost, especially if the transition model is unknown.
Issue: Still prone to local optima, and uncertainty can make “uphill” moves
misleading.

Simulated Annealing with Uncertainty

Simulated annealing uses a temperature parameter to escape local optima by


occasionally accepting worse moves. With uncertainty:

Modification: Accept a move based on the expected cost difference, weighted by


temperature: P (accept) = e −ΔE/T , where ΔE = E[cost(S ′ )] − cost(S).
Strength: Naturally handles noisy evaluations and can explore broadly early on,
narrowing as T decreases.

Gradient Descent with Uncertainty

From our prior discussion, gradient descent adjusts variables continuously. With
uncertain actions:

Stochastic Gradient Descent (SGD): Introduce noise into the gradient (e.g., from
sampling or model uncertainty) and update based on
x t+1 = x t − α ⋅ (∇f(x t ) + noise).
Advantage: Robust to small perturbations, common in machine learning with noisy
data.
Limitation: Assumes continuous space and may struggle with discrete,
probabilistic transitions.

Example: Stadium Placement with Uncertainty


Let’s revisit the Morocco stadium problem. Suppose moving a stadium’s coordinates is
uncertain—e.g., due to imprecise GPS or shifting land availability:

State: S = {(x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 )}.


Action: Move stadium 1 by (Δx, Δy).
Uncertainty: With 70% probability, it moves to (x 1 + Δx, y 1 + Δy); with 20%, it
overshoots to (x 1 + 1.5Δx, y 1 + 1.5Δy); with 10%, it fails to move.
Cost: Expected distance to cities,
E[distance] = ∑ P (outcome) ⋅ distance(outcome)
outcomes

Using stochastic hill climbing:

Sample the outcome of each possible move.


Estimate expected cost based on samples.
Move to the neighbor with the lowest expected cost.

Challenges and Solutions


Modeling Uncertainty: Requires a transition probability model. If unknown, use
Monte Carlo sampling or learn it online (reinforcement learning-style).
Computational Cost: Evaluating expected values over many outcomes is
expensive.
Local Optima: Uncertainty can trap you in deceptive regions. Solution: Add
randomness (e.g., annealing) or restart from multiple points

Local search with partial observability


Let's explore the impact of a partially observed environment on local search methods
—an interesting aspect that adds considerable complexity to our optimization toolkit. I
will clarify how partial observability affects local search, connecting it to the uncertain
actions we previously discussed, and providing practical insights to enhance
understanding.

Partially Observed Environment


In a fully observed environment, local search methods (like hill climbing, gradient
descent, or simulated annealing) assume complete knowledge of the current state and
the effects of actions. You know exactly where you are in the state space and can
evaluate the cost or utility of each neighbor. In a partially observed environment, you
only get incomplete or noisy information about the current state. This could be due to
sensor limitations, hidden variables, or delayed feedback.

Formally:

State Space: S, the set of all possible states.


Observations: O, a set of possible observations, where each observation O ∈ O
corresponds to some subset of states via a probabilistic mapping P (O|S) .
Belief State: Instead of knowing S, you maintain a probability distribution over
possible states, b(S) = P (S|O 1 , O 2 , …), updated with each observation.

Impact on Local Search


Partial observability disrupts the core assumptions of local search, which relies on
evaluating and comparing neighboring states. Let’s dissect the effects:
Uncertain State Evaluation
Problem: You can’t compute the exact cost f(S) of the current state because S
isn’t fully known. Instead, you get an observation O and must estimate
E[f(S)|O] = ∑ S P (S|O)f(S).
Impact: Cost evaluations become probabilistic, introducing noise and variance. A
“better” neighbor might look worse due to misleading observations.
Example: In our Morocco stadium problem, suppose city populations (affecting
distance weights) are only partially known due to outdated census data. The cost
of a stadium placement varies depending on the true, hidden population.

Ambiguity in Neighbor Selection


Problem: Moving to a neighbor assumes you know the current state to define
“neighbors.” With partial observability, you’re unsure of your starting point, so the
neighborhood itself is ambiguous.
Impact: Algorithms like hill climbing might greedily chase a phantom optimum
based on a flawed belief about the current state.
Mitigation: Use a belief state b(S) to define an expected neighborhood, averaging
over possible current states.

Exploration Becomes Critical


Problem: Without full visibility, you risk over-optimizing based on limited data,
missing better regions of the state space.
Impact: Purely exploitative methods (e.g., steepest ascent hill climbing) falter, as
they rely on accurate gradients or comparisons. Randomness or broader
exploration (e.g., simulated annealing) gains importance.
Intuition: It’s like searching for treasure in a dark cave with a flickering torch—you
need to probe more widely, not just follow the brightest glimmer.

Convergence Slows or Fails


Problem: Local search assumes iterative improvement toward a goal. Partial
observability can lead to oscillations or getting stuck, as updates based on noisy
observations may conflict with the true optimum.
Impact: Gradient descent, for instance, might follow a “gradient” distorted by
observation noise, zigzagging instead of converging.
Example: If stadium construction costs are partially observed (e.g., hidden soil
quality issues), gradient steps might overshoot or undershoot the true cost
minimum.

Adapting Local Search Methods


To handle partial observability, we modify traditional approaches:
Hill Climbing with Belief States
Adaptation: Maintain a belief state b(S) and compute the expected cost over
possible states: E[f] = ∫ f(S)b(S)dS. For each action, estimate the expected cost
of the resulting belief state.
Process:
From current belief b, sample possible states.
Simulate actions and update b with new observations (e.g., via Bayes’ rule).
Move to the neighbor with the lowest expected cost.
Challenge: Sampling and updating beliefs are computationally expensive,
especially in continuous spaces.

Stochastic Gradient Descent (SGD)


Adaptation: Treat partial observability as noise in the gradient. Use noisy
observations to approximate ∇f, updating parameters iteratively:
~
S t+1 = S t − α ⋅ ∇f(S t , O t ).
Strength: Robust to observation noise.
Limitation: Assumes the noise averages out over time, which may not hold if
observations systematically mislead (e.g., biased sensors).

Simulated Annealing with Exploration


Adaptation: Use temperature to encourage exploration of uncertain regions.
Accept moves based on expected cost differences, integrating over the belief
state.
Process:
1. Propose a move and predict the resulting observation distribution.
2. Accept with probability P = e −ΔE/T , where ΔE is the expected cost change.
Advantage: Naturally explores ambiguous areas early, refining as observations
accumulate.

Example: Stadiums with Partial Observability


Revisiting our problem:

Scenario: City distances are fully known, but road accessibility (e.g., traffic delays
or construction feasibility) is only partially observed via noisy surveys.
State: Stadium coordinates S = {(x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 )}.
Observation: A noisy indicator of accessibility (e.g., “good” or “poor” with some
error rate).
Cost: Minimize expected distance, weighted by uncertain accessibility penalties.
Approach: Use SGD over the belief state. For each iteration:
Observe noisy accessibility data.
Update belief about each site’s true cost.
Adjust coordinates with a noisy gradient step.

Let’s explore online local search, a dynamic and adaptive approach to optimization
that’s particularly relevant when the environment evolves over time or information is
revealed incrementally. As an AI expert, I’ll frame this in the context of our prior
discussions—building on uncertain actions and partially observed environments—
while highlighting its unique characteristics and practical implications.

Online Local Search


Online local search refers to optimization where the problem itself unfolds or changes
as the search progresses, requiring real-time decision-making without full prior
knowledge of the state space, cost function, or constraints. Unlike offline local search
(e.g., precomputing stadium locations with all city data available), online search
operates in a setting where:

Data arrives sequentially: You learn about the environment step-by-step.


Actions are irrevocable (or costly to reverse): Each move commits you to a path,
like exploring a maze without a map.
Adaptation is key: The algorithm must adjust to new information mid-process.

This mirrors scenarios like robotics (navigating unknown terrain), online resource
allocation, or even our Morocco stadium problem if city data or construction feedback
trickles in over time.

Key Features
Incremental Information: You don’t have the full cost function or state space
upfront—observations arrive as you act.
Real-Time Decisions: Moves are made before the problem is fully defined,
balancing immediate gains with future adaptability.
Feedback-Driven: The algorithm uses outcomes of past actions to refine its
strategy, often under time pressure.

Online vs Offline local search


Offline: Solve min f(x) with a fixed f and known constraints (e.g., gradient descent
on a static map).
Online: Solve a sequence of subproblems min f t (x t ), where f t evolves with time t
based on new data or actions.

Online Local Search in State Space


In a state space:

States: Configurations (e.g., current positions of resources or agents).


Actions: Local moves (e.g., shift a position, adjust a parameter).
Observations: Partial or noisy feedback about the current state or action
outcomes.
Goal: Minimize cumulative cost or regret over time, rather than a single static
optimum.

Algorithms for Online Local Search


Let’s adapt some local search methods to the online setting:

Online Hill Climbing


Approach: At each step, observe the current state (or a noisy version of it),
evaluate available neighbors based on immediate feedback, and move to the best
one.
Process:
1. Start at state S 0 .
2. At time t, observe O t (e.g., cost or partial state info).
3. Estimate neighbor costs using O t and prior knowledge.
4. Move to the neighbor with the lowest observed cost.
Challenge: Myopic—focuses on short-term gains, risking long-term traps.
Example: A robot adjusts its path based on real-time sensor data, unaware of
obstacles beyond its current view.

Online Simulated Annealing


Approach: Introduce randomness to explore despite limited information, adapting
the temperature dynamically based on observed progress.
Process:
1. Propose a move from S t to S t+1 .
2. Use feedback O t to estimate cost difference ΔE.
3. Accept with probability e −ΔE/Tt , where T t decreases as confidence in the
environment grows.
Strength: Balances exploration and exploitation, useful when the problem shifts
(e.g., new constraints emerge).
Example: Allocating stadium resources as construction bids arrive sequentially.

Online Gradient Descent


Approach: Update continuous parameters based on a stream of noisy gradient
estimates, akin to stochastic gradient descent in machine learning.
Process:
1. At time t, receive observation O t (e.g., a noisy gradient ∇f t ).
~

2. Update: S t+1 = S t − α t ∇f t , where α t adapts to uncertainty.


~

Advantage: Naturally fits problems with continuous variables and streaming data.
Limitation: Assumes the cost function’s structure is somewhat stable over time.
Connection to Partial Observability and Uncertainty
Online local search often assumes a partially observed environment by default. New
observations refine your understanding of the state space. It also overlaps with
uncertain actions:

Partial Observability: You act based on a belief state that updates with each
observation (e.g., a robot learns the map as it moves).
Uncertain Actions: Outcomes are probabilistic, but in online settings, you don’t
precompute expectations—you observe and react.

The key difference is time: Online search doesn’t wait to model the full distribution; it
acts and learns concurrently.

Example: Online Stadium Placement


Let’s apply this to our Morocco problem in an online context:

Scenario: You’re placing three stadiums, but city population data and road
conditions are revealed one city at a time as surveys complete.
State: Current coordinates S t = {(x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 )}.
Action: Adjust one stadium’s position slightly.
Feedback: At time t, learn the exact distance and accessibility for city t.
Cost: Minimize total distance to known cities, anticipating future data.
Algorithm: Online gradient descent:
1. Start with random S 0 .
2. For each city t = 1 to 9:
Observe city t’s coordinates and accessibility.
Compute a partial gradient based on known cities.
Update S t with a small step.
3. Output final S 9 .

You might also like