0% found this document useful (0 votes)
10 views37 pages

Design and Implementation of Students Time Table Mnanagement System

The document discusses the significance of timetables in educational institutions, highlighting the complexity of the lecture timetabling problem and the challenges faced in optimizing schedules. It reviews various methods and technologies, particularly focusing on genetic algorithms as a prominent solution for generating near-optimal timetables due to their adaptability and effectiveness in handling complex constraints. The document also traces the historical development of genetic algorithms and their application across various fields, emphasizing their potential in solving practical optimization problems like timetabling.

Uploaded by

megamoe19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views37 pages

Design and Implementation of Students Time Table Mnanagement System

The document discusses the significance of timetables in educational institutions, highlighting the complexity of the lecture timetabling problem and the challenges faced in optimizing schedules. It reviews various methods and technologies, particularly focusing on genetic algorithms as a prominent solution for generating near-optimal timetables due to their adaptability and effectiveness in handling complex constraints. The document also traces the historical development of genetic algorithms and their application across various fields, emphasizing their potential in solving practical optimization problems like timetabling.

Uploaded by

megamoe19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

CHAPTER TWO

REVIEW OF RELATED LITERATURE

A timetable is an organized list, usually set out in tabular form, providing

information about a series of arranged events in particular, the time at which it is

planned these events will take place. They are applicable to any institution where

activities have to be carried out by various individuals in a specified time frame.

From the time schools became organized environments, timetables have been the

framework for all school activities. As a result, schools have devoted time, energy

and human capital to the implementation of nearly optimal timetables which must

be to satisfy all required constraints as specified by participating entities

(Robertus, 2002).

The lecture timetabling problem is a typical scheduling problem that appears to

be a tedious job in every academic institute once or twice a year. The problem

involves the scheduling of classes, students, teachers and rooms at a fixed number

of time-slots, subject to a certain number of constraints. An effective timetable is

crucial for the satisfaction of educational requirements and the efficient

utilization of human and space resources, which make it an optimization problem.

Traditionally, the problem is solved manually by trial and hit method, where a

valid solution is not guaranteed. Even if a valid solution is found, it is likely to

miss far better solutions. These uncertainties have motivated for the scientific

study of the problem, and to develop an automated solution technique for it. The

problem is being studied for last more than four decades, but a general solution
technique for it is yet to be formulated (Datta D. et.al, 2006).

Timetabling problem is one of the hardest problem areas already proven to NP-

complete and it is worthy of note that as educational institutions are challenged

to grow in number and complexity, their resources and events are becoming

harder to schedule (Ossam Chohan, 2009).

2.1 REVIEW OF RELEVANT THEORIEDS AND TECHNOLOGIES

Solutions to timetabling problems have been proposed since the 1980s. Research

in this area is still active as there are several recent related papers in operational

research and artificial intelligence journals. This indicates that there are many

problems in timetabling that need to be solved in view of the availability of more

powerful computing facilities and advancement of information technology (S.B.

Deris et.al, 1997).

The problem was first studied by Gotlieb (1962), who formulated a class-teacher

timetabling problem by considering that each lecture contained one group of

students, one teacher, and any number of times which could be chosen freely.

Since then the problem is being continuously studied using different methods

under different conditions. Initially it was mostly applied to schools (de Gans,

1981; Tripathy, 1984). Since the problem in schools is relatively simple because

of their simple class structures, classical methods, such as linear or integer

programming approaches (Lawrie, 1969; Tripathy, 1984), could be used easily.

However, the gradual consideration of the cases of higher secondary schools and

universities, which contain different types of complicated class-structures, is


increasing the complexity of the problem. As a result, classical methods have

been found inadequate to handle the problem, particularly the huge number of

integer and/or real variables, discrete search space and multiple objective

functions.

This inadequacy of classical methods has drawn the attention of the researchers

towards the heuristic-based non-classical techniques. Worth mentioning non-

classical techniques that are being applied to the problem are Genetic Algorithms

(Alberto Colorni et al., 1992), Neural Network (Looi C., 1992), and Tabu Search

Algorithm (Costa D., 1994). However, compared to other non-classical methods,

the widely used are the genetic/evolutionary algorithms (GAs/EAs). The reason

might be their successful implementation in a wider range of applications. Once

the objectives and constraints are defined, EAs appear to offer the ultimate free

lunch scenario of good solutions by evolving without a problem solving strategy

(Al-Attar A., 1994). A few worth mentioning EAs, used for the school

timetabling problem, are those of Abramson et al. (1992), Piola R.(1994), and

Bufe et al. (2001). Similarly, EAs, used for the university class timetabling

problem, are those of Carrasco et al. (2001), Srinivasan et al. (2002) and Datta et

al...

Since 1995, a large amount of timetabling research has been presented in the

series of international conferences on Practice and Theory of Automated

Timetabling (PATAT). Papers on this research have been published in conference

proceedings, see e.g., (Burke & Carter, 1997) and (Burke & Erben, 2000), and
three volumes of selected papers in the Lecture Notes in Computer Science series,

see (Burke & Ross, 1996), (Burke & Carter, 1998), and (Burke & Erben, 2001).

Additionally, there is a EURO working group on automated timetabling (EURO-

WATT) which meets once a year regularly sends out a digest via e-mail, and

maintains a website with relevant information on timetabling problems, e.g., a

bibliography and several benchmarks.

Fang (1994), in his doctoral thesis, investigates the use of genetic algorithms to

solve a group of timetabling problems. He presents a framework for the utilization

of genetic algorithms in solving of timetabling problems in the context of learning

institutions. This framework has the following important points, which give you

considerable flexibility: a declaration of the specific constraints of the problem

and use of a function for evaluation of the solutions, advising the use of a genetic

algorithm, since it is independent of the problem, for its resolution.

Gröbner (1997) presents an approach to generalize all the timetabling problems,

describing the basic structure of this problem. Gröbner proposes a generic

language that can be used to describe timetabling problems and its constraints.

Chan (1997) discusses the implementation of two genetic algorithms used to

solve class-teacher timetabling problem for small schools.

Oliveira (Oliveira and Reis, 2000) presents a language for representation of the

timetabling problem, the UniLang. UniLang intends to be a standard suitable as

input language for any timetabling system. It enables a clear and natural

representation of data, constraints, quality measures and solutions for different


timetabling (as well as related) problems, such as school timetabling, university

timetabling and examination scheduling.

Fernandes (2002) classified the constraints of class-teacher timetabling problem

in constraints strong and weak. Violations to strong constraints (such as schedule

a teacher in two classes at the same time) result in an invalid timetable. Violations

to weak constraints result in valid timetable, but affect the quality of the solution

(for example, the preference of teachers for certain hours). The proposed

algorithm, evolutionary, has been tested in a university comprising 109 teachers,

37 rooms, 1131 a time interval of one hour each and 472 classes. The algorithm

proposed in resolving the scheduling without violating the strong constraints in

30% of executions.

Eley (2006) in PATAT'06 presents a solution to the exam timetable problem,

formulating it as a problem of combinatorial optimization, using algorithms Ant,

to solve.

Analyzing the results obtained by the various works published, we can say what

the automatic generation of schedules is capable of achieving. Some works show

that when compared with the schedules manuals in institutions of learning real,

the times obtained by the algorithms for solving the class-teacher timetabling

problem are of better quality, since, uses some function of evaluation.

There are two main problems in timetabling. The first one is related to the

combinatorial nature of the problems, where it is difficult to find an optimal

solution because it is impossible to enumerate all nodes in such a large search


space. The second one is related to the dynamic nature of the problems where

variables and constraints are changing in accordance with the development of an

organization (S.B. Deris et al., 1997). Therefore, a timetabling system must be

flexible, adaptable and portable, otherwise the users will not use the system

optimally or even as decision aids such as for storing, retrieving, and printing

timetables, when the timetable planning decisions are made manually. In

addition, most of the universities adopting a semester system give freedom to

students to choose subjects provided that all pre-requisites are satisfied. This

situation further complicates the construction of a timetable.

Various techniques have been proposed to solve timetabling problems. These

techniques are neural networks (Gianoglio P, 1990), heuristics (Wright M, 1996),

graph coloring, integer programming, Genetic Algorithms (Burke E. et al., 1994;

Paechter B. et al., 1994), knowledge-based, and constraint logic programming

(Lajos, 1995). The models formulated by some of these techniques cannot be

easily reformulated or customized to support changes, hence the selection of the

genetic algorithm for the implementation of this project.

2.2 TIMETABLING AS A NP-COMPLETE PROBLEM

In computational complexity theory, the complexity class NP-complete

(abbreviated NP-C or NPC, NP standing for Nondeterministic Polynomial time)

is a class of problems having two properties:

 Any given solution to the problem can be verified quickly (in polynomial

time); the set of problems with this property is called NP.


 If the problem can be solved quickly (in polynomial time), then so can

every problem in NP.

Although any given solution to the timetabling problem can be verified quickly,

there is no known efficient way to locate a solution in the first place; indeed, the

most notable characteristic of NP-complete problems is that no fast solution to

them is known. That is, the time required to solve the problem using any currently

known algorithm increases very quickly as the size of the problem grows (Ossam

Chohan; 2009).

When solving the timetabling problem, we are usually looking for some solution,

which will be the best among others. The space of all feasible solutions (series of

desired solutions with some more desirable than others) is called search space

(also state space). Each point in the search space represents one feasible solution

which can be "marked" by its value or fitness for the problem. The solution is

usually one point in the search space (Ossam Chohan; 2009).

As a result of comparative fact finding and exhaustive study of existing systems,

Genetic Algorithms have been the most prominently used in generating near-

optimal solutions to timetabling problems, hence its usage in the implementation

of this project.

2.3 A BRIEF HISTORY OF GENETIC ALGORITHMS

The earliest instances of what might today be called genetic algorithms appeared

in the late 1950s and early 1960s, programmed on computers by evolutionary

biologists who were explicitly seeking to model aspects of natural evolution. It


did not occur to any of them that this strategy might be more generally applicable

to artificial problems, but that recognition was not long in coming: "Evolutionary

computation was definitely in the air in the formative days of the electronic

computer" (Mitchell Melanie, 1996). By 1962, researchers such as G.E.P. Box,

G.J. Friedman, W.W. Bledsoe and H.J. Bremermann had all independently

developed evolution-inspired algorithms for function optimization and machine

learning, but their work attracted little follow-up. A more successful development

in this area came in 1965, when Ingo Rechenberg, then of the Technical

University of Berlin, introduced a technique he called evolution strategy, though

it was more similar to hill-climbers than to genetic algorithms. In this technique,

there was no population or crossover; one parent was mutated to produce one

offspring, and the better of the two was kept and became the parent for the next

round of mutation (Haupt et. al., 1998). Later versions introduced the idea of a

population. Evolution strategies are still employed today by engineers and

scientists, especially in Germany.

The next important development in the field came in 1966, when L.J. Fogel, A.J.

Owens and M.J. Walsh introduced in America a technique they called

evolutionary programming. In this method, candidate solutions to problems were

represented as simple finite-state machines; like Rechenberg's evolution strategy,

their algorithm worked by randomly mutating one of these simulated machines

and keeping the better of the two (Mitchell Melanie, 1996; Goldberg David,

1989). Also like evolution strategies, a broader formulation of the evolutionary


programming technique is still an area of ongoing research today. However, what

was still lacking in both these methodologies was recognition of the importance

of crossover.

As early as 1962, John Holland's work on adaptive systems laid the foundation

for later developments; most notably, Holland was also the first to explicitly

propose crossover and other recombination operators. However, the seminal work

in the field of genetic algorithms came in 1975, with the publication of the book

Adaptation in Natural and Artificial Systems. Building on earlier research and

papers both by Holland himself and by colleagues at the University of Michigan,

this book was the first to systematically and rigorously present the concept of

adaptive digital systems using mutation, selection and crossover, simulating

processes of biological evolution, as a problem-solving strategy. The book also

attempted to put genetic algorithms on a firm theoretical footing by introducing

the notion of schemata (Mitchell Melanie, 1996; Haupt et. al., 1998). That same

year, Kenneth De Jong's important dissertation established the potential of GAs

by showing that they could perform well on a wide variety of test functions,

including noisy, discontinuous, and multimodal search landscapes (Goldberg

David, 1989).

These foundational works established more widespread interest in evolutionary

computation. By the early to mid-1980s, genetic algorithms were being applied

to a broad range of subjects, from abstract mathematical problems like bin-

packing and graph coloring to tangible engineering issues such as pipeline flow
control, pattern recognition and classification, and structural optimization

(Goldberg David, 1989).

At first, these applications were mainly theoretical. However, as research

continued to proliferate, genetic algorithms migrated into the commercial sector,

their rise fueled by the exponential growth of computing power and the

development of the Internet. Today, evolutionary computation is a thriving field,

and genetic algorithms are "solving problems of everyday interest" (Haupt et. al.,

1998) in areas of study as diverse as stock market prediction and portfolio

planning, aerospace engineering, microchip design, biochemistry and molecular

biology, and scheduling at airports and assembly lines. The power of evolution

has touched virtually any field one cares to name, shaping the world around us

invisibly in countless ways, and new uses continue to be discovered as research

is ongoing. And at the heart of it all lies nothing more than Charles Darwin's

simple, powerful insight: that the random chance of variation, coupled with the

law of selection, is a problem-solving technique of immense power and nearly

unlimited application.

Genetic algorithms (GAs) are numerical optimization algorithms that are as a

result of both natural selection and natural genetics. The method which is general

in nature is capable of being applied to a wider range of problems unlike most

procedural approaches. Genetic algorithms help to solve practical problems on a

daily basis. The algorithms are simple to understand and the required computer

code easy to write. The Genetic Algorithm (GA) technique has never attracted
much attention like the artificial neural networks, hill climbing, simulate

annealing amongst many others although it has a growing number of disciples.

The reason for this is certainly not because of any inherent limits it has or its lack

of powerful metaphors. The phenomenon that evolution is the concept resulting

in the bio-diversity we see around us today is a powerful and inspiring paradigm

for solving any complex problem. The use of GAs have been evident from the

very beginning characterized by examples of computer scientists having visions

of systems that mimics and duplicate one or more of the attributes of life. The

idea of using a population of solutions to solve practical engineering optimization

problems was considered several times during the 1950's and 1960's. However,

the concept of GAs were essentially invented by one man—John Holland—in the

1960's. His reasons for developing such algorithms were to solve problems of

generalized concerns. He itemized this concept in his book in 1975, Adaptation

in Natural and Artificial Systems (recently re-issued with additions) which is

particularly worth reading for its visionary approach. Its application has proven

it to be more than just a robust method for estimating a series of unknown

parameters within a model of a physical system (David, 1999).

However its robustness cuts across many different practical optimization

problems especially those that concern us most like the timetable problem in the

context of this project.

2.4 BASIS FOR A GENETIC ALGORITHM

1. A number, or population, of guesses of the solution to the problem.


2. A way of determining the states of generated solutions i.e. calculating how

well or bad the individual solutions within the population are.

3. A method for mixing fragments of the better solutions to form new, on

average even better solutions.

4. A mutation operator to avoid permanent loss of diversity within the

solutions.

Concisely stated, a genetic algorithm is a programming technique that mimics

biological evolution as a problem-solving strategy. Given a specific problem to

solve, the input to the GA is a set of potential solutions to that problem, encoded

in some fashion, and a metric called a fitness function that allows each candidate

to be quantitatively evaluated. These candidates may be solutions already known

to work, with the aim of the GA being to improve them, but more often they are

generated at random.

The GA then evaluates each candidate according to the fitness function. In a pool

of randomly generated candidates, of course, most will not work at all, and these

will be deleted. However, purely by chance, a few may hold promise - they may

show activity, even if only weak and imperfect activity, toward solving the

problem.

These promising candidates are kept and allowed to reproduce. Multiple copies

are made of them, but the copies are not perfect; random changes are introduced

during the copying process. These digital offspring then go on to the next

generation, forming a new pool of candidate solutions, and are subjected to a


second round of fitness evaluation. Those candidate solutions which were

worsened, or made no better, by the changes to their code are again deleted; but

again, purely by chance, the random variations introduced into the population

may have improved some individuals, making them into better, more complete

or more efficient solutions to the problem at hand. Again these winning

individuals are selected and copied over into the next generation with random

changes, and the process repeats. The expectation is that the average fitness of

the population will increase each round, and so by repeating this process for

hundreds or thousands of rounds, very good solutions to the problem can be

discovered.

As astonishing and counterintuitive as it may seem to some, genetic algorithms

have proven to be an enormously powerful and successful problem-solving

strategy, dramatically demonstrating the power of evolutionary principles.

Genetic algorithms have been used in a wide variety of fields to evolve solutions

to problems as difficult as or more difficult than those faced by human designers.

Moreover, the solutions they come up with are often more efficient, more elegant,

or more complex than anything comparable a human engineer would produce. In

some cases, genetic algorithms have come up with solutions that baffle the

programmers who wrote the algorithms in the first place (Adam, 2004).

2.5 METHODS OF REPRESENTATION

 Before a genetic algorithm can be put to work on any problem, a method

is needed to encode potential solutions to that problem in a form that a


computer can process. One common approach is to encode solutions as

binary strings: sequences of 1's and 0's, where the digit at each position

represents the value of some aspect of the solution (Fleming et. al., 2002).

 Another, similar approach is to encode solutions as arrays of integers or

decimal numbers, with each position again representing some particular

aspect of the solution. This approach allows for greater precision and

complexity than the comparatively restricted method of using binary

numbers only and often "is intuitively closer to the problem space"

(Fleming et. al., 2002).

 A third approach is to represent individuals in a GA as strings of letters,

where each letter again stands for a specific aspect of the solution. One

example of this technique is Hiroaki Kitano's "grammatical encoding"

approach, where a GA was put to the task of evolving a simple set of rules

called a context-free grammar that was in turn used to generate neural

networks for a variety of problems (Mitchell, 1996).


The advantage of the three methods above is that they make it easy to

define operators that cause the random changes in the selected candidates:

flip a 0 to a 1 or vice versa, add or subtract from the value of a number by

a randomly chosen amount, or change one letter to another.

 Another strategy, developed principally by John Koza of Stanford

University and called genetic programming, represents programs as

branching data structures called trees (Koza et. al., 2003). In this approach,

random changes can be brought about by changing the operator or altering

the value at a given node in the tree, or replacing one sub-tree with another.

Figure 2.5: Three simple program trees of the kind normally used in genetic

programming. The mathematical expression that each one represents is given

underneath it (Adapted from Adam Marczyk 2004).

It is important to note that evolutionary algorithms do not necessarily represent

candidate solutions as data strings of fixed length. Though some represent them

this way, but others do not; e.g. Kitano's grammatical encoding discussed above

can be efficiently scaled to create large and complex neural networks, and Koza's

genetic programming trees can grow arbitrarily large as necessary to solve

whatever problem they are applied to.


2.6 METHODS OF SELECTION

There are many different techniques which a genetic algorithm can use to select

the individuals to be copied over into the next generation, but listed below are

some of the most common methods. Some of these methods are mutually

exclusive, but others can be and often are used in combination.

 Elitist selection: The fittest members of each generation are guaranteed to

be selected. (Most GAs doesn’t use pure elitism, but instead use a modified

form where the single best or a few of the best individuals from each

generation are copied into the next generation just in case nothing better

turns up.)

 Fitness-proportionate selection: More fit individuals are more likely, but

not certain, to be selected.

 Roulette-wheel selection: A form of fitness-proportionate selection in

which the chance of an individual's being selected is proportional to the

amount by which its fitness is greater or less than its competitors' fitness.

(Conceptually, this can be represented as a game of roulette - each

individual gets a slice of the wheel, but more fit ones get larger slices than

less fit ones. The wheel is then spun, and whichever individual "owns" the

section on which it lands each time is chosen).

 Scaling selection: As the average fitness of the population increases, the

strength of the selective pressure also increases and the fitness function

becomes more discriminating. This method can be helpful in making the


best selection later on when all individuals have relatively high fitness and

only small differences in fitness distinguish one from another.

 Tournament selection: Subgroups of individuals are chosen from the

larger population, and members of each subgroup compete against each

other. Only one individual from each subgroup is chosen to reproduce.

 Rank selection: Each individual in the population is assigned a numerical

rank based on fitness, and selection is based on this ranking rather than

absolute difference in fitness. The advantage of this method is that it can

prevent very fit individuals from gaining dominance early at the expense

of less fit ones, which would reduce the population's genetic diversity and

might hinder attempts to find an acceptable solution.

 Generational selection: The offspring of the individuals selected from

each generation become the entire next generation. No individuals are

retained between generations.

 Steady-state selection: The offspring of the individuals selected from each

generation go back into the pre-existing gene pool, replacing some of the

less fit members of the previous generation. Some individuals are retained

between generations.

 Hierarchical selection: Individuals go through multiple rounds of

selection each generation. Lower-level evaluations are faster and less


discriminating, while those that survive to higher levels are evaluated more

rigorously. The advantage of this method is that it reduces overall

computation time by using faster, less selective evaluation to weed out the

majority of individuals that show little or no promise, and only subjecting

those who survive this initial test to more rigorous and more

computationally expensive fitness evaluation.

2.7 METHODS OF CHANGE

 Once selection has chosen fit individuals, they must be randomly altered

in hopes of improving their fitness for the next generation. There are two

basic strategies to accomplish this. The first and simplest is called

mutation. Just as mutation in living things changes one gene to another,

so mutation in a genetic algorithm causes small alterations at single points

in an individual's code. Refer to Figure.

Figure 2.7a: Diagram showing the effect of mutation on an individual in a

population of 8-bit strings where mutation occurs at position 4, changing the 0 at

that position in its genome to a 1 (Adapted from Adam Marczyk 2004).


 The second method is called crossover, and entails choosing two

individuals to swap segments of their code, producing artificial "offspring"

that are combinations of their parents. This process is intended to simulate

the analogous process of recombination that occurs to chromosomes during

sexual reproduction (Adam, 2004). Common forms of crossover include

single-point crossover, in which a point of exchange is set at a random

location in the two individuals' genomes, and one individual contributes all

its code from before that point and the other contributes all its code from

after that point to produce an offspring, and uniform crossover, in which

the value at any given location in the offspring's genome is either the value

of one parent's genome at that location or the value of the other parent's

genome at that location, chosen with 50/50 probability. Refer to Figure

2.7b.

Figure 2.7b: Diagram showing the effect of mutation on individuals in a

population of 8-bit strings showing two individuals undergoing single-point

crossover; the point of exchange is set between the fifth and sixth positions in the

genome, producing a new individual that is a hybrid of its progenitors (Adapted

from Adam Marczyk 2004).

2.8 STRENGTHS OF GENETIC ALGORITHMS


 The first and most important point is that genetic algorithms are

intrinsically parallel. Most other algorithms are serial and can only explore

the solution space to a problem in one direction at a time, and if the solution

they discover turns out to be suboptimal, there is nothing to do but abandon

all work previously completed and start over. However, since GAs has

multiple offspring, they can explore the solution space in multiple

directions at once. If one path turns out to be a dead end, they can easily

eliminate it and continue work on more promising avenues, giving them a

greater chance each run of finding the optimal solution (Adam, 2004; John,

1992).

 However, the advantage of parallelism goes beyond this. Consider the

following: All the 8-digit binary strings (strings of 0's and 1's) form a

search space, which can be represented as ******** (where the * stands

for "either 0 or 1"). The string 01101010 is one member of this space.

However, it is also a member of the space 0*******, the space 01******,

the space 0******0, the space 0*1*1*1*, the space 01*01**0, and so on.

By evaluating the fitness of this one particular string, a genetic algorithm

would be sampling each of these many spaces to which it belongs. Over

many such evaluations, it would build up an increasingly accurate value

for the average fitness of each of these spaces, each of which has many

members. Therefore, a GA that explicitly evaluates a small number of

individuals is implicitly evaluating a much larger group of individuals -


just as a pollster who asks questions of a certain member of an ethnic,

religious or social group hopes to learn something about the opinions of all

members of that group, and therefore can reliably predict national opinion

while sampling only a small percentage of the population. In the same way,

the GA can "home in" on the space with the highest-fitness individuals and

find the overall best one from that group. In the context of evolutionary

algorithms, this is known as the Schema Theorem, and is the "central

advantage" of a GA over other problem-solving methods (John, 1992;

Mitchell, 1996; Goldberg, 1989).

 Due to the parallelism that allows them to implicitly evaluate many

schemas at once, genetic algorithms are particularly well-suited to solving

problems where the space of all potential solutions is truly huge - too vast

to search exhaustively in any reasonable amount of time. Most problems

that fall into this category are known as "nonlinear". In a linear problem,

the fitness of each component is independent, so any improvement to any

one part will result in an improvement of the system as a whole. Needless

to say, few real-world problems are like this. Nonlinearity is the norm,

where changing one component may have ripple effects on the entire

system, and where multiple changes that individually are detrimental may

lead to much greater improvements in fitness when combined. Nonlinearity

results in a combinatorial explosion: the space of 1,000-digit binary strings

can be exhaustively searched by evaluating only 2,000 possibilities if the


problem is linear, whereas if it is nonlinear, an exhaustive search requires

evaluating 21000 possibilities - a number that would take over 300 digits

to write out in full (Adam, 2004).

 Fortunately, the implicit parallelism of a GA allows it to surmount even

this enormous number of possibilities, successfully finding optimal or very

good results in a short period of time after directly sampling only small

regions of the vast fitness landscape (Forrest, 1993). For example, a genetic

algorithm developed jointly by engineers from General Electric and

Rensselaer Polytechnic Institute produced a high-performance jet engine

turbine design that was three times better than a human-designed

configuration and 50% better than a configuration designed by an expert

system by successfully navigating a solution space containing more than

10387 possibilities. Conventional methods for designing such turbines are

a central part of engineering projects that can take up to five years and cost

over $2 billion; the genetic algorithm discovered this solution after two

days on a typical engineering desktop workstation (John, 1992).

 Another notable strength of genetic algorithms is that they perform well in

problems for which the fitness landscape is complex - ones where the

fitness function is discontinuous, noisy, changes over time, or has many

local optima. Most practical problems have a vast solution space,

impossible to search exhaustively; the challenge then becomes how to

avoid the local optima - solutions that are better than all the others that are
similar to them, but that are not as good as different ones elsewhere in the

solution space. Many search algorithms can become trapped by local

optima: if they reach the top of a hill on the fitness landscape, they will

discover that no better solutions exist nearby and conclude that they have

reached the best one, even though higher peaks exist elsewhere on the map

(Adam, 2004).

 Evolutionary algorithms, on the other hand, have proven to be effective at

escaping local optima and discovering the global optimum in even a very

rugged and complex fitness landscape. (It should be noted that, in reality,

there is usually no way to tell whether a given solution to a problem is the

one global optimum or just a very high local optimum. However, even if a

GA does not always deliver a provably perfect solution to a problem, it can

almost always deliver at least a very good solution.) All four of a GA's

major components - parallelism, selection, mutation, and crossover - work

together to accomplish this. In the beginning, the GA generates a diverse

initial population, casting a "net" over the fitness landscape. (Koza et. al.,

2003) compares this to an army of parachutists dropping onto the landscape

of a problem's search space, with each one being given orders to find the

highest peak.) Small mutations enable each individual to explore its

immediate neighborhood, while selection focuses progress, guiding the

algorithm's offspring uphill to more promising parts of the solution space

(John, 1992).
 However, crossover is the key element that distinguishes genetic

algorithms from other methods such as hill-climbers and simulated

annealing. Without crossover, each individual solution is on its own,

exploring the search space in its immediate vicinity without reference to

what other individuals may have discovered. However, with crossover in

place, there is a transfer of information between successful candidates -

individuals can benefit from what others have learned, and schemata can

be mixed and combined, with the potential to produce an offspring that has

the strengths of both its parents and the weaknesses of neither. This point

is illustrated in Koza et.al. (1999), where the authors discuss a problem of

synthesizing a low pass filter using genetic programming. In one

generation, two parent circuits were selected to undergo crossover; one

parent had good topology (components such as inductors and capacitors in

the right places) but bad sizing (values of inductance and capacitance for

its components that were far too low). The other parent had bad topology,

but good sizing. The result of mating the two through crossover was an

offspring with the good topology of one parent and the good sizing of the

other, resulting in a substantial improvement in fitness over both its

parents.

 The problem of finding the global optimum in a space with many local

optima is also known as the dilemma of exploration vs. exploitation, "a

classic problem for all systems that can adapt and learn" (John, 1992). Once
an algorithm (or a human designer) has found a problem-solving strategy

that seems to work satisfactorily, should it concentrate on making the best

use of that strategy, or should it search for others? Abandoning a proven

strategy to look for new ones is almost guaranteed to involve losses and

degradation of performance, at least in the short term. But if one sticks with

a particular strategy to the exclusion of all others, one runs the risk of not

discovering better strategies that exist but have not yet been found. Again,

genetic algorithms have shown themselves to be very good at striking this

balance and discovering good solutions with a reasonable amount of time

and computational effort (Adam, 2004).

 Another area in which genetic algorithms excel is their ability to

manipulate many parameters simultaneously (Forrest, 1993). Many real-

world problems cannot be stated in terms of a single value to be minimized

or maximized, but must be expressed in terms of multiple objectives,

usually with tradeoffs involved: one can only be improved at the expense

of another. GAs are very good at solving such problems: in particular, their

use of parallelism enables them to produce multiple equally good solutions

to the same problem, possibly with one candidate solution optimizing one

parameter and another candidate optimizing a different one (Haupt et.al.,

1998), and a human overseer can then select one of these candidates to use.

If a particular solution to a multi-objective problem optimizes one

parameter to a degree such that that parameter cannot be further improved


without causing a corresponding decrease in the quality of some other

parameter, that solution is called Pareto optimal or non-dominated (Coello,

2000).

 Finally, one of the qualities of genetic algorithms which might at first

appear to be a liability turns out to be one of their strengths: namely, GAs

know nothing about the problems they are deployed to solve. Instead of

using previously known domain-specific information to guide each step

and making changes with a specific eye towards improvement, as human

designers do, they are "blind watchmakers" (Dawkins, 1996); they make

random changes to their candidate solutions and then use the fitness

function to determine whether those changes produce an improvement.

 The virtue of this technique is that it allows genetic algorithms to start out

with an open mind, so to speak. Since its decisions are based on

randomness, all possible search pathways are theoretically open to a GA;

by contrast, any problem-solving strategy that relies on prior knowledge

must inevitably begin by ruling out many pathways a priori, therefore

missing any novel solutions that may exist there (Koza et. al., 1999).

Lacking preconceptions based on established beliefs of "how things should

be done" or what "couldn't possibly work", GAs do not have this problem.

Similarly, any technique that relies on prior knowledge will break down

when such knowledge is not available, but again, GAs is not adversely

affected by ignorance (Goldberg, 1989). Through their components of


parallelism, crossover and mutation, they can range widely over the fitness

landscape, exploring regions which intelligently produced algorithms

might have overlooked, and potentially uncovering solutions of startling

and unexpected creativity that might never have occurred to human

designers. One vivid illustration of this is the rediscovery, by genetic

programming, of the concept of negative feedback - a principle crucial to

many important electronic components today, but one that, when it was

first discovered, was denied a patent for nine years because the concept

was so contrary to established beliefs (Koza et. al., 2003). Evolutionary

algorithms, of course, are neither aware nor concerned whether a solution

runs counter to established beliefs - only whether it works.

2.9 LIMITATIONS OF GENETIC ALGORITHMS

Although genetic algorithms have proven to be an efficient and powerful

problem-solving strategy, they are not a panacea. GAs does have certain

limitations which are outlined below:

 The first, and most important, consideration in creating a genetic algorithm

is defining a representation for the problem. The language used to specify

candidate solutions must be robust; i.e., it must be able to tolerate random

changes such that fatal errors or nonsense do not consistently result.

There are two main ways of achieving this. The first, which is used by most

genetic algorithms, is to define individuals as lists of numbers - binary-

valued, integer-valued, or real-valued - where each number represents


some aspect of a candidate solution. If the individuals are binary strings, 0

or 1 could stand for the absence or presence of a given feature. If they are

lists of numbers, these numbers could represent many different things: the

weights of the links in a neural network, the order of the cities visited in a

given tour, the spatial placement of electronic components, the values fed

into a controller, the torsion angles of peptide bonds in a protein, and so

on. Mutation then entails changing these numbers, flipping bits or adding

or subtracting random values. In this case, the actual program code does

not change; the code is what manages the simulation and keeps track of the

individuals, evaluating their fitness and perhaps ensuring that only values

realistic and possible for the given problem result.

In another method, genetic programming, the actual program code does

change. As discussed in the section Methods of representation, GP

represents individuals as executable trees of code that can be mutated by

changing or swapping sub-trees. Both of these methods produce

representations that are robust against mutation and can represent many

different kinds of problems, and both have had considerable success in

various examples on which they have been applied.

This issue of representing candidate solutions in a robust way does not arise

in nature, because the method of representation used by evolution, namely

the genetic code, is inherently robust: with only a very few exceptions, such

as a string of stop codons, there is no such thing as a sequence of DNA


bases that cannot be translated into a protein. Therefore, virtually any

change to an individual's genes will still produce an intelligible result, and

so mutations in evolution have a higher chance of producing an

improvement. This is in contrast to human-created languages such as

English, where the number of meaningful words is small compared to the

total number of ways one can combine letters of the alphabet, and therefore

random changes to an English sentence are likely to produce nonsense

(Adam, 2004).

 The problem of how to write the fitness function must be carefully

considered so that higher fitness is attainable and actually does equate to a

better solution for the given problem. If the fitness function is chosen

poorly or defined imprecisely, the genetic algorithm may be unable to find

a solution to the problem, or may end up solving the wrong problem. (This

latter situation is sometimes described as the tendency of a GA to "cheat",

although in reality all that is happening is that the GA is doing what it was

told to do, not what its creators intended it to do.) This is not a problem in

nature, however. In the laboratory of biological evolution there is only one

fitness function, which is the same for all living things - the drive to survive

and reproduce, no matter what adaptations make this possible. Those

organisms which reproduce more abundantly compared to their

competitors are fitter; those which fail to reproduce are unfit (Adam, 2004).

 In addition to making a good choice of fitness function, the other


parameters of a GA - the size of the population, the rate of mutation and

crossover, the type and strength of selection - must be also chosen with

care. If the population size is too small, the genetic algorithm may not

explore enough of the solution space to consistently find good solutions. If

the rate of genetic change is too high or the selection scheme is chosen

poorly, beneficial schema may be disrupted and the population may enter

error catastrophe, changing too fast for selection to ever bring about

convergence (Adam, 2004).

Living things do face similar difficulties, and evolution has dealt with

them. It is true that if a population size falls too low, mutation rates are too

high, or the selection pressure is too strong (such a situation might be

caused by drastic environmental change), then the species may go extinct.

The solution has been "the evolution of evolvability" - adaptations that alter

a species' ability to adapt. For example, most living things have evolved

elaborate molecular machinery that checks for and corrects errors during

the process of DNA replication, keeping their mutation rate down to

acceptably low levels; conversely, in times of severe environmental stress,

some bacterial species enter a state of hyper mutation where the rate of

DNA replication errors rises sharply, increasing the chance that a

compensating mutation will be discovered. Of course, not all catastrophes

can be evaded, but the enormous diversity and highly complex adaptations

of living things today show that, in general, evolution is a successful


strategy. Likewise, the diverse applications of and impressive results

produced by genetic algorithms show them to be a powerful and

worthwhile field of study (John, 1975).

 One type of problem that genetic algorithms have difficulty dealing with

are problems with "deceptive" fitness functions (Mitchell, 1996), those

where the locations of improved points give misleading information about

where the global optimum is likely to be found. For example, imagine a

problem where the search space consisted of all eight-character binary

strings, and the fitness of an individual was directly proportional to the

number of 1s in it - i.e., 00000001 would be less fit than 00000011, which

would be less fit than 00000111, and so on - with two exceptions: the string

11111111 turned out to have very low fitness, and the string 00000000

turned out to have very high fitness. In such a problem, a GA (as well as

most other algorithms) would be no more likely to find the global optimum

than random search.

The resolution to this problem is the same for both genetic algorithms and

biological evolution: evolution is not a process that has to find the single

global optimum every time. It can do almost as well by reaching the top of

a high local optimum, and for most situations, this will suffice, even if the

global optimum cannot easily be reached from that point. Evolution is very

much a "satisfier" - an algorithm that delivers a "good enough" solution,

though not necessarily the best possible solution, given a reasonable


amount of time and effort invested in the search. The Evidence for Jury-

Rigged Design in Nature FAQ gives examples of this very outcome

appearing in nature. (It is also worth noting that few, if any, real-world

problems are as fully deceptive as the somewhat contrived example given

above. Usually, the location of local improvements gives at least some

information about the location of the global optimum.)

 One well-known problem that can occur with a GA is known as premature

convergence. If an individual that is more fit than most of its competitors

emerges early on in the course of the run, it may reproduce so abundantly

that it drives down the population's diversity too soon, leading the

algorithm to converge on the local optimum that that individual represents

rather than searching the fitness landscape thoroughly enough to find the

global optimum (Forrest, 1993; Mitchell, 1996). This is an especially

common problem in small populations, where even chance variations in

reproduction rate may cause one genotype to become dominant over

others.

The most common methods implemented by GA researchers to deal with

this problem all involve controlling the strength of selection, so as not to

give excessively fit individuals too great of an advantage. Rank, scaling

and tournament selection, discussed earlier, are three major means for

accomplishing this; some methods of scaling selection include sigma

scaling, in which reproduction is based on a statistical comparison to the


population's average fitness, and Boltzmann selection, in which the

strength of selection increases over the course of a run in a manner similar

to the "temperature" variable in simulated annealing (Mitchell, 1996).

Premature convergence does occur in nature (where it is called genetic drift

by biologists). This should not be surprising; as discussed above, evolution

as a problem-solving strategy is under no obligation to find the single best

solution, merely one that is good enough. However, premature

convergence in nature is less common since most beneficial mutations in

living things produce only small, incremental fitness improvements;

mutations that produce such a large fitness gain as to give their possessors

dramatic reproductive advantage are rare.

 Finally, several researchers (John, 1992; Forrest, 1993; Haupt et. al., 1998)

advise against using genetic algorithms on analytically solvable problems.

It is not that genetic algorithms cannot find good solutions to such

problems; it is merely that traditional analytic methods take much less time

and computational effort than GAs and, unlike GAs, are usually

mathematically guaranteed to deliver the one exact solution. Of course,

since there is no such thing as a mathematically perfect solution to any

problem of biological adaptation, this issue does not arise in nature.

2.10 APPLICATION

OF

GENETIC
ALGORITHMS

IN

THIS

RESEARCH

Having considered the basis for a genetic algorithm, the outline below highlights

the applications of the proposed system in generating timetables (with

Department of Computer Science, Akanu Ibiam Federal Polytechnic, Unwana as

case study).

The timetabling problem is a combinatorial optimization problem (COP) and in

order to find a very comprehensive mathematical framework to describe it (and

also tackle its NP-hardness), hence the introduction of a highly abstract concept

of heuristics (genetic algorithms). The basic property of the timetable problem is

the attempt of the genetic algorithm to optimize a function over a discrete

structure with many independent variables. The relation between the choices

made in the discrete domain and the effects on the objective function value are

usually complex and frequently not easy to trace. The unifying framework for

COP’s is the Constraint Satisfaction Problem (CSP) in conjunction with the

optimization of an objective function (Kostuch, 2003). It is worthy of note that

even though the timetabling problem is treated as an optimization problem, there

is actually no fixed objective function, the function that exists is used as an

arbitrary measure to check for optimized solutions and degree of constraints

satisfaction (Abramson, 1992). Once the objectives and constraints are specified,
genetic algorithms offer the ultimate scenarios of good timetable solutions

through evolution processes even though the complexity of assignment is totally

dependent on the number of instances and number of constraints.

Hence the algorithm considered for use in the proposed system is a scaled down

version of the Hybrid Genetic algorithm for the construction of examination


timetables developed for the University of Nottingham. The concept though

developed for examination timetabling, can be adapted to fit the construction of

course timetables. The genetic algorithm employed combines two heuristic

algorithms, the first finding a non-conflicting set of exams and the second

assigning the selected exam to rooms. The process is repeated until all exams

have been scheduled without conflicts (Rupert, 1995).

Figure 2.10: Diagram depicting the Hybrid Genetic Algorithm used in University

of Nottingham (Adapted from Rupert Weare et. al. 1995).

Like every other genetic algorithms, this algorithm can quickly produce large

populations of random feasible exam timetables. Uniquely, the process takes

each member of the course population and assigns it to the first period in which

the exam may be placed without conflict. The mutation and crossover

procedures are then applied to the population so that constraints associated

with each course in the assignment are satisfied. The timetables generated by
the algorithm with a starting population size of 200 had an average fitness of

0.986 (Rupert Weare et. al., 1995).

You might also like