A Genetic Algorithm For Database Query Optimization: February 1970
A Genetic Algorithm For Database Query Optimization: February 1970
net/publication/2673329
CITATIONS READS
106 886
3 authors:
Yannis Ioannidis
National and Kapodistrian University of Athens
318 PUBLICATIONS 8,299 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
DL.org - Coordination Action on Digital Library Interoperability, Best Practices, and Modelling Foundations View project
All content following this page was uploaded by Kristin P. Bennett on 11 March 2015.
GA in A with CHUNK
problem, and the best solution among those found be
chosen. With that in mind, we also compare the best
1.3
C 1.1
o
ing the optimum strategy. CHUNK is considerably
improved as well, but is still has inferior performance
s
t
1.0
for the reasons explained above. Similar improvements
are seen in the A space as well. Especially in the large
joins, both crossovers nd very good strategies. All
these results indicate that multiple runs of the GA al-
0.9
1.5
(a)
GA in L with CHUNK 4.3 TIME
GA in L with M2S
The average time results are presented in Figure 4,
1.4
System-R in L
where the x{axis represents the number of joins in the
GA in A with M2S
query, and the y{axis represents the processing time
GA in A with CHUNK
in seconds. For the 16-join queries on which System{R
1.3
failed to nish, we use the time-to-failure in this gure.
S
The results are as follows. System{R performs faster
for queries of size up to 14, but the GA in L is much
c
a 1.2
Figure 3: Scaled Cost of Strategy at Convergence: (a) in the population resides on a processor and commu-
average of 5 runs and (b) best of 5 runs nication is carried out by message passing. The total
communication overhead is thus minimal. Based on
results on other optimization problems [1] where the
Another interesting comparison is that between the evaluation of the tness function dominates the pro-
two crossovers. When GA is applied to L, M2S is the cessing time, as is the case with query optimization, we
preferred crossover, with CHUNK having much worse expect linear speedups in execution time. Since only
performance. This is due to the fact that, when the limited parallelism can be incorporated into System{
relations are the genes of the chromosome, applying R, the time to execute the parallel GA should become
CHUNK produces many ospring with cartesian prod- much smaller than that of System{R.
ucts. Therefore, in that case, the algorithm spends
much time in useless matings, thus failing to converge
to a good strategy. On the other hand, M2S produces 5 CONCLUSIONS
much fewer strategies with cartesian products and is
the overall winner. Exactly the opposite happens when We have presented a genetic algorithm for database
GA is applied in A. CHUNK is the best performer, query optimization. In doing so we have intro-
400 GA in L with CHUNK
/ Combinatorial Optimization Conference, Wa-
GA in L with M2S
terloo, September 1990, Ontario, Canada, 1990.
System-R in L University of Waterloo Press.
GA in A with M2S
[2] G.A. Cleveland and S.F. Smith. Using genetic al-
300 GA in A with CHUNK
gorithms to schedule
ow shop releases. In Scha-
eer [13], pages 160{169.
[3] E. F. Codd. A relational model of data for large
T
i
shared data banks. CACM, 13(6):377{387, 1970.
m 200
[4] D.E. Goldberg. Genetic Algorithms in Search,
Optimization and Machine Learning. Addison{
e