Project Guidelines & Format (Final)
Project Guidelines & Format (Final)
A FRAMEWORK FOR
CLUSTERING CATEGORICAL DATA
Submitted in partial fulfillment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
Submitted by
P. Anitha (16A51A1222)
P. Anitha (16A51A1222) P. Anitha (16A51A1222)
P. Anitha (16A51A1222) P. Anitha (16A51A1222)
2016-2020
A Project Report
on
A FRAMEWORK FOR
CLUSTERING CATEGORICAL DATA
Submitted in partial fulfillment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
Submitted by
P. Anitha (16A51A1222)
P. Anitha (16A51A1222) P. Anitha (16A51A1222)
P. Anitha (16A51A1222) P. Anitha (16A51A1222)
2016-2020
Approved by AICTE, New Delhi
Affiliated to JNTU Kakinada
Accredited by NBA (UG: CSE,ECE,EEE,ME ,CE & IT)
Accredited by NAAC(UGC) with A+ Grade
Recognised by UGC Under Section 2(f) & 12(B)
TEQIP Participated College
Recognised by Scientific & Industrial Research Organisation(SIRO)
CERTIFICATE
This is to certify that the work embodied in this project entitled “YOUR
Manikanth (15A5A1222) in partial fulfillment for the award of the degree of Bachelor
The results embodied in this project report have not been submitted to any other
We do hereby declare that the work embodied in this project entitled “YOUR
PROJECT TITLE” is the outcome of research work carried out by us under the direct
Information Technology during the period 2019-20. The work is original and has not
Project Associates
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
ACKNOWLEDGEMENTS
This project was impossible without the people who supported us and believed us.
We would like to express our sincere heartfelt gratitude to our honorable, esteemed
supervisor Dr. B. V. Ramana, Professor and Head, Department of Information
Technology (IT), Aditya Institute of Technology and Management (AITAM), Tekkali,
affiliated to JNTUK, Kakinada, for his kind and valuable guidance for the completion
of the project work. His consistent support and intellectual guidance inspired us to
innovate new ideas. I am glad to work under his supervision.
We thank all our friends and classmates for their love and support. Last, but not least
we would like to thank our parents for supporting us to complete our bachelor’s degree
in all ways.
Project Associates
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
P. Anitha (16A51A1222)
Vision of the Department
M1: Information Technology program dedicates itself to provide students with a set of
skills, knowledge and attitude that will permit its graduates to succeed and thrive as
successful information technologists.
M2: Enhance overall personality development which includes innovative thinking,
team work, entrepreneur skills, communication skills, employability skills and ethical
conduct.
M3: Ensuring effective teaching–learning process to provide in-depth knowledge of
interdisciplinary areas.
M4: Providing industry interactions through consultancy and sponsored research for
the societal needs.
PEO3: Possess leadership qualities, nourish ethical responsibilities and cherish with
communication skills.
PEO4: Enrich lifelong learning with technical savvy and promote the progressive
societal needs
PROGRAM OUTCOMES (POs)
PROJ
2 2 2 3 3 1 1 1 3 3 3 1 1 1 1
.TOT
1: Slight (Low)
2: Moderate (Medium)
3: Substantial (High)
ABSTRACT
Spotted Hyena Optimizer (SHO) is a recently developed swarm based algorithm in the
field of metaheuristic research, for solving realistic engineering design constraint and
unconstrained difficulties. To resolve complicated non-linear physical world tasks, at
times SHO reveals deprived performance concerning to explorative strength. So to
enhance the explorative strength along with exploitation in the search region an attempt
has been made by proposing the enhanced version of classical SHO. The suggested
method is designated as STS-SHO. In STS-SHO, a new evolutionary technique named
as Space Transformation Search Technique (STS) has been incorporated with original
SHO. The suggested method has been assessed by IEEE CEC 2017 benchmark
problems. The efficacy of the said method has been proven by using standard measures
like: given performance metrics in CEC 2017, complexity analysis, convergence
analysis and statistical implications. Further as real world application, the said
algorithm has been applied to train Pi-Sigma neural network (PSNN) by means of 13
benchmark datasets considered from UCI depository. From the paper it can be
concluded that the suggested method STS-SHO is an effective and trustworthy
algorithm, which has the ability to resolve real life optimization complications.
1 INTRODUCTION
Optimization is the process of finding all potential outcomes within a fixed search
space and picking the best optimal solution depending upon certain initial conditions
and input parameters. At present optimization technique predominantly used across all
engineering and science disciplines such as chemical engineering [1], engineering
design [2], information system [3] and operation and supply chain management [4] etc.
The problems associated with such fields are convoluted in nature and hard to optimize.
This is the foundation for development of diverse meta-heuristic algorithms to find out
optimal solutions. To generate optimal solution for a specific problem one of the
popular technique available in the literature is deterministic technique or gradient based
methods. This method belongs to supervised training category. The limitation of such
method includes (i) poor attainment of global optima accurately, (ii) slow convergence
rate, and (iii) dependencies on the input parameters are very high. Therefore to deal
with the difficulties faced by the deterministic methods and the gradual increase in
complexities of real life problems many nature inspired meta-heuristic algorithms will
be evolved. This type of optimization algorithms are also termed as stochastic
optimization. The advantages of stochastic based algorithms are to find global optima
by successfully avoiding local optima trap and can deal with increased complexity of
the problem. The widespread acceptance of this approach is due to free from problem
dependencies and involved randomness in choosing initial parameters as well as finding
method of solutions.
The meta-heuristic algorithms broadly classified into two categories such as single
solution based algorithm and population based algorithm. In single solution based
approach one candidate solution is chosen from set of all possible solutions. The chosen
candidate solution undergoes repeated iteration for evaluation till desired optimized
result achieved. Merit of this type of approach is its faster rate of execution due to less
involved complexities with the process. However, the demerit associated with this type
of algorithm is that it may suffer from local optima trap, which results in failure to
attain global optima. Some popular methods which falls to this category available in the
literature are Hill climbing [5], Tabu search [6], Iterated local search [7], Simulated
annealing [8] and Guided local search [9] etc. On the contrary the population based
approach considers all possible candidate solutions instead of a single candidate
solution. In this perspective, the objective function tries to assess the correctness of
each and every candidate solution and guide towards attaining optimal result. Broadly
this population based method again subdivided into two categories such as evolutionary
algorithms and swarm based algorithms. Evolutionary algorithms follow the
mechanism which is inspired by biological evolution. The biological evolution process
involves four operators such as random-selection, reproduction, recombination and
mutation. By using these operators the evolution process will be carried out. The
objective function decides the effectiveness of each of the candidate solution. Some of
the important examples of evolutionary approach available in literature are Genetic
algorithm [10], Genetic programming [11], Differential evolution (DE) [12], Scatter
search [13], Memetic algorithm [14], Evolutionary programming [15] and Evolutionary
strategy [16]. Second category of population based algorithm is the swarm based
algorithm which is evolved by the social behavior and implementing the swarming
behavior of various creatures from nature such as birds, ants, crows, grey wolves, bees,
hyenas, whales and salps. This type of approach is popular among researchers due to its
wide range of applications, comparatively easy to understand and implement. The wide
acceptance of this type of algorithm is its ability to solve many complex real world day
to day life optimization problems and higher efficacy. Popular algorithms widely used
in the literature are Particle swarm optimization (PSO) [17], Ant colony optimization
[18], Ant lion optimization [19], Whale optimization [20], Grey wolf optimization [21],
Spotted hyena optimization [22], Artificial bee colony [23] and Bat algorithm [24] etc.
Day by day many algorithms are evolved with a motivation to perform in specific real
life problems as a specific algorithm cannot give good result for all algorithms. In this
direction, there is an important development regarding these meta-heuristic algorithms
is the No Free Lunch (NFL) theorem [25]. The NFL is one of the most interesting
progresses for stochastic algorithms, which is the main cause of evolvement of new
such kind of algorithms each year. According to NFL theorem all the optimization
algorithms behave equally over the given function list.
References
1. Panda, N. and Majhi, S.K., 2020. Improved spotted hyena optimizer with space
transformational search for training pi-sigma higher order neural network.
Computational Intelligence, 36(1), pp.320-350.
2. Panda, N. and Majhi, S.K., 2019. Improved Salp Swarm Algorithm with Space
Transformation Search for Training Neural Network. Arabian Journal for
Science and Engineering, pp.1-19.