Software Engineering Using Artificial Intelligence
Software Engineering Using Artificial Intelligence
net/publication/254198356
CITATIONS READS
41 17,548
3 authors:
SEE PROFILE
All content following this page was uploaded by Mohamed Salah Hamdi on 22 October 2014.
1. Introduction 2. Background
The software intensive systems we develop these days This section briefly introduces the primary
are becoming much more complex in terms of the number of processes of the IEEE 12207 standard for software life cycle
functional and nonfunctional requirements they need to processes. This standard is well documented and widely
support. The impact of low quality can also have a used by the industry and has been adopted by the major
catastrophic impact on the mission of these systems in many standards organizations.
critical applications. Moreover, the cost of software
development dominates the total cost of such systems.
2.1 The IEEE 12207 standard for Information
Research in applying artificial intelligence techniques to
Technology-software life cycle processes
software Engineering have grown tremendously in the last
two decades producing a large number of projects and This standard establishes a framework for software
publications. A number of conferences and journals are life cycle processes, and provides well-defined terminology
dedicated to publish the research in this field. The AI to be used by all the stakeholders. It defines processes,
techniques are proposed in order to reduce the time to activities, and tasks that are to be applied during the
market and enhance the quality of software systems. Yet acquisition of a system that contains software, a stand-alone
many of these AI techniques remain largely used by the software product, and software service. The standard covers
research community and with little impact on the processes the tasks for the overall the life cycle phases during the
and tools used by the practicing software engineer. supply, development, operation, and maintenance of
The recent survey papers published in this field are software products.
mainly targeted to the research community. They are driven We briefly describe the processes defined by the
by the specific AI techniques used rather than the software standard and their dependencies in a layered architecture.
engineering activities supported. They are also focused on a The top layer contains the five primary life cycle processes
specific software engineering process such as software defined by the standard. These are the acquisition, supply,
design [28] development, operation, and maintenance processes. The
This survey paper attempts to close the gap standard also defines four project organizational processes
between the research and practice of applying AI techniques that include the management process. Eight other supporting
to the software engineering processes. It also highlights processes are defined in the middle layer are defined to
open practical problems to the research community in support the primary processes and the management process
applying such techniques by surveying the recently in the top layer. The dependencies between the processes are
proposed work in this area. specified. For example, the acquisition and the supply
© ICCIT 2012 24
processes interact to establish the contract for the project ● There are communication problems between the
and start the management process, which in turn manages stakeholders [37]
the development, operation, and maintenance processes. ● Requirements are difficult to manage [10]
The operation process depends on the maintenance process In the following we explore the techniques used in the
to correct errors found, and how the later depends on the requirement engineering field
development process to redevelop components that require
major changes. Processing Natural Language Requirements NLR
The transformation of NLR into specifications and
3. The Development Process
design automatically, began in the early 1980s. In [1],
Abbott drew an analogy between the noun phrases used in
In this section we focus on the major tasks of the NL descriptions and the data types used in programming
development process based on the standard described above languages. In those days requirements and modeling were
and then survey some of the AI techniques used in not as distinct activities as they are now. In [30];[18];[26],
supporting the tasks of this process. In particular we focus the author noted that verb phrases and to some extent
on the tasks related to requirements analysis, architecture adjectives describe relationships between these entities,
design, coding, and testing. operations and functions.
In the following, we give examples of some of the
The system and software architectures play a major systems that have attempted to produce formal specification
role in driving the management activities during the from NL Requirements:
development and maintenance of software systems. The In [30], the authors proposed a framework to translate
standard provides a flow of the development process specifications written in NL (English) into formal
activities and associated documents. The system architecture specifications (TELL). their system was not implemented
design activity which produces the Software architecture but set the foundations for future systems. In [5], NL2ACTL
and requirements allocation description (SARAD) system was introduced the, which aims to translate NL
document, establishes the top-level architecture of the sentences, written to express properties of a reactive system,
system and defines the software and hardware items of the to statements of an action based temporal logic. In [18], the
system. These items are then developed concurrently. The authors developed the FORSEN system which aims to
software items development starts with the software translate NL requirements into the Formal specifications
requirements analysis activity that produces the software language VDM. This system allowed the detection of
requirements specification document (SRS). Then the ambiguities in the NL requirements.
software architecture design activity produces the In the following, we give examples of some of the
documents related to software architecture description systems that have attempted to produce OO oriented models
(SAD), software interface design description (SIDD). This is from NL Requirements
followed by the software coding and testing activities. The In [11], the authors defined a general framework for the
application of AI techniques in support of the tasks of these automatic development of OO models from NL
activities is described in the following subsections. requirements using linguistics instruments. In [20], a Large-
scale Object-based Linguistic Interactor Translator Analyser
3.1 Software requirements analysis (LOLITA) NLP system was used to develop the NL-OOPS
which aims to produce OO specifications from NL
Requirement Engineering (RE):
requirements in [11], the researchers developed an approach
Requirements are first expressed in natural language
that linked the linguistic world and the conceptual world
within a set of documents. These documents usually
through a set of linguistic patterns. In [7], the authors
represent “the unresolved views of a group of individuals
developed the Class-Model Builder (CM-Builder), a NL
and will, in most cases be fragmentary, inconsistent,
based CASE tools that builds class diagrams specified in
contradictory, not prioritized and often be overstated,
UML from NL requirements documents
beyond actual needs” [31]. The main activities of this phase
are requirements elicitation, gathering and analysis and their
Knowledge Based Systems (KBS):
transformation into a less ambiguous representation [36].
In [13], the authors stated that “The reuse of experts
Problems arising during this phase can be summarized
design knowledge can play a significant role in improving
as follows:
the quality and efficiency of the software development
● Requirements are ambiguous [27] process”. KBS were used to store design families, upon the
● Requirements are incomplete, vague and imprecise development of the requirements, input and outputs of the
[34] , [35] system’s functionality. The system searches the KB and
● Requirements are conflicting [35] proposes a design schema which is refined by the user to
● Requirements are volatile [18] fully satisfy the requirements. In [31], The READS tool
supports both the front end activities such as requirement
25
discovery, analysis and decomposition and requirements 3.2 Software architecture design
traceability, allocation, testing, and documentation
Ontologies: One of the most important problems facing the
Ontologies are developed by many organizations to software engineer is to develop quality architecture from the
reuse, integrate, and merge data and knowledge and to requirements model. In this section we describe recent work
achieve interoperability and communication among their on software architecture design using AI techniques.
software systems. In [34], the authors use semantic web and Developing the software architecture starts by defining a
ontological techniques to elicit, represent, model, analyze hierarchy of subsystems and components with allocated
and reason about knowledge and information involved in responsibilities from the information provided by the
requirements engineering processes. In [4], the researchers requirements and analysis models. AI techniques uses
have developed the Ontology-based software Development quality attributes to define a goodness function over the
Environment (ODE) based on a software process ontology. space of possible architectures. Some of the most common
quality attributes of architecture design used in developing
Intelligence Computing for Requirements Engineering: the architecture are modularity, complexity, modifiability,
In this section, we will discuss some of the systems understandability (or clarity), and reusability. Modularity is
developed using Computational Intelligence (CI) techniques usually connected to the concept of coupling and cohesion,
to support requirements engineering The SPECIFIER where designers strive for a modular design by developing
system [21] can best be viewed as a case based system that the architecture using loosely coupled and highly cohesive
takes as input an informal specification of an operation subsystems and components. In an earlier work on using AI
where the pre and post-conditions are given as English techniques for software architecture development, Robyn
sentences. In [35], the authors used fuzzy logic and Lutz [14] used Genetic Algorithms (GAs) to search the
possibility theory to develop an approximate reasoning space of possible hierarchical decompositions of a system.
schema for inferring relative priority of requirements under She introduced a fitness function using information theoretic
uncertainty, to assess requirements priorities . This is to metric capturing the data coupling and control coupling
achieve an effective trade off among conflicting between components. The quality attribute used for the
requirements so that each conflicting requirement can be fitness function is related to the complexity and modularity
satisfied to some degree. of the produced architecture. Later on, she focused in her
In [12], an approach is presented that uses further research on Product Line Architectures (PLAs)
computational linguistics to analyze textual scenarios, to [15,16] where variation points are explicitly defined to
identify where actors or whole actions are missing from the enhance reusability and modifiability of reference
text, to fill the missing information, and to generate a architecture that can be used to instantiate a family of
message sequence chart (MSC) including the information architectures. Other work on hierarchical decompositions of
missing from the textual scenario. Then, the requirements a system is summarized in [28].
analyst validates the generated MSC.
In Jose Del Salgado Martinez et al (Chapter 6 in [19]), A promising recent work on synthesizing
the authors constructed a Bayesian network to predict architecture from requirements using GAs is presented in
whether a requirements specification has enough quality to [29]. Figure 1 shows the process of architecture generation.
be considered as a baseline. In order to structure and In this work, the requirements model based on use-cases that
quantify the final model of the Bayesian network captures the functional requirements is used to develop a
“Requisites”, several information sources were used, such as null architecture that gives the basic decomposition of the
standards, reports, and through interaction with experts. This functionalities into components. The null architecture is
Bayesian network represents the knowledge needed when represented by a UML class diagram that is generated from
assessing a requirements specification. Requisites were use-case sequence diagrams. The null architecture is used by
demonstrated on some use cases. After the propagation over the GA to first create an initial population of architectures.
the network of information collected about the certainty of a A fixed library of standard architectural solutions based on
subset of variables, the value predicted determine whether styles and patterns is used to produce new generations. The
the requirements specification has to be revised or not. fitness function of the GA is defined by a weighted list of
In [3], the authors present a collaborative and situational metrics of quality attributes. This function can also be
tool called MUSTER, that has been designed and developed optionally defined by scenarios capturing specific quality
for requirements elicitation workshops. The tool also offers attributes. The work presented is restricted to modifiability
an example of how a group support system, coupled with scenarios because it can be easily formalized. Details of the
artificial intelligence, can be applied to very practical GA technique used and the results of testing are given in the
activities and situations within the software development reference.
process.
26
3.3 Software coding and testing Analogical reasoning in software reuse can be used. The
idea is to find a system with similar requirements and
Techniques learned from AI research make advanced modify it. Although this process looks feasible, it has not
programming much simpler, especially with regard to been demonstrated in software engineering to any great
information flow and control as a result of advances in extent.
knowledge representation. In the following we focus on the Closely related to analogical reasoning techniques is Case-
AI techniques used in supporting the tasks of coding and based reasoning (CBR). CBR is based upon the premise that
testing. similar problems are best solved with similar solutions. CBR
is argued to offer a number of advantages over many other
knowledge management techniques. For program synthesis
retrieval from component repositories and the reuse of
successful past Experience is important. As an example, one
application of CBR technology was to support the reuse of
software packages within Ada and C program libraries.
The idea of experience reuse, the most ambitious form of
CBR-supported reuse, is closely aligned with what is called
Experience Factory. This field is also known as
Organizational Learning, researches methods and techniques
for the management, elicitation, and adaptation of reusable
artifacts from software engineering projects. An Experience
Factory is based upon a number of premises such as a
feedback process, appropriate storage of experience, and
support of reuse and retrieval [25].
Constraint programming is another AI technique that is
applied in software engineering. Constraint programming
has been, for example, used to design the PTIDEJ system
Figure 1. Evolutionary architecture generation Adopted from (Pattern Trace Identification, Detection and Enhancement in
[29]. Java. PTIDEJ is an automated system designed to identify
micro-architectures looking like design patterns in object
a) Coding: oriented source code. A micro-architecture defines a subset
Software engineers can apply AI techniques to help of classes in an objected oriented program. The main interest
automate or assist the programming process. of PTIDEJ is that it is able to provide explanations for its
answers. This is really interesting since coding and software
Use of AI to help assist the programming process: engineering is often considered a form of art and where fully
The main idea here is to create an expert system to assist automated systems are not always appreciated by potential
software engineers during software development [23], [24]. users (or programmers).
In [23], this proposal is called the Programmer's Apprentice Search Based Software Engineering (SBSE) is an emerging
Project. The Programmer's Apprentice should have the research topic that focuses on representing aspects of
capability of interacting with the human programmers Software Engineering as problems that may be solved using
exactly the same way as human assistants would, thereby meta-heuristic search algorithms developed in AI. SBSE is
hopefully increasing the productivity of the human the reformulation of software engineering tasks as
programmers. At first, the Apprentice would only be able to optimization problems. One of the optimization and search
handle "the simplest and most routine parts" of techniques that can be used are genetic algorithms. Genetic
programming. As time progresses and research continues, algorithms are used for automatic code generation by
the Apprentice should be able to deal with more complicated optimizing a population of trial solutions to a problem. The
tasks. The human programmers will still be necessary to individuals in the population are computer programs.
implement code of a 'tricky' nature (such as abstract b) Testing: Software testing remains an expensive task in
reasoning or to better cater human preferences). the development process and one of the main challenges
concerns its possible automation. AI techniques can
Use of AI to help automate the programming process: play a vital role in this regard. One of these techniques
The idea here is to have a completely automated program are constraint solving techniques. Since the seminal
synthesis. This is done by having human specialists write a work of Offut and De Millo in the context of mutation
complete and concise specification of the desired software; testing [40], much attention has been devoted to the use
so that, a system can generate "functions, data structures, or of constraint solving techniques in the automation of
entire programs" directly from the specifications [8]. There software testing (Constraint-based testing). ATGen, for
are many possible AI technologies that could be applied. example, is a software test data generator based on
27
symbolic execution and constraint logic programming repair assignments. In any case, more studies with respect
for ADA programs. to the appropriate criteria for selecting assignment policy,
There are many other ways how AI techniques can reward mechanisms and management goals need to be
support the testing process [19]. One of the earliest studies undertaken.
to suggest adoption of a knowledge based system for testing One open problem with Search-Based Software Testing
was by Bering and Crawford [2] who describe a Prolog techniques, and Search-Based Test Data Generation
based expert system that takes a Cobol program as input, techniques in particular, is lack of handling of the execution
parses the input to identify relevant conditions and then aims environment that the software under test lives within.
to generate test data based on the conditions. Current state of the art in test data generation, for example,
A more active area of research since the mid-1990s has ignores or fails to handle interactions with the underlying
been the use of AI planning for testing. An AI planner could operating system, the file system, network access and
generate test cases, consisting of a sequence of commands databases on which they may be dependent.
by representing commands as operators, providing initial Another problem with Search-Based Software Testing
states, and setting the goal as testing for correct system techniques is: because fitness functions are heuristics, there
behavior [9]. AI planning was also used for testing are cases in which they fail to give adequate guidance to the
distributed systems [6] and for the generation of test cases search.
for graphical user interfaces [17]. Constraint-Based Testing (CBT) is the process of generating
A study by Kobbacy, et al [38] has shown that the use test cases from programs or models by using the Constraint
of genetic algorithms for optimization has grown Programming technology. Scalability is the main challenge
substantially since the 1980s. This trend is also present in that CBT tools have to face to.
their use in testing, with numerous studies aiming to take Dealing with more than hundred of thousands lines of code,
advantage of their properties in an attempt to generate with dynamic constructions such as huge dynamic data
optimal test cases. The authors in [33], for example, used structures, with non-linear numerical constraints extracted
genetic algorithms for testing object oriented programs from complex statements are some of the problems we have
where the main aim was to construct test cases consisting of to deal with.
a sequence of method calls.
Fuzzy logic is another AI technique that is applied in 5. Conclusions
software testing to manage the uncertainty involved in this
phase of software development [39]. In this paper, we surveyed promising research work
on applying AI techniques to solve some of the most
4. Open Problems important problems facing the software engineer. We
surveyed research in the development activities of
Open problems that Artificial Intelligence can help in requirements engineering, software architecture design, and
the requirements engineering phase include the following: coding and testing processes. We summarized the most
[19] important open problems in these active research areas.
● Disambiguating natural language requirements
● Developing knowledge based systems and ontologies 6. Aknowledgements
to manage the requirements and model problem This research work was funded by Qatar National
domains Research Fund (QNRF) under the National Priori-ties
● The use of computational intelligence to solve the Research Program (NPRP) Grant No.: 09-1205-2-470.
problems of incompleteness and prioritization of
requirements. 7. References
One of the most difficult problems is the problem of
transforming requirements into architectures. Much research [1] Abbott, R. J. (1983). Program design by informal English
is needed in this area to address the ever increasing descriptions. CACM, 26(11), 882–894.
[2] Bering, C. A., & Crawford, M. W. (1988). Using an expert
complexity of functional and non-functional requirements.
system to test a logistics information system. In Proceedings of the
Recent important research problems are developing product IEEE National Aerospace and Electronics Conference (pp. 1363-
line architectures and service-oriented architectures using AI 1368), Dayton, OH. Washington DC: IEEE Computer Society.
techniques. [3] Coulin, C., Zowghi, D., & Sahraoui, A. (2010). MUSTER: A
Test data generation is notoriously hard. Recent work Situational Tool for Requirements Elicitation. In F. Meziane, & S.
(including that one search based testing) has made progress Vadera (Eds.), Artificial Intelligence Applications for Improved
towards the ultimate goal of fully automated test case Software Engineering Development: New Prospects (pp. 146-165).
design. However, the techniques that are being developed [4] Falbo, R. A., Guizzardi, G., Natali, A. C., Bertollo, G., Ruy, F.
are often hampered by features of the programs under test. F., & Mian, P. G. (2002), Towards semantic software engineering
environments. Proceedings of the 14th international Conference on
One area that has received some attention is the use of Software Engineering and Knowledge Engineering, (pp. 477-478).
automated algorithms with machine learning to make
28
[5] Fantechi, A., Gnesi, S., Ristori, G., Carenini, M., Vanocchi, [24] Phil B. (1999). The Use of Artificial Intelligence for Program
M., & Moreschini, P. (1994). Assisting requirement formalization Development,
by means of natural language translation. Formal Methods in http://www.philforhumanity.com/The_Use_of_Artificial_Intelligen
System Design, 4(3), 243–263. ce_for_Program_Development.html
[6] Gupta, M., Bastani, F., Khan, L., & Yen, I.-L. (2004). [25] Shepperd, M. J. (2009). Case-based reasoning and software
Automated test data generation using MEA-graph planning. In engineering. Empirical Software Engineering. Springer. Retrieved
Proceedings of the Sixteenth IEEE Conference on Tools with from http://hdl.handle.net/2438/3049.
Artificial Intelligence (pp. 174-182). Washington, DC: IEEE [26] Poo, D. C. C., & Lee, S. Y. (1995). Domain object
Computer Society. identification through events and functions. Information and
[7] Harmain, H. M., & Gaizauskas, R. (2003). CM-Builder: A Software Technology, 37(11), 609–621.
natural language-based CASE tool for object-oriented analysis. [27] Presland, S. G. (1986). The analysis of natural language
Automated Software Engineering Journal, 10(2), 157–181. requirements documents. PhD Thesis, University of Liverpool,
[8] Hewett, Micheal, and Rattikorn Hewett (1994). 1994 IEEE UK.
10th Conference on Artificial Intelligence for Applications. [28] Outi Räihä, A survey on search-based software design,”
[9] Howe, A. E., von Mayrhauser, A., & Mraz, R. T. (1995). Test Computer Science Review, 4 ( 2 0 1 0 ) 203 – 249.
sequences as plans: an experiment in using an AI planner to [29] Outi Räihä, Hadaytullah , Kai Koskimies and Erkki Mäkinen
generate system tests. In Proceedings of the Tenth Conference on “Synthesizing Architecture from Requirements: A Genetic
Knowledge-Based Software Engineering (pp. 184-191). Approach” UNIVERSITY OF TAMPERE DEPARTMENT OF
[10] Hull, E., Jackson, K., & Dick, J. (2005). Requirements COMPUTER SCIENCES SERIES OF PUBLICATIONS – NET
Engineering. Berlin: Springer. PUBLICATIONS , AUGUST 2010
[11] Juristo, N., Moreno, A. M., & López, M. (2000). How to use [30] Saeki, M., Horai, H., & Enomoto, H. (1989). Software
linguistics instruments for Object-Oriented Analysis. IEEE development process from natural language specification. In
Software, (May/June): 80–89.. Proceedings of the 11th international Conference on Software
[12] Kof, L. (2010). From Textual Scenarios to Message Sequence Engineering. (pp. 64-73), Pittsburgh, PA.
Charts. In F. Meziane, & S. Vadera (Eds.), Artificial Intelligence [31] Smith, T. J. (1993). READS: a requirements engineering tool.
Applications for Improved Software Engineering Development: Proceedings of IEEE International Symposium on Requirements
New Prospects (pp. 83-105). Engineering, (pp. 94–97), San Diego.
[13] Lubars, M. D., & Harandi, M. T. (1987). Knowledge- based SSBSE ( 2010). http://www.ssbse.org, checked 10.5.2011.
software design using design schemas. In Proceedings of the 9th [32] Vadera, S., & Meziane, F. (1994). From English to Formal
international Conference on Software Engineering, (pp. 253-262). Specifications. The Computer Journal, 37(9), 753–763.
[14] Lutz, R. “Evolving good hierarchical decompositions of von Mayrhauser, A., France, R., Scheetz, M., & Dahlman, E.
complex systems,” Journal of Systems Architecture 47 (2001), (2000). Generating test-cases from an object-oriented model with
613–634. an artifical-intelligence planning system. IEEE Transactions on
[15] Lutz, R. “A Survey of Product-Line Verification and Reliability, 49(1), 26–36. doi:10.1109/24.855534
Validation Techniques,” JPL-NASA Technical Report, 2007 [33] Wappler, S., & Wegener, J. (2006). Evolutionary unit testing
http://trs-new.jpl.nasa.gov/dspace/bitstream/2014/41221/1/07- of object-oriented software using strongly-typed genetic
2165.pdf programming. In Proceedings of the Eighth Annual Conference on
[16] Jing (Janet) J. Liu, Samik Basu and Robyn R. Lutz: Genetic and Evolutionary Computation (pp. 1925-1932), Seattle,
Generating Variation Point Obligations for Compositional Model WA. New York: ACM Press.
Checking of Software Product Lines. Journal of Automated [34] Yang, Y., Xia, F., Zhang, W., Xiao, X., Li, Y., & Li, X.
Software Engineering, p. 29, vol. 18, 2011 (2008). Towards Semantic Requirement Engineering. IEEE
[17] Memon, A. M., Pollack, M. E., & Soffa, M. L. (1999). Using International Workshop on Semantic Computing and Systems (pp.
a Goal Driven Approach to Generate Test Cases for GUIs. In 67-71).
Proceedings of the Twenty-first International Conference on [35] Yen, J., & Liu, F. X. (1995). A Formal Approach to the
Software Engineering (pp. 257-266). Analysis of Priorities of Imprecise Conflicting Requirements. In
[18] Meziane, F. (1994). From English to Formal Specifications. Proceedings of the 7th international Conference on Tools with
PhD Thesis, University of Salford, UK. Artificial intelligence. Herndon, VA , USA
[19] Meziane, F. and Vadera, S., (2010). Artificial Intelligence in [36] Young, R. R. (2003). The requirements Engineering
Software Engineering Current Developments and Future Prospects, Handbook. Norwood, MA: Artech House Inc.
In "Artificial Intelligence Applications for Improved Software [37] Zave, P. (1997). Classification of Research Efforts in
Engineering Development: New Prospects", IGI Global Requirements Engineering. ACM Computing Surveys, 29(4), 315–
[20] Mich, L. (1996). NL-OOPS: from natural language to object, 321.
oriented requirements using [38] Kobbacy, K. A., Vadera, S., & Rasmy, M. H. (2007). AI and
the natural language processing system LOLITA. Natural OR in management of operations: history and trends. The Journal
Language Engineering, 2(2), 161–187. of the Operational Research Society, 58, 10–28.
[21] Miriyala, K., & Harandi, M. T. (1991). Automatic derivation doi:10.1057/palgrave. jors.2602132
of formal software specifications from informal descriptions. IEEE [39] Nand, S., Kaur, A., Jain S. (2007).Use Of Fuzzy Logic In
Transactions on Software Engineering, 17(10), 1126–1142. Software Development. Issues in Information Systems. Volume
[22] Moreno, C. A., Juristo, N., & Van de Riet, R. P. (2000). VIII, No. 2, pp. 238-244
Formal justification in object-oriented modelling: A linguistic [40] DeMillo, R.A., Offutt, A.J. (1991). Constraint-based
approach. Data & Knowledge Engineering,33, 25–47. automatic test data generation. IEEE Transactions on Software
[23] Partridge, Derek, ed. (1991). Artificial Intelligence and Engineering 17 (9), 900–910.
Software Engineering. New Jersey: University of Exeter, 1991.
29