Se Notes-1
Se Notes-1
UNIT I
An alternative definition of software engineering is: “An engineering approach to develop software”.
Software engineering principles have evolved over the last sixty years with contributions from
numerous researchers and software professionals. Over the years, it has emerged from a pure art to
a craft, and finally to an engineering discipline. The early programmers used an ad hoc programming
style. This style of program development is now variously being referred to as exploratory, build and
fix, and code and fix styles.
Software products
Variety of software such as Microsoft’s Windows and the Office suite, Oracle DBMS, software
accompanying a camcorder or a laser printer, etc. These software are available off-the-shelf for
purchase and are used by a diverse range of customers. These are called generic software products
since many users essentially use the same software. These can be purchased off-the-shelf by the
customers.
Software services
Indian software companies have excelled in executing software services projects and have made a
name for themselves all over the world. Generic product development entails certain amount of
business risk. A company needs to invest upfront and there is substantial risks concerning whether
the investments would turn profitable.
Abstraction refers to construction of a simpler version of a problem by ignoring the details. The
principle of constructing an abstraction is popularly known as modelling (or model construction ).
Abstraction is the simplification of a problem by focusing on only one aspect of the problem while
omitting all other aspects.
software engineering techniques have evolved over many years in the past. This evolution is the
result of a series of innovations and accumulation of experience about writing good programs
Early Computer Programming : Early commercial computers were very slow and too elementary as
compared to today’s standards. Even simple processing tasks took considerable computation time
on those computers. No wonder that programs at that time were very small in size and lacked
sophistication. Those programs were usually written in assembly languages. Program lengths were
typically limited to about a few hundreds of lines of monolithic assembly code.
High-level Language Programming: Computers became faster with the introduction of the
semiconductor technology in the early 1960s. Faster semiconductor transistors replaced the
prevalent vacuum tube-based circuits in a computer. With the availability of more powerful
computers, it became possible to solve larger and more complex problems. At this time, high-level
languages such as FORTRAN, ALGOL, and COBOL were introduced.
Control Flow-based Design As the size and complexity of programs kept on increasing, the
exploratory programming style proved to be insufficient. Programmers found it increasingly difficult
not only to write cost-effective and correct programs, but also to understand and maintain programs
written by others. To cope up with this problem, experienced programmers advised other
programmers to pay particular attention to the design of a program’s control flow structure. A
program’s control flow structure indicates the sequence in which the program’s instructions are
executed. In order to help develop programs having good control flow structures, the flow charting
technique was developed
Structured programming - a logical extension The need to restrict the use of GO TO statements was
recognised by everybody. However, many programmers were still using assembly languages. JUMP
instructions are frequently used for program branching in assembly languages. Therefore,
programmers with assembly language programming background considered the use of GO TO
statements in programs inevitable. An example of a sequence statement is an assignment statement
of the form a=b;.
Object-oriented Design Data flow-oriented techniques evolved into object-oriented design (OOD)
techniques in the late seventies. Object-oriented design technique is an intuitively appealing
approach, where the natural objects (such as employees, pay-roll-register, etc.) relevant to a
problem a r e first identified and then the relationships among the objects such as composition,
reference, and inheritance are determined. Each object essentially acts as a data hiding (also known
as data abstraction ) entity.
The life cycle of a software represents the series of identifiable stages through which it evolves
during its life time.
A good SDLC besides clearly identifying the different phases in the life cycle, should unambiguously
define the entry and exit criteria for each phase. The phase entry (or exit) criteria is usually
expressed as a set of conditions that needs to be be satisfied for the phase to start (or to complete).
As an example, the phase exit criteria for the software requirements specification phase, can be that
the software requirements specification (SRS) document is ready, has been reviewed internally, and
also has been reviewed and approved by the customer.
A software life cycle model (also called process model) is a descriptive and diagrammatic
representation of the software life cycle. A life cycle model represents all the activities required to
make a software product transit through its life cycle phases. It also captures the order in which
these activities are to be undertaken. In other words, a life cycle model maps the different activities
performed on a software product from its inception to retirement.
Different life cycle models may map the basic development activities to phases in different ways.
Thus, no matter which life cycle model is followed, the basic activities are included in all life cycle
models though the activities may be carried out in different orders in different life cycle models.
During any life cycle phase, more than one activity may also be carried out.
The development team must identify a suitable life cycle model for the particular project and then
adhere to it. Without using of a particular life cycle model the development of a software product
would not be in a systematic and disciplined manner. When a software product is being developed
by a team there must be a clear understanding among team members about when and what to do.
Otherwise it would lead to chaos and project failure.
The classical Waterfall Model was the first Process Model. It is also known as a linear-sequential life
cycle model. It is very simple to understand and use.
Classical waterfall model is the earliest, best known and most commonly used methodology. It is a
sequential lifecycle that is simple to understand and use each phase has to be completed finished
before another start which means no overlapping is allowed.
The main aim of the feasibility study is to determine whether it would be technically and financially
feasible to develop the product. The feasibility study activity involves analysis of the problem and
collection of all relevant information relating to the product. The collected data is analyzed to arrive
at the following:
An abstract problem definition: Only the important requirements of the customer are captured and
the details of the requirement are ignored.
Formulation of the different strategies for solving the problem: All the different ways in which the
problem can be solved are identified.
Evaluation of the different solution strategies: Different solution strategies are analyzed to examine
their benefits and shortcomings. This analysis usually requires making approximate estimates of the
resources required, cost of development, and development time required for each of the alternative
solutions.
Once the best solution is identified, all later phases of development are carried out to as per this
solution. In other words, we can say that during the feasibility study, very high-level decisions
regarding the exact solution strategy to be adopted are made. At feasibility study stage, it may also
be determined that none of the solution is feasible due to high cost, resource constraints, or some
technical reasons.
The aim of the requirements analysis and specification phase is to understand the exact
requirements of the customer and to document them properly. This phase consists of two distinct
activities, namely requirements gathering and analysis, and requirements specification as follows:
Requirements Gathering and Analysis: The goal of the requirements gathering activity is to collect all
relevant information from the customer regarding the product to be developed. Once the
requirements have been gathered, the analysis activity is taken up. The goal of the requirement
analysis is to weed out the incompleteness and inconsistencies in these requirements.
Requirement Specification: The customer requirements identified during the requirements gathering
and analysis activity are organized into a software requirement specification (SRS) document. The
important components of this document are functional requirements, non-functional requirements,
and the goals of implementation.
3. Design
The goal of the design phase is to transform the requirements specified in the SRS document into a
structure that is suitable for implementation in some programming language. In technical terms,
during the design phase, the software architecture is derived from the SRS document. There are two
design approaches being used at present: traditional design approach and object-oriented
design approach.
The purpose of coding and unit testing phase of software development is to translate the software
design into the source code. The coding phase is also sometimes called the implementation
phase since the design is implemented into a workable solution in this phase. Each component of
design is implemented as a program module. The end-product of this phase is a set of program
modules that been individually tested. After coding is complete, each module is unit-tested to
determine the correct working of all individual modules.
During this phase, the different modules are integrated in a planned manner. The plan specifies the
order in which modules are combined to realize the full system. Integration of various modules are
normally carried out incrementally over a number of steps. After each integration step, the partially
integrated system is tested. Finally, after all modules have been successfully integrated and tested,
system is carried out. The goal of system testing is to ensure that the developed system conforms to
its requirements laid out in the SRS document.
6. Maintenance
Maintenance involves monitoring and improving system performance, enhancing system services,
and upgrading to newer versions. Maintenance of a typical software product requires much more
effort than the effort necessary to develop the product itself. The past studies indicate that the
relative effort of development of a typical software product to its maintenance effort is roughly in
40:60 ratio.
Management can provide enough resources and experts to pick up the role at its phase.
Advantages of Waterfall Model
In the waterfall model, only one phase is executed at a time or phases cannot overlap.
Waterfall model works best for a small project, where requirements are clearly defined.
Idealistic Model: The classical waterfall model is an idealistic one since it assumes that no
development error is ever committed by the engineers during any of the life cycle phases. However,
in practical development environments, the engineers do commit a large number of errors in almost
every phase of the life cycle.Waterfall model assumes that all requirements are defined correctly at
the beginning of the project, and on the basis of that, the development work starts. However, that is
rarely the case in real life projects. The customer keeps changing the requirements as their
development proceeds. Thus, it becomes difficult to accommodate later requirements change
request made by the customer. Phases are sequential: Classical waterfall model assumes, that all the
phases are sequential.
For example, for efficient utilization of manpower in a company, the members assigned the testing
work may start their work immediately after the requirements specification to design the system
test cases. Consequently, it is safe to say that in a practical software development scenario, the
different phases might overlap, rather than having a precise point in time at which one phase stops
and the other starts.
In practice, it is not possible to strictly follow the classical waterfall model for software development
work. In this context, we can view the iterative waterfall model as making necessary changes to the
classical waterfall model so that it becomes applicable to practical software development projects.
For example, if during a testing a design error is identified, then the feedback path allows the design
to be reworked and the changes to be reflected in the design documents. However, observe that
there is no feedback path to the feasibility stage. This means that the feasibility study errors cannot
be corrected.
The iterative waterfall model is the most widely used software development model evolved so far.
When to use Iterative Waterfall Model
The requirement of the defined and clearly understood.
New technology is being learned by the development team.
There are some high risk features and goals which might in the future.
Advantages of Iterative Waterfall Model
Feedback Path: iterative waterfall allows the mechanism of error connection because there is a
feedback path from one phase to its preceding phase which it lacks in the Waterfall Model.
Simple: iterative waterfall model is simple to understand and use. It is the most widely used
software development model evolved so far.
Difficult to include change requests: In the iterative waterfall model, all the requirements must be
clearly defined before starting of the development phase but sometimes customer requirement
changes which is difficult to incorporate change requests that are made after development phase
starts.
Not support Intermediate delivery: Project has to be fully completed before it delivered to the
customer.
Risk handling: Project is prone to many types of risk but there is no risk handling mechanism.
The rapid application development (RAD) model was proposed in the early nineties in an attempt to
overcome the rigidity of the waterfall model (and its derivatives) that makes it difficult to
accommodate any change requests from the customer. It proposed a few radical extensions to the
waterfall model.
This model has the features of both prototyping and evolutionary models. It deploys an evolutionary
delivery model to obtain and incorporate the customer feedbacks on incrementally delivered
versions. In this model prototypes are constructed, and incrementally the features are developed
and delivered to the customer. But unlike the prototyping model, the prototypes are not thrown
away but are enhanced and used in the software construction.
● To decrease the time taken and the cost incurred to develop software systems.
● To limit the costs of accommodating change requests.
● To reduce the communication gap between the customer and the developers.
Main motivation
In the iterative waterfall model, the customer requirements need to be gathered, analysed,
documented, and signed off upfront, before any development could start. However, often clients do
not know what they exactly wanted until they saw a working system. It has now become well
accepted among the practitioners that only through the process commenting on an installed
application that the exact requirements can be brought out
Working of RAD
In the RAD model, development takes place in a series of short cycles or iterations. At any time, the
development team focuses on the present iteration only, and therefore plans are made for one
increment at a time. The time planned for each iteration is called a time box.
The development team almost always includes a customer representative to clarify the
requirements. This is intended to make the system tuned to the exact customer requirements and
also to bridge the communication gap between the customer and the development team. The
development team usually consists of about five to six members, including a customer
representative.
The customers usually suggest changes to a specific feature only after they have used it. Since the
features are delivered in small increments, the customers are able to give their change requests
pertaining to a feature already delivered. Incorporation of such change requests just after the
delivery of an incremental feature saves cost as this is carried out before large investments have
been made in development and testing of a large number of features.
The decrease in development time and cost, and at the same time an increased flexibility to
incorporate changes are achieved in the RAD model in two main ways—minimal use of planning and
heavy reuse of any existing code through rapid prototyping. The lack of long-term and detailed
planning gives the flexibility to accommodate later requirements changes. Reuse of existing code has
been adopted as an important mechanism of reducing the development cost. RAD model
emphasises code reuse as an important means for completing a project faster. In fact, the adopters
of the RAD model were the earliest to embrace object-oriented languages and practices. Further,
RAD advocates use of specialised tools to facilitate fast creation of working prototypes.
These specialised tools usually support the following features: Visual style of development.
The following are some of the characteristics of an application that indicate its suitability to RAD
style of development:
Customised software: As already pointed out a customised software is developed for one or two
customers only by adapting an existing software. In customised software development projects,
substantial reuse is usually made of code from pre-existing software. For example, a company might
have developed a software for automating the data processing activities at one or more educational
institutes.
Non-critical software: The RAD model suggests that a quick and dirty software should first be
developed and later this should be refined into the final software for delivery. Therefore, the
developed product is usually far from being optimal in performance and reliability. In this regard, for
well understood development projects and where the scope of reuse is rather restricted, the
Iiterative waterfall model may provide a better solution.
Highly constrained project schedule: RAD aims to reduce development time at the expense of good
documentation, performance, and reliability. Naturally, for projects with very aggressive time
schedules, RAD model should be preferred.
Large software: Only for software supporting many features (large software) can incremental
development and delivery be meaningfully carried out.
The RAD style of development is not advisable if a development project has one or more of the
following characteristics:
Generic products (wide distribution): As we have already pointed out in Chapter 1, software
products are generic in nature and usually have wide distribution. For such systems, optimal
performance and reliability are imperative in a competitive market. As it has already been discussed,
the RAD model of development may not yield systems having optimal performance and reliability.
Requirement of optimal performance and/or reliability: For certain categories of products, optimal
performance or reliability is required. Examples of such systems include an operating system (high
reliability required) and a flight simulator software (high performance required). If such systems are
to be developed using the RAD model, the desired product performance and reliability may not be
realised.
Lack of similar products: If a company has not developed similar software, then it would hardly be
able to reuse much of the existing artifacts. In the absence of sufficient plug-in components, it
becomes difficult to develop rapid prototypes through reuse, and use of RAD model becomes
meaningless.
Monolithic entity: For certain software, especially small-sized software, it may be hard to divide the
required features into parts that can be incrementally developed and delivered. In this case, it
becomes difficult to develop a software incrementally.
RAD versus prototyping model : In the prototyping model, the developed prototype is primarily
used by the development team to gain insights into the problem, choose between alternatives, and
elicit customer feedback. The code developed during prototype construction is usually thrown away.
In contrast, in RAD it is the developed prototype that evolves into the deliverable software. Though
RAD is expected to lead to faster software development compared to the traditional models (such as
the prototyping model), though the quality and reliability would be inferior.
In the iterative waterfall model, all the functionalities of a software are developed together. On the
other hand, in the RAD model the product functionalities are developed incrementally through
heavy code and design reuse. Further, in the RAD model customer feedback is obtained on the
developed prototype after each iteration and based on this the prototype is refined. Thus, it
becomes easy to accommodate any request for requirements changes. However, the iterative
waterfall model does not support any mechanism to accommodate any requirement change
requests. The iterative waterfall model does have some important advantages that include the
following. Use of the iterative waterfall model leads to production of good quality documentation
which can help during software maintenance
RAD versus evolutionary model Incremental development is the hallmark of both evolutionary and
RAD models. However, in RAD each increment results in essentially a quick and dirty prototype,
whereas in the evolutionary model each increment is systematically developed using the iterative
waterfall model. Also in the RAD model, software is developed in much shorter increments
compared the evolutionary model. In other words, the incremental functionalities that are
developed are of fairly larger granularity in the evolutionary model.
The agile software development model was proposed in the mid-1990s to overcome the serious
shortcomings of the waterfall model of development identified above. The agile model was primarily
designed to help a project to adapt to change requests quickly.Thus, a major aim of the agile models
is to facilitate quick project completion
Agility is achieved by fitting the process to the project, i.e. removing activities that may not be
necessary for a specific project. Also, anything that that wastes time and effort is avoided. Please
note that agile model is being used as an umbrella term to refer to a group of development
processes. These processes share certain common characteristics, but do have certain subtle
differences among themselves. A few popular agile SDLC models are the following:
● Crystal
● Atern (formerly DSDM)
● Feature-driven development
● Scrum
● Extreme programming (XP)
● Lean development
● Unified process
In the agile model, the requirements are decomposed into many small parts that can be
incrementally developed. The agile model adopts an iterative approach. Each incremental part is
developed over an iteration. Each iteration is intended to be small and easily manageable and lasting
for a couple of weeks only. At a time, only one increment is planned, developed, and then deployed
at the customer site. No long-term plans are made. The time to complete an iteration is called a time
box. The implication of the term time box is that the end date for an iteration does not change. That
is, the delivery date is considered sacrosanct. The development team can, however, decide to
reduce the delivered functionality during a time box if necessary. A central principle of the agile
model is the delivery of an increment to the customer after each time box. A few other principles
that are central to the agile model are discussed below.
For establishing close contact with the customer during development and to gain a clear
understanding of the domain-specific issues, each agile project usually includes a customer
representative in the team. At the end of each iteration, stakeholders and the customer
representative review the progress made and re-evaluate the requirements. A distinguishing
characteristics of the agile models is frequent delivery of software increments to the customer. Agile
model emphasise face-to-face communication over written documents. It is recommended that the
development team size be deliberately kept small (5–9 people) to help the team members
meaningfully engage in face-to-face communication and have collaborative work environment. It is
implicit then that the agile model is suited to the development of small projects. However, if a large
project is required to be developed using the agile model, it is likely that the collaborating teams
might work at different locations. In this case, the different teams are needed to maintain as much
daily contact as possible through video conferencing, telephone, email etc.
The agile methods derive much of their agility by relying on the tacit knowledge of the team
members about the development project and informal communications to clarify issues, rather than
spending significant amounts of time in preparing formal documents and reviewing them. Though
this eliminates some overhead, but lack of adequate documentation may lead to several types of
problems, which are as follows:
Lack of formal documents leaves scope for confusion and important decisions taken during different
phases can be misinterpreted at later points of time by different team members.
In the absence of any formal documents, it becomes difficult to get important project decisions such
as design decisions to be reviewed by external experts.
When the project completes and the developers disperse, maintenance can become a problem.
Some of the good practices that have been recognised in the extreme programming model and the
suggested way to maximise their use:
Code review: It is good since it helps detect and correct problems most efficiently. It suggests pair
programming as the way to achieve continuous review. In pair programming, coding is carried out by
pairs of programmers. The programmers take turn in writing programs and while one writes the
other reviews code that is being written.
Testing: Testing code helps to remove bugs and improves its reliability. XP suggests test-driven
development (TDD) to continually write and execute test cases. In the TDD approach, test cases are
written even before any code is written.
Simplicity: Simplicity makes it easier to develop good quality code, as well as to test and debug it.
Therefore, one should try to create the simplest code that makes the basic functionality being
written to work. For creating the simplest code, one can ignore the aspects such as efficiency,
reliability, maintainability, etc. Once the simplest thing works, other aspects can be introduced
through refactoring.
Design: Since having a good quality design is important to develop a good quality solution,
everybody should design daily. This can be achievd through refactoring, whereby a working code is
improved for efficiency and maintainability.
Integration testing: It is important since it helps identify the bugs at the interfaces of different
functionalities. To this end, extreme programming suggests that the developers should achieve
continuous integration, by building and performing integration testing several times a day.
Scrum Model In the scrum model, a project is divided into small parts of work that can be
incrementally developed and delivered over time boxes that are called sprints. The software
therefore gets developed over a series of manageable chunks. Each sprint typically takes only a
couple of weeks to complete. At the end of each sprint, stakeholders and team members meet to
assess the progress made and the stakeholders suggest to the development team any changes
needed to features that have already been developed and any overall improvements that they might
feel necessary.
SPIRAL MODEL
Spiral model is one of the most important Software Development Life Cycle models, which provides
support for Risk Handling. In its diagrammatic representation, it looks like a spiral with many loops.
The exact number of loops of the spiral is unknown and can vary from project to project. Each loop
of the spiral is called a Phase of the software development process. The exact number of phases
needed to develop the product can be varied by the project manager depending upon the project
risks. As the project manager dynamically determines the number of phases, so the project manager
has an important role to develop a product using the spiral model.
The Radius of the spiral at any point represents the expenses(cost) of the project so far, and the
angular dimension represents the progress made so far in the current phase.
Each phase of the Spiral Model is divided into four quadrants as shown in the above figure. The
functions of these four quadrants are discussed below-
Objectives determination and identify alternative solutions: Requirements are gathered from the
customers and the objectives are identified, elaborated, and analyzed at the start of every phase.
Then alternative solutions possible for the phase are proposed in this quadrant.
Identify and resolve Risks: During the second quadrant, all the possible solutions are evaluated to
select the best possible solution. Then the risks associated with that solution are identified and the
risks are resolved using the best possible strategy. At the end of this quadrant, the Prototype is built
for the best possible solution.
Develop next version of the Product: During the third quadrant, the identified features are
developed and verified through testing. At the end of the third quadrant, the next version of the
software is available.
Review and plan for the next Phase: In the fourth quadrant, the Customers evaluate the so far
developed version of the software. In the end, planning for the next phase is started.
The Prototyping Model also supports risk handling, but the risks must be identified completely
before the start of the development work of the project. But in real life project risk may occur after
the development work starts, in that case, we cannot use the Prototyping Model. In each phase of
the Spiral Model, the features of the product dated and analyzed, and the risks at that point in time
are identified and are resolved through prototyping. Thus, this model is much more flexible
compared to other SDLC models.
The Spiral model is called a Meta-Model because it subsumes all the other SDLC models. For
example, a single loop spiral actually represents the Iterative Waterfall Model. The spiral model
incorporates the stepwise approach of the Classical Waterfall Model. The spiral model uses the
approach of the Prototyping Model by building a prototype at the start of each phase as a risk-
handling technique. Also, the spiral model can be considered as supporting the Evolutionary model –
the iterations along the spiral can be considered as evolutionary levels through which the complete
system is built.
Good for large projects: It is recommended to use the Spiral Model in large and complex projects.
Customer Satisfaction: Customer can see the development of the product at the early phase of the
software development and thus, they habituated with the system by using it before completion of
the total product.
Too much dependability on Risk Analysis: The successful completion of the project is very much
dependent on Risk Analysis. Without very highly experienced experts, it is going to be a failure to
develop a project using this model.
Difficulty in time management: As the number of phases is unknown at the start of the project, so
time estimation is very difficult.
UNIT II
Experienced developers take considerable time to understand the exact requirements of the
customer and to meticulously document those. They know that without a clear understanding of the
problem and proper documentation of the same, it is impossible to develop a satisfactory solution.
For any type of software development project, availability of a good quality requirements document
has been acknowledged to be a key factor in the successful completion of the project. A good
requirements document not only helps to form a clear understanding of various features required
from the software, but also serves as the basis for various activities carried out during later life cycle
phases.
An overview of requirements analysis and specification phase The requirements analysis and
specification phase starts after the feasibility study stage is complete and the project has been found
to be financially viable and technically feasible. The requirements analysis and specification phase
ends when the requirements specification document has been developed and reviewed.
The goal of the requirements analysis and specification phase is to clearly understand the
customer requirements and to systematically organise the requirements into a document called
the Software Requirements Specification (SRS) document.
Requirements analysis and specification activity is usually carried out by a few experienced members
of the development team and it normally requires them to spend some time at the customer site.
The engineers who gather and analyse customer requirements and then write the requirements
specification document are known as system analysts in the software industry parlance. System
analysts collect data pertaining to the product to be developed and analyse the collected data to
conceptualise what exactly needs to be done. After understanding the precise user requirements,
the analysts analyse the requirements to weed out inconsistencies, anomalies and incompleteness.
They then proceed to write the software requirements specification (SRS) document. The SRS
document is the final outcome of the requirements analysis and specification phase.
Once the SRS document is ready, it is first reviewed internally by project team to ensure that it
accurately captures all the user requirements, and that it is understandable, consistent,
unambiguous, and complete. The SRS document is then given to the customer for review. After the
customer has reviewed the SRS document and agrees to it, it forms the basis for all future
development activities and also serves as a contract document between the customer and the
development organisation
Main activities carried out during requirements analysis and specification phase
Requirements analysis and specification phase mainly involves carrying out the following two
important activities:
• Requirements specification
The complete set of requirements are almost never available in the form of a single document from
the customer. In fact, it would be unrealistic to expect the customers to produce a comprehensive
document containing a precise description of what he wants. Further, the complete requirements
are rarely obtainable from any single customer representative. Therefore, the requirements have to
be gathered by the analyst from several sources in bits and pieces. These gathered requirements
need to be analysed to remove several types of problems that frequently occur in the requirements
that have been gathered piecemeal from different sources. We can conceptually divide the
requirements gathering and analysis activity into two separate tasks: • Requirements gathering •
Requirements analysis
The complete set of requirements are almost never available in the form of a single document from
the customer. In fact, it would be unrealistic to expect the customers to produce a comprehensive
document containing a precise description of what he wants. Further, the complete requirements
are rarely obtainable from any single customer representative. Therefore, the requirements have to
be gathered by the analyst from several sources in bits and pieces. These gathered requirements
need to be analysed to remove several types of problems that frequently occur in the requirements
that have been gathered piecemeal from different sources.
We can conceptually divide the requirements gathering and analysis activity into two separate tasks:
• Requirements gathering
• Requirements analysis
Requirements Gathering
Requirements gathering is also popularly known as requirements elicitation. The primary objective
of the requirements gathering task is to collect the requirements from the stakeholders. A
stakeholder is a source of the requirements and is usually a person, or a group of persons who either
directly or indirectly are concerned with the software. Requirements gathering may sound like a
simple task. However, in practice it is very difficult to gather all the necessary information from a
large number of stakeholders and from information scattered across several pieces of documents.
Gathering requirements turns out to be especially challenging if there is no working model of the
software being developed.
Suppose a customer wants to automate some activity in his organisation that is currently being
carried out manually. In this case, a working model of the system (that is, the manual system) exists.
Availability of a working model is usually of great help in requirements gathering. For example, if the
project involves automating the existing accounting activities of an organisation, then the task of the
system analyst becomes a lot easier as he can immediately obtain the input and output forms and
the details of the operational procedures.
In this context, consider that it is required to develop a software to automate the book-keeping
activities involved in the operation of a certain office. In this case, the analyst would have to study
the input and output forms and then understand how the outputs are produced from the input data.
However, if a project involves developing something new for which no working model exists, then
the requirements gathering activity becomes all the more difficult. In the absence of a working
system, much more imagination and creativity is required on the part of the system analyst.
Typically even before visiting the customer site, requirements gathering activity is started by
studying the existing documents to collect all possible information about the system to be
developed. During visit to the customer site, the analysts normally interview the end-users and
customer representatives, carry out requirements gathering activities such as questionnaire surveys,
task analysis, scenario analysis, and form analysis
Task analysis helps the analyst to understand the nitty-gritty of various user tasks and to
represent each task as a hierarchy of subtasks.
Scenario analysis: A task can have many scenarios of operation. The different scenarios of a
task may take place when the task is invoked under different situations. For different types
of scenarios of a task, the behaviour of the software can be different. For example, the
possible scenarios for the book issue task of a library automation software may be: Book is
issued successfully to the member and the book issue slip is printed. The book is reserved,
and hence cannot be issued to the member. The maximum number of books that can be
issued to the member is already reached, and no more books can be issued to the member.
For various identified tasks, the possible scenarios of execution are identified and the details
of each scenario is identified in consultation with the users. For each of the identified
scenarios, details regarding system response, the exact conditions under which the scenario
occurs, etc. are determined in consultation with the user
Form analysis: Form analysis is an important and effective requirements gathering activity
that is undertaken by the analyst, when the project involves automating an existing manual
system. During the operation of a manual system, normally several forms are required to b e
filled up by the stakeholders, and in turn they receive several notifications (usually manually
filled forms). In form analysis the exiting forms and the formats of the notifications produced
are analysed to determine the data input to the system and the data that are output from
the system. For the different sets of data input to the system, how these input data would
be used by the system to produce the corresponding output data is determined from the
users
Requirements Analysis
After requirements gathering is complete, the analyst analyses the gathered requirements to
form a clear understanding of the exact customer requirements and to weed out any
problems in the gathered requirements. It is natural to expect that the data collected from
various stakeholders to contain several contradictions, ambiguities, and incompleteness,
since each stakeholder typically has only a partial and incomplete view of the software.
Therefore, it is necessary to identify all the problems in the requirements and resolve them
through further discussions with the customer.
The main purpose of the requirements analysis activity is to analyse the gathered
requirements to remove all ambiguities, incompleteness, and inconsistencies from the
gathered customer requirements and to obtain a clear understanding of the software to
be developed.
For carrying out requirements analysis effectively, the analyst first needs develop a clear
grasp of the problem. The following basic questions pertaining to the project should be
clearly understood by the analyst before carrying out analysis:
What is the problem?
Why is it important to solve the problem?
What exactly are the data input to the system and what exactly are the data output by the
system?
What are the possible procedures that need to be followed to solve the problem?
What are the likely complexities that might arise while solving the problem?
If there are external software or hardware with which the developed software has to
interface, then what should be the data interchange formats with the external systems?
After the analyst has understood the exact customer requirements, he proceeds to identify
and resolve the various problems that he detects in the gathered requirements.
During requirements analysis,the analyst needs to identify and resolve three main types of
problems in the requirements:
• Anomaly : It is an anomaly is an ambiguity in a requirement. When a requirement is
anomalous, several interpretations of that requirement are possible. Any anomaly in any of
the requirements can lead to the development of an incorrect system, since an anomalous
requirement can be interpreted in the several ways during development.
• Inconsistency : Two requirements are said to be inconsistent, if one of the requirements
contradicts the other.
• Incompleteness: An incomplete set of requirements is one in which some requirements
have been overlooked. The lack of these features would be felt by the customer much later,
possibly while using the software. Often, incompleteness is caused by the inability of the
customer to visualise the system that is to be developed and to anticipate all the features
that would be required. An experienced analyst can detect most of these missing features
and suggest them to the customer for his consideration and approval for incorporation in
the requirements.
A well-formulated SRS document finds a variety of usage other than the primary intended
usage as a basis for starting the software development work. In the following subsection, we
identify the important uses of a well-formulated SRS document:
Forms an agreement between the customers and the developers: A good SRS document
sets the stage for the customers to form their expectation about the software and the
developers about what is expected from the software.
Reduces future reworks: The process of preparation of the SRS document forces the
stakeholders to rigorously think about all of the requirements before design and
development get underway. This reduces later redesign, recoding, and retesting. Careful
review of the SRS document can reveal omissions, misunderstandings, and inconsistencies
early in the development cycle.
Provides a basis for estimating costs and schedules: Project managers usually estimate the
size of the software from an analysis of the SRS document. Based on this estimate they make
other estimations such as the effort required to develop the software and the total cost of
development. The SRS document also serves as a basis for price negotiations with the
customer. The pr oject manager also uses the SRS document for work scheduling.
Provides a baseline for validation and verification: The SRS document provides a baseline
against which compliance of the developed software can be checked. It is also used by the
test engineers to create the test plan.
Facilitates future extensions: The SRS document usually serves as a basis for planning
future enhancements.
The skill of writing a good SRS document usually comes from the experience gained from
writing SRS documents for many projects. However, the analyst should be aware of the
desirable qualities that every good SRS document should possess. IEEE Recommended
Practice for Software Requirements Specifications describes the content and qualities of a
good software requirements specification (SRS). Some of the identified desirable qualities of
an SRS document are the following:
Concise: The SRS document should be concise and at the same time unambiguous,
consistent, and complete. Verbose and irrelevant descriptions reduce readability and also
increase the possibilities of errors in the document.
The SRS document should describe the system to be developed as a black box, and should
specify only the externally visible behaviour of the system. For this reason, the S R S
document is also called the black-box specification of the software being developed.
Traceable: It should be possible to trace a specific requirement to the design elements that
implement it and vice versa. Similarly, it should be possible to trace a requirement to the
code segments that implement it and the test cases that test this requirement and vice
versa. Traceability is also important to verify the results of a phase with respect to the
previous phase and to analyse the impact of changing a requirement on the design elements
and the code.
Verifiable: All requirements of the system as documented in the SRS document should be
verifiable. This means that it should be possible to design test cases based on the description
of the functionality as to whether or not requirements have been met in an implementation.
A requirement such as “the system should be user friendly” is not verifiable. On the other
hand, the requirement—“When the name of a book is entered, the software should display
whether the book is available for issue or it has been loaned out” is verifiable. Any feature of
the required system that is not verifiable should be listed separately in the goals of the
implementation section of the SRS document.
The most damaging problems are incompleteness, ambiguity, and contradictions. There are
many other types problems that a specification document might suffer from. By knowing
these problems, one can try to avoid them while writing an SRS document. Some of the
important categories of problems that many SRS documents suffer from are as follows:
Over-specification: It occurs when the analyst tries to address the “how to” aspects in the
SRS document. For example, in the library automation problem, one should not specify
whether the library membership records need to be stored indexed on the member’s first
name or on the library member’s identification (ID) number. Over-specification restricts the
freedom of the designers in arriving at a good design solution.
Forward references: One should not refer to aspects that are discussed much later in the
SRS document. Forward referencing seriously reduces readability of the specification.
Wishful thinking: This type of problems concern description of aspects which would be
difficult to implement.
Noise: The term noise refers to presence of material not directly relevant to the software
development process. For example, in the register customer function, suppose the analyst
writes that customer registration department is manned by clerks who report for work
between 8am and 5pm, 7 days a week. This information can be called noise as it would
hardly be of any use to the software developers and would unnecessarily clutter the SRS
document, diverting the attention from the crucial points. Several other “sins” of SRS
documents can be listed and used to guard against writing a bad SRS document and is also
used as a checklist to review an SRS document.
A good SRS document, should properly categorize and organise the requirements into
different sections [IEEE830]. As per the IEEE 830 guidelines, the important categories of user
requirements are the following. An SRS document should clearly document the following
aspects of a software:
• Functional requirements
• Goals of implementation
the different categories of requirements
Functional requirements
The functional requirements capture the functionalities required by the users from the
system that it is useful to consider a software as offering a set of functions {f i} to the user.
These functions can be considered similar to a mathematical function f : I → O, meaning that
a function transforms an element (ii) in the input domain (I) to a value (oi) in the output (O)
External interfaces required: Examples of external interfaces are— hardware, software and
communication interfaces, user interfaces, report formats, etc. To specify the user
interfaces, each interface between the software and the users must be described.
Goals of implementation The ‘goals of implementation’ part of the SRS document offers
some general suggestions regarding the software to be developed. These are not binding on
the developers, and they may take these suggestions into account if possible. For example,
the developers may use these suggestions while choosing among different design solutions.
A goal, in contrast to the functional and non-functional requirements, is not checked by the
customer for conformance at the time of acceptance testing.
Functional Requirements:
For example, how useful must a piece of work be performed by the system for it to be called
‘a useful piece of work’ ? Can the printing of the statements of the ATM transaction during
withdrawal of money from an ATM be called a useful piece of work? Printing of ATM
transaction should not be considered a high-level requirement, because the user does not
specifically request for this activity. The receipt gets printed automatically as part of the
withdraw money function.
Usually, the user invokes (requests) the services of each high-level requirement. It may
therefore be possible to treat print receipt as part of the withdraw money function rather
than treating it as a high-level function. It is therefore required that for some of the high-
level functions, we might have to debate whether we wish to consider it as a highlevel
function or not. However, it would become possible to identify most of the high-level
functions without much difficulty after practising the solution to a few exercise problems.
The high-level functional requirements often need to be identified either from an informal
problem description document or from a conceptual understanding of the problem. Each
high-level requirement characterises a way of system usage (service invocation) by some
user to perform some meaningful piece of work. Remember that there can be many types of
users of a system and their requirements from the system may be very different. So, it is
often useful to first identify the different types of users who might use the system and then
try to identify the different services expected from the software by different types of users.
The decision regarding which functionality of the system can be taken to be a high-level
functional requirement and the one that can be considered as part of another function (that
is, a subfunction) leaves scope for some subjectivity. For example, consider the issue-book
function in a Library Automation System.
Once all the high-level functional requirements have been identified and the requirements
problems have been eliminated, these are documented. A function can be documented by
identifying the state at which the data is to be input to the system, its input data domain,
the output data domain, and the type of processing to be carried on the input data to obtain
the output data. We now illustrate the specification of the functional requirements through
two examples. Let us first try to document the withdraw-cash function of an automated tell
e r machine (ATM) system in the following. The withdraw-cash is a high-level requirement. It
has several sub-requirements corresponding to the different user interactions. These user
interaction sequences may vary from one invocation from another depending on some
conditions. These different interaction sequences capture the different scenarios. To
accurately describe a functional requirement, we must document all the different scenarios
that may occur
Specification of large software: If there are large number of functional requirements (much
larger than seen), should they just be written in a long numbered list of requirements? A
better way to organise the functional requirements in this case would be to split the
requirements into sections of related requirements. For example, the functional
requirements of a academic institute automation software can be split into sections such as
accounts, academics, inventory, publications, etc. When there are too many functional
requirements, these should be properly arranged into sections.
For example the following can be sections in the trade house automation software:
• Customer management
• Account management
• Purchase management
• Vendor management
• Inventory management
Traceability:
Traceability means that it would be possible to identify (trace) the specific design
component which implements a given requirement, the code part that corresponds to a
given design component, and test cases that test a given requirement. Thus, any given code
component can be traced to the corresponding design component, and a design component
can be traced to a specific requirement that it implements and vice versa.
Depending on the type of project being handled, some sections can be omitted, introduced,
or interchanged as may be considered prudent by the analyst. However, organisation of the
SRS document to a large extent depends on the preferences of the system analyst himself,
and he is often guided in this by the policies and standards being followed by the
development company. Also, the organisation of the document and the issues discussed in it
to a large extent depend on the type of the product being developed. However, irrespective
of the company’s principles and product type, the three basic issues that any SRS document
should discuss are—functional requirements, non-functional requirements, and guidelines
for system implementation.
Introduction Purpose: This section should describe where the software would be deployed
and and how the software would be used.
Project scope: This section should briefly describe the overall context within which the
software is being developed. For example, the parts of a problem that are being automated
and the parts that would need to be automated during future evolution of the software.
Environmental characteristics: This section should briefly outline the environment
(hardware and other software) with which the software will interact
Product perspective: This section needs to briefly state as to whether the software is
intended to be a replacement for a certain existing systems, or it is a new software. If the
software being developed would be used as a component of a larger system, a simple
schematic diagram can be given to show the major components of the overall system,
subsystem interconnections, and external interfaces can be helpful.
Product features: This section should summarize the major ways in which the software
would be used. Details should be provided in Section 3 of the document. So, only a brief
summary should be presented here.
User classes: Various user classes that are expected to use this software are identified and
described here. The different classes of users are identified by the types of functionalities
that they are expected to invoke, or their levels of expertise in using computers.
Operating environment: This section should discuss in some detail the hardware platform
on which the software would run, the operating system, and other application software with
which the developed software would interact.
Design and implementation constraints: In this section, the different constraints on the
design and implementation are discussed. These might include—corporate or regulatory
policies; hardware limitations (timing requirements, memory requirements); interfaces to
other applications; specific technologies, tools, and databases to be used; specific
programming language to be used; specific communication protocols to be used; security
considerations; design conventions or programming standards.
User documentation: This section should list out the types of user documentation, such as
user manuals, on-line help, and trouble-shooting manuals that will be delivered to the
customer along with the software.
This section should describe a high-level description of various interfaces and various
principles to be followed.
User interfaces:
This section should describe a high-level description of various interfaces and various
principles to be followed. The user interface description may include sample screen images,
any GUI standards or style guides that are to be followed, screen layout constraints,
standard push buttons (e.g., help) that will appear on every screen, keyboard shortcuts,
error message display standards, etc. The details of the user interface design should be
documented in a separate user interface specification document.
Hardware interfaces: This section should describe the interface between the software and
the hardware components of the system. This section may include the description of the
supported device types, the nature of the data and control interactions between the
software and the hardware, and the communication protocols to be used.
Software interfaces: This section should describe the connections between this software
and other specific software components, including databases, operating systems, tools,
libraries, and integrated commercial components, etc. Identify the data items that would be
input to the software and the data that would be output should be identified and the
purpose of each should be described.
Communications interfaces: This section should describe the requirements associated with
any type of communications required by the software, such as e-mail, web access, network
server communications protocols, etc. This section should define any pertinent message
formatting to be used. It should also identify any communication standards that will be used,
such as TCP sockets, FTP, HTTP, or SHTTP. Specify any com munication security or encryption
issues that may be relevant, and also the data transfer rates, and synchronisation
mechanisms.
This section should describe the non-functional requirements other than the design and
implementation constraints and the external interface requirements.
Safety requirements: Those requirements that are concerned with possible loss or damage
that could result from the use of the software are specified here. For example, recovery after
power failure, handling software and hardware failures, etc. may be documented here.
Security requirements: This section should specify any requirements regarding security or
privacy requirements on data used or created by the software. Any user identity
authentication requirements should be described here. It should also refer to any external
policies or regulations concerning the security issues. Define any security or privacy
certifications that must be satisfied. For software that have distinct modes of operation, in
the functional requirements section, the different modes of operation can be listed and in
each mode the specific functionalities that are available for invocation can be organised as
follows.
Functional requirements
1. Operation mode 1
(a) Functional requirement 1.1
(b) Functional requirement 1.2
2. Operation mode 2
Specification of the behaviour may not be necessary for all systems. It is usually necessary
for those systems in which the system behaviour depends on the state in which the system
is, and the system transits among a set of states depending on some prespecified conditions
and events. The behaviour of a system can be specified using either the finite state machine
(FSM) formalism and any other alternate formalisms. The FSMs can used to specify the
possible states (modes) of the system and the transition among these states due to
occurrence of events.
Non-functional requirements
N.1: Database: A data base management system that is available free of cost in the public
domain should be used.
N.2: Platform: Both Windows and Unix versions of the software need to be developed.
N.3: Web-support: It should be possible to invoke the query book functionality from any
place by using a web browser. Observation: Since there are many functional requirements,
the requirements have been organised into four sections: Manage own books, manage
friends, manage borrowed books, and manage statistics. Now each section has less than 7
functional requirements. This would not only enhance the readability of the document, but
would also help in design.
A good SRS document should properly characterise the conditions under which different
scenarios of interaction occur (see Section 4.2.5). That is, a high-level function might involve
different steps to be undertaken as a consequence of some decisions made after each step.
Sometimes the conditions can be complex and numerous and several alternative interaction
and processing sequences may exist depending on the outcome of the corresponding
condition checking. A simple text description in such cases can be difficult to comprehend
and analyse. In such situations, a decision tree or a decision table can be used to represent
the logic and the processing involved. Also, when the decision making in a functional
requirement has been represented as a decision table, it becomes easy to automatically or
at least manually design test case for it.
There are two main techniques available to analyse and represent complex processing logic
—decision trees and decision tables. Once the decision making logic is captured in the form
of trees or tables, the test cases to validate these logic can be automatically obtained. It
should, however, be noted that decision trees and decision tables have much broader
applicability than just specifying complex processing logic in an SRS document. For instance,
decision trees and decision tables find applications in information theory and switching
theory.
Decision tree
A decision tree gives a graphic view of the processing logic involved in decision making and
the corresponding actions taken. Decision tables specify which variables are to be tested,
and based on this what actions are to be taken depending upon the outcome of the decision
making logic, and the order in which decision making is performed. The edges of a decision
tree represent conditions and the leaf nodes represent the actions to be performed
depending on the outcome of testing the conditions. Instead of discussing how to draw a
decision tree for a given processing logic, we shall explain through a simple example how to
represent the processing logic in the form of a decision tree.
Decision table A decision table shows the decision making logic and the corresponding
actions taken in a tabular or a matrix form. The upper rows of the table specify the variables
or conditions to be evaluated and the lower rows specify the actions to be taken when an
evaluation test is satisfied. A column in the table is called a rule. A rule implies that if a
certain condition combination is true, then the corresponding action is executed.
Even though both decision tables and decision trees can be used to represent complex
program logic, they can be distinguishable on the following three considerations:
Readability: Decision trees are easier to read and understand when the number of
conditions are small. On the other hand, a decision table causes the analyst to look at every
possible combination of conditions which he might otherwise omit.
Explicit representation of the order of decision making: In contrast to the decision trees,
the order of decision making is abstracted out in decision tables. A situation where decision
tree is more useful is when multilevel decision making is required. Decision trees can more
intuitively represent multilevel decision making hierarchically, whereas decision tables can
only represent a single decision to select the appropriate action for execution.
Representing complex decision logic: Decision trees become very complex to understand
when the number of conditions and actions increase. It may even be to draw the tree on a
single page. When very large number of decisions are involved, the decision table
representation may be preferred.
FORMAL TECHNIQUE
The different stages in this system development activity are requirements specification,
functional design, architectural design, detailed design, coding, implementation, etc.
Semantic domains: Formal techniques can have considerably different semantic domains.
Abstract data type specification languages are used to specify algebras, theories, and
programs. Programming languages are used to specify functions from input to output values.
Maximally parallel semantics: In this approach, all the concurrent actions enabled at any
state are assumed to be taken together. This is again not a natural model of concurrency
since it implicitly assumes the availability of all the required computational resources.
Partial order semantics: Under this view, the semantics ascribed to a system is a structure
of states satisfying a partial order relation among the states (events). The partial order
represents a precedence ordering among events,
● Formal specifications encourage rigour. It is often the case that the very process of
construction of a rigorous specification is more important than the formal
specification itself. The construction of a rigorous specification clarifies several
aspects of system behaviour that are not obvious in an informal specification. It is
widely acknowledged that it is cost-effective to spend more efforts at the
specification stage.
● Formal methods usually have a well-founded mathematical basis. Thus, formal
specifications are not only more precise, but also mathematically sound.
● The mathematical basis of the formal methods makes it possible for automating the
analysis of specifications.
● Formal specifications can be executed to obtain immediate feedback o n the
features of the specified system. This concept of executable specifications is related
to rapid prototyping.
UNIT III
SOFTWARE DESIGN
The activities carried out during the design phase called as design process transform the SRS
document into the design document.
The design process starts using the SRS document and completes with the production of the design
document. The design document produced at the end of the design phase should be implementable
using a programming language in the subsequent (coding) phase
Different modules required: The different modules in the solution should be clearly identified. Each
module is a collection of functions and the data shared by the functions of the module. Each module
should accomplish some well-defined task out of the overall responsibility of the software. Each
module should be named according to the task it performs. For example, in an academic automation
software, the module consisting of the functions and data necessary to accomplish the task of
registration of the students should be named handle student registration.
Control relationships among modules: A control relationship between two modules essentially
arises due to function calls across the two modules. The control relationships existing among various
modules should be identified in the design document.
Interfaces among different modules: The interfaces between two modules identifies the exact data
items that are exchanged between the two modules when one module invokes a function of the
other module.
Data structures of the individual modules: Each module normally stores some data that the
functions of the module need to share to accomplish the overall responsibility of the module.
Suitable data structures for storing and managing the data of a module need to be properly designed
and documented.
Algorithms required to implement the individual modules: Each function in a module usually
performs some processing activity. The algorithms required to accomplish the processing activities
of various modules need to be carefully designed and documented with due considerations given to
the accuracy of the results, space and time complexities
A design solution is said to be highly modular, if the different modules in the solution have high
cohesion and their inter-module couplings are low. A software design with high cohesion and low
coupling among modules is the effective problem decomposition
Based on this classification, we would be able to easily judge the cohesion and coupling existing in a
design solution. From a knowledge of the cohesion and coupling in a design, we can form our own
opinion about the modularity of the design solution.
Layered design
A layered design is one in which when the call relations among different modules are represented
graphically, it would result in a tree-like diagram with clear layering. In a layered design solution, the
modules are arranged in a hierarchy of layers. A module can only invoke functions of the modules in
the layer immediately below it. The higher layer modules can be considered to be similar to
managers that invoke (order) the lower layer modules to get certain tasks done. A layered design
can be considered to be implementing control abstraction, since a module at a lower layer is
unaware of (about how to call) the higher layer modules. A layered design can make the design
solution easily understandable, since to understand the working of a module, one would at best
have to understand how the immediately lower layer modules work without having to worry about
the functioning of the upper layer modules. When a failure is detected When a failure is detected
while executing a module, it is obvious that the modules below it can possibly be the source of the
error. This greatly simplifies debugging since one would need to concentrate only on a few modules
to detect the error.
COHESION AND COUPLING
Good module decomposition is indicated through high cohesion of the individual modules and low
coupling of the modules with each other.
Cohesion is a measure of the functional strength of a module, whereas the coupling between two
modules is a measure of the degree of interaction (or interdependence) between the two modules.
When the functions of the module co-operate with each other for performing a single objective,
then the module has good cohesion. If the functions of the module do very different things and do
not co-operate with each other to perform a single piece of work, then the module has very poor
cohesion.
Coupling: Two modules are said to be highly coupled, if either of the following two situations arise:
If the function calls between two modules involve passing large chunks of shared data, the modules
are tightly coupled.
If the interactions occur through some shared data, then also we say that they are highly coupled.
If two modules either do not interact with each other at all or at best interact by passing no data or
only a few primitive data items, they are said to have low coupling.
Functional independence By the term functional independence, we mean that a module performs a
single task and needs very little interaction with other modules. A module that is highly cohesive and
also has low coupling with other modules is said to be functionally independent of the other
modules. Functional independence is a key to any good design primarily due to the following
advantages it offers:
Error isolation: Whenever an error exists in a module, functional independence reduces the chances
of the error propagating to the other modules. The reason behind this is that if a module is
functionally independent, its interaction with other modules is low. Therefore, an error existing in
the module is very unlikely to affect the functioning of other modules.
Scope of reuse: Reuse of a module for the development of other applications becomes easier. The
reasons for this is as follows.
Functionally independent module performs some well-defined and precise task and the interfaces of
the module with other modules are very few and simple. A functionally independent module can
therefore be easily taken out and reused in a different program. On the other hand, if a module
interacts with several other modules or the functions of a module perform very different tasks, then
it would be difficult to reuse it. This is especially so, if the module accesses the data (or code)
internal to other modules.
Understandability: When modules are functionally independent, complexity of the design is greatly
reduced. This is because of the fact that different modules can be understood in isolation, since the
modules are independent of each other.
Classification of Cohesiveness
Cohesiveness of a module is the degree to which the different functions of the module co-operate to
work towards a single objective. The different modules of a design can possess different degrees of
freedom. The cohesiveness increases from coincidental to functional cohesion. That is, coincidental
is the worst type of cohesion and functional is the best cohesion possible
Coincidental cohesion: A module is said to have coincidental cohesion, if it performs a set of tasks
that relate to each other very loosely, The designs made by novice programmers often possess this
category of cohesion, since they often bundle functions to modules rather arbitrarily.
Logical cohesion: A module is said to be logically cohesive, if all elements of the module perform
similar operations, such as error handling, data input, data output, etc. As an example of logical
cohesion, consider a module that contains a set of print functions to generate various types of
output reports such as grade sheets, salary slips, annual reports, etc.
Temporal cohesion: When a module contains functions that are related by the fact that these
functions are executed in the same time span, then the module is said to possess temporal cohesion.
As an example, consider the following situation. When a computer is booted, several functions need
to be performed. These include initialisation of memory and devices, loading the operating system,
etc. When a single module performs all these tasks, then the module can be said to exhibit temporal
cohesion. Other examples of modules having temporal cohesion are the following. Similarly, a
module would exhibit temporal cohesion, if it comprises functions for performing initialisation, or
start-up, or shut-down of some process.
Procedural cohesion: A module is said to possess procedural cohesion, if the set of functions of the
module are executed one after the other, though these functions may work towards entirely
different purposes and operate on very different data. Consider the activities associated with order
processing in a trading house. The functions login(), place-order(), check-order(), printbill(), place-
order-on-vendor(), update-inventory(), and logout() all do different thing and operate on different
data.
Communicational cohesion: A module is said to have communicational cohesion, if all functions of
the module refer to or update the same data structure. As an example of procedural cohesion,
consider a module named student in which the different functions in the module such as
admitStudent, enterMarks, printGradeSheet, etc. access and manipulate data stored in an array
named studentRecords defined within the module.
Sequential cohesion: A module is said to possess sequential cohesion, if the different functions of
the module execute in a sequence, and the output from one function is input to the next in the
sequence. As an example consider the following situation. In an on-line store consider that after a
customer requests for some item, it is first determined if the item is in stock
Functional cohesion: A module is said to possess functional cohesion, if different functions of the
module co-operate to complete a single task. For example, a module containing all the functions
required to manage employees’ pay-roll displays functional cohesion. In this case, all the functions of
the module (e.g., computeOvertime(), computeWorkHours(), computeDeductions(), etc.) work
together to generate the payslips of the employees.
Classification of Coupling The coupling between two modules indicates the degree of
interdependence between them. Intuitively, if two modules interchange large amounts of data, then
they are highly interdependent or coupled.The degree of coupling between two modules depends
on their interface complexity. The interface complexity is determined based on the number of
parameters and the complexity of the parameters that are interchanged while one module invokes
the functions of the other module.
Data coupling: Two modules are data coupled, if they communicate using an elementary data item
that is passed as a parameter between the two, e.g. an integer, a float, a character, etc. This data
item should be problem related and not used for control purposes.
Stamp coupling: Two modules are stamp coupled, if they communicate using a composite data item
such as a record in PASCAL or a structure in C.
Control coupling: Control coupling exists between two modules, if data from one module is used to
direct the order of instruction execution in another. An example of control coupling is a flag set in
one module and tested in another module.
Common coupling: Two modules are common coupled, if they share some global data items.
Content coupling: Content coupling exists between two modules, if they share code. That is, a jump
from one module into the code of another module can occur. Modern high-level programming
languages such as C do not support such jumps across modules.
STRUCTURED ANALYSIS
during structured analysis, the major processing tasks (high-level functions) of the system are
analysed, and t h e data flow among these processing tasks are represented graphically. The
structured analysis technique is based on the following underlying principles: Top-down
decomposition approach. Application of divide and conquer principle. Through this each highlevel
function is independently decomposed into detailed functions. Graphical representation of the
analysis results us i ng data flow diagrams.
DFD representation of a problem, as we shall see shortly, is very easy to construct. Though
extremely simple, it is a very powerful tool to tackle the complexity of industry standard problems. A
DFD is a hierarchical graphical model of a system that shows the different processing activities or
functions that the system performs and the data interchange among those functions. DFD is an
elegant modelling technique that can be used not only to represent the results of structured analysis
of a software problem, but also useful for several other applications such as showing the flow of
documents or items in an organisation.
Data Flow Diagrams (DFDs)
The DFD (also known as the bubble chart) is a simple graphical formalism that can be used to
represent a system in terms of the input data to the system, various processing carried out on those
data, and the output data generated by the system. The main reason why the DFD technique is so
popular is probably because of the fact that DFD is a very simple formalism— it is simple to
understand and use. A DFD model uses a very limited number of primitive symbols (shown in Figure
6.2) to represent the functions performed by a system and the data flow among these functions.
Starting with a set of high-level functions that a system performs, a DFD model represents the
subfunctions performed by the functions using a hierarchy of diagrams.
Primitive symbols used for constructing DFDs
There are essentially five different types of symbols used for constructing DFDs
Function symbol: A function is represented using a circle. This symbol is called a process or a bubble.
Bubbles are annotated with the names of the corresponding functions.
External entity symbol: An external entity such as a librarian, a library member, etc. is represented
by a rectangle. The external entities are essentially those physical entities external to the software
system which interact with the system by inputting data to the system or by consuming the data
produced by the system. In addition to the human users, the external entity symbols can be used to
represent external hardware and software such as another application software that would interact
with the software being modelled
.
Data flow symbol: A directed arc (or an arrow) is used as a data flow symbol. A data flow symbol
represents the data flow occurring between two processes or between an external entity and a
process in the direction of the data flow arrow. Data flow symbols are usually annotated with the
corresponding data names.
Data store symbol: A data store is represented using two parallel lines. It represents a logical file.
That is, a data store symbol can represent either a data structure or a physical file on disk. Each data
store is connected to a process by means of a data flow symbol. The direction of the data flow arrow
shows whether data is being read from or written into a data store. An arrow flowing in or out of a
data store implicitly represents the entire data of the data store and hence arrows connecting t o a
data store need not be annotated with the name of the corresponding data items.
Output symbol: The output symbol is used when a hard copy is produced.
Synchronous and asynchronous operations: if two bubbles are directly connected by a data flow
arrow, then they are synchronous. This means that they operate at t he same speed. the validate-
number bubble can start processing only after t h e readnumber bubble has supplied data to it; and
the read-number bubble has to wait until the validate-number bubble has consumed its data.
Data dictionary Every DFD model of a system must be accompanied by a data dictionary. A data
dictionary lists all data items that appear in a DFD model. The data items listed include all data flows
and the contents of all data stores appearing on all the DFDs in a DFD model. Please remember that
the DFD model of a system typically consists of several DFDs, viz., level 0 DFD, level 1 DFD, level 2
DFDs, etc. However, a single data dictionary should capture all the data appearing in all the DFDs
constituting the DFD model of a system. A data dictionary lists the purpose of all data items and the
definition of all composite data items in terms of their component data item
For example, a data dictionary entry may represent that the data grossPay consists of the
components regularPay and overtimePay.
grossP ay = regularP ay + overtimeP ay
DEVELOPING THE DFD MODEL OF A SYSTEM
A DFD model of a system graphically represents how each input data is transformed to its
corresponding output data through a hierarchy of DFDs. The DFD model of a problem consists of
many of DFDs and a single data dictionary.
The DFD model of a system i s constructed by using a hierarchy of DFDs. The top level DFD is called
the level 0 DFD or the context diagram. This is the most abstract (simplest) representation of the
system (highest level). It is the easiest to draw and understand. At each successive lower level DFDs,
more and more details are gradually introduced. To develop a higher-level DFD model, processes are
decomposed into their subprocesses and the data flow among these subprocesses are identified.
To develop the data flow model of a system, first the most abstract representation (highest level) of
the problem is to be worked out.
Context Diagram The context diagram is the most abstract (highest level) data flow representation
of a system. It represents the entire system as a single bubble. The bubble in the context diagram is
annotated with the name of the software system being developed (usually a noun). This is the only
bubble in a DFD model, where a noun is used for naming the bubble. The bubbles at all other levels
are annotated with verbs according to the main function performed by the bubble. This is expected
since the purpose of the context diagram is to capture the context of the system rather than its
functionality.
Decomposition Each bubble in the DFD represents a function performed by the system. The bubbles
are decomposed into subfunctions at the successive levels of the DFD model. Decomposition of a
bubble is also known as factoring or exploding a bubble. Each bubble at any level of DFD is usually
decomposed to anything three to seven bubbles. A few bubbles at any level m a k e that level
superfluous. For example, if a bubble is decomposed to just one bubble or two bubbles, then this
decomposition becomes trivial and redundant. On the other hand, too many bubbles (i.e. more than
seven bubbles) at any level o f a DFD makes the DFD model hard to understand. Decomposition of a
bubble should be carried on until a level is reached at which the function of the bubble can be
described using a simple algorithm
STRUCTURED DESIGN
The aim of structured design is to transform the results of the structured analysis (that i s, the DFD
model) into a structure chart. A structure chart represents the software architecture. The various
modules making up the system, the module dependency (i.e. which module calls which other
modules), and the parameters that are passed among the different modules. The structure chart
representation can be easily implemented using some programming language. Since the main focus
in a structure chart representation is on module structure of a software and the interaction among
the different modules, the procedural aspects (e.g. how a particular functionality is achieved) are not
represented.
The basic building blocks using which structure charts are designed are as following: Rectangular
boxes:
A rectangular box represents a module. Usually, every rectangular box is annotated with the name
of the module it represents.
Module invocation arrows: An arrow connecting two modules implies that during program execution
control is passed from one module to the other in the direction of the connecting arrow. However,
just by looking at the structure chart, we cannot say whether a modules calls another module just
once or many times. Also, just by looking at the structure chart, we cannot tell the order in which the
different modules are invoked.
Data flow arrows: These are small arrows appearing alongside the module invocation arrows. The
data flow arrows are annotated with the corresponding data name. Data flow arrows represent the
fact that the named data passes from one module to the other in the direction of the arrow.
Library modules: A library module is usually represented by a rectangle with double edges. Libraries
comprise the frequently called modules. Usually, when a module is invoked by many other modules,
it is made into a library module.
Selection: The diamond symbol represents the fact that one module of several modules connected
with the diamond symbol i s invoked depending on the outcome of the condition attached with the
diamond symbol.
Repetition: A loop around the control flow arrows denotes that the respective modules are invoked
repeatedly. In any structure chart, there should be one and only one module at the top, called the
root. There should be at most one control relationship between any two modules in the structure
chart. This means that if module A invokes module B, module B cannot invoke module A. The main
reason behind this restriction is that we can consider the different modules of a structure chart to be
arranged in layers or levels. The principle of abstraction does not allow lower-level modules to be
aware of the existence of the high-level modules. However, it is possible for t wo higher-level
modules to invoke the same lower-level module.
Flow chart versus structure chart
We are all familiar with the flow chart representation of a program. Flow chart is a convenient
technique to represent the flo w of control in a program.
A structure chart differs from a flow chart in three principal ways:
It is usually difficult to identify the different modules of a program from its flow chart representation.
Data interchange among different modules is not represented in a flow chart.
Sequential ordering of tasks that i s inherent to a flow chart is suppressed in a structure chart.
DETAILED DESIGN
During detailed design the pseudo code description of the processing and the different data
structures are designed for the different modules of the structure chart. These are usually described
in the form of module specifications (MSPEC). MSPEC is usually written using structured English. The
MSPEC for the non-leaf modules describe the different conditions under which the responsibilities
are delegated to the lowerlevel modules. The MSPEC for the leaf-level modules should describe in
algorithmic form how the primitive processing steps are carried out. To develop the MSPEC of a
module, it is usually necessary to refer to the DFD model and the SRS document to determine the
functionality of the module.
DESIGN REVIEW
After a design is complete, the design is required to be reviewed. The review team usually consists of
members with design, implementation, testing, and maintenance perspectives, who may or may not
be the members of the development team. Normally, members of the team who would code the
design, and test the code, the analysts, and the maintainers attend the review meeting.
The review team checks the design documents especially for the following aspects:
Traceability: Whether each bubble of the DFD can be traced to some module in the structure chart
and vice versa. They check whether each functional requirement in the SRS document can be traced
to some bubble in the DFD model and vice versa.
Correctness: Whether all the algorithms and data structures of the detailed design are correct.
Maintainability: Whether the design can be easily maintained in future.
Implementation: Whether the design can be easily and efficiently be implemented.
After the points raised by the reviewers is addressed by the designers, the design document
becomes ready for implementation.
UNIT IV
OBJECT MODELLING USING UML
A model is constructed by focusing only on a few aspects of the problem and ignoring the rest. The
model of a problem is called an analysis model. On the other hand, the model of the solution (code)
is called the design model. The design model is usually obtained by carrying out iterative
refinements to the analysis model using a design methodology. Any design is a model of the
solution, whereas any model of the problem is an analysis model.
Modelling language: A modelling language consists of a set of notations using which design and
analysis models are documented.
Design process consists of a step by step procedure (or recipe) using which a problem description
can be converted into a design solution. A design process is, at times, also referred to as a design
methodology. A model can be documented using a modelling language such as unified modelling
language (UML).
BASIC OBJECT-ORIENTATION CONCEPTS
An ‘object’ would mean a single entity. Each object in an object-oriented program usually represents
a tangible real-world entity such as a library member, a book, an issue register, etc.
When the system is analysed, developed, and implemented in terms of objects, it becomes easy to
understand the design and the implementation of the system, since objects provide an excellent
decomposition of a large problem into small parts. Each object essentially consists of some data that
is private to the object and a set of functions (termed as operations or methods ) that operate on
those data. The data of the object can only be accessed by the methods of the object.
This mechanism of hiding data from other objects is popularly known as the principle of data hiding
or data abstraction. Data hiding promotes high cohesion and low coupling among objects, and
therefore is considered to be an important principle that can help one to arrive at a reasonably good
design. Each object stores some data and supports certain operations on the stored data. As an
example, consider the libraryMember object of a library automation application. The private data of
a libraryMember object can be the following:
• name of the member
• membership number
• address
• phone number
• e-mail address
• date when admitted as a member
• membership expiry date
• books outstanding.
The operations supported by a libraryMember object can be the following:
• issue-book
• find-books-outstanding
• find-books-overdue
• return-book
• find-membership-detail
The data stored internally in an object are called its attributes, and the operations supported by an
object are called its methods.
CLASS
Class Similar objects constitute a class. That is, all the objects constituting a class possess similar
attributes and methods. For example, the set of all library members would constitute the class
LibraryMember in a library automation application. In this case, each library member object has
attributes such as member name, membership number, member address, etc. and also has methods
such as issue-book, returnbook, etc. Once we define a class, it can be used as a template for object
creation.
An ADT is a type where the data contained in instantiated entity is abstracted (hidden) from other
entities. Let us now examine whether a class supports the two mechanisms of an ADT.
Abstract data: The data of an object can be accessed only through its methods. In other words, the
exact way data is stored internally (stack, array, queue, etc.) in the object is abstracted out (not
known to the other objects).
Data type: In programming language terminology, a data type defines a collection of data values
and a set of predefined operations on those values. Thus, a data type can be instantiated to create a
variable of that type. We can instantiate a class into objects. Therefore , a class is a data type.
Methods The operations (such as create, issue, return, etc.) supported by an object are
implemented in the form of methods. Notice that we are distinguishing between the terms
operation and method. Though the terms ‘operation’ and ‘method’ are sometimes used
interchangeably.
Method overloading: The implementation of a responsibility of a class through multiple
methods with the same method name is called method overloading.
Class Relationships : Classes in a programming solution can be related to each other in the
following four ways: • Inheritance • Association and link • Aggregation and composition •
Dependency In the following subsection, we discuss these different types of relationships that can
exist among classes.
Inheritance
The inheritance feature allows one to define a new class by incrementally extending the features of
an existing class. The original class is called the base class (also called superclass o r parent class )
and the new class obtained through inheritance is called the derived class (also called a subclass or a
child class ). The derived class is said to inherit the features of the base class.
An example of inheritance has been shown that the classes Faculty, Students, and Staff have been
derived from the base class LibraryMember through an inheritance relationship (note the special
type of arrow that has been used to draw it). The inheritance relation between library member and
faculty can alternatively be expressed as the following—A faculty is a type of library member. So, the
inheritance relationship is also at times called is a relation.
Multiple inheritance Construction of the class relationships for a given problem consists of
identifying and representing four types of relations—inheritance, composition/aggregation,
association, and dependency. However, at times it may so happen that some features of a class are
similar to one class and a few other features of the class are similar to those of another class. In this
case, it would be useful if the class could be allowed to inherit features from both the classes. Using
the multiple inheritance feature, a class can inherit features from multiple base classes.
Multiple inheritance is a mechanism by which a subclass can inherit attributes and methods from
more than one base class.
Association and link
Association is a common type of relation among classes. When two classes are associated, they can
take each others help (i.e. invoke each others methods) to serve user requests. More technically, we
can say that if one class is associated with another bidirectionally, then the corresponding objects of
the two classes know each others ids (identities). As a result, it becomes possible for the object of
one class to invoke the methods of the corresponding object of the other class.
n-ary association Binary association between classes is very commonly encountered in design
problems. However, there can be situations where three or more different classes can be involved in
an association.
A class can have an association relationship with itself. This is called recursive association or unary
association. As an example, consider the following—two students may be friends. Here, an
association named friendship exists among pairs of objects of the Student class. In unary association,
two (or more) different objects of the same class are linked by the association relationship.
An association describes a group of similar links. Alternatively, we can say that a link can be
considered as an instance of an association relation.
If two classes are associated, then the association relationship exists at all points of time. In contrast,
links between objects are dynamic in nature. Links between the objects of the associated classes can
get formed and dissolved as the program executes. An association between two classes simply
means that zero or more links may be present among the objects of the associated classes at any
time during execution
Composition and aggregation :
Composition and aggregation represent part/whole relationships among objects. Objects which
contain other objects are called composite objects. As an example, consider the following—A Book
object can have upto ten Chapters.
Aggregation/composition can occur in a hierarchy of levels. That is, an object contained in another
object may itself contain some other object. Composition and aggregation relationships cannot be
reflexive. That is, an object cannot contain an object of the same type as itself.
Dependency A class is said to be dependent on another class, if any changes to the latter class
necessitates a change to be made to the dependent class. A dependency relation between two
classes shows that any change made to the independent class would require the corresponding
change to be made to the dependent class.
Dependencies among classes may arise due to various causes. Two important reasons for
dependency to exist between two classes are the following: A method of a class takes an object of
another class as an argument. A class implements an interface class.
Abstract class Classes that are not intended to produce instances of themselves are called
abstract classes. In other words, an abstract class cannot be instantiated. Abstract classes merely
exist so that behaviour common to a variety of classes can be factored into one common location,
where they can be defined once. Definition of an abstract class helps to push reusable code up in the
class hierarchy, thereby enhancing code reuse. By using abstract classes, code reuse can be
enhanced and the effort required to develop software brought down. Abstract classes usually
support generic methods. These methods help to standardise the method names and input and
output parameters in the derived classes. The subclasses of the abstract classes are expected to
provide the concrete implementations for these methods.
Abstraction
The abstraction mechanism allows us to represent a problem in a simpler way by considering only
those aspects that are relevant to some purpose and omitting all other details that are irrelevant.
Abstraction is supported in two different ways in an object-oriented designs (OODs). These are the
following:
Feature abstraction: A class hierarchy can be viewed as defining several levels (hierarchy) of
abstraction, where each class is an abstraction of its subclasses. That is, every class is a simplified
(abstract) representation of its derived classes and retains only those features that are common to
all its children classes and ignores the rest of the features. Thus, the inheritance mechanism can be
thought of as providing feature abstraction
Data abstraction: An object itself can be considered as a data abstraction entity, because it
abstracts out the exact way in which it stores its various private data items and it merely provides a
set of methods to other objects to access and manipulate these data items. In other words, we can
say that data abstraction implies that each object hides (abstracts away) from other objects the
exact way in which it stores its internal information. This helps in developing good quality programs,
as it causes objects to have low coupling with each other, since they do not directly access any data
belonging to each other.
Abstraction is a powerful mechanism for reducing the perceived complexity of software designs.
Analysis of the data collected from several software development projects shows that software
productivity is inversely proportional to the perceived software complexity. Therefore, implicit use of
abstraction, as it takes place in object-oriented development, is a promising way of increasing
productivity of the software developers.
Encapsulation The data of an object is encapsulated within its methods. To access the data internal
to an object, other objects have to invoke its methods, and cannot directly access the data. the
following three important advantages:
Protection from unauthorised data access: The encapsulation feature protects an object’s variables
from being accidentally corrupted by other objects. This protection includes protection from
unauthorised access and also protection from the problems that arise from concurrent access to
data such as deadlock and inconsistent values.
Data hiding: Encapsulation implies that the internal structure data of an object are hidden, so that
all interactions with the object are simple and standardised. This facilitates reuse of a class across
different projects. Furthermore, if the internal data or the method body of a class are modified,
other classes are not affected. This leads to easier maintenance and bug correction.
Weak coupling: Since objects do not directly change each others internaldata, they are weakly
coupled. Weak coupling among objects enhances understandability of the design since each object
can be studied and understood in isolation from other objects.
Polymorphism: Polymorphism literally means poly ( m a ny ) morphi s m (forms). Remember that in
Chemistry, diamond, graphite, and coal are called polymorphic forms of carbon. There are two main
types of polymorphisms in object-orientation:
Static polymorphism: Static polymorphism occurs when multiple methods implement the same
operation. In this type of polymorphism, when a method is called (same method name but different
parameter types), different behaviour (actions) would be observed. This type of polymorphism is
also referred to as static binding, because the exact method to be bound on a method call is
determined at compiled-time (statically).
Dynamic polymorphism: Dynamic polymorphism is also called dynamic binding. In dynamic
binding,the exact method that would be invoked (bound) on a method call can only be known at the
run time (dynamically) and cannot be determined at compile time. That is, the exact behaviour that
would be produced on a method call cannot be predicted at compile time and can only be observed
at run time.
Dynamic binding is based on two important concepts:
Assignment of an object to another compatible object.
Method overriding in a class hierarchy. Assignment to compatible of objects In object-orientation,
objects of the derived classes are compatible with the objects of the base class. That is, an object of
the derived class can be assigned to an object of the base class, but not vice versa.
Method overriding: The principal advantage of dynamic binding is that it leads to elegant
programming and facilitates code reuse and maintenance. Even when the method of an object of
the base class is invoked, an appropriate overridden method of a derived class would be invoked
depending on the exact object that may have been assigned at the run-time to the object of the base
class
Genericity :
Genericity is the ability to parameterise class definitions. For example, while defining a class stack of
different types of elements such as integer stack, character stack, floating-point stack, etc.;
genericity permits us to define a generic class of type stack and later instantiate it either as an
integer stack, a character stack, or a floating-point stack as may be required. This can be achieved by
assigning a suitable value to a parameter used in the generic class definition.
Persistence Objects usually get destroyed once a program completes its execution. Persistent
objects are stored permanently. That is, they live across different executions. An object can be made
persistent by maintaining copies of the object in a secondary storage or in a database.
Agents A passive object is one that performs some action only when requested through invocation
of some of its methods. An agent (also called an active object), on the other hand, monitors events
occurring in the application and takes actions autonomously. Agents are used in applications such as
monitoring exceptions. For example, in a database application such as accounting, an agent may
monitor the balance sheet and would alert the user whenever inconsistencies arise in a balance
sheet due to some improper transaction taking place.
Widget The term widget stands for window object. A widget is a primitive object used for graphical
user interface (GUI) design. More complex graphical user interface design primitives (widgets) can
be derived from the basic widget using the inheritance mechanism. A widget maintains internal data
such as the geometry of the window, back ground and fore ground colors of the window, cursor
shape and size, etc. The methods supported by a widget manipulate the stored data and carry out
operations such as resize window, iconify window, destroy window, etc. Widgets are becoming the
standard components of GUI design. This has given rise to the technique of component-based user
interface development.
Advantages of OOD:
In the last couple of decades since OOD has come into existence, it has found widespread
acceptance in industry as well as in academic circles.
The main reason for the popularity of OOD is that it holds out the following promises:
● Code and design reuse
● Increased productivity
● Ease of testing and maintenance
● Better code and
● design understandability enabling development of large programs
Out of all the above mentioned advantages, it is usually agreed that the chief advantage of OOD is
improved productivity—which comes about due to a variety of factors, such as the following: Code
reuse by the use of predeveloped class libraries Code reuse due to inheritance Simpler and more
intuitive abstraction, i.e., better management of inherent problem and code complexity Better
problem decomposition
Disadvantages of OOD
The following are some of the prominent disadvantages inherent to the object paradigm:
1. The principles of abstraction, data hiding, inheritance, etc. do incur run time overhead due
to the additional code that gets generated on account of these features. This causes an
project-oriented program to run a little slower than an equivalent procedural program.
2. An important consequence of object-orientation is that the data that is centralised in a
procedural implementation, gets scattered across various objects in an object-oriented
implementation. Therefore, the spatial locality of data becomes weak and this leads to
higher cache miss ratios and consequently to larger memory access times. This finally shows
up as increased program run time.
UNIFIED MODELLING LANGUAGE (UML)
UML is a language for documenting models. As is the case with any other language, UML has its
syntax (a set of basic symbols and sentence format.ion rules) and semantics (meanings of basic
symbols and sentences). It provides a set of basic graphical notations (e.g. rectangles, lines, ellipses,
etc.) that can be combined in certain ways to document the design and analysis results. It is
important to remember that UML is neither a system design or development methodology by itself,
nor is tied to any specific methodology. UML is merely a language for documenting models. Before
the advent of UML, every design methodology not only prescribed entirely different design steps,
but each was tied to some specific design modelling language.
In general, reuse of design solutions across different methodologies was hard. UML was intended to
address this problem that was inherent to the modelling techniques that existed. UML can be used
to document object-oriented analysis and design results that have been obtained using any
methodology. UML was developed to standardise the large number of object-oriented modelling
notations that existed in the early nineties. The principal ones in use those days include the
following: OMT [Rumbaugh 1991] Booch’s methodology [Booch 1991] OOSE [Jacobson 1992] Odell’s
methodology [Odell 1992], Shlaer and Mellor methodology[Shlaer 1992]
MODEL:A model is an abstraction of a real problem (or situation), and is constructed by leaving out
unnecessary details. This reduces the problem complexity and makes it easy to understand the
problem (or situation).
A model is a simplified version of a real system. It is useful to think of a model as capturing aspects
important for some application while omitting (or abstracting out) the rest. As the size of a problem
increases, the perceived complexity increases exponentially due to human cognitive limitations.
Therefore, to develop a good understanding of any problem, it is necessary to construct a model of
the problem. Modelling has turned out to be a very essential tool in software design and helps to
effectively handle the complexity in a problem. These models that are first constructed are the
models of the problem. A design methodology essentially transform these analysis models into a
design model through iterative refinements
A model in the context of software development can be graphical, textual, mathematical, or
program code-based. Graphical models are very popular because they are easy to understand and
construct. UML is primarily a graphical modelling tool.
Why construct a model? An important reason behind constructing a model is that it helps to
manage the complexity in a problem and facilitates arriving at good solutions and at the same time
helps to reduce the design costs. The initial model of a problem is called an analysis model. The
analysis model of a problem can be refined into a design model using a design methodology. Once
models of a system have been constructed, these can be used for a variety of purposes during
software development, including the following:
● Analysis
● Specification
● Design
● Coding
● Visualisation and understanding of an implementation.
● Testing, etc.
Since a model can be used for a variety of purposes, it is reasonable to expect that the models would
vary in detail depending on the purpose for which these are being constructed. For example, a
model developed for initial analysis and specification should be very different from the one used for
design. A model that is constructed for analysis and specification would not show any of the design
decisions that would be made later on during the design stage. On the other hand, a model
constructed for design purposes should capture all the design decisions. Therefore, it is a good idea
to explicitly mention the purpose for which a model has been developed.
UML DIAGRAMS
If a single model is made to capture all the required perspectives, then it would be as complex as the
original problem, and would be of little use.
Once a system has been modelled from all the required perspectives, the constructed models can be
refined to get the actual implementation of the system. UML diagrams can capture the following
views (models) of a system:
● User’s view
● Structural view
● Behaviourial view
● Implementation view
● Environmental view
USER’S VIEW
This view defines the functionalities made available by the system to its users. The users’ view
captures the view of the system in terms of the functionalities offered by the system to its users. The
users’ view is a black-box view of the system where the internal structure, the dynamic behaviour of
different system components, the implementation etc. are not captured. The users’ view is very
different from all other views in the sense that it is a functional model compared to all other views
that are essentially object models. The users’ view can be considered as the central view and all
other views are required to conform to this view. This thinking is in fact the crux of any user centric
development style
Structural view
The structural view defines the structure of the problem (or the solution) in terms of the kinds of
objects (classes) important to the understanding of the working of a system and to its
implementation. It also captures the relationships among the classes (objects). The structural model
is also called the static model, since the structure of a system does not change with time.
Behaviourial view
The behaviourial view captures how objects interact with each other in time to realise the system
behaviour. The system behaviour captures the time-dependent (dynamic) behaviour of the system.
It therefore constitutes the dynamic model of the system.
Implementation view
This view captures the important components of the system and their interdependencies. For
example, the implementation view might show the GUI part, the middleware, and the database part
as the different parts and also would capture their interdependencies.
Environmental view
This view models how the different components are implemented on different pieces of hardware.
USE CASE MODEL :
The use case model for any system consists of a set of use cases. Intuitively, the use cases represent
the different ways in which a system can be used by the users. A simple way to find all the use cases
of a system is to ask the question —“What all can the different categories of users do by using the
system?” Thus, for the library information system (LIS), the use cases could be: • issue-book • query-
book • return-book • create-member • add-book, etc
The purpose of a use case is to define a piece of coherent behaviour without revealing the internal
structure of the system. The use cases do not mention any specific algorithm to be used nor the
internal data representation, internal structure of the software.
A use case typically involves a sequence of interactions between the user and the system. Even for
the same use case, there can be several different sequences of interactions. A use case consists of
one main line sequence and several alternate sequences. The main line sequence represents the
interactions between a user and the system that normally take place. The mainline sequence is the
most frequently occurring sequence of interaction. For example, in the mainline sequence of the
withdraw cash use case supported by a bank ATM would be—the user inserts the ATM card, enters
password, selects the amount withdraw option, enters the amount to be withdrawn, completes the
transaction, and collects the amount. Several variations to the main line sequence (called alternate
sequences) may also exist.
Typically, a variation from the mainline sequence occurs when some specific conditions hold. For the
bank ATM example, consider the following variations or alternate sequences:
• Password is invalid.
• The amount to be withdrawn exceeds the account balance. The mainline sequence and each of the
alternate sequences corresponding to the invocation of a use case is called a scenario of the use
case.
A use case can be viewed as a set of related scenarios tied together by a common goal. The main
line sequence and each of the variations are called scenarios or instances of the use case. Each
scenario is a single path of user events and system activity.
Example:
Normally, each use case is independent of the other use cases. However, implicit dependencies
among use cases may exist because of dependencies that may exist among use cases at the
implementation level due to factors such as shared resources, objects, or functions. For example, in
the Library Automation System example, renew-book a n d reserve-book are two independent use
cases. But, in actual implementation of renew-book, a check is to be made to see if any book has
been reserved by a previous execution of the reserve-book use case. Another example of
dependence among use cases is the following. In the Bookshop Automation Software,
updateinventory and sale-book are two independent use cases. But, during execution of sale-book
there is an implicit dependency on updatein Inventory. Since when sufficient quantity is unavailable
in the inventory, sale-book cannot operate until the inventory is replenished using updateinventory.
REPRESENTATION OF USE CASES:
A use case model can be documented by drawing a use case diagram and writing an accompanying
text elaborating the drawing. In the use case diagram, each use case is represented by an ellipse
with the name of the use case written inside the ellipse. All the ellipses (i.e. use cases) of a system
are enclosed within a rectangle which represents the system boundary. The name of the system
being modeled (e.g., library information system ) appears inside the rectangle. The different users of
the system are represented by using stick person icons. Each stick person icon is referred to as an
actor. 3 An actor is a role played by a user with respect to the system use. It is possible that the same
user may play the role of multiple actors. An actor can participate in one or more use cases. The line
connecting an actor and the use case is called the communication relationship. It indicates that an
actor makes use of the functionality provided by the use case. Both human users and external
systems can be represented by stick person icons. When a stick person icon represents an external
system, it is annotated by the stereotype <>. At this point, it is necessary to explain the concept of a
stereotype in UML. One of the main objectives of the creators of the UML was to restrict the number
of primitive symbols in the language. It was clear to them that when a language has a large number
of primitive symbols, it becomes very difficult to learn use
EXAMPLE: The use case diagram of the Super market prize Scheme
Text description U1: register-customer: Using this use case, the customer can register himself by
providing the necessary details.
Scenario 1: Mainline sequence 1. Customer: select register customer option 2 . System: display
prompt to enter name, address, and telephone number. 3. Customer: enter the necessary values 4:
System: display the generated id and the message that the customer has successfully been
registered. Scenario 2: At step 4 of mainline sequence 4 : System: displays the message that the
customer has already registered.
Scenario 3: At step 4 of mainline sequence 4 : System: displays message that some input information
have not been entered. The system displays a prompt to enter the missing values.
U2: register-sales: Using this use case, the clerk can register the details of the purchase made by a
customer.
Scenario 1: Mainline sequence
1. Clerk: selects the register sales option.
2. System: displays prompt to enter the purchase details and the id of the customer.
3. Clerk: enters the required details.
4 : System: displays a message of having successfully registered the sale.
U3: select-winners. Using this use case, the manager can generate the winner list. Scenario 2:
Mainline sequence
1. Manager: selects the select-winner option. 2 . System: displays the gold coin and the
surprise gift winner list.
Why Develop the Use Case Diagram?
If you examine a use case diagram, the utility of the use cases represented by the ellipses would
become obvious. They along with the accompanying text description serve as a type of requirements
specification of the system and the model based on which all other models are developed. In other
words, the use case model forms the core model to which all other models must conform.
One possible use of identifying the different types of users (actors) is in implementing a security
mechanism through a login system, so that each actor can invoke only those functionalities to which
he is entitled to. Another important use is in designing the user interface in the implementation of
the use case targetted for each specific category of users who would use the use case. Another
possible use is in preparing the documentation (e.g. users’ manual) targeted at each category of
user. Further, actors help in identifying the use cases and understanding the exact functioning of the
system.
How to Identify the Use Cases of a System?
Identification of the use cases involves brain storming and reviewing the SRS document. Typically,
the high-level requirements specified in the SRS document correspond to the use cases. In the
absence of a wellformulated SRS document, a popular method of identifying the use cases is actor-
based. This involves first identifying the different types of actors and their usage of the system.
Subsequently, for each actor the different functions that they might initiate or participate are
identified. For example, for a Library Automation System, the categories of users can be members,
librarian, and the accountant. Each user typically focuses on a set of functionalities.
For example, the member typically concerns himself with book issue, return, and renewal aspects.
The librarian concerns himself with creation and deletion of the member and book records. The
accountant concerns itself with the amount collected from membership fees and the expenses
aspects.
Generalisation:
Use case generalisation can be used when you have one use case that is similar to another, but does
something slightly differently or something more. Generalisation works the same way with use cases
as it does with classes. The child use case inherits the behaviour and meaning of the present use
case. The notation is the same too. It is important to remember that the base and the derived use
cases are separate use cases and should have separate text descriptions
Includes The includes relationship in the older versions of UML (prior to UML 1.1) was known as the
uses relationship. The includes relationship implies one use case includes the behaviour of another
use case in its sequence of events and actions. The includes relationship is appropriate when you
have a chunk of behaviour that is similar across a number of use cases.
The factoring of such behaviour will help in not repeating the specification and implementation
across different use cases. Thus, the includes relationship explores the issue of reuse by factoring out
the commonality across use cases. the includes relationship is represented using a predefined
stereotype <>. In the includes relationship, a base use case compulsorily and automatically includes
the behaviour of the common use case. the use cases issue-book and renew-book both include
check-reservation use case. The base use case may include several use cases. In such cases, it may
interleave their associated common use cases together. The common use case becomes a separate
use case and independent text description should be provided for it.
Extends: The main idea behind the extends relationship among use cases is that it allows you show
optional system behaviour. An optional system behaviour is executed only if certain conditions hold,
otherwise the optional behaviour is not executed. T h e e xt e nds relationship is similar to
generalisation. But unlike generalisation, the extending use case can add additional behaviour only
at an extension point only when certain conditions are satisfied. The extension points are points
within the use case where variation to the mainline (normal) action sequence may occur. The
extends relationship is normally used to capture alternate paths or scenarios.
Organisation When the use cases are factored, they are organised hierarchically. The highlevel use
cases are refined into a set of smaller and more refined use cases Top-level use cases are super-
ordinate to the refined use cases. The refined use cases are sub-ordinate to the top-level use cases.
Note that only the complex use cases should be decomposed and organised in a hierarchy. It is not
necessary to decompose the simple use case.
PACKAGING:
Packaging is the mechanism provided by UML to handle complexity. When we have too many use
cases in the top-level diagram, we can package the related use cases so that at best 6 or 7 packages
are present at the top level diagram. Any modeling element that becomes large and complex can be
broken up into packages. Please note that you can put any element of UML (including another
package) in a package diagram. The symbol for a package is a folder. Just as you organise a large
collection of documents in a folder, you organise UML elements into packages.
CLASS DIAGRAMS
A class diagram describes the static structure of a system. It shows how a system is structured rather
than how it behaves. The static structure of a system comprises a number of class diagrams and their
dependencies. The main constituents of a class diagram are classes and their relationships—
generalisation, aggregation, association, and various kinds of dependencies.
Classes
The classes represent entities with common features, i.e., attributes and operations. Classes are
represented as solid outline rectangles with compartments. Classes have a mandatory name
compartment where the name is written centered in boldface. The class name is usually written
using mixed case convention and begins with an uppercase (e.g. LibraryMember). Object names on
the other hand, are written using a mixed case convention, but starts with a small case letter (e.g.,
studentMember). Class names are usually chosen to be singular nouns.
Classes have optional attributes and operations compartments. A class may appear on several
diagrams. Its attributes and operations are suppressed on all but one diagram.
Attributes An attribute is a named property of a class. It represents the kind of data that an object
might contain. Attributes are listed with their names, and may optionally contain specification of
their type (that is, their class, e.g., Int, Book, Employee, etc.), an initial value, and constraints.
Attribute names are written left-justified using plain type letters, and the names should begin with a
lower case letter. Attribute names may be followed by square brackets containing a multiplicity
expression, e.g. sensorStatus.
Operation: The operation names are typically left justified, in plain type, and always begin with a
lower case letter. Abstract operations are written in italics.4 (Remember that abstract operations are
those for which the implementation is not provided during the class definition.) The parameters of a
function may have a kind specified. The kind may be “in” indicating that the parameter is passed into
the operation; or “out” indicating that the parameter is only returned from the operation; or “inout”
indicating that the parameter is used for passing data into the operation and getting result from the
operation.
Association
Association between two classes is represented by drawing a straight line between the concerned
classes. The name of the association is written along side the association line. An arrowhead may be
placed on the association line to indicate the reading direction of the association. The arrowhead
should not be misunderstood to be indicating the direction of a pointer implementing an
association. On each side of the association relation, the multiplicity is noted as an individual
number or as a value range. The multiplicity indicates how many instances of one class are
associated with the other. Value ranges of multiplicity are noted by specifying the minimum and
maximum value, separated by two dots, An asterisk is used as a wild card and means many (zero or
more). Associations are usually realised by assigning appropriate reference attributes to the classes
involved. Thus, associations can be implemented using pointers from one object class to another.
Links and associations can also be implemented by using a separate class that stores which objects
of a class are linked to which objects of another class. Some CASE tools use the role names of the
association relation for the corresponding automatically generated attribute.
Aggregation
Aggregation is a special type of association relation where the involved classes are not only
associated to each other, but a whole-part relationship exists between them. That is, the aggregate
object not only knows the addresses of its parts and therefore invoke the methods of its parts, but
also takes the responsibility of creating and destroying its parts. An example of aggregation, a book
register is an aggregation of book objects. Books can be added to the register and deleted as and
when required. Aggregation is represented by an empty diamond symbol at the aggregate end of a
relationship
Composition
Composition is a stricter form of aggregation, in which the parts are existence-dependent on the
whole. This means that the life of the parts cannot exist outside the whole. In other words, the
lifeline of the whole and the part are identical. When the whole is created, the parts are created and
when the whole is destroyed, the parts are destroyed.
Aggregation versus Composition: Both aggregation and composition represent part/whole
relationships. When the components can dynamically be added and removed from the aggregate,
then the relationship is aggregation. If the components cannot be dynamically added/delete then
the components are have the same life time as the composite.
Inheritance
The inheritance relationship is represented by means of an empty arrow pointing from the subclass
to the superclass. The arrow may be directly drawn from the subclass to the superclass.
Alternatively, when there are many subclasses of a base class, the inheritance arrow from the
subclasses may be combined to a single line and is labelled with the aspect of the class that is
abstracted.
Dependency A dependency relationship is shown as a dotted arrow that is drawn from the
dependent class to the independent class.
Constraints
A constraint describes a condition or an integrity rule. Constraints are typically used to describe the
permissible set of values of an attribute, to specify the pre- and post-conditions for operations, to
define certain ordering of items, etc.
Object diagrams
Object diagrams shows the snapshot of the objects in a system at a point in time. Since it shows
instances of classes, rather than the classes themselves, it is often called as an instance diagram. The
objects are drawn using rounded rectangles. An object diagram may undergo continuous change as
execution proceeds.
INTERACTION DIAGRAMS
When a user invokes one of the functions supported by a system, the required behaviour is realised
through the interaction of several objects in the system. Interaction diagrams, as their name itself
implies, are models that describe how groups of objects interact among themselves through
message passing to realise some behaviour.
Typically, each interaction diagram realises the behaviour of a single use case. Sometimes, especially
for complex use cases, more than one interaction diagrams may be necessary to capture the
behaviour. An interaction diagram shows a number of example objects and the messages that are
passed between the objects within the use case.
There are two kinds of interaction diagrams—
sequence diagrams and collaboration diagrams. These two diagrams are equivalent in the sense that
any one diagram can be derived automatically from the other.
Sequence diagram A sequence diagram shows interaction among objects as a two dimensional
chart. The chart is read from top to bottom. The objects participating in the interaction are shown at
the top of the chart as boxes attached to a vertical dashed line. Inside the box the name of the
object is written with a colon separating it from the name of the class and both the name of the
object and the class are underlined. This signifies that we are referring any arbitrary instance of the
class. Book represents any arbitrary instance of the Book class.
Each message is labelled with the message name. Some control information can also be included.
Two important types of control information are:
A condition (e.g., [invalid]) indicates that a message is sent, only if the condition is true.
.An iteration marker shows that the message is sent many times to multiple receiver objects as
would happen when you are iterating over a collection or the elements of an array. You can also
indicate the basis of the iteration, e.g., [for every book object]
The sequence diagram for the book renewal use case for the Library Automation Software Observe
that the exact objects which participate to realise the renew book behaviour and the order in which
they interact can be clearly inferred from the sequence diagram
Collaboration diagram
A collaboration diagram shows both structural and behavioural aspects explicitly. This is unlike a
sequence diagram which shows only the behavioural aspects. The structural aspect of a
collaboration diagram consists of objects and links among them indicating association. In this
diagram, each object is also called a collaborator. The behavioural aspect is described by the set of
messages exchanged among the different collaborators. The link between objects is shown as a solid
line and can be used to send messages between two objects. The message is shown as a labelled
arrow placed near the link. Messages are prefixed with sequence numbers because they are the only
way to describe the relative sequencing of the messages in this diagram
ACTIVITY DIAGRAM
The activity diagram is possibly one modelling element which was not present in any of the
predecessors of UML. No such diagrams were present either in the works of Booch, Jacobson, or
Rumbaugh. It has possibly been based on the event diagram of Odell [1992] though the notation is
very different from that used by Odell. The activity diagram focuses on representing various activities
or chunks of processing and their sequence of activation.
Activity diagrams can be very useful to understand complex processing activities involving the roles
played by many components. Besides helping the developer to understand the complex processing
activities, these diagrams can also be used to develop interaction diagrams which help to allocate
activities (responsibilities) to classes.
STATE CHART DIAGRAM
A state chart diagram is normally used to model how the state of an object changes in its life time.
State chart diagrams are good at describing how the behaviour of an object changes across several
use case executions. However, if we are interested in modelling some behaviour that involves
several objects collaborating with each other, state chart diagram is not appropriate. We have
already seen that such behaviour is better modelled using sequence or collaboration diagrams. State
chart diagrams are based on the finite state machine (FSM) formalism. An FSM consists of a finite
number of states corresponding to those of the object being modelled. The object undergoes state
changes when specific events occur.
A major disadvantage of the FSM formalism is the state explosion problem. The number of states
becomes too many and the model too complex when used to model practical systems. This problem
is overcome in UML by using state charts.
The basic elements of the state chart diagram are as follows:
Initial state: This represented as a filled circle. Final state: This is represented by a filled circle inside
a larger circle. State: These are represented by rectangles with rounded corners.
Transition: A transition is shown as an arrow between two states. Normally, the name of the event
which causes the transition is places along side the arrow. You can also assign a guard to the
transition. A guard is a Boolean logic condition. The transition can take place only if the guard
evaluates to true. The syntax for the label of the transition is shown in 3 parts— [guard]event/action.
from Rejected order state, there is an automatic and implicit transition to the end state. Such
transitions are called pseudo transitions.
POSTSCRIPT UML has gained rapid acceptance among practitioners and academicians over a short
time and has proved its utility in arriving at good design solutions to software development
problems.
Package diagram
A package is a grouping of several classes. In fact, a package diagram can be used to group any UML
artifacts.. Packages are popular way of organising source code files. Java packages are a good
example which can be modelled using a package diagram. Such package diagrams show the different
class groups (packages) and their inter dependencies. These are very useful to document
organisation of source files for large projects that have a large number of program files.
Component diagram
A component represents a piece of software that can be independently purchased, upgraded, and
integrated into an existing software. A component diagram can be used to represent the physical
structure of an implementation in terms of the various components of the system. A component
diagram is typically used to achieve the following purposes:
• Organise source code to be able to construct executable releases.
• Specify dependencies among different components. A package diagram can be used to provide a
high-level view of each component in terms the different classes it contains.
Deployment diagram
The deployment diagram shows the environmental view of a system. That is, it captures the
environment in which the software solution is implemented. In other words, a deployment diagram
shows how a software system will be physically deployed in the hardware environment. That is,
which component will execute on which hardware component and how they will they communicate
with each other. Since the diagram models the run time architecture of an application, this diagram
can be very useful to the system’s operation staff. The environmental view provided by the
deployment diagram is important for complex and large software solutions that run on hardware
systems comprising multiple components.
Certain changes were required to support interoperability among UML-based CASE tools using XML
metadata interchange (XMI). UML 2.0 defines thirteen types of diagrams, divided into three
categories as follows:
Structure diagrams: These include the class diagram, object diagram, component diagram,
composite structure diagram, package diagram, and deployment diagram.
Behaviour diagrams: These diagrams include the use case diagram, activity diagram, and state
machine diagrams
Interaction diagrams: These diagrams include the sequence diagram, communication diagram,
timing diagram, and interaction overview diagram. The collaboration diagram of UML 1.X has been
renamed in UML 2.0 as communication diagram. This renaming was necessary as the earlier name
was somewhat misleading, it shows the communications among the classes during the execution of
a use case rather than showing collaborative problem solving.
Though a large number of new features have been introduced in UML 2.0 as compared to 1.X, in the
following subsections, we discuss only two of the enhancements in UML2.0 through combined
fragments and composite structure diagram.
Fragment: A fragment in a sequence diagram is represented by a box, and encloses a portion of the
interaction within a sequence diagram. Each fragment is also known as an interaction operand. An
interaction operand may contain an optional guard condition, which is also called an interaction
constraint. The behaviour specified in an interaction operand is executed only if its guard condition
evaluates to true.
Operator: A combined fragment is associated with one operator called interaction operator that is
shown at the top left corner of the fragment. The operator indicates the type of fragment. The type
of logic operator along with the guards in the fragment defines the behaviour of the combined
fragment. A combined fragment can also contain nested combined fragments or interaction uses
containing additional conditional structures that represent more complex structures that affect the
flow of messages. Some of the important operators of a combined fragment are the following: alt:
This operator indicates that among multiple fragments, only the one whose guard is true will
execute.
opt: An optional fragment that will execute only if the guard is true.
par: This operator indicated that various fragments can execute at the same time.
loop: A loop operator indicates that the various fragments may execute multiple times and the
guard indicates the basis of iteration, meaning that the execution would continue until the guard
turns false. region: It defines a critical region in which only one thread can execute.
Composite structure diagram The composite structure diagram lets you define how a class is
defined by a further structure of classes and the communication paths between these parts. Some
new core constructs such as parts, ports and connectors are introduced.
Part: The concept of parts makes possible the description of the internal structure of a class.
Port: The concept of a port makes it possible to describe connection points formally. These are
addressable, which means that signals can be sent to them.
Connector: Connectors can be used to specify the communication links between two or more parts.
UNIT V
Good software development organisations require their programmers to adhere to some well-
defined and standard style of coding which is called their coding standard. These software
development organisations formulate their own coding standards that suit them the most, and
require their developers to follow the standards rigorously because of the significant business
advantages it offers. The main advantages of adhering to a standard style of coding are the
following:
A coding standard gives a uniform appearance to the codes written by different engineers.
It facilitates code understanding and code reuse.
It promotes good programming practices.
A coding standard lists several rules to be followed during coding, such as the way variables are to be
named, the way the code is to be laid out, the error return conventions, etc.
Besides the coding standards, several coding guidelines are also prescribed by software companies.
After a module has been coded, usually code review is carried out to ensure that the coding
standards are followed and also to detect as many errors as possible before testing. It is important
to detect as many errors as possible during code reviews, because reviews are an efficient way of
removing errors from code as compared to defect elimination using testing.
Coding Standards and Guidelines Good software development organisations usually develop their
own coding standards and guidelines depending on what suits their organisation best and based on
the specific types of software they develop. To give an idea about the types of coding standards that
are being used, we shall only list some general coding standards and guidelines that are commonly
adopted by many software development organisations, rather than trying to provide an exhaustive
list.
Representative coding standards Rules for limiting the use of globals: These rules list what types of
data can be declared global and what cannot, with a view to limit the data that needs to be defined
with global scope. Standard headers for different modules: The header of different modules should
have standard format and information for ease of understanding and maintanence.
The following is an example of header format that is being used in some companies:
● Name of the module.
● Date on which the module was created.
● Author’s name.
● Modification history.
● Synopsis of the module. This is a small writeup about what the module does.
● Different functions supported in the module, along with their input/output parameters.
● Global variables accessed/modified by the module.
Naming conventions for global variables, local variables, and constant identifiers: A popular
naming convention is that variables are named using mixed case lettering. Global variable names
would always start with a capital letter (e.g., GlobalData) and local variable names start with small
letters (e.g., localData). Constant names should be formed using capital letters only
(e.g.,CONSDATA).
Conventions regarding error return values and exception handling mechanisms: The way error
conditions are reported by different functions in a program should be standard within an
organisation. For example, all functions while encountering an error condition should either return a
0 or 1 consistently, independent of which programmer has written the code. This facilitates reuse
and debugging.
Representative coding guidelines: The following are some representative coding guidelines that are
recommended by many software development organisations. Wherever necessary, the rationale
behind these guidelines is also mentioned.
Do not use a coding style that is too clever or too difficult to understand: Code should be easy to
understand. Many inexperienced engineers actually take pride in writing cryptic and
incomprehensible code. Cl e ve r coding can obscure meaning of the code and reduce code
understandability; thereby making maintenance and debugging difficult and expensive.
Avoid obscure side effects: The side effects of a function call include modifications to the
parameters passed by reference, modification of global variables, and I/O operations. An obscure
side effect is one that is not obvious from a casual examination of the code. Obscure side effects
make it difficult to understand a piece of code. For example, suppose the value of a global variable is
changed or some file I/O is performed obscurely in a called module. That is, this is difficult to infer
from the function’s name and header information. Then, it would be really hard to understand the
code.
Do not use an identifier for multiple purposes: Programmers often use the same identifier to
denote several temporary entities. For example, some programmers make use of a temporary loop
variable for also computing and storing the final result. The rationale that they give for such multiple
use of variables is memory efficiency, e.g., three variables use up three memory locations, whereas
when the same variable is used for three different purposes, only one memory location is used.
Some of the problems caused by the use of a variable for multiple purposes are as follows:
Each variable should be given a descriptive name indicating its purpose. This is not possible if an
identifier is used for multiple purposes. Use of a variable for multiple purposes can lead to confusion
and make it difficult for somebody trying to read and understand the code.
Use of variables for multiple purposes usually makes future enhancements more difficult. For
example, while changing the final computed result from integer to float type, the programmer might
subsequently notice that it has also been used as a temporary loop variable that cannot be a float
type.
Code should be well-documented: As a rule of thumb, there should be at least one comment line on
the average for every three source lines of code.
Length of any function should not exceed 10 source lines: A lengthy function is usually very difficult
to understand as it probably has a large number of variables and carries out many different types of
computations. For the same reason, lengthy functions are likely to have disproportionately larger
number of bugs.
Do not use GO TO statements: Use of GO TO statements makes a program unstructured. This makes
the program very difficult to understand, debug, and maintain.
CODE REVIEW
Testing is an effective defect removal mechanism. However, testing is applicable to only executable
code. Review is a very effective technique to remove defects from source code.
Code review for a module is undertaken after the module successfully compiles. That is, all the
syntax errors have been eliminated from the module. Obviously, code review does not target to
design syntax errors in a program, but is designed to detect logical, algorithmic, and programming
errors. Code review has been recognised as an extremely cost-effective strategy for eliminating
coding errors and for producing high quality code. The reason behind why code review is a much
more cost-effective strategy to eliminate errors from code compared to testing is that reviews
directly detect errors. On the other hand, testing only helps detect failures and significant effort is
needed to locate the error during debugging
The rationale behind the above statement is explained as follows. Eliminating an error from code
involves three main activities—testing, debugging, and then correcting the errors. Testing is carried
out to detect if the system fails to work satisfactorily for certain types of inputs and under certain
circumstances. Once a failure is detected, debugging is carried out to locate the error that is causing
the failure and to remove it. Of the three testing activities, debugging is possibly the most laborious
and time consuming activity. In code inspection, errors are directly detected, thereby saving the
significant effort that would have been required to locate the error. Normally, the following two
types of reviews are carried out on the code of a module:
Code inspection.
Code walkthrough.
Code walkthrough is an informal code analysis technique. In this technique, a module is taken up for
review after the module has been coded, successfully compiled, and all syntax errors have been
eliminated. A few members of the development team are given the code a couple of days before the
walkthrough meeting. Each member selects some test cases and simulates execution of the code by
hand (i.e., traces the execution through different statements and functions of the code). The main
objective of code walkthrough is to discover the algorithmic and logical errors in the code. The
members note down their findings of their walkthrough and discuss those in a walkthrough meeting
where the coder of the module is present. Even though code walkthrough is an informal analysis
technique, several guidelines have evolved over the years for making this naive but useful analysis
technique more effective. These guidelines are based on personal experience, common sense,
several other subjective factors. Therefore, these guidelines should be considered as examples
rather than as accepted rules to be applied dogmatically.
Some of these guidelines are as follows:
● The team performing code walkthrough should not be either too big or too small. Ideally, it
should consist of between three to seven members.
● Discussions should focus on discovery of errors and avoid deliberations on how to fix the
discovered errors.
● In order to foster co-operation and to avoid the feeling among the engineers that they are
being watched and evaluated in the code walkthrough meetings, managers should not
attend the walkthrough meetings.
Code Inspection During code inspection, the code is examined for the presence of some
common programming errors. This is in contrast to the hand simulation of code execution
carried out during code walkthroughs. We can state the principal aim of the code inspection to
be the following:
The principal aim of code inspection is to check for the presence of some common types of
errors that usually creep into code due to programmer mistakes and oversights and to check
whether coding standards have been adhered to.
The inspection process has several beneficial side effects, other than finding errors. The
programmer usually receives feedback on programming style, choice of algorithm, and
programming techniques. The other participants gain by being exposed to another
programmer’s errors.
As an example of the type of errors detected during code inspection, consider the classic error of
writing a procedure that modifies a formal parameter and then calls it with a constant actual
parameter. It is more lik ely that such an error can be discovered by specifically looking for this
kinds of mistakes in the code, rather than by simply hand simulating execution of the code. In
addition to the commonly made errors, adherence to coding standards is also checked during
code inspection
Good software development companies collect statistics regarding different types of errors that
are commonly committed by their engineers and identify the types of errors most frequently
committed. Such a list of commonly committed errors can be used as a checklist during code
inspection to look out for possible errors. Following is a list of some classical programming errors
which can be checked during code inspection:
● Use of uninitialised variables.
● Jumps into loops. Non-terminating loops.
● Incompatible assignments.
● Array indices out of bounds. Improper storage allocation and deallocation.
● Mismatch between actual and formal parameter in procedure calls.
● Use of incorrect logical operators or incorrect precedence among operators.
● Improper modification of loop variables.
● Comparison of equality of floating point values.
● Dangling reference caused when the referenced memory has not been allocated
Clean Room Testing
Clean room testing was pioneered at IBM. This type of testing relies heavily on walkthroughs,
inspection, and formal verification. The programmers are not allowed to test any of their code by
executing the code other than doing some syntax testing using a compiler. It is interesting to note
that the term cleanroom was first coined at IBM by drawing analogy to the semiconductor
fabrication units where defects are avoided by manufacturing in an ultra-clean atmosphere.
This technique reportedly produces documentation and code that is more reliable and maintainable
than other development methods relying heavily on code execution-based testing.
The main problem with this approach is that testing effort is increased as walkthroughs, inspection,
and verification are time consuming for detecting all simple errors. Also testing- based error
detection is efficient for detecting certain errors that escape manual inspection.
SOFTWARE DOCUMENTATION
When a software is developed, in addition to the executable files and the source code, several kinds
of documents such as users’ manual, software requirements specification (SRS) document, design
document, test document, installation manual, etc., are developed as part of the software
engineering process. All these documents are considered a vital part of any good software
development practice. Good documents are helpful in the following ways:
● Good documents help enhance understandability of code. As a result, the availability of
good documents help to reduce the effort and time required for maintenance.
● Documents help the users to understand and effectively use the system.
● Good documents help to effectively tackle the manpower turnover1 problem. Even when an
engineer leaves the organisation, and a new engineer comes in, he can build up the required
knowledge easily by referring to the documents.
● Production of good documents helps the manager to effectively track the progress of the
project. The project manager would know that some measurable progress has been
achieved, if the results of some pieces of work has been documented and the same has been
reviewed.
Different types of software documents can broadly be classified into the following:
Internal documentation: These are provided in the source code itself.
External documentation: These are the supporting documents such as SRS document, installation
document, user manual, design document, and test document.
Internal Documentation Internal documentation is the code comprehension features provided in
the source code itself. Internal documentation can be provided in the code in several forms. The
important types of internal documentation are the following:
● Comments embedded in the source code.
● Use of meaningful variable names.
● Module and function headers.
● Code indentation.
● Code structuring (i.e., code decomposed into modules and functions).
● Use of enumerated types.
● Use of constant identifiers.
● Use of user-defined data types.
Careful experiments suggest that out of all types of internal documentation, meaningful variable
names is most useful while trying to understand a piece of code. The above assertion, of course, is in
contrast to the common expectation that code commenting would be the most useful. The research
finding is obviously true when comments are written without much thought.
For example, the following style of code commenting is not much of a help in understanding the
code. a=10; /* a made 10 */
A good style of code commenting is to write to clarify certain non-obvious aspects of the working of
the code, rather than cluttering the code with trivial comments. Good software development
organisations usually ensure good internal documentation by appropriately formulating their coding
standards and coding guidelines. Even when a piece of code is carefully commented, meaningful
variable names has been found to be the most helpful in understanding the code.
External Documentation External documentation is provided through various types of supporting
documents such as users’ manual, software requirements specification document, design document,
test document, etc.
A systematic software development style ensures that all these documents are of good quality and
are produced in an orderly fashion.
An important feature that is required of any good external documentation is consistency with the
code. If the different documents are not consistent, a lot of confusion is created for somebody trying
to understand the software. In other words, all the documents developed for a product should be
up-to-date and every change made to the code should be reflected in the relevant external
documents. Even if only a few documents are not up-to-date, they create inconsistency and lead to
confusion. Another important feature required for external documents is proper understandability
by the category of users for whom the document is designed
Gunning’s fog index
Gunning’s fog index (developed by Robert Gunning in 1952) is a metric that has been designed to
measure the readability of a document. The computed metric value (fog index) of a document
indicates the number of years of formal education that a person should have, in order to be able to
comfortably understand that document. That is, if a certain document has a fog index of 12, any one
who has completed his 12th class would not have much difficulty in understanding that document.
TESTING
The aim of program testing is to help realise identify all defects in a program. However, in practice,
even after satisfactory completion of the testing phase, it is not possible to guarantee that a program
is error free. This is because the input data domain of most programs is very large, and it is not
practical to test the program exhaustively with respect to each value that the input can assume.
Consider a function taking a floating point number as argument. If a tester takes 1sec to type in a
value, then even a million testers would not be able to exhaustively test it after trying for a million
number of years. Even with this obvious limitation of the testing process, we should not
underestimate the importance of testing. We must remember that careful testing can expose a large
percentage of the defects existing in a program, and therefore provides a practical way of reducing
defects in a system.
How to test a program?
Testing a program involves executing the program with a set of test inputs and observing if the
program behaves as expected. If the program fails to behave as expected, then the input data and
the conditions under which it fails are noted for later debugging and error correction. A highly
simplified view of program testing is schematically shown in Figure 10.1. The tester has been shown
as a stick icon, who inputs several test data to the system and observes the outputs produced by it
to check if the system fails on some specific inputs. Unless the conditions under which a software
fails are noted down, it becomes difficult for the developers to reproduce a failure observed by the
testers. For examples, a software might fail for a test case only when a network connection is
enabled.
A mistake is essentially any programmer action that later shows up as an incorrect result during
program execution. A programmer may commit a mistake in almost any development activity. For
example, during coding a programmer might commit the mistake of not initializing a certain variable,
or might overlook the errors that might arise in some exceptional situations such as division by zero
in an arithmetic operation. Both these mistakes can lead to an incorrect result.
An error is the result of a mistake committed by a developer in any of the development activities.
Among the extremely large variety of errors that can exist in a program. One example of an error is a
call made to a wrong function.
A failure of a program essentially denotes an incorrect behaviour exhibited by the program during
its execution. An incorrect behaviour is observed either as an incorrect result produced or as an
inappropriate activity carried out by the program. Every failure is caused by some bugs present in
the program
The number of possible ways in which a program can fail is extremely large. Out of the large number
of ways in which a program can fail, in the following we give three randomly selected examples:
– The result computed by a program is 0, when the correct result is 10.
– A program crashes on an input
. – A robot fails to avoid an obstacle and collides with it.
A test scenario is an abstract test case in the sense that it only identifies the aspects of the program
that are to be tested without identifying the input, state, or output. A test case can be said to be an
implementation of a test scenario. In the test case, the input, output, and the state at which the
input would be applied is designed such that the scenario can be executed. An important automatic
test case design strategy is to first design test scenarios through an analysis of some program
abstraction (model) and then implement the test scenarios as test cases.
A test script is an encoding of a test case as a short program. Test scripts are developed for
automated execution of the test cases. A test case is said to be a positive test case if it is designed to
test whether the software correctly performs a required functionality.
A test case is said to be negative test case, if it is designed to test whether the software carries out
something, that is not required of the system. As one example each of a positive test case and a
negative test case, consider a program to manage user login. A positive test case can be designed to
check if a login system validates a user with the correct user name and password. A negative test
case in this case can be a test case that checks whether the the login functionality validates and
admits a user with wrong or bogus login user name or password.
A test suite is the set of all test that have been designed by a tester to test a given program.
Testability of a requirement denotes the extent to which it is possible to determine whether an
implementation of the requirement conforms to it in both functionality and performance. In other
words, the testability of a requirement is the degree to which an implementation of it can be
adequately tested to determine its conformance to the requirement.
Verification versus validation
The objectives of both verification and validation techniques are very similar since both these
techniques are designed to help remove errors in a software. In spite of the apparent similarity
between their objectives, the underlying principles of these two bug detection techniques and their
applicability are very different. We summarise the main differences between these two techniques
in the following: Verification is the process of determining whether the output of one phase of
software development conforms to that of its previous phase; whereas validation is the process of
determining whether a fully developed software conforms to its requirements specification. Thus,
the objective of verification is to check if the work products produced after a phase conform to that
which was input to the phase. For example, a verification step can be to check if the design
documents produced after the design step conform to the requirements specification.
On the other hand, validation is applied to the fully developed and integrated software to check if it
satisfies the customer’s requirements.
● The primary techniques used for verification include review, simulation, formal verification,
and testing. Review, simulation, and testing are usually considered as informal verification
techniques. Formal verification usually involves use of theorem proving techniques or use of
automated tools such as a model checker. On the other hand, validation techniques are
primarily based on product testing. Note that we have categorised testing both under
program verification and validation. The reason being that unit and integration testing can
be considered as verification steps where it is verified whether the code is a s per the
module and module interface specifications. On the other hand, system testing can be
considered as a validation step where it is determined whether the fully developed code is
as per its requirements specification.
● Verification does not require execution of the software, whereas validation requires
execution of the software
● Verification is carried out during the development process to check if the development
activities are proceeding alright, whereas validation is carried out to check if the right as
required by the customer has been developed.
● Verification techniques can be viewed as an attempt to achieve phase containment of errors.
Phase containment of errors has been acknowledged to be a cost-effective way to eliminate
program bugs, and is an important software engineering principle. The principle of detecting
errors as close to their points of commitment as possible is known as phase containment of
errors. Phase containment of errors can reduce the effort required for correcting bugs. For
example, if a design problem is detected in the design phase itself, then the problem can be
taken care of much more easily than if the error is identified, say, at the end of the testing
phase. In the later case, it would be necessary not only to rework the design, but also to
appropriately redo the relevant coding as well as the system testing activities, thereby
incurring higher cost.
● While verification is concerned with phase containment of errors, the aim of validation is to
check whether the deliverable software is error free.
● Error detection techniques = Verification techniques + Validation techniques
Testing involves performing the following main activities:
Test suite design: The set of test cases using which a program is to be tested is designed possibly
using several test case design techniques. We discuss a few important test case design
techniques later in this Chapter.
Running test cases and checking the results to detect failures: Each test case is run and the
results are compared with the expected results. A mismatch between the actual result and
expected results indicates a failure. The test cases for which the system fails are noted down for
later debugging.
Locate error: In this activity, the failure symptoms are analysed to locate the errors. For each
failure observed during the previous activity, the statements that are in error are identified.
Error correction: After the error is located during debugging, the code is appropriately changed
to correct the error.
When test cases are designed based on random input data, many of the test cases do not
contribute to the significance of the test suite, That is, they do not help detect any additional A
minimal test suite is a carefully designed set of test cases such that each test case helps detect
different errors. This is in contrast to testing using some random input values. defects not
already being detected by other test cases in the suite.
Testing a software using a large collection of randomly selected test cases does not guarantee
that all (or even most) of the errors in the system will be uncovered. Let us try to understand
why the number of random test cases in a test suite, in general, does not indicate of the
effectiveness of testing.
Two types of testing namely
Black-box approach
White-box (or glass-box) approach
In the black-box approach, test cases are designed using only the functional specification of the
software. That is, test cases are designed solely based on an analysis of the input/out behaviour
(that is, functional behaviour) and does not require any knowledge of the internal structure of a
program. For this reason, black-box testing is also known as functional testing. On the other
hand, designing white-box test cases requires a thorough knowledge of the internal structure of
a program, and therefore white-box testing is also called structural testing. Black- box test cases
are designed solely based on the input-output behaviour of a program.
In contrast, white-box test cases are based on an analysis of the code. These two approaches to
test case design are complementary. That is, a program has to be tested using the test cases
designed by both the approaches, and one testing using one approach does not substitute
testing using the other.
A software product is normally tested in three levels or stages:
Unit testing
Integration testing
System testing
During unit testing, the individual functions (or units) of a program are tested. Unit testing is
referred to as testing in the small, whereas integration and system testing are referred to as
testing in the large. After testing all the units individually, the units are slowly integrated and
tested after each step of integration (integration testing). Finally, the fully integrated system is
tested (system testing). Integration and system testing are known as testing in the large
Unit testing is undertaken after a module has been coded and reviewed. This activity is typically
undertaken by the coder of the module himself in the coding phase. Before carrying out unit
testing, the unit test cases have to be designed and the test environment for the unit under test
has to be developed
Driver and stub modules In order to test a single module, we need a complete environment to
provide all relevant code that is necessary for execution of the module. That is, besides the
module under test, the following are needed to test the module:
The procedures belonging to other modules that the module under test calls.
Non-local data structures that the module accesses.
A procedure to call the functions of the module under test with appropriate parameters.
Modules required to provide the necessary environment (which either call or are called by the
module under test) are usually not available until they too have been unit tested
A stub procedure is a dummy procedure that has the same I/O parameters as the function called
by the unit under test but has a highly simplified.
Driver: A driver module should contain the non-local data structures accessed by the module
under test. Additionally, it should also have the code to call the different functions of the unit
under test with appropriate parameter values for testing
BLACK-BOX TESTING In black-box testing, test cases are designed from an examination of the
input/output values only and no knowledge of design or code is required.
The following are the two main approaches available to design black box test cases:
● Equivalence class partitioning
● Boundary value analysis
Equivalence Class Partitioning In the equivalence class partitioning approach, the domain of input
values to the program under test is partitioned into a set of equivalence classes. The partitioning is
done such that for every input data belonging to the same equivalence class, the program behaves
similarly. The main idea behind defining equivalence classes of input data is that testing the code
with any one value belonging to an equivalence class is as good as testing the code with any other
value belonging to the same equivalence class.
Boundary Value Analysis A type of programming error that is frequently committed by
programmers is missing out on the special consideration that should be given to the values at the
boundaries of different equivalence classes of inputs. The reason behind programmers committing
such errors might purely be due to psychological factors. Programmers often fail to properly address
the special processing required by the input values that lie at the boundary of the different
equivalence classes.
The important steps in the black-box test suite design approach: Examine the input and output
values of the program. Identify the equivalence classes. Design equivalence class test cases by
picking one representative value from each equivalence class. Design the boundary value test cases
as follows. Examine if any equivalence class is a range of values. Include the values at the boundaries
of such equivalence classes in the test suite
The strategy for black-box testing is intuitive and simple. For black-box testing, the most important
step is the identification of the equivalence classes. Often, the identification of the equivalence
classes is not straightforward. However, with little practice one would be able to identify all
equivalence classes in the input data domain. Without practice, one may overlook many equivalence
classes in the input data set. Once the equivalence classes are identified, the equivalence class and
boundary value test cases can be selected almost mechanically.
WHITE BOX TESTING
White-box testing is an important type of unit testing. A large number of white-box testing strategies
exist. Each testing strategy essentially designs test cases based on analysis of some aspect of source
code and is based on some heuristic.
A white-box testing strategy can either be coverage-based or faultbased.
Fault-based testing: A fault-based testing strategy targets to detect certain types of faults. These
faults that a test strategy focuses on constitutes the fault model of the strategy.
Coverage-based testing A coverage-based testing strategy attempts to execute (or cover) certain
elements of a program. Popular examples of coverage-based testing strategies are statement
coverage, branch coverage, multiple condition coverage, and path coverage-based testing.
Testing criterion for coverage-based testing A coverage-based testing strategy typically targets to
execute (i.e., cover) certain program elements for discovering failures. The set of specific program
elements that a testing strategy targets to execute is called the testing criterion of the strategy. For
example, if a testing strategy requires all the statements of a program to be executed at least once,
then we say that the testing criterion of the strategy is statement coverage. We say that a test suite
is adequate with respect to a criterion, if it covers all elements of the domain defined by that
criterion.
A white-box testing strategy is said to be stronger than another strategy, if the stronger
testing strategy covers all program elements covered by the weaker testing strategy, and the
stronger strategy additionally covers at least one program element that is not covered by the weaker
strategy.
Statement Coverage The statement coverage strategy aims to design test cases so as to
execute every statement in a program at least once. The principal idea governing the statement
coverage strategy is that unless a statement is executed, there is no way to determine whether an
error exists in that statement.
Branch Coverage A test suite satisfies branch coverage, if it makes each branch condition in
the program to assume true and false values in turn. In other words, for branch coverage each
branch in the CFG representation of the program must be taken at least once, when the test suite is
executed. Branch testing is also known as edge testing, since in this testing scheme, each edge of a
program’s control flow graph is traversed at least once.
Multiple Condition Coverage In the multiple condition (MC) coverage-based testing, test
cases are designed to make each component of a composite conditional expression to assume both
true and false values. For example, consider the composite conditional expression
((c1 .and.c2 ).or.c3). A test suite would achieve MC coverage, if all the component conditions c1, c2
and c3 are each made to assume both true and false values. Branch testing can be considered to be
a simplistic condition testing strategy where only the compound conditions appearing in the
different branch statements are made to assume the true and false values. It is easy to prove that
condition testing is a stronger testing strategy than branch testing. For a composite conditional
expression of n components, 2n test cases are required for multiple condition coverage. Thus, for
multiple condition coverage, the number of test cases increases exponentially with the number of
component conditions. Therefore, multiple condition coverage-based testing technique is practical
only if n (the number of conditions) is small.
Path Coverage A test suite achieves path coverage if it exeutes each linearly independent
paths ( o r basis paths ) at least once. A linearly independent path can be defined in terms of the
control flow graph (CFG) of a program.
Control flow graph (CFG) A control flow graph describes how the control flows through the
program. We can define a control flow graph as the following: A control flow graph describes the
sequence in which the different instructions of a program get executed. In order to draw the control
flow graph of a program, we need to first number all the statements of a program. The different
numbered statements serve as nodes of the control flow graph. There exists an edge from one node
to another, if the execution of the statement representing the first node can result in the transfer of
control to the other node.
Path A path through a program is any node and edge sequence from the start node to a
terminal node of the control flow graph of a program. Please note that a program can have more
than one terminal nodes when it contains multiple exit or return type of statements. Writing test
cases to cover all paths of a typical program is impractical since there can be an infinite number of
paths through a program in presence of loops.
Linearly independent set of paths (or basis path set) A set of paths for a given program is
called linearly independent set of paths (or the set of basis paths or simply the basis set), if each
path in the set introduces at least one new edge that is not included in any other path in the set
If a set of paths is linearly independent of each other, then no path in the set can be
obtained through any linear operations (i.e., additions or subtractions) on the other paths in the set.
McCabe’s Cyclomatic Complexity Metric McCabe obtained his results by applying graph-
theoretic techniques to the control flow graph ofa program. McCabe’s cyclomatic complexity defines
an upper bound on the number of independent paths in a program. We discuss three different ways
to compute the cyclomatic complexity. For structured programs, the results computed by all the
three methods are guaranteed to agree
Uses of McCabe’s cyclomatic complexity metric Beside its use in path testing, cyclomatic
complexity of programs has many other interesting applications such as the following: Estimation of
structural complexity of code: McCabe’s cyclomatic complexity is a measure of the structural
complexity of a program. The reason for this is that it is computed based on the code structure
(number of decision and iteration constructs used). Intuitively, the McCabe’s complexity metric
correlates with the difficulty level of understanding a program, since one understands a program by
understanding the computations carried out along all independent paths of the program. Cyclomatic
complexity of a program is a measure of the psychological complexity or the level of difficulty in
understanding the program.
Estimation of testing effort: Cyclomatic complexity is a measure of the maximum number of
basis paths. Thus, it indicates the minimum number of test cases required to achieve path coverage.
Therefore, the testing effort and the time required to test a piece of code satisfactorily is
proportional to the cyclomatic complexity of the code. To reduce testing effort, it is necessary to
restrict the cyclomatic complexity of every function to seven.
Estimation of program reliability: Experimental studies indicate there exists a clear
relationship between the McCabe’s metric and the number of errors latent in the code after testing.
This relationship exists possibly due to the correlation of cyclomatic complexity with the structural
complexity of code. Usually the larger is the structural complexity, the more difficult it is to test and
debug the code.
Data Flow-based Testing Data flow based testing method selects test paths of a program
according to the definitions and uses of different variables in a program. Consider a program P . For a
statement numbered S of P , let DEF(S) = {X /statement S contains a definition of X } and USES(S)=
{X /statement S contains a use of X }
mutation testing is a fault-based testing technique in the sense that mutation test cases are
designed to help detect specific types of faults in a program. In mutation testing, a program is first
tested by using an initial test suite designed by using various white box testing strategies that we
have discussed. After the initial testing is complete, mutation testing can be taken up. An important
advantage of mutation testing is that it can be automated to a great extent. The process of
generation of mutants can be automated by predefining a set of primitive changes that can be
applied to the program.
DEBUGGING After a failure has been detected, it is necessary to first identify the program
statement(s) that are in error and are responsible for the failure, the error can then be fixed.
Debugging Approaches The following are some of the approaches that are popularly
adopted by the programmers for debugging: Brute force method This is the most common method
of debugging but is the least efficient method. In this approach, print statements are inserted
throughout the program to print the intermediate values with the hope that some of the printed
values will help to identify the statement in error. This approach becomes more systematic with the
use of a symbolic debugger (also called a source code debugger ), because values of different
variables can be easily checked and break points and watch points can be easily set to test the values
of variables effortlessly. Single stepping using a symbolic debugger is another form of this approach,
where the developer mentally computes the expected result after every source instruction and
checks whether the same is computed by single stepping through the program.
Backtracking This is also a fairly common approach. In this approach, starting from the
statement at which an error symptom has been observed, the source code is traced backwards until
the error is discovered. Unfortunately, as the number of source lines to be traced back increases, the
number of potential backward paths increases and may become unmanageably large for complex
programs, limiting the use of this approach
Program slicing This technique is similar to back tracking. In the backtracking approach, one
often has to examine a large number of statements. However, the search space is reduced by
defining slices. A slice of a program for a particular variable and at a particular statement is the set of
source lines preceding this statement that can influence the value of that variable [Mund2002].
Program slicing makes use of the fact that an error in the value of a variable can be caused by the
statements on which it is data dependent.
Debugging Guidelines Debugging is often carried out by programmers based on their
ingenuity and experience. The following are some general guidelines for effective debugging: Many
times debugging requires a thorough understanding of the program design. Trying to debug based
on a partial understanding of the program design may require an inordinate amount of effort to be
put into debugging even for simple problems. Debugging may sometimes even require full redesign
of the system. In such cases, a common mistakes that novice programmers often make is attempting
not to fix the error but its symptoms
One must be beware of the possibility that an error correction may introduce new errors.
Therefore after every round of error-fixing, regression testing (see Section 10.13) must be carried
out
PROGRAM ANALYSIS TOOLS A program analysis tool usually is an automated tool that takes
either the source code or the executable code of a program as input and produces reports regarding
several important characteristics of the program, such as its size, complexity, adequacy of
commenting, adherence to programming standards, adequacy of testing, etc. We can classify various
program analysis tools into the following two broad categories: Static analysis tools Dynamic analysis
tools
Static Analysis Tools :Static program analysis tools assess and compute various
characteristics of a program without executing it. Typically, static analysis tools analyse the source
code to compute certain metrics characterising the source code (such as size, cyclomatic complexity,
etc.) and also report certain analytical conclusions. These also check the conformance of the code
with the prescribed coding standards. In this context, it displays the following analysis results: To
what extent the coding standards have been adhered to? Whether certain programming errors such
as uninitialised variables, mismatch between actual and formal parameters, variables that are
declared but never used, etc., exist? A list of all such errors is displayed.
A major practical limitation of the static analysis tools lies in their inability to analyse run-
time information such as dynamic memory references using pointer variables and pointer
arithmetic, etc. In a high level programming languages, pointer variables and dynamic memory
allocation provide the capability for dynamic memory references. However, dynamic memory
referencing is a major source of programming errors in a program. Static analysis tools often
summarise the results of analysis of every function in a polar chart known as Kiviat Chart. A Kiviat
Chart typically shows the analysed values for cyclomatic complexity, number of source lines,
percentage of comment lines, Halstead’s metrics, etc.
Dynamic Analysis Tools Dynamic program analysis tools can be used to evaluate several
program ******Created by ebook converter - www.ebook-converter.com****** ******ebook
converter DEMO - www.ebook-converter.com******* characteristics based on an analysis of the
run time behaviour of a program. These tools usually record and analyse the actual behaviour of a
program while it is being executed. A dynamic program analysis tool (also called a dynamic analyser )
usually collects execution trace information by instrumenting the code. Code instrumentation is
usually achieved by inserting additional statements to print the values of certain variables into a file
to collect the execution trace of the program. The instrumented code when executed, records the
behaviour of the software for different test cases. An important characteristic of a test suite that is
computed by a dynamic analysis tool is the extent of coverage achieved by the test suite. After a
software has been tested with its full test suite and its behaviour recorded, the dynamic analysis tool
carries out a post execution analysis and produces reports which describe the coverage that has
been achieved by the complete test suite for the program. For example, the dynamic analysis tool
can report the statement, branch, and path coverage achieved by a test suite. If the coverage
achieved is not satisfactory more test cases can be designed, added to the test suite, and run.
Further, dynamic analysis results can help eliminate redundant test cases from a test suite. Normally
the dynamic analysis results are reported in the form of a histogram or pie chart to describe the
structural coverage achieved for different modules of the program. The output of a dynamic analysis
tool can be stored and printed easily to provide evidence that thorough testing has been carried out.
INTEGRATION TESTING Integration testing is carried out after all (or at least some of ) the
modules have been unit tested. Successful completion of unit testing, to a large extent, ensures that
the unit (or module) as a whole works satisfactorily. In this context, the objective of integration
testing is to detect the errors at the module interfaces (call parameters). For example, it is checked
that no parameter mismatch occurs when one module invokes the functionality of another module.
Thus, the primary objective of integration testing is to test the module interfaces, i.e., there are no
errors in parameter passing, when one module invokes the functionality of another module. The
objective of
integration testing is to check whether the different modules of a program interface with each other
properly.
During integration testing, different modules of a system are integrated in a planned manner
using an integration plan. The integration plan specifies the steps and the order in which modules
are combined to realise the full system. After each integration step, the partially integrated system is
tested. An important factor that guides the integration plan is the module dependency graph. Thus,
by examining the structure chart, the integration plan can be developed. Any one (or a mixture) of
the following approaches can be used to develop the test plan:
Big-bang approach to integration testing
Top-down approach to integration testing
Bottom-up approach to integration testing
Mixed (also called sandwiched ) approach to integration testing
In the following subsections, we provide an overview of these approaches to integration testing
Big-bang approach to integration testing
Big-bang testing is the most obvious approach to integration testing. In this approach, all the
modules making up a system are integrated in a single step. In simple words, all the unit tested
modules of the system are simply linked together and tested. However, this technique can
meaningfully be used only for very small systems. The main problem with this approach is that once
a failure has been detected during integration testing, it is very difficult to localise the error as the
error may potentially lie in any of the modules. Therefore, debugging errors reported during big-
bang integration testing are very expensive to fix. As a result, big-bang integration testing is almost
never used for large programs.
Bottom-up approach to integration testing
Large software products are often made up of several subsystems. A subsystem might consist of
many modules which communicate among each other through well-defined interfaces. In bottom-up
integration testing, first the modules for the each subsystem are integrated. Thus, the subsystems
can be integrated separately and independently.
The primary purpose of carrying out the integration testing a subsystem is to test whether
the interfaces among various modules making up the subsystem work satisfactorily. The test cases
must be carefully chosen to exercise the interfaces in all possible manners. In a pure bottom-up
testing no stubs are required, and only test-drivers are required. Large software systems normally
require several levels of subsystem testing, lower-level subsystems are successively combined to
form higher-level subsystems. The principal advantage of bottom- up integration testing is that
several disjoint subsystems can be tested simultaneously. Another advantage of bottom-up testing is
that the low-level modules get tested thoroughly, since they are exercised in each integration step.
Since the low-level modules do I/O and other critical functions, testing the low-level modules
thoroughly increases the reliability of the system. A disadvantage of bottom-up testing is the
complexity that occurs when the system is made up of a large number of small subsystems that are
at the same level. This extreme case corresponds to the big-bang approach.