Vce Software Development Core Techniques and Principles Compressed

Download as pdf or txt
Download as pdf or txt
You are on page 1of 187

Software Development: Core

Techniques and Principles


4th Edition

A text for VCE Software Development Units 3&4


Adrian Janson
The software development industry is an exciting and challenging one. Software developers regularly
deal with complex issues such as code optimization, effective user interface design, working with the
myriad of devices on the market and ensuring that the products of their efforts are error free and
work to specification.

Many perceive the role of the software developer as one that is both solitary and mechanical. This
could not be further from the truth. Software development is an art. It is a highly creative field in which
programmers strive for innovative solutions to problems that range in complexity, scope and
application. It is a rare project that is undertaken by a single software developer as many projects are
large enough in scale to require teams of programmers working on aspects that need to be integrated
at a later stage. There are of course a number of stakeholders involved in the software development
process – the key one being the client. A good software developer needs to have communication skills
that will enable them to marry the client’s needs to the product and a mechanism by which the
solution can be refined.

This text has been written to support the VCE Software Development Units 3&4 course from 2020-
2024. The text follows the study design closely and explains the process of undertaking the Problem-
Solving Methodology – which can be used to develop a solution to a specified problem. This text has
been structured to follow the Areas of Study. At the beginning of each chapter, you will find a list of
the Key Knowledge dot points that are covered as well as a diagram showing where the chapter fits
into the course as a whole. This text contains questions to test your knowledge as well as practice
exam questions from each of the Areas and Study and is designed to be the comprehensive resource
for all students undertaking VCE Software Development.

i|Page
Published in 2020 by

Adrian Janson Publishing Pty Ltd


ATF Brotto Janson Family Trust
ABN 81507629548
PO Box 2098
Mount Waverley, Vic. 3149
Australia
E-mail: [email protected]
© Adrian Janson 2020
ISBN 978-0-9873914-3-8 (print)
ISBN 978-0-9873914-4-5 (eBook)

Copying for educational purposes


The Australian Copyright Act 1968 (the Act) allows a maximum of one chapter or 10% of this book, whichever is
the greater, to be copied by any educational institution for its educational purposes PROVIDED THAT THE
DUCATIONAL INSTITUTION (OR THE BODY THAT ADMINISTERS IT) HAS GIVEN A REMUNERATION NOTICE TO
COPYRIGHT AGENCY LIMITED (CAL) UNDER THE ACT.

For details of the CAL licence for educational institutions contact:


Copyright Agency Limited
Level 19, 157 Liverpool Street
Sydney, NSW 2000
Telephone: (02) 9394 7600
Facsimile: (02) 9394 7601
E-mail: [email protected]

Copying for other purposes


Except as permitted under the Act (for example a fair dealing for the purposes of study, research, criticism or
review) no part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by
any means without prior written permission. All inquiries should be made to the publisher at the email address
or phone number shown above.

Screen shot(s) reprinted with permission of Microsoft Corporation.

ii | P a g e
CONTENTS
Software Development: Core Techniques and Principles 4rd Edition

1. Approaches to problem solving 1


2. The structure of a programming language 11
3. Data structures and data manipulation 29
4. From planning to analysis 55
5. Analysis tools 77
6. The art of design 97
7. Of input and output 113
8. Testing and evaluating 131
9. The law in a software development context 147
10. Cybersecurity 159

iii | P a g e
For Dad.

iv | P a g e
Chapter 1
Approaches to Problem Solving

The chapter covers Unit 3: Area of Study 1 key knowledge:


• The stages of the problem-solving methodology
• Approaches to problem-solving
o KK1.3 Methods of documenting a problem, need or opportunity
o KK1.4 Methods of determining solution requirements, constraints and scope

Key terms: information problem, Problem Solving Methodology, analysis, design, development,
evaluation, solution requirements, constraints, scope, coding, validation, testing, documentation,
efficiency, effectiveness, functional and non-functional components.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Solving information problems course, the methodology that we are going to


study is the PSM as defined in this chapter and
in the VCAA VCE Applied Computing Study
There is an art to solving information
Design.
problems. All of you will have an appreciation
of this, having used software that addresses
An important consideration when beginning to
particular problems in different ways – with
examine the PSM as stated in the Study Design,
varying success. There is a perception amongst
is that of context. The PSM has been created
non-programmers that programming is a very
with a view of supporting all of the VCE studies.
mechanical process. The truth is the exact
With this in mind, its activities are stated in a
opposite. The process of creating a solution to
general sense, and in some cases, may contain
an information problem is one that involves
examples that are specific to one VCE unit and
high levels of creativity and often imagination.
not another. Our discussion will place the PSM
The methodology known as the Problem-
into a VCE Software Development context.
solving Methodology gives a framework to this
process.
The Problem-solving Methodology
The Problem-solving Methodology (or PSM) is (PSM)
central to the study of VCE Software
Development. Within the discipline of The Problem-Solving Methodology is a process
Software Development, many similar by which a solution can be found to an
methodologies exist. For the purposes of this information problem that exists within an
Software Development: Core Techniques and Principles 4th edition

Figure 1: The Problem-solving Methodology (PSM)

information system. A software developer will The way that the PSM is implemented is known
follow this process to ensure that a design as the ‘development model’ and the different
addresses the problems that exist, performs choices that are available will be discussed in
within specifications, integrates if needed with detail in Chapter 4. For now, let’s just become
the existing system and can be achieved within familiar with the stages and activities
budget and on time. We will define four stages, associated with each stage.
each with a number of activities within them.
The first stage of the PSM is analysis.
The four stages we define are:
1. Analysis
1. Analysis
2. Design The analysis stage is where the information
3. Development system is examined to determine what
4. Evaluation problems exist or how new elements can be
added. There are a number of tools that can be
It is important to note that the PSM is not used to analyse a system, but it is important to
necessarily a linear process. In reading the remember that an information problem
stages and sub-stages in Figure 1, the order is involves users, processes, equipment and data
fairly clear. While a project would start out and all of these must be considered. Analysis is
executing the stages in this order, and would often about asking targeted questions;
aim to proceed from one to the next without together the answers help to build a picture of
any deviation, this does not normally occur. what is required. Analysis involves three main
activities in preparing a solution: determining

2|Page
Chapter 1: Approaches to problem solving

the solution requirements, identifying the gathered, how it is being used and by whom,
solution constraints and determining the and what information is being produced. By
solution scope, all of which are influenced by examining the information problem in this
the needs of the stakeholders (any parties that way, it can be easier to find out what additional
have a valid link or interest in the system or data will be needed to produce the software
information problem). These activities are solution that is required and what functions
often documented in the form of a Software the solution needs to provide.
Requirements Specification (SRS) which is a
type of report that is used by software The requirements of a software solution can be
developers to document the analysis of an classified as either functional or non-
information problem and to enable the design functional. Functional requirements are
stage to begin. directly related to what the software solution
is required to do. Non-functional requirements
are related to the attributes of the software
solution, such as user-friendliness, response
Systems thinking: A holistic approach to
rates, reliability, portability, robustness and
the identification and solving of
problems. Systems thinking involves maintainability. For example, the digital blood
analysing the interactions and pressure monitor shown in Figure 2 performs
interrelationships between components two very specific functions (taking a person’s
of individual information systems (data, blood pressure and their pulse). These are the
processes, people and digital systems), functional requirements of the software that is
to identify how they are influencing the
functioning of the whole system. This
running on the device. However, the software
approach enables students to has a number of non-functional requirements
understand whole systems and work relating to the way in which the information
with complexity, uncertainty and risk. needs to be presented and the way the device
responds. The device displays the pulse as a
VCAA VCE IT Study Design 2020-2023 flashing heart icon and the blood pressure is
Glossary
displayed in a timely manner (usually less than
a minute). The monitor is very easy to use with
a start and stop function (user friendly) as well
Determining the solution requirements as a recall function that immediately displays
the last blood pressure reading (response
In determining the solution requirements, the rate).
key question that needs to be asked is: What
does the solution need to provide?

Any given information problem will contain


aspects that will be well understood and others
that will not. So what is the best way to gain a
thorough understanding of the problem in
order to work out what is required of the
solution?

A number of tools can be used to represent the


information problem and come to a better
understanding of its particular aspects. Tools Figure 2: Digital blood pressure monitor
such as context diagrams, data flow diagrams
(DFDs) and use case diagrams (UCD) can be
used in this way, and often all three together
will be used to build a complete picture of what
is occurring. Collectively, these tools clearly
identify the data that is currently being

3|Page
Software Development: Core Techniques and Principles 4th edition

Identifying the constraints on the solution


Effectiveness: A measure of how well a
Key to the success of any software solution, solution, information management
will be an understanding of the constraints strategy or a network functions and
placed upon it. Probably the most obvious whether each achieves its intended
constraint is cost and this will vary widely results. Measures of effectiveness in a
based on the size of the organization and the solution include accessibility,
accuracy, attractiveness, clarity,
scale of the project. Other constraints that
communication of message,
need to be taken into account are: speed of completeness, readability, relevance,
processing required (or available), the timeliness, and usability. Measures of
requirements that users have of the solution, effectiveness of an information
legal requirements, security required (or management strategy include currency
imposed), compatibility with existing hardware of files, ease of retrieval, integrity of
data and security. Measures of effective
and software within the system, the level of
networks include maintainability,
expertise of the users, technical support staff reliability and the security of data
and the software developers themselves, the during storage and transmission.
capacity of the existing system and the
availability of equipment. This is by no means VCAA VCE Computing Study Design
an exhaustive list and sometimes the 2020-2023 Glossary
constraints that are placed on a proposed
software solution are hard to predict and
particular to the organisation or environment.

Determining the scope of the solution Efficiency: A measure of how much


time, cost and effort is applied to
Similar to constraints is the topic of scope. The achieve intended results. Measures of
efficiency in a solution could include the
scope defines what the boundaries of the cost of file manipulation, its
software solution will be. It also identifies what functionality and the speed of
the solution will do, what it won’t do and what processing. Measures of efficiency in a
particular benefits there will be to users. network include its productivity,
Benefits are often stated in terms of efficiency processing time, operational costs and
and effectiveness. level of automation.

VCAA VCE Computing Study Design


What is the difference between efficiency 2020-2023 Glossary
and effectiveness?

Two terms that are important to understand the PSM, they will be provided with an SRS
are efficiency and effectiveness. Efficiency can which outlines all of the important aspects
be measured by examining factors such as the from the analysis stage. The format and
time it takes to complete common tasks, the content of an SRS will be discussed in detail in
cost of maintaining the system and the effort Chapter 5.
required to produce the required information.
Effectiveness can be measured by examining The design stage consists of two main
whether the goals of the system have been activities: planning how the solution will
met, that is, how accurate the solution is. function given the requirements (the solution
design) and determining the criteria that will
2. Design be used to evaluate the solution.

Once the analysis is complete, the design of the


software solution can begin. If a software
developer is new to a project at this stage in

4|Page
Chapter 1: Approaches to problem solving

Planning solution functionality and Coding a software solution is probably seen as


appearance (the solution design) the largest activity from the perspective of a
software developer. In a sense this is true, but
Designing how the solution will function is a it is a task that is made much easier if due
complex activity for a software developer. It diligence has been given to the analysis and
involves working out how the data that is design stages. While coding a software
required will be named, structured, validated, solution, it is important that a software
manipulated and stored. A number of tools can developer follows good coding conventions for
help to accomplish this task. Data dictionaries, the naming and structure of the code and
data structure diagrams, pseudocode and includes ample internal documentation. While
object descriptions can all be used to do this. following these sorts of conventions may not
This activity also involves showing how the affect the final product (and will be invisible to
various components of the solution relate to all but those involved in the development now
one another. Tools that can help to accomplish and in the future), not doing these things leads
this are data flow diagrams, context diagrams to sloppy and hard-to-read code (and is seen as
and use case diagrams. Lastly, it is important to quite unprofessional). The coder writes code in
be able to design how information will be an appropriate programming language in
presented in the software solution. This can be accordance with the plan, then tests, debugs
represented by using tools such as annotated and modifies the code as required.
diagrams and mock ups. Each of these tools
will be explained in the following chapters. Validating

Determining the evaluation criteria Validating data is the process of determining


the reasonableness of the data. The amount of
It may seem out of place in the design stage to validation that is included and the way in which
be considering how the software solution will it operates can have a profound impact on the
be evaluated, however, this is vitally important effectiveness of the entire software solution.
to the success of the project. By translating the As well as preventing errors from occurring,
requirements of the software solution as set the process of ‘trapping errors’ can also be
out in the SRS to a number of evaluation used to determine if data is reasonable.
criteria, software developers have a better
sense of how the success of the solution will be Validation really begins in the design stage. It is
ultimately judged. in the design stage that important decisions
about how data will be collected, processed
3. Development and output will be made. Validation is also
strongly influenced by user interface design.
The development stage of the PSM is the stage The actual validation coding takes place in the
during which the software solution is built. It development stage.
consists of four main activities: coding
(manipulation), validation, testing and Testing
documenting. These activities do not
necessarily occur in this order, as will be Testing is often an ongoing activity during the
explained. development of a software solution, and as a
programmer adds elements to the program,
Coding (manipulation) they will test them to see that they are working
and modify or fix them as needed. However,
This activity in the PSM is titled ‘manipulation’ the formal activity of testing a software
but for our purposes, the development stage solution is usually conducted at the conclusion
begins with coding. of the development of the software and is
done using an exhaustive grid which covers

5|Page
Software Development: Core Techniques and Principles 4th edition

both the valid and invalid possibilities of the of it (as can happen when new software
software’s use. solutions are introduced). An evaluation can
(and should) contain many different elements.
When undertaking this task, the first step is to There are two key activities involved in this
list all of the tests that will be undertaken. This stage: evaluating the software solution and
list can be quite long as it will be designed to determining a strategy that will be used to find
cover all of the combinations of valid and out the extent that the solution meets the
invalid input as well as use of the software. required needs.
Test data will be constructed to perform each
of these tests and often the expected Evaluating the solution
behaviour or output from each will be
documented. The tests will then be carried out In the design stage, a set of evaluation criteria
and the behaviour of the software solution were created that can now be drawn upon to
compared to the expected result in each case. evaluate how well the solution has met
requirements, needs or opportunities. In
There is some argument that the best time to framing the evaluation of these criteria, it is
compile a list of tests is during the design stage. important to consider the overall efficiency
Doing so gives those coding the solution a and effectiveness of the solution. For that
valuable insight into the exact parameters of reason, criteria that have been written in a
both the input and output. quantifiable way containing efficiency and
effectiveness measures, will enable a software
The final step of the testing process is developer to quickly determine the extent to
correcting those errors that have been which a software solution has been a success
detected, after which the test data that initially or what its deficits are.
triggered the incorrect result is tested again to
ensure that the software solution has been Determining a strategy
fixed.
What are the best ways to find out if the
Documentation software solution has met the required needs?
A strategy to determine this will include a
Documentation is then written to support the timeline for the evaluation, what data will be
variety of users of the software solution. collected using what methods and how the
Documentation can take a number of forms. data relates to the evaluation criteria set out in
Internal documentation is placed inside the the design stage. Note that this activity is more
program code and assists future programmers complex than simply asking a series of
who wish to modify the software solution. questions of the users.
System support documentation can be in
electronic or hardcopy form. Different types of
system support documentation can be
produced for different groups from those using
the system to those maintaining it.

4. Evaluation

The evaluation of a software solution is an


important activity that will usually take place
after the solution has been in full operation for
a while. This time period varies, but it needs to
be long enough so that users of the system are
comfortable with it and hopefully not resentful

6|Page
Chapter 1: Approaches to problem solving

Context Questions

1. What is the purpose of the problem-solving methodology?


2. What possible consequences are there in evaluating a software solution too early?
3. What is meant by the term ‘due diligence’?
4. Is the PSM always implemented in a linear fashion? Explain what this means.
5. What is validation and in which stage does it take place?
6. When considering how to test a software solution, why is it useful to create a list of all of the
tests that will be conducted?
7. What is scope and how is it different to constraints?
8. When creating a strategy for evaluating a software solution, what elements need to be
included?

Applying the Concepts

• Imagine you are beginning the process of writing this text from scratch, using the PSM as the
method that will guide you. Write down what you envision you would be completing at each
stage and each activity within the PSM. Compare this to what others have and discuss any
different approaches that become evident.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Understand the stages and activities within the PSM

Explain the difference between functional and non-functional requirements

Understand what it means for a solution to be effective

List and describe measures of effectiveness

Understand what it means for a solution to be efficient

List and describe measures of efficiency

7|Page
Software Development: Core Techniques and Principles 4th edition

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
The purpose of the Problem Solving Methodology is:
A. To give form to what goes on in completing a task
B. To document a process within an organisation
C. To create a solution to a problem
D. To provide a structure by which a solution can be found to an information problem

Question 2
Which of the following lists contain measures of efficiency?
A. readability, cost, attractiveness
B. speed of processing, cost, level of automation
C. speed of processing, usability, accuracy
D. cost, productivity, timeliness

Question 3
Which of the following is a functional requirement of a solution?
A. User friendly
B. Robust
C. Calculates the tax payable for the financial year
D. Can be easily maintained

Question 4
The four stages of the Problem Solving Methodology are:
A. Analysis, Design, Implementation, Feedback
B. Design, Implementation, Feedback, Packaging
C. Analysis, Development, Evaluation, Construction
D. Analysis, Design, Development, Evaluation

Question 5
With reference to the stages of the Problem Solving Methodology, the stage that involves writing
evaluation criteria is called the ___________________ stage.
1 mark
Question 6
With reference to the stages of the Problem Solving Methodology, validation takes place within the
___________________ stage.
1 mark

8|Page
Chapter 1: Approaches to problem solving

Question 7
What is the difference between efficiency and effectiveness?

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________
2 marks

Question 8
Paul is beginning the process of developing an App that will be used for podcasting. He starts by
gathering some data from both the presenters and listeners about what they would like to see
included. For each of these groups, list one functional requirement that may be put forward as well as
one non-functional requirement.

Group Requirements
Presenters Functional:

Non-functional:

Listeners Functional:

Non-functional:

4 marks

9|Page
Software Development: Core Techniques and Principles 4th edition

Sample Examination Answers

Question 1
Answer: D

Question 2
Answer: B

Efficiency vs effectiveness. Straight from the glossary definition.

Question 3
Answer: C

A functional requirement is one that the solution can do.

Question 4
Answer: D

Question 5
Answer: Design

Easy mark here – definition directly from the PSM.

Question 6
Answer: Development

Question 7
Efficiency is a measure of how well a task is done while effectiveness is a measure of how correctly it
is done.

Efficiency and effectiveness are mentioned frequently in the study design and feature in the exam.
Often it will not be in a question as simple as this one. It is important to use these terms in your answers
but at the same time ensure that you use them in context and explain why. For example, stating that
a new system is efficient will not earn you any marks, but stating that a new system will increase
efficiency by reducing processing time, will.

Question 8
Group Requirements
Presenters Functional: Sound quality is high.
Non-functional: Easy for listeners to find their show
and listen to it
Listeners Functional: Easy to navigate and select shows for
their play-list
Non-functional: Reliable / doesn’t crash or lose where
they are up to in their play-list

Functional and non-functional requirements are a common exam question and it is important to not
only know the difference between them, but to be able to list things that come under each category.

10 | P a g e
Chapter 2
The Structure of a programming language

The chapter covers Unit 3: Area of Study 1 key knowledge:


• Data and information
o KK1.1 Characteristics of data types
• Approaches to problem solving
o KK1.5 Methods of representing designs, including data dictionaries, mock-ups, object
descriptions and pseudo-code
o KK1.7 A programming language as a method of developing working modules that meet
specific needs
o KK1.8 Naming conventions of solution elements
o KK1.9 Processing features of a programming language, including classes, control
structures, functions, instructions and methods
o KK1.14 Purposes and characteristics of internal documentation, including meaningful
comments and syntax.

Key terms: binary, kilobyte, data types, boolean, floating point, integer, string, date, control structure,
sequence, selection, repetition (iteration), subroutine, function, method, event, class, internal
documentation, naming convention, mock-ups, data dictionary, object descriptions, algorithm, top-
down design.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Within the first computers that were built,


What is binary? these signals were moved around the inside of
the computer (on the motherboard) through
Within any computer, the basis of data the use of parallel wires known as a bus. These
representation is essentially the presence of first buses consisted of eight wires. As binary is
particular voltages on a series of wires. If there a base 2 number, the total number of
is a voltage on a single wire, then it is said to be combinations that can be represented using
carrying the value ‘1’. If there is no voltage, eight wires (that is from ‘00000000’ all the way
then it is said to be ‘0’. This is the basis of binary up to ‘11111111’) is 256. As the bus on a
and all signals within a computer are computer’s motherboard serves the purpose
represented in this way. of moving data from one section of the
motherboard to another and a great deal of
what is moved around inside a computer is
Software Development: Core Techniques and Principles 4th edition

actually text, a code was needed to represent Characteristics of data types


this.

ASCII (the American Standard Code for


Information Interchange) was created in 1963 Data types: The forms that an item of
to serve as one of the standard systems for data can take, including binary (as
representing text in computer systems. The represented in images and sound),
Boolean, character and numeric,
ASCII table translates 7-bit binary codes to the
characterised by the kind of operations
letters of the keyboard and other special that can be performed on it. Depending
characters and is still used today (although its on the software being used, these data
form has changed). types can be divided into more specific
types, for example integer and floating
point, which are numeric types. More
A single binary digit is known as a ‘bit’ and 8-
sophisticated types can be derived from
bits are known as a ‘byte’. them, for example a string of characters
or a date type, and their names may
vary, such as text data type versus
Larger chunks of binary string data type.

Soon the amount of binary numbers that we VCAA VCE IT Study Design 2020-2023
were storing grew larger, as did the width of Glossary
the computer bus on the motherboard. By
increasing the size of the bus, computers were
able to move more data around more quickly. An understanding of the data types that can be
Larger units were needed to represent these utilised in a programming language is
greater quantities. paramount to the creation of an efficient
software solution. Many would argue that as
A kilobyte represents 1000 bytes (‘kilo’ means memory capacities and CPU clock speeds have
1000). Technically speaking, as binary is a base increased, the need for the careful selection of
2 number, a kilobyte is not exactly 1000 bytes, data types has become less of an issue.
but is equal to 1024 bytes (which is 2 to the However, the choice of variables used in a
power of 10). ‘Kilo’ was adopted as the prefix software solution has direct implications for
as the number was so close to ‘1000’ that it storage sizes, processing times and accuracy.
was the most convenient measure. Each The larger the program being developed, the
increase of scale (by 1000) has seen a different more influence these choices have on the
unit defined. Figure 3 summarises the ones efficiency (running time and memory used) of
that are used commonly today (or soon will the finished solution.
be).
Figure 4 on the next page lists some of the
different variable types and their typical

Units of storage
Name Abbreviation Size Power of 2
Byte None 8 bits 8
Kilobyte KB 1,024 bytes 10
Megabyte MB 1,048,576 bytes 20
Gigabyte GB 1,073,741,824 bytes 30
Terabyte TB 1,099,511,627,776 bytes 40
Petabyte PB 1,125,899,906,842,624 bytes 50
Exabyte EB 1,152,921,504,606,846,976 bytes 60

Figure 3: Units of storage

12 | P a g e
Chapter 2: The Structure of a Programming Language

ranges. Different names can be used for these Languages are often classified as being either
types in each language and their storage size high or low level languages. The distinction
and value range can vary. There are also many between these is based on the amount of
other data types that are available, but our translation that is needed to take the language
focus in this chapter is on the standard types from its entered form to binary code (known as
listed in this table. compiling). The more translation required, the
“higher” the level of language. It used to be the
Some of the ranges in Figure 4 may appear case that if a software developer wanted to
quite strange. They are however, not arbitrary achieve higher levels of efficiency, they would
values, but are determined by calculating the code in a lower level language so that they
maximum number that can be stored using could better control what was happening at
that many binary digits. machine level. This is still true, but many of the
modern programming languages strike a good
balance between ease of use and allowing the
The structure of a user to access machine level control if they
programming language wish.

Programming languages differ from each other Instructions and syntax


in syntax and style, but there are elements that
are common to many of them. Features and Each programming language has a syntax that
elements within programming languages have needs to be followed for the structure of the
changed over the years with the advent of new code as well as the syntax of the commands.
generations of languages (referred to as ‘GLs’, Programmers will often make use of language
for example, 3GL or 4GL). Many of the reference material (which is a form of
programming languages that are in use today documentation) which will describe the syntax
are 3 and 4GLs, but 5GLs are in development. of all of the available commands.

Variable types and their typical ranges


Variable Type Storage Value Range
Size

Boolean 1 byte True or False

Character (or Char) 2 bytes 0 – 65,535

Floating Point (or 12 bytes +/- 79,228,162,514,264,337,593,543,950,335


Decimal or Real) with no decimal places
+/- 7.9228162514264337593543950335
with 28 decimal places (maximum)
Smallest non zero number:
+/- 0.0000000000000000000000000001
or (+/-1E-28)

Integer 4 bytes -2,147,483,648 – 2,147,483,647

String Depends Depends on the string length.


on the
string
length.
Figure 4: Variable types and their typical ranges

13 | P a g e
Software Development: Core Techniques and Principles 4th edition

Control structures difference between passing a copy of the data


to the subroutine, and passing a pointer to the
There are three main control structures that variables themselves (which would then mean
every programming language contains. These that the subroutine could alter them).
are sequence, selection and repetition (or
iteration) structures. These control structures Subroutines are very useful in dividing a
are the core of any language and, without project up amongst a programming team. As
them, the language would not be able to they can be written to accept parameters and
function effectively. A key part of learning to change the values of these parameters by
use any programming language is learning how reference, they can be coded almost like
to code these control structures. Examples of independent programs. Subroutines that are
how these are structured will be shown very useful can also be placed into program
shortly. libraries and used again and again across a
variety of projects.
Subroutines (or procedures)
Functions
The simplest sort of structure within a
programming language is often a subroutine, A function is very similar to a subroutine and in
also known as a procedure, function or many ways can be considered a type of
method. Subroutines have been part of subroutine. A function performs a task that is
programming languages since the third frequent or useful to the programmer and
generation. A subroutine is a contained section often is used in a way that passes data back to
of code that can be called from within any the calling code. Common examples of these
other part of the program (even from within are mathematical functions such as SIN(X) and
the subroutine itself). Subroutines primarily COS(X). Functions like these ones accept a
prevent programs becoming large with big parameter (in this case ‘X’) and pass data back
sections of repeated code. Code that is useful via the name of the function.
and is being used frequently can be placed into
a subroutine and called as needed. Both procedures and functions can be
considered to be subroutines and their syntax
and use is defined by the language in which
they are being used. A common distinction
between procedures and functions has been
that procedures do not return data and
functions do. Although this is commonly true,
some languages do not define procedures and
functions in this way and indeed, even
languages that do, allow them to be used in
ways that are contrary to this.

Methods and events

Figure 5: Overlaid diagrams showing how Object oriented (OO) programming languages
libraries, subroutines and methods are often (4GL) work in a very non-sequential way, as
represented opposed to many of the 3GL languages.
Although languages such as Visual Basic.Net
have their origins in the third generation, they
work in a fundamentally different way,
To aid in flexibility, subroutines can also accept
triggering instead on events and utilising
parameters to vary the way they operate.
methods within the code.
These parameters can be passed by value or by
reference, which basically means the

14 | P a g e
Chapter 2: The Structure of a Programming Language

Methods and events are similar in ways and in press a button on the dog’s back a number of
this sense hard to separate. Events usually times to give it commands, pat the dog, ‘give’
refer to those things that happen while the the dog a toy bone or pick up the dog. All of
program is running that are triggered by a these interactions are recognised by the robot
user’s interaction or by another object. When dog and can be considered to be ‘events’. The
an event is triggered, the program executes robot dog is also able to perform a number of
the code that is contained under the heading specific actions. The dog can walk, run, heel,
of the object and event in question. In this beg and ‘bark’. All of these actions can be
sense, events are usually the equivalent to considered to be the ‘methods’ that belong to
subroutines in 3GLs. For example, a button the dog (that is, what the robot dog can do). All
object may contain code in several different of these things can be considered to form the
event procedures, one of which could be a class of robot dog.
‘click’ event. When the user of the program
clicks on the button, the code inside the click If another robot dog were to be created using
event procedure would be executed. this class of robot dog, it too would be able to
be interacted with in the same ways and would
Methods are commands that can be used to be able to perform the same actions. It would
interact directly with objects to change their however act independently from the first
behaviour or have them perform a particular robot dog and any other robot dogs that were
function. Every object in an OO programming obtained and placed in the same environment.
language will have certain behaviours Most 3GL or 4GLs have the ability to define
associated with it that can be accessed via a classes and doing so is a powerful way of
method. As the type of object can be quite building sophisticated software solutions
varied, these methods are usually quite drawing on previous code and structures
specific to the object type in question. The rather than writing everything from scratch.
important aspect of this to understand is that
methods belong to objects. All objects have The role of internal documentation
specific methods that belong to them, and in
some cases, these methods may be unique. As well as writing documentation to enable the
users to use the program effectively, it is
Classes equally important to write internal
documentation in the program’s code.
Bringing these concepts together, is the object Although there may not be many
known as a class. A class is a definition of an circumstances in which programmers other
object that has a number of
methods and events associated
with it. It can be duplicated and
used independently of the other
instances of itself. Just as a
variable can be declared to be of
a particular type, an object can
be created that is of a particular
class and named in a unique
fashion that allows it to be
referenced on its own.

To use an analogy to explain this,


imagine you have at your
disposal a small robot dog. You
are able to interact with the dog
in a finite number of ways. You Figure 6: Internal documentation inside a Python program

15 | P a g e
Software Development: Core Techniques and Principles 4th edition

than those that coded a program initially will Communicate the intended purpose
want to examine the code, internal
documentation is a vital element in the Make sure that when you are writing internal
software development process. If the program documentation, you are explaining what the
has been created by a programmer while code should be doing as opposed to what
employed by a company, the company owns function individual statements have. These
the program and may wish to make changes two things may seem to be the same, but to
long after the original programmer has left the another developer trying to understand what
company. Similarly, some aspects of a program the code is doing, it will be easier for them if
can be quite complex, and even the original they know what the purpose of the statements
programmer may have trouble interpreting are as opposed to simply what they do.
uncommented code if they need to make
changes to it in the future. Write your comments for someone else

Internal documentation should describe the In many cases, internal documentation will be
function of key variables and procedures and used by the software developers that wrote
should also include an explanation of the the original code. However, there will be many
naming and coding conventions that have times when developers are called upon to
been used. It is also usual to include any write modifications to existing code within an
references to code that has been sourced from organisation. With this in mind, internal
places other than the author or the company documentation should always be written
itself as well as include revision information clearly and professionally.
describing who has contributed to the
software development and when. Naming conventions for solution elements
Internal documentation does not have any Whether writing an algorithm or writing code,
effect on the efficiency of a software solution it is very important to decide on a naming
as the compilation process ignores all convention that will be used for all the
comment lines and does not convert them to variables and subroutines within the program.
machine language. This convention should then be applied
consistently throughout and documented in
Good internal documentation practices some fashion, possibly through internal
documentation and certainly in some hard
Header comments copy format.

At the beginning of a module or software The use of a naming convention is preferred for
solution, a number of lines can be used to state a number of reasons. In the long run, a naming
the name (or names) of those that have convention aids the programmer by reminding
written the code, the date that the code was them of the function of variables and
last updated, the version number of the code subroutines that they may not have used or
and a general description of its purpose. Doing accessed for a while in the development of the
this gives another developer some initial software. In addition to this, naming
information about the code and may aid their conventions aid those who are reading the
understanding. program with a view to understanding or
modifying it later on.
Use of white space

Blank lines should be used to separate groups


of statements that have a different purpose.
The use of blank lines makes the code look less
dense and aids readability.

16 | P a g e
Chapter 2: The Structure of a Programming Language

‘Hungarian notation’ Dr. Simonyi was Hungarian, it became known


as the ‘Hungarian notation’.
An example of a naming convention is the
‘Hungarian notation’ convention. ‘Hungarian
notation’ is a popular naming convention for
Methods of representing
programmers in all programming languages, designs
particularly 3GL. It has a number of rules
related to the naming of subroutines, variables When considering the process of moving from
and objects. When a subroutine, variable or the definition of a problem and its scope, to the
object is placed into a program, it should be design of potential solutions, a number of tools
named in the following way: the first few will be used.
letters (all lower case) should indicate the type
of the particular subroutine, variable or object. Annotated diagrams or mock-ups
It doesn’t really matter what letters are used,
but it is important to be consistent. A button By creating annotated diagrams (or mock-ups)
could use the letters ‘btn’ and a label could use that model the layout of the proposed
the letters ‘lbl’; similarly, an integer may begin software solution, a software developer can
with the single letter ‘i’, but an integer tracking easily convey the user interface and the way
the size of a list or array may begin with the that the solution will operate. A client can
characters ‘sz’. After these type-distinguishing examine the annotated diagrams and give
letters, a word or number of words should be feedback based on the way the designs
added that describe the purpose of the address the functional requirements of the
subroutine, variable or object. They should be solution. They can also give feedback on the
written in ‘camel case’ – where each word way the user interface looks in terms of some
begins with a capital letter. of the attributes of user interfaces discussed
later in this chapter.
For example, a button that is being used to
calculate a tax rate could be named Data dictionaries
‘btnTaxRate’. A text box that is to be used for
the entry of a person’s name could be called A data dictionary is a table that lists and
‘txtUserName’. describes all of the variables that are (or will)
be used in a program or suite of programs. It
The origins of Hungarian notation essentially lists the meta-data for the variables
that a software solution will be using.
In the early days of
programming language
development, the Chief
Architect at Microsoft, Dr.
Charles Simonyi, introduced an
identifier naming convention
that added a prefix to the
identifier name in an effort to
label the function or type of the
identifier. The naming
convention proved to be very
popular as it made the code
easier to read by others and
many of Dr. Simonyi’s colleagues
began using the convention for
themselves. As it had no formal
definition, and in part because Figure 7: Annotated diagram for a currency converter program

17 | P a g e
Software Development: Core Techniques and Principles 4th edition

Usually, it lists the names of the variables, their Both data dictionaries and object descriptions
type, size (in characters), scope and a can be represented in a wide variety of ways.
description of their function. Data dictionaries There is not one standard for representing
can be created that contain more information these types of design tool, but the important
than this depending on personal preference. thing is that you understand the purpose and
An example is shown in Figure 8. function of both.

Variable Name Type Size Scope Description Algorithms


ID_Number Integer 6 Global ID number of the
product Once the requirements of a
strName String 10 Global Name of the program are known, an
product algorithm can be created.
strType String 10 Global The type or category An algorithm is a set of
of the product
ordered steps or
boolGST- Boolean 1 Global Is the product GST
instructions that accomplish
Exception free or not
… etc.
some task or goal. A
program will be written in a
Figure 8: An example of a data dictionary
specific computer language
but an algorithm is not in
Object descriptions any particular language. It defines, in English
the basic steps needed to solve the problem -
Object descriptions assist with the planning of similar to a recipe. Just like a recipe, algorithms
the structure and content of objects that are to also have specific symbols and formatting
be incorporated into the design of a software dependant on the type of algorithm being
solution. There are many different ways of used.
representing object descriptions. The
programming language being used will have an Programmers can sometimes be resistant to
effect on the content of an object description. writing algorithms, but doing so has some
Object descriptions are similar in style to data significant benefits. The main benefit is that
dictionaries and can contain information such problems can be avoided early - especially
as the methods and events associated with an those that are logic based. They also allow the
object. Figure 9 is a typical example of an scope of a task to be determined, which can be
object description. Note that an explanation of helpful when a team of programmers are
methods and events can be found in the next setting themselves up to begin a large
chapter. programming task.

Object Name: btnCalculate


Name Type Description
btnCalculate_Click Event When the button is clicked, it will calculate the result of the
current transaction and display the result on the screen.
btnCalculate_DblClick Event When the button is double clicked, it will lock the calculate
function so that no other buttons can be clicked.
btnCalculate.Enabled Method This method will be used to disable or re-enable the button while
the program is running. It can be set to ‘true’ or ‘false’.
btnCalculate.Visible Method This method will be used to make the button visible or invisible
based on the user’s access level. It can be set to ‘true’ or ‘false’.
btnCalculate.Focus Method This method can be used to place the cursor on the button so the
user can only select this. It will be used at appropriate times to
guide the user through the ordering process.
Figure 9: An example of an object description

18 | P a g e
Chapter 2: The Structure of a Programming Language

Top-down design Selection

Just as writing a program from scratch is a In a selection structure, a condition is tested


daunting task, writing an algorithm from (or a question is asked), and depending on the
scratch is equally daunting. Software answer, one of a number of courses of action
developers often use a method for breaking (usually two) is taken. After this has been done,
the problem down in small parts, known as the program moves on to the next task in
“top-down design”. sequence. A selection structure is sometimes
referred to as an ‘if-then-else’ statement
The first step in writing any algorithm is because of the way in which it instructs the
defining what is to be achieved. That is, what is program.
the goal of the program? This may be
outputting the result of a calculation, Repetition (or iteration)
displaying a list of various items or
communicating across a network. In a repetition structure, a number of tasks are
performed a set number of times or until a
The next step is determining what information condition is met. Three types of repetition are
is required in order to achieve this goal. What available: fixed, test at end and test at
information will need to be gathered or input? beginning. Examples of each of these
What processing needs to occur? structures can be found in the following
sections.
When considering these things, a rough order
of tasks that need to be performed should be A common method of representing an
constructed. These tasks may be quite large at algorithm is using a type of algorithmic
this stage of the process, but that’s the way language known as pseudocode. Pseudocode
that top-down design works. Each task can was originally designed to be easy to write and
now be broken down into smaller, more understand and it is preferred by many
manageable parts. If these smaller parts are software developers for these reasons.
still too large, they may be further broken
down until the products of these divisions are Pseudocode
very simple programming tasks to achieve.
Pseudocode algorithms are written in plain
Algorithmic representations English. As with any language, there are rules
that need to be followed, although there are
There are three different control structures not many of these to remember.
that an algorithm can employ. These are
sequence, selection and repetition (or The following conventions should be used
iteration). By using these three structures in when writing pseudocode algorithms:
any combination, any logic problem can be
solved. A small definition of each follows
below:

Sequence

Tasks that are performed in sequence lead


from one to the other in that order. Any
number of tasks may be executed in sequence.
Each task must be performed once – there is
no possibility of skipping a task or branching off
to perform a different task.

19 | P a g e
Software Development: Core Techniques and Principles 4th edition

1 Start … Stop These should be used to indicate the beginning and end
Or of any program or subprogram. It does not matter which
Begin … End pair is used, as long as there is consistency throughout
the algorithm.
2 Action 1 Sequence is represented by writing statements
Action 2 underneath each other as shown.
3 If Condition Selection is represented like this. If the condition is true,
Then then the statement(s) following the ‘then’ are carried
Action 1 out.
Action 2
End If
4 If Condition Expanded version of the above convention. If the
Then condition is true, then the statement(s) following the
Action 1 ‘then’ are carried out. Otherwise, if the condition is false,
Action 2 a set of different actions are taken.
Else
Action 3
Action 4
End If
5 Case of variable A ‘Case’ statement can be used to simplify conditions
value1: Action 1 where there are many different values that are being
value2 to value3: Action 2 checked. An action can be performed if the variable is a
value4+: Action 4 specific value, within a range or greater than a certain
End Case value.
6 Count ß number Assigning values to variables is done with an arrow, which
serves to show the flow of data. Note that conditions
should still use the ‘=’ sign when two objects are
compared to each other.
7 For Count ß first to last Repetition can be performed in a variety of ways, and
Action 1 pseudocode has a different way of representing each.
Action 2 This is a fixed loop that repeats a set number of times.
Next Count
8 Repeat This is used to continually repeat a loop until some
Action 1 condition becomes true. This is an example of the ‘test at
Action 2 end’ type of repetition discussed earlier.
Until Condition
9 While Condition Do This loop is used when a condition must be true before
Action 1 starting the loop. This is an example of the ‘test at
Action 2 beginning’ type of repetition also discussed earlier.
End While
10 Process, Display, Increment These are some examples of the more commonly used
Add, Multiply, Divide, Subtract pseudocode terms. It is not important to use all of these,
Calculate, Sum but consistency is important. ‘<>’ is used to indicate ‘not
Input, Output equal to’.
<>, >, <, <=, >=
Figure 10: Pseudocode conventions

Note that the main difference between the The example in Figure 11 on the next page, is a
‘test at end’ and ‘test at beginning’ repetition typical algorithm which starts by setting a
structures, is that if the condition is such that variable ‘Pass’ to ‘unlucky’. The user is then
the loop will end, the ‘test at end’ structure will prompted to enter a password and this is
still execute what is inside it once while the placed into the variable ‘Password’. A ‘test-at-
‘test at beginning’ will not execute what is beginning’ loop compares the password that
inside it at all. has been entered to the one held in the
variable ‘Pass’. If they are not the same, the

20 | P a g e
Chapter 2: The Structure of a Programming Language

user is prompted to re-enter the password and


Pseudo code example 2
it is checked again in a continuous loop. If the
user correctly enters the password, the loop
Start
finishes and an ‘Access granted’ message is
{Washing the dishes}
displayed on the screen.
Put plug in sink
Fill sink with hot water
Pseudocode example 1 Add detergent
While dishes left to wash Do
Start Take dish from bench
Pass ß “unlucky” Put dish in water
Display “Enter the password” Repeat
Input Password If dish is a pot then
While Password <> Pass Scrub pot
Display “Wrong password – with steel
try again” pad
Input Password Else
End While Wipe dish
Display “Access granted” with sponge
Stop End If
Until dish is clean
Figure 11: Pseudocode example for a simple Put dish in drying rack
password program End While
Stop
You will see in the example in Figure 12, that
Figure 12: Pseudocode example for washing
the actions are described in English and the
the dishes
structure of the problem is well laid out and
easy to read. Some of the structures described
above have been used and the algorithm has
been indented where appropriate. This
example one shows how a pseudocode
algorithm can be used to describe a real-life
task – in this case, washing the dishes.

21 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context Questions

1. Why is a kilobyte equal to 1024 bytes and not 1000 bytes?


2. What is the difference between a high level and a low level language?
3. Why is it important to understand the syntax of a programming language?
4. What benefits does using subroutines have over not using them at all?
5. What is the difference between a subroutine and a function?
6. What is the difference between a method and an event?
7. Explain why the inclusion of internal documentation does not have an adverse effect on the
efficiency of a program.
8. Give reasons why the use of a naming convention is beneficial.
9. What advantages are there in using coding conventions for the naming of program elements?
10. In what ways can a data dictionary assist a software developer in the design process?
11. Describe the process known as ‘top-down design’.
12. List the three different control structures that an algorithm can employ.
13. What does a ‘Case’ statement do that makes it useful to a software developer?
14. What implications does the choice of variable type have on the efficiency of a software
solution?

Applying the Concepts

• Write an algorithm to describe a household chore that you currently perform.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Explain how data is represented inside the computer

List and explain the differences between the main data types

Create pseudo-code examples of the sequence, selection and repetition control

structures

Explain the difference between subroutines, functions, events and methods

Explain what classes consist of and their benefits

Discuss the benefits of internal documentation and naming conventions

Create an annotated mock up to represent the design of an interface

Understand how data dictionaries and object descriptions are used

Write simple pseudo-code algorithms

22 | P a g e
Chapter 2: The Structure of a Programming Language

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
The process of breaking a problem down into smaller parts is known as:
A. an algorithm
B. top down design
C. bottom up design
D. sideways refinement

Question 2
A character data type stores data that is:
A. A single ASCII character
B. A prime number
C. A positive integer
D. True or false

Question 3
Which of the following is not a benefit of writing an algorithm?
A. Find errors in logic before coding
B. Test features of the user interface
C. Makes the processing more efficient
D. Makes it easy to divide the project amongst members of a team

Question 4
Which of the following is not a structure available in an algorithm?
A. Manipulation
B. Sequence
C. Iteration
D. Selection

Question 5
Internal documentation is:
A. A user guide which is included inside the code of a software solution
B. Documentation that is only intended for those internal to the organisation
C. Documentation intended for those that might make changes to the code
D. Documentation that is opened by typing an administrator password

The following information is required for Questions 6 and 7.

btnProcess
Name Type Description
btnProcess_Click Event Calculate the final result

btnProcess_DblClick Event Lock the button

btnProcess.Enabled Method Lock or enable the button

23 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 6
The tool displayed above can best be described as a:
A. Context Table
B. Data Flow Diagram
C. Object Description
D. Data Dictionary

Question 7
The person coding this object wants to add a tool tip that will appear when the mouse hovers over
the object. This would be classified as:
A. A method
B. A variable
C. A boolean value
D. An event

The following information is required for Questions 8, 9 and 10.


Variable Name Type Size (bytes)
ID Integer 1
Title String 10
Classification String 10
Borrowed Boolean 1

Question 8
The tool displayed above can best be described as a:
A. Context Table
B. Data Flow Diagram
C. Variable List
D. Data Dictionary

Question 9
The maximum number of unique ‘ID’ numbers that could be stored would be:
A. 16
B. 256
C. 255
D. 2056

Question 10
A Boolean data type stores data that is:
A. Encrypted
B. A prime number
C. A positive integer
D. True or false

Question 11
It is always good programming practice to use a naming convention.
a. Describe one naming convention that can be used when naming variables.

__________________________________________________________________________________

__________________________________________________________________________________
1 mark

24 | P a g e
Chapter 2: The Structure of a Programming Language

b. Give two advantages of using a naming convention.

Advantage 1: _______________________________________________________________________

__________________________________________________________________________________

Advantage 2: _______________________________________________________________________

__________________________________________________________________________________
2 marks

c. Explain how not using a naming convention can lead to increased development costs in the
future.

__________________________________________________________________________________

__________________________________________________________________________________
1 mark

Question 12
a. What is internal documentation?

__________________________________________________________________________________
1 mark

b. Describe a situation in which not having any internal documentation would be a problem.

__________________________________________________________________________________

__________________________________________________________________________________
1 mark

c. Martin and Julee are having a debate about internal documentation. Martin believes that
every line of code should be accompanied by at least one line of internal documentation. Julee
argues that a few lines of documentation per event procedure is enough.

Discuss the pros and cons of each point of view.

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
4 marks

25 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 13
A variable type that can only have two values: ‘true’ or ‘false’ is called ___________________.
1 mark

Question 14
Consider the diagram shown below:

a. What is the name for a diagram of this type?

__________________________________________________________________________________
1 mark

b. Who has this diagram been produced for?

__________________________________________________________________________________
1 mark

c. At what stage of the Problem Solving Methodology would a diagram such as this be made?

__________________________________________________________________________________
1 mark

d. List two benefits of producing a diagram of this type (as opposed to not producing one at
all).

Benefit 1: _________________________________________________________________________

Benefit 2: _________________________________________________________________________
2 marks

Question 15
What is the difference between a method and a function?

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

26 | P a g e
Chapter 2: The Structure of a Programming Language

Sample Examination Answers

Question 1
Answer: B

Question 2
Answer: A

Though programming languages used across schools vary, the standard data types listed in the study
design are the ones that you need to be familiar with.

Question 3
Answer: B

The design of a user interface, while it also takes place in the design stage, is a different process to
writing an algorithm.

Question 4
Answer: A

Manipulation is something that gets done in an algorithm, but rather this is spelled out using sequence,
selection and iteration.

Question 5
Answer: C

Question 6
Answer: C

Question 7
Answer: D

This question is aimed at the difference between methods and events. Events are (generally) triggered
when objects are interacted with, though events can also be triggered by other objects or the object
itself. A method is contained within the definition of the object and can be used to change its function
or appearance.

Question 8
Answer: D

Question 9
Answer: B

1 byte has a maximum number of 255 – plus 0, makes 256 different numbers.
Question 10
Answer: D

27 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 11
a. Hungarian convention.
b. i – Makes it easier to debug the program, ii – Makes it easier to see the function of variables
(their type, scope and purpose)
c. Anyone doing further development on the program will need to decipher what the variables
do and this will take time – leading to an increase in costs.

Question 12

a. Internal documentation relates to comments (documentation) placed within the code to


make it easier for programmers to understand and modify the code at a later date.
b. Not having any internal documentation makes modifying the code later on difficult –
especially if it is being done by someone other than the person that wrote the code.
c. Martin: documenting each line of code will be very time consuming. It will, however, mean
that the code is very easy to understand at a later date.
Julee: documenting each event procedure with a few lines will be much more practical and
less time consuming. However, there will be event procedures that are long and complex and
if Julee limits her comments to just a few lines, this may not be enough to fully explain the
function of the code.

When a question uses the stem ‘discuss’, this means that you need to discuss the pros and cons of two
alternatives. This is also reflected in the marks for the question which are set at 4 – meaning that the
examiners will be expecting you to make 4 points.

Question 13
Answer: Boolean

Definition of a variable type.

Question 14
a. annotated diagram or a mock up
b. software developer
c. design
d. 1: have a clear idea of what the solution will look like and how it will be laid out
2: can be used to show the client and gain feedback on the design

Question 15
A function is a subroutine that returns a value. A method is a subroutine that is associated with a
specific object.

28 | P a g e
Chapter 3
Data structures and data manipulation

The chapter covers Unit 3: Area of Study 1 key knowledge:


• Data and information
o KK1.2 Types of data structures, including associative arrays (or dictionaries or hash tables),
one-dimensional arrays (single data type, integer index) and records (varying data types,
field index)
• Approaches to problem solving
o KK1.6 formatting and structural characteristics of files, including delimited (CSV), plain text
(TXT) and XML file formats
o KK1.10 Algorithms for sorting, including selection sort and quick sort
o KK1.11 Algorithms for binary and linear searching
o KK1.12 Validation techniques, including existence checking, range checking and type
checking
o KK1.13 Techniques for checking that modules meet design specifications, including trace
tables and construction of test data.

Key terms: data structure, one-dimensional array, record, file, associative array, hash table, linear
search, binary search, selection sort, quick sort, files, CSV, TXT, XML, validation, existence check, range
check, type check, testing, syntax error, logic error, run-time error, debug, trace table, testing table,
test data.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Types of data structures One-dimensional arrays


A data structure is a way of storing or The implementation of an array can vary
organising data so that it is easier to access or among programming languages. Stated simply,
more efficient to use. Let’s first discuss the a one dimensional (or 1D) array is a data
simple data structures that can be used, and structure in which variables are grouped
later we will examine some of the more together under the same name and accessed
complex ones. via an ‘index’. The ‘index’ is most commonly an
integer value which starts at the value ‘0’.
Although an array typically contains only one
Software Development: Core Techniques and Principles 4th edition

data type, it can consist of multiple data types represented like the diagram shown in Figure
(as is possible in some languages). 15 below.

For example, let’s say that we had four


numbers that needed to be stored. We could The declaration of an array of integers called
store the four numbers using integer variables ‘Number’ to hold 4 items would look like this:
called ‘Number1’, ‘Number2’, etc. In memory,
this would look like the diagram shown in Declare Number(4) of integer
Figure 13.
If we were to set the value of:
Number1 Number2 Number(0) ß 20

Number
Index 0 1 2 3
Number3 Number4 Value 20

If we were to execute the lines shown below:


Figure 13: Storing numbers in single variables
versus an array Number(1) ß 72
Number(2) ß 37
Number(3) ß 30
There is nothing wrong with declaring these
four variables and using these in a program. Then the array would look like this:
The problem with this sort of method arises in
the way in which the variables will be used. If Number
the programmer wishes to ask the user to Index 0 1 2 3
enter a number into each of these variables Value 20 72 37 30
and then display a total, it would probably be
done as shown in Figure 14 below. Figure 15: Storing numbers in single variables
versus an array
Adding four numbers together

Begin Note that the index of an array always begins


Total ß 0 at ‘0’. For example, if an array is declared with
Input Number1 100 elements, they will be numbered from ‘0’
Input Number2 – ‘99’ (which is 100 elements in total). The
Input Number3 reason for this is because the index is used as
Input Number4 an ‘offset’ to reference the memory locations
Total ß Number1 + Number2 + the array is using that are stored sequentially
Number3 + Number4 in memory. That is, the first element of the
Display Total array has an index of ‘0’ because it is located at
End the memory location pointed to by the variable
name of the array. The second element of the
Figure 14: Adding four numbers together array has an index of ‘1’ as it is one memory
location in advance of the first one (and so on).
A 1D array would group all of the number
The same program to input the four numbers
variables together in a way that would allow
and then calculate and display the total could
them to be accessed via an ‘index’ which would
look like Figure 16 on the next page.
point to the one we wanted. This could be

30 | P a g e
Chapter 3: Data structures and data manipulation

Adding four numbers using an array Records

Begin A record is a structure that can be used to


Total ß 0 group together variables for a particular
For Loop ß 0 to 3 purpose. Records are similar to arrays except
Input Number(Loop) that where an array usually contains elements
Total ß Total + Number(Loop) all of the same type, the variables within a
Next Loop record may be of different types and sizes.
Display Total Indexing the elements of a record is often done
End via an identifier which is declared at the same
time as the record.
Figure 16: Adding four numbers using an array
For example, it might be necessary to store
contact information for customers, like their
At this point in the comparison, you may be name, address, suburb and telephone number.
thinking that the non-array version essentially You could certainly store this information in a
consists of seven lines of code while the 1D number of arrays (using the index of the array
array version consists of six lines of code to do to indicate which customer’s details you were
the same thing. This is not a huge improvement accessing). Some difficulties arise when data is
– and you could argue that the 1D version is represented like this. A problem is that each
more complicated to code. However, this is not separate array would not make a lot of sense
a good example to demonstrate the real on its own. Examining the sixth phone number
strength of 1D arrays. If we were dealing would not tell you whose phone number it
instead with 500 numbers, the non-array was. You would need to also access the sixth
version of the program would need to ask the element of the name or surname arrays to
user to input each of the 500 numbers (using discover this. The main difficulty with such an
500 separate lines of code). That would arrangement comes when the information
increase the size of the program to 503 lines, needs to be sorted in some way. Sorting a
and would mean implementing a new version number of parallel arrays (and keeping them in
of the program if the program had already sync with each other) is a difficult task. A
been released. The 1D array version would record designed to store customer information
only need some small changes to the size of the would group all of the different variables
loop and would remain a program of only six together as shown in Figure 17.
lines.

Customer_Details
Name Surname Address Suburb State Telephone

In an algorithm, values could be placed into the different parts of this record like this:

Customer_Details.Name ß “Phillip”
Customer_Details.Surname ß “Rivers”
Customer_Details.Address ß “217 Williams Rd.”
Customer_Details.Suburb ß “South Yarra”
Customer_Details.State ß “Vic.”
Customer_Details.Telephone ß “0414297564”

Figure 17: An example record

31 | P a g e
Software Development: Core Techniques and Principles 4th edition

A record like this is quite useful, but as it A hash table is a type of implementation of an
stands, it is storing one set of information. In associative array.
practice, records are placed into arrays so that
large sets of information can be stored (which Hash tables
is effectively the same as a database table).
A hash table is essentially a way of
Using data structures to organise data implementing an associative array in an array
of a smaller size than the known set of key
Associative arrays (or dictionaries) values. If an associative array is able to be used
in an array that is the same size as the total
An associative array is essentially a way of number of key values, then using a hash table
connecting two pieces of information is not really required. Let me illustrate this with
together. It will not be the main repository for an example.
the data within a system, but will instead aid in
the organisation of that data by associating a
transaction or event to it. The way that it does “Room101” : “Jameson, Elaine”
this is by using the main key value used to “Room105” : “Phillips, Mike”
index the data and then connecting it to “Room106” : “Canter, Judy”
another value to form a (key, value) pair. Key “Room204” : “Jameson, Elaine”
values do not have to be all represented in the “Room209” : “Le, Jenny”
associative array, but by definition, should be “Room210” : “Teo, Manti”
unique identifiers.
Figure 18: Storing information in an
Let’s say that we need to record the names of associative array
people that are staying in a hotel (by the room
number). If a person is staying in a room, then
we record their name against the room The hotel room associative array would not be
number, otherwise we don’t record the room a good candidate for a hash table as the
number in our array at all. An associative array number of rooms in the hotel is a known
to represent this might look like the one below. quantity and is also relatively small. The
There are a few things to note about overhead in storage and processing time is
associative arrays that are illustrated by this miniscule in a situation like this. However, let’s
example. instead imagine that we want to record the
occurrence of a word in an essay (not how
Firstly, an association consists of a binding and many times a word has appeared but just that
this is often shown by using a ‘:’. Some the word has been used). The number of words
programming languages (PHP for example) in the English language is now just over
actually make use of the ‘:’ character in their 1,025,000. We could make an array of this size
implementation of associative arrays. so that we could represent a strict key:value
binding for every single word. While that
The second thing to note is that not all of the would work, there are some problems with
rooms in the hotel are represented. In fact, it is doing this. Firstly, the number of words in the
not clear how many rooms are present in the English language is increasing, so our data
hotel. structure would require frequent modification
to stay in line with this. In addition, it is highly
The third thing to note is that a person can unlikely that an essay would use every single
effectively book more than one hotel room word in the English language. A typical essay
under their name. The ‘key’ cannot be may only have a few hundred unique words, so
repeated but the ‘value’ can be. the use of an array of over one million
elements will mean there is a lot of wasted

32 | P a g e
Chapter 3: Data structures and data manipulation

storage space and a processing overhead to


examine all of these items.

Instead a better solution is to use a smaller


array (as a hash table) and place values into it
using a hash function.

Instead of using a very large array, perhaps we


implement an array of size 1,300. This may
seem like an arbitrary number to have chosen,
but it has been chosen as it is 26 x 50 and there
are 26 letters in the alphabet. If we use the first Figure 19: Organising building bricks by colour
letters of the word as the hash function, we can (hash function)
place them into the array based on this. We
could have made the array only have 26 output. There are many ways in which a
elements, but then some elements of the array software solution can receive input. The most
would become full of many words and the common form of input may be via the
processing would become difficult. keyboard and the mouse, but increasingly,
input is gathered through a variety of devices
The main aim in implementing a hash table is and methods. Devices for input include tablets,
to have the elements within the table evenly smart phones, styluses, joysticks and game
spread out and easy to locate. While it is controllers. There are also a variety of methods
possible to have more than one item in an of inputting data such as through touch
array element, doing this adds to the interfaces and the movement of devices via
processing time. Ideally, in designing the hash accelerometers and GPS coordinates.
table and the hash function, a software
developer is trying to strike a good balance The quality of data is of critical importance, as
between efficiency and storage space. once data is gathered, the value of any
information that is produced will be contingent
In the example shown in the figure, there are on this. In addition to this, information that is
only four ‘buckets’ for the building bricks. The output from a software solution may in turn
hash function being used is clearly the colour provide input to other software products.
of the brick. You can see that in each ‘bucket’ Therefore it is not only important to ensure
there are a number of bricks, so finding a brick that data is of the highest quality, but it is
of a certain size would not be as easy as it important to have a good understanding of the
would be if there were instead three ‘buckets’ characteristics (or limitations) of that data.
for each colour (for example).
Quality of data
Of course the reason for using a hash table is
not really to store information, but to organise
There is a well known acronym in software
it in such a way that retrieving it is easy.
development: ‘GIGO’ (or Garbage In, Garbage
Typically the way that you would use a hash
Out). ‘GIGO’ describes what happens when
table would be to put the ‘key’ into the hash
invalid (or nonsense) data is entered into a
function to find the location that it should be
software solution. The code tries to interpret
in, and then check that location to see if it is
the data as valid data and so what is produced
present.
is an unknown. It may result in the program
crashing or it may produce unexpected results
Structuring input and output (or ‘garbage’). The process by which input data
is filtered to ensure that it is valid and ‘GIGO’
Software developers need to consider how the does not occur is known as validation.
software they are writing will handle input and

33 | P a g e
Software Development: Core Techniques and Principles 4th edition

Validation techniques number might be interpreted as zero, but most


programming languages will crash when this
There are a large number of techniques that happens.
can be used to validate data. It has to be said
that the best technique is not to require any Type checking also applies when a user enters
validation at all! If a user interface is designed a decimal number into an input that requires a
so that the choices that users can make are whole number value. In a case such as this, a
limited to valid types and values, then software developer may decide that it is fine to
validation (as a whole) becomes a lot easier. round the user’s input to the nearest whole
This is not always possible and so techniques number, rather than require them to re-enter
must be employed to check the data that has it.
been entered is as valid as possible.
Range checking
Existence checking
Range checking is the process of determining if
One of the simplest ways to validate an item of an inputted value is within the range of
input data is to ensure that the user has acceptable values for the program. This is
entered it. This is known as existence checking. usually most applicable to number values, but
In many programming languages, a blank input can also be applied to text. For example, you
field is interpreted as being equal to ‘0’, which may wish to perform a range check for valid
may be a valid input into the program or may Australian postcodes.
cause it to crash. An existence check will check
to see if data has been entered. If it hasn’t, a While the usual format of a range check may
prompt will be displayed to the user to enter be to test to see if a value falls between an
the data before proceeding. One way to upper limit and a lower bound, a range check
prevent most occurrences of this problem is to may also test to see if an inputted value is one
insert default input values into the program of a number of acceptable values (perhaps
and allow the user to change these values. stored in an array or similar data structure).
However, a malicious user could still delete the
default value and try to proceed to the next Existence, type and range checking – a
step in the program with the input box empty. logical sequence
For this reason, existence checks are almost
always necessary. It is highly possible that these three types of
validation could be used sequentially. If a
Type checking number were required to be within a certain
range, it would first need to be checked for
In the same way that the lack of input can existence, then checked to see that it was a
cause a problem, input of the incorrect type number type before the number was tested to
can cause unexpected results. Entering data of see if it was within the correct range.
a different type into an area of the program
that is expecting something else could cause
the program to crash or the ‘GIGO’ effect.
Testing
Testing is a vital step in any software
Often input boxes that are expecting text to be
development process. This not only includes
entered can receive input of any type –
making sure that the software solution does
numbers, special characters, upper and lower
what it is intended to do, but ensures that it is
case letters, etc. This might not be a serious
error-free. Testing in this context becomes a
problem, but it may compromise the
serious legal and/or contractual issue. As it can
information that is produced. More serious is
be such a time consuming process, testing is
when text is entered as an input that is meant
sometimes done by people employed
to be a number. In some cases, the value of the
specifically for this task, but this is a luxury not

34 | P a g e
Chapter 3: Data structures and data manipulation

available to all companies as it incurs extra the entire code before they execute the
cost. program.

Logic errors
Automated testing can help to limit the
time involved in formally testing software Logic errors are much harder to detect than
solutions. Automated testing software is syntax errors as they may not occur every time
a category of software that can run a the program is run and they do not cause the
series of pre-determined tests on
software solutions using test data and
program to crash. They occur when data
modelling keystrokes and mouse causes the program to operate in a way that
movements. Packages can be expensive, was not intended. For example, the program
but make the testing process cheaper in may give an incorrect answer to a calculation
the long term. or retrieve the wrong record from a database.

Run-time errors
The costs of improper testing Run-time errors (or exceptions) occur when
something happens that was not planned for,
Despite the fact that IT is a relatively young
causing the program to crash. Common
field in terms of world history, there have been
examples of this sort of error are when an
some infamous examples of the effect that
index is used that is outside the bounds of an
improper testing can have. The Therac-25 was
array or a number is placed into a variable that
a medical computer used to control radiation
has not been allocated enough space to hold it.
therapy. A bug in its code caused at least two
What follows is a brief description of some of
deaths between 1985 and 1987 due to massive
the most commonly occurring errors. Most of
exposures to radiation (over 100 times the
these errors are run-time errors, but some are
intended dose). A bug in the guidance system
logic errors.
of the European rocket Ariane-5 caused the
rocket to explode one minute after take-off,
Arithmetic overflow and underflow
costing over one billion US dollars. The
software within the guidance system
Arithmetic overflow happens when the result
attempted to store a 64-bit number in a
of a calculation or process causes a variable to
memory location of only 16-bits.
become too large for the location in which it is
stored. Underflow can occur when a number is
Types of errors too small to be represented in the variable
type being used. This may be hard to visualise,
Before discussing how we can test a module or
but as small numbers are represented as
coded solution for errors, it is useful to have an
exponents, small numbers have an exponent
understanding of the different types of errors
which is a large negative number.
that can occur. Generally speaking, errors are
classified as syntax errors, logic errors or run-
Memory leak
time errors.
As programs create variables and allocate
Syntax errors memory to resources, they consume more of
the processing power of the computer on
Syntax errors are errors with the spelling or
which they are running. If these resources are
format of specific commands. As such, they do
not ‘freed’ up when they are no longer needed
not occur very frequently in finished programs
by the program, there is the potential for the
as most good programming language
program to increasingly use more and more
interfaces will detect them in the development
memory. This may lead to reduced
stage. Compiled languages check the syntax of
performance for all the programs being used at

35 | P a g e
Software Development: Core Techniques and Principles 4th edition

the time and at its most extreme will cause the Off-by-one error
computer to crash. Many programming
languages now include ‘garbage collectors’ This type of error is a very common one in
whose sole function is to free up those areas of which a loop within a program either loops one
memory that are no longer needed by the too few or one too many times. This may or
program. may not cause the program to crash, instead
possibly causing an output or function to be
Handle leak incorrect. Often the cause for this problem is a
programmer failing to take into account that
A similar type of error to the memory leak, a an array begins at ‘0’ as opposed to ‘1’. This
handle leak occurs when a program allocates a error can also be caused by a comparison that
‘handle’ to a resource (such as the screen, is out by one (for example, ‘less than’ instead
printer or other peripheral) and does not ‘free’ of ‘less than or equal to’).
up this handle on completing the task. The
program may then assign multiple handles to Race hazard
the same device, which will slow the computer
down and may eventually cause a crash. A race hazard can occur when one process that
is dependent upon the output from another
Buffer overflow process, executes before that data is received.

A buffer is a temporary memory location that Type conversion


is used to store data that has either been input
or output before it is processed. Buffer Type conversion errors can occur when data is
overflows can occur when the data input into a changed from one data type to another. The
program is not checked for size and is too big conversion may be one which is not allowed,
for the buffer to handle. which will cause the program to crash.
Alternatively, the data may lose accuracy and
Stack overflow effectively become a different number.

A stack data structure is used by the CPU to Preventing or handling bugs


keep track of the current line within the
program. As subroutines are executed, the line Bugs can be prevented or handled in a variety
number of the line calling the subroutine is of ways. During design time, there are ways in
placed in the data structure known as a stack, which programs can be tested or checked for
so that once it is finished, the CPU can move bugs. In addition to this, many programming
back to this point and execute the next line. If language interfaces contain features that are
a program contains a circular reference to a designed to minimise the instance of errors.
subroutine (that is, a subroutine that in turn Some programming languages will also have
calls another that contains a call to the original syntax designed not to let programmers
subroutine), then the stack data structure can perform tasks that could potentially lead to
quickly fill up with line number references and problems. Lastly, all programs should contain
overflow causing a program crash. some sort of error trapping routine, so that an
unexpected error does not crash the program,
Deadlock but instead reports some information about
why it has occurred.
A deadlock can occur when two or more
processes are waiting for each other to finish Although the term ‘bug’ has its origins in
and so neither one does. Deadlocks can be very engineering, the first computer ‘bug’ was
difficult to predict as they generally happen in found in a relay computer at Harvard
multitasking or networked environments when University in 1945. Technicians opened up the
different programs are sharing resources. computer to try to determine what was going

36 | P a g e
Chapter 3: Data structures and data manipulation

wrong, only to find a dead moth in-between been done, go through the algorithm line by
one of the relays! They affixed the bug to the line, and as the variables change, you should
error log and wrote underneath “First actual write their new values under the relevant
case of bug being found”. The original log column. You should also leave a space at the
(complete with the moth taped to it) can be bottom of the table for anything that is output
found in the Naval Surface Warfare Center to the screen.
Computer Museum at Dahlgren, Virginia, USA.
The technicians were reputedly the first to coin Note that columns are only modified when the
the term ‘debugged’. value of a variable is changed. This way, when
reading the trace table, the current state of the
module or coded solution can be found by
examining the bottom row. If a variable does
not have a value entered in this row, the
current stored value of the variable is the
bottom most one.

A similar technique to creating a trace table is


the use of ‘watches’ on key variables.

Watches and breakpoints

Most programming language interfaces allow


you to set breakpoints in the code at which
Figure 20: The first computer bug point the execution of the code will pause and
allow you to examine it. Once the program has
been paused in this way, it is usually possible
As we are examining the stages of software to ‘step’ through the code a line at a time so
development, we will focus on the methods that you can observe what is happening.
which we can use to detect errors before the ‘Watches’ can also be set up so that the value
program is implemented. of certain variables is displayed on the screen
while the program is executed.
Methods of testing
Techniques like these are best when testing a
module to ensure that it works within design
A trace table is a relatively easy tool to
implement for a small
algorithm or module (ideally
before it is coded). While it is Trace table
a tool that can also be used ID Surname Tax_Rate Done Discount Temp
to test out the function of a 001 Phillips 35 False 0 220
coded solution, its focus is on
specific logic and so it can be True 300
a little too fiddly to use on a 002 Jones 42 False 32 15
larger scale. 16 200
True 220
To create a trace table, you
must first look through the Output (screen):
algorithm or coded solution Customer: Phillips. Total Amount: $534.20
and identify all of the main Customer: Jones. Total Amount: $320.50
variables. These should be
placed in a table like the one Figure 21: Trace table example
in Figure 21. Once this has

37 | P a g e
Software Development: Core Techniques and Principles 4th edition

specifications. Once focus moves to the testing collections of data (such variable defaults or
of the coded solution as a whole, the use of program statistics) into a TXT file. In doing so,
trace tables and watch lists becomes hard as the software developer will be ensuring that
the number of variables can be extremely they write information and read information in
large. the same order and insert any separators that
they require. This lack of structure often means
Testing will be further discussed in Chapter 8, that software developers will instead opt for
where we will focus our attention on the ways one of the two other formats listed next.
in which solutions are tested and evaluated.
CSV file format
Procedures and techniques for One of the most common file formats is a CSV
handling and managing files or Comma Separated Value format. As the
name suggests, a CSV file is a text file in which
A number of procedures and techniques are the items of data are separated by commas.
often used when managing files. As files are Many applications are able to produce CSV files
easily corrupted, it is often best to leave them as their output and accept CSV files for input. A
‘open’ for the shortest time possible. From a CSV file can fairly easily replicate what is held
programming perspective the best way to do in a spreadsheet as its format and structure is
this is to ‘open’ the file, read in its contents and very similar. Though it is more common to
then ‘close’ the file. If the file is ‘open’ and the encounter a comma delimited file, the
software solution experiences an unexpected character that is placed between the different
crash, it is quite likely that the file will be data items can be varied. If a software solution
damaged. Opening files for prolonged periods is using the file exclusively, then the choice of
can also mean that other users cannot access character (known as the delimiter) does not
the file. matter. However, if the data is going to be
passed between software packages, the choice
Input files should be located in a folder that is of delimiter is obviously very important.
specific to the software solution so that they
are easy to locate and are grouped with the XML file format
solution when files are backed up. This can also
be achieved by including an import feature in XML or Extensible Markup Language is a
the software that copies a file from any method of encoding data in a file that is both
location to a secure folder within the solution. well organised and accessible for software
solutions as well as easy to read. It is standards
Naming of files is also important. A specific file based and used extensively by software
extension may be chosen so that files are not developers, especially when sharing of data
misinterpreted as belonging to another over the Internet is required. It shares a similar
software package. Files should be named in structure to HTML, which also uses ‘markups’
ways that identify them and distinguish them in the form of tags in its structure. One of the
from previous versions. most useful aspects of representing data in
XML file format, is that the format itself defines
Common file formats the data and how it is structured. This is in
contrast to a CSV file where the exact number
TXT file format and data types would need to be known in
advance for a software package to read them.
While all files are technically saved in a plain
text format, a TXT file has no special structure An example of an XML file is shown on the next
to it (other than one that the software page.
developer imposes via their code). It is very
common for software developers to save small

38 | P a g e
Chapter 3: Data structures and data manipulation

<?xml version=”1.0” encoding=”UTF-8”?> A better search method is called a binary


<catalog> search.
<movie>
<title>Star Wars</title>
Linear search algorithm
<director>George Lucas</director>
<writer>George Lucas</writer>
Variables
<year>1977</year>
Integer array: List[ ] – will hold the items in
</movie>
the array.
<movie>
Integer: Count – will hold the index of the
<title>The Shawshank
array element we are examining.
Redemption</title>
Integer: Max – will hold the size of the
<director>Frank
array.
Darabont</director>
Integer: Item – will hold the item that we
<writer>Stephen King</writer>
are searching for.
<year>1994</year>
Boolean: Found – indicates if the item has
</movie>
been found or not.
</catalog>
</xml>
Subroutine subFind(Item, List, Max)
Begin
Figure 22: XML file example
subFind ß -1
As you can see in Figure 22, the structure of an Count ß 0
XML file is similar to that of a flat file database. Found ß False
The fields are also well defined, unlike in a CSV While (Count <= Max and not Found)
file. Also of importance when discussing the If List[Count] = Item Then
handling and management of files are issues of subFind ß Count
security, archiving, backing up and disposal. Found ß True
Else
Searching for data in an array Count ß Count + 1
End If
A common programming task is searching for End While
data in an array. The simplest way to do this is End
to examine each item in the list in turn, until
the required item is found. This is known as a Figure 23: Linear search algorithm
linear search. Although it is not very efficient,
it is very easy to code and works well for small
lists. For a list of 1000 items, a linear search Binary search
could take up to 1000 comparisons to find the
required item. On average, it will take 500
A binary search is a search that can be carried
comparisons. Figure 23 shows an example of a
out on a sorted list. It works by dividing a list in
linear search.
half each time a comparison is made. As a
result, it is extremely fast, although if the list is
In the linear search algorithm shown, ‘subFind’
not in sorted order already, this does mean an
is set to ‘-1’ before the loop starts. As the value
additional overhead. The best policy is if you
of ‘subFind’ will be changed to be the index of
plan to utilise a binary search in your program,
the item that is being searched for, if it is not
keep your data in sorted order at all times,
found, ‘subFind’ will still be equal to ‘-1’ at the
which then reduces this overhead
end of the search. This can be used to indicate
considerably.
that the item is not in the list. Note also that
the ‘Found’ flag is used to end the loop if the
Let’s examine how a binary search would
item is found, rather than allow the loop to
locate some items contained in a list.
check all of the remaining array elements.

39 | P a g e
Software Development: Core Techniques and Principles 4th edition

Binary search example

Let’s examine how a binary search would locate some items contained the list below:

Index 0 1 2 3 4 5 6 7 8 9 10 11 12
List(Index) 3 5 8 9 12 17 27 28 30 42 44 47 50

Note that the array shown above has already been sorted.

We will start by searching for the number ‘12’. Two variables - ‘Low’ and ‘High’ will be used to keep
track of the two ends of the list we are examining. Another variable – ‘Middle’ will be used to point to
the item we are comparing to our search item.

At the beginning of the search, ‘Low’ is set to the index of the start of the list (in this case ‘0’) and ‘High’
is set to the index of the end of the list (in this case ‘12’). ‘Middle’ is set to the middle index between
‘Low’ and ‘High’ (rounded down). If ‘Middle’ is the item we are looking for, the search is finished. In
this case, it is not, so we continue to the next comparison. The variables for comparison ‘1’ are shown
below:

Index 0 1 2 3 4 5 6 7 8 9 10 11 12
List(Index) 3 5 8 9 12 17 27 28 30 42 44 47 50
L M H

If the item we are looking for is smaller than the item at position ‘Middle’ then we set ‘High’ to be equal
to ‘Middle’ minus one. If it is larger, then ‘Low’ is set to be equal to ‘Middle’ plus one. This way, we
have halved the list. ‘Middle’ is reset to the mid-point between ‘Low’ and ‘High’. The variables for the
second comparison are shown below:

Index 0 1 2 3 4 5 6 7 8 9 10 11 12
List(Index) 3 5 8 9 12 17 27 28 30 42 44 47 50
L M H

Note that ‘Middle’ has been set to ‘2’ as it is the value of ‘High’ – ‘Low’ divided by 2 and rounded down.

Once again, the item at ‘Middle’ is not the item we are looking for. As our search item is higher, we set
‘Low’ to be equal to ‘Middle’ plus one. We also reset ‘Middle’ to the mid-point between ‘Low’ and
‘High’. The variables for the third comparison are shown below:

Index 0 1 2 3 4 5 6 7 8 9 10 11 12
List(Index) 3 5 8 9 12 17 27 28 30 42 44 47 50
L M H

‘Middle’ is now pointing to the item we are searching for, so we are now finished. It has taken three
comparisons in this case. Using a linear search method, on average it would take 6.5 comparisons. A
binary search would take a maximum of 10 comparisons to find an item in a list of 1,000 items, and a
maximum of 20 comparisons to find an item in a list of 1,000,000! For those that are mathematically
minded, the formula to work out the number of comparisons is log n (base 2) + 1 (where n is the
number of items in the array).

Figure 24: Binary search example

40 | P a g e
Chapter 3: Data structures and data manipulation

Binary search algorithm

Variables

Integer array: List[ ] – will hold the items in the array.


Integer: Low – will hold the index of the start of the list we are examining.
Integer: Max – will hold the size of the array.
Integer: High – will hold the index of the end of the list we are examining (Max – 1)
Integer: Middle – will hold the index of the middle of the list we are examining (half way between
‘Low’ and ‘High’).
Integer: Item – will hold the item that we are searching for.
Boolean: Found – indicates whether the item has been found or not.

Subroutine subFind(Item, Max)


Begin
Low ß 0
High ß Max - 1
subFind ß -1
Found ß False
While (Low <= High and not Found)
Middle ß (Low + High) Div 2
If List[Middle] = Item Then
subFind ß Middle
Found ß True
End If
If List(Middle) > Item Then
High ß Middle – 1
Else
Low ß Middle + 1
The ‘Div’ function gives the
End If whole number result of one
End While number divided by another
End

Figure 25: Binary search algorithm

In the binary search algorithm shown above, Sorting data in an array


‘subFind’ is set to ‘-1’ before the loop starts. As
the value of ‘subFind’ will be changed to be the It is a common programming task to
index of the item that is being searched for, if incorporate a sorting routine into a program.
it is not found, ‘subFind’ will still be equal to ‘- Many programming environments have
1’ at the end of the search. This can be used to incorporated sorting routines and these can be
indicate that the item is not in the list, although sufficient for the types of solutions we are
the Boolean variable ‘Found’ is also used for building. There can be disadvantages to using
this purpose. The way that the algorithm these, however. When making use of inbuilt
‘knows’ that it has reached the end of the sorting routines, we have no idea whether they
search, is when ‘Low’ becomes greater than are very efficient or how they have been
‘High’ – which will happen if ‘Low’ and ‘High’ coded. An understanding of how sorting
are the same value and the item that is being routines work gives the software developer the
examined is not the item that is being searched ability to code their own, especially in cases
for.

41 | P a g e
Software Development: Core Techniques and Principles 4th edition

when the data set being sorted is particularly examining each item in the array and finding
large and efficiency gains are required. the smallest one (in the case of sorting from
lowest to highest). Once found, this item is
There are a large number of different sorting swapped with the item at the start of the array.
routines. A quick search of the Internet will The sort then examines the items from the
uncover many of these – and will probably also next position to the end of the array, looking
raise the question: Why so many? Each sorting again for the smallest item. Once found, it is
routine that has been developed has resulted swapped with the item at the second position.
in gains in the number of comparisons within Each subsequent pass gets smaller by one – as
the routine and/or the number of times array the sorted portion of the array gets larger.
elements need to be swapped around –
ultimately leading to a quicker result. Most A selection sort is an easy sort to write and
sorting routines are designed to move the understand. Despite its relative simplicity, it is
array elements around rather than copy the not a bad sorting routine and can perform
array to another place in memory. This can be quite adequately if the situation is right.
important when the size of the array to be
sorted is very large. Sorting algorithms will Selection sorts do a large number of
almost always utilise a small number of comparisons (of the order of n2 where n is the
temporary variables so that swaps can be number of array elements being sorted). One
made. of the things that is very consistent about a
selection sort is the number of swaps and
For the purposes of this study, we will examine passes it will do. It will always perform n-1
two sorting routines. The array of data shown swaps as well as n-1 passes.
in Figure 26 below will be used to compare
each of them. Note that the data set starts at Some sorts have what is known as an ‘exit
array index ‘0’. clause’. Some sorting algorithms are able to

Array of data for sort comparison

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 35 20 15 30 28 45 17 3 49 32

Figure 26: Array of data for sort comparison

Each time the array is cycled through, we call detect whether an array has become sorted
this a ‘pass’. Each time two array elements are and then exit the sort earlier. A selection sort
compared to each other, we call this a is not able to do this. In fact, an array that is
‘comparison’. Each time two array elements already sorted will take just as long to ‘sort’
are swapped, we call this a ‘swap’. Each of using a selection sort as one that is in reverse
these is an important measure in determining order (for example). Many of the more
if a sorting routine is performing well. sophisticated sorting routines move the array
elements towards a sorted state in the process
Let’s look at our first sorting routine (also one of doing simple comparisons – which a
of the simplest ones) known as a selection sort. selection sort does not do.

Selection sort Let’s look at our array of data and use a


selection sort to arrange it in ascending order.
The selection sort was one of the first sorting
routines and it attempts to sort the data in a
very ‘human’ way. A selection sort starts by

42 | P a g e
Chapter 3: Data structures and data manipulation

Selection sort example part 1

Below is the array of data that we will be using as a starting point for our sorting comparisons.

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 35 20 15 30 28 45 17 3 49 32

On our first pass through the array, we look for the smallest number – which is ‘3’.

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 20 35 15 30 28 45 17 3 49 32

We have now identified which number needs to go at the beginning of the array. So all we need to do
is swap it with the number that is there (‘20’) and our first pass is complete.

End of Pass 1
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 35 15 30 28 45 17 20 49 32

Now we can start the second pass through the array. As we now have the first element in place, we can
start our loop from position ‘1’ instead of ‘0’. Again we are looking for the smallest element – which
this time is ‘15’.

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 35 15 30 28 45 17 20 49 32

To end the second pass, we swap this element into its correct place in element ‘1’.

End of Pass 2
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 35 30 28 45 17 20 49 32

Let’s do the rest of the passes in quick succession, now that you can see how each pass is done.

End of Pass 3
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 30 28 45 35 20 49 32

End of Pass 4
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 45 35 30 49 32

End of Pass 5
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 45 35 30 49 32

Figure 27a: Selection sort example part 1

43 | P a g e
Software Development: Core Techniques and Principles 4th edition

Selection sort example part 2

End of Pass 6
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 30 35 45 49 32

End of Pass 7
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 30 32 45 49 35

End of Pass 8
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 30 32 35 49 45

In the final pass of a selection sort, the last two array elements are compared and swapped (if
required). This then completes the sort.

End of Pass 9
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 30 32 35 45 49

For an array of 10 elements, a selection sort will execute 9 passes. In general terms, for an array of size
‘n’, a selection sort will execute ‘n-1’ passes.

Figure 27b: Selection sort example part 2

Let’s discuss a typical algorithm for a selection


sort The ‘swap’ subroutine

The algorithm shown in Figure 29 on the next Variables


page works by using two nested loops. The first
loop ‘PassNumber’ will perform the passes (1 Integer: Temp – will hold one of the list
less than the total number of array elements) items while it is being swapped.
and the second loop ‘Position’ will cycle Integer: Index1 – the index of the first list
through the array elements. As the second item to be swapped.
loop is executed, it compares the current array Integer: Index2 – the index of the second
element to the smallest value for this pass. If it item.
is the smallest value found, it is recorded as
well as its array position. The second loop is Subroutine subSwap(Index1, Index2)
designed to get progressively smaller, as each Begin
pass through the list will mean that an element Temp ß List(Index1)
is placed in its correct position at the beginning List(Index1) ß List(Index2)
of the array. List(Index2) ß Temp
End
The algorithm also utilises a simple routine to
swap two elements, as shown in Figure 28. Figure 28: ‘swap’ subroutine

44 | P a g e
Chapter 3: Data structures and data manipulation

Selection sort algorithm

Variables

Integer array: List[ ] – will hold the items in the array.


Integer: ListSize – the number of items in the array.
Integer: PassNumber – the counter for the first loop.
Integer: Position – the counter for the second loop.
Integer: Smallest – will hold the value of the smallest array element found in the current pass.
Integer: SmallestPos – will hold the index of the smallest array element found.

Subroutine subSelectionSort(List, ListSize)


Begin
Loop PassNumber from 1 to ListSize – 2
Smallest ß List(PassNumber – 1)
SmallestPos ß PassNumber
Loop Position from PassNumber - 1 to ListSize - 1
If List(Position + 1) < Smallest Then
Smallest ß List(Position + 1)
SmallestPos ß Position + 1
End If
Next Position Loop
Swap(PassNumber - 1, SmallestPos)
Next PassNumber Loop
End

Figure 29: Selection sort algorithm

Quick sort keep track of what items we are comparing by


using two variables ‘up’ and ‘down’. ‘Up’ will
A quick sort is a popular sorting algorithm as it begin at the ‘start’ of the list and work up and
is both efficient and fast. The most popular ‘down’ will start at the ‘end’ of the list and
versions of the quick sort algorithm make use work down. If the item that ‘up’ is pointing to
of a programming method known as recursion. is larger than the ‘pivot’, the idea is to swap it
with an item that is smaller than the ‘pivot’
Let’s firstly look at how a quick sort works. being pointed to by ‘down’. The first pass is
Note that there are also versions of this sort complete when ‘down’ becomes smaller than
that use the ‘pivot’ value in different ways. The ‘up’. At this point, the ‘pivot’ value is swapped
version detailed in Figure 31 uses a pivot value with ‘down’ and should then be in the correct
at the start of the array (or sub-list). place in the array.

A quick sort starts by setting a ‘pivot’ value Let’s see how this works in practice.
which is the item at the start of the list. As this
routine works by examining smaller and
smaller lists, we will define variables to keep
track of the current ‘start’ and ‘end’ of the list
we are examining.

On the first pass through the array, items in the


list are compared to the ‘pivot’ value. We will

45 | P a g e
Software Development: Core Techniques and Principles 4th edition

Quick sort example part 1

Here is our array of data that we used in the selection sort example.

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 35 20 15 30 28 45 17 3 49 32

For the array above, ‘start’ and ‘up’ are set to ‘0’ and ‘end’ and ‘down’ are set to ‘9’. This would make
the ‘pivot’ value equal to ‘35’, which is the value at ‘start’. As we move through the array, we will be
comparing items to ‘35’.

Let’s begin. The value at ‘up’ is compared to the ‘pivot’ value. In this case ‘35’ is equal to ‘35’ and so
we will leave it in place. ‘Up’ is incremented and the next value is compared to ‘35’. ‘20’ is less than
‘35’ so leave it in place. ‘Up’ is incremented again. ‘15’ is also less than ‘35’. We keep incrementing ‘up’
until we find a value that is greater than ’35’. This happens when ‘up’ is pointing to index ‘5’ which is
‘45’. We have now found our first value to swap. ‘Down’ is now moved down the list. ‘32’ is less than
‘35’ which means we have our second value to swap. ‘32’ and ‘45’ are now swapped.

Index 0 1 2 3 4 5 6 7 8 9
List(Index) 35 20 15 30 28 32 17 3 49 45

‘Up’ now continues to move up the list till it finds a value that is greater than the pivot. ‘49’ is greater
than the pivot, so ‘up’ stops at index ‘8’. ‘Down’ is moved down the list until it finds a value that is less
than the pivot, or it becomes smaller than ‘up’. ‘Down’ does become smaller than ‘up’ when it has a
value of ‘7’. When this happens, this signals that the pass is over. The last thing to do is to swap the
‘pivot’ value with ‘down’. ‘35’ is swapped with ‘3’ and the array looks like the one below.

End of Pass 1
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 20 15 30 28 32 17 35 49 45

The ‘pivot’ is now in its correct place in the array. You will notice that all the values below the ‘pivot’
are smaller and all the values above the ‘pivot’ are larger.

We now divide the list into two sub-lists, each of which we sort using the same routine. The sub-lists
are defined as being either side of the ‘pivot’ value (pointed to by ‘down’). If a sub-list consists of one
or no elements, we consider it sorted. In this case, we would have two sub-lists. The first list would
start at ‘0’ and go to ‘down’ – 1 (which in this case is ’6’). The second sub-list would start at ‘down’ + 1
(which is ‘8’) and go to the ‘end’ of the list.

As the quick sort routine works by sorting smaller and smaller sub-lists, we will consider the completion
of each sub-list to be a pass. Let’s sort the second sub-list first. The ‘start’ of this list would be ‘8’ and
the ‘end’ would be ‘9’. As this list has only two elements, we only need to sort these. In this case,
swapping ‘49’ and ‘45’ completes the sorting of this sub-list.

End of Pass 2
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 20 15 30 28 32 17 35 45 49

Figure 30a: Quick sort example part 1

46 | P a g e
Chapter 3: Data structures and data manipulation

Quick sort example part 2

Let’s now look at the first sub-list. The ‘start’ of this sub-list is ‘0’ (which makes the ‘pivot’ equal to ‘3’)
and the ‘end’ is ‘6’. ‘Up’ is set to ‘0’ and ‘down’ is set to ‘6’. ‘Up’ is now moved up the list until it finds
a value that is greater than the ‘pivot’ – which it does at index ‘1’. ‘Down’ is now moved down the list
till it finds a value that is less than the ‘pivot’ – which it does not do. ‘Down’ now has a value of ‘0’ and
as it is less than ‘up’, the ‘pivot’ is swapped with ‘down’ (which is not really a swap at all).

End of Pass 3
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 20 15 30 28 32 17 35 45 49

As before, we try to split what is either side of the ‘pivot’ into two sub-lists. The first sub-list will be
from ‘start’ to ‘down’ – 1. This is a list of zero length, so we can ignore it. The second sub-list will be
from ‘down’ + 1 (which is ‘1’) to ‘end’ (which is ‘6’).

Once again, ‘start’ is set to the start of this list (‘1’) and ‘end’ is set to the end of this list (‘6’). The ‘pivot’
value in this case is ‘20’. ‘Up’ is set to ‘1’ and ‘down’ is set to ‘6’. ‘Up’ begins to move up the list again
till it finds a value that is greater than the ‘pivot’. This happens when ‘up’ is ‘3’ (which contains the
value ‘30’). ‘Down’ is now moved down the list till it finds a value that is less than the ‘pivot’, which
happens immediately. ‘30’ and ‘17’ are swapped. ‘Up’ continues to move up the list, this time stopping
at ‘28’. ‘Down’ continues to move down and passes ‘up’. At this stage, ‘up’ has a value of ‘4’ and ‘down’
has a value of ‘3’. As ‘down’ is now less than ‘up’, the ‘pivot’ and ‘down’ are swapped which places the
‘pivot’ in the correct place and ends the current pass.

End of Pass 4
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 17 15 20 28 32 30 35 45 49

Hopefully now this process is clearer to you. To speed the process up, the remaining passes will be
shown to you – but I would encourage you to work them out for yourself first to check your
understanding.

End of Pass 5
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 17 15 20 28 32 30 35 45 49

End of Pass 6
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 17 15 20 28 30 32 35 45 49

End of Pass 7
Index 0 1 2 3 4 5 6 7 8 9
List(Index) 3 15 17 20 28 30 32 35 45 49

The list is now sorted. With a small list such as this, the size of the sub-lists was sometimes very small
(1 or 2 array elements). With larger lists, the sub-lists are divided and further divided and the sort works
backwards once it reaches these finishing points.

Figure 30b: Quick sort example part 2

47 | P a g e
Software Development: Core Techniques and Principles 4th edition

Figure 31 below shows an algorithm for a quick should always have an end point, after which
sort (using recursion). Before we discuss how it time it will begin to work its way back through
works, let’s touch on the topic of recursion. the many procedure calls that have been
made. Now that you know how a quick sort
Recursion is the process by which a function or works, you can see that it divides the array into
procedure calls itself. This can be a very useful sub-lists, each of which is sorted using the
tool as it can make a complex process simpler quick sort routine.
and easier to code. A recursive procedure

Quick sort algorithm

Variables

Integer array: List[ ] – will hold the items in the array.


Integer: Start – will hold the index of the left most element in the sub-list that is being examined.
Integer: End – will hold the index of the right most element in the sub-list that is being examined.
Integer: Pivot – the pivot value for the list or sub-list.

Function Quicksort(List)
Begin
If length(array) > 1 Then
Pivot ß first element of the array List
Start ß first index of List
End ß last index of List
While Start <= End
While List(Start) < Pivot
Start ß Start + 1
End While
While List(End) > Pivot
End ß End -1
End While
If Start <= End Then
Swap List(Start) with List(End)
Start ß Start + 1
End ß End – 1
End If
End While
Quicksort(List from first index to End)
Quicksort(List from Start to last index)
End If
End

Figure 31: Quick sort algorithm

This algorithm is quite complex and hard to Alternatively, code the algorithm and test it
understand by simply reading through it. To out with some large unsorted lists to see what
come to grips with the way that it works, happens and how quickly the list is able to be
perhaps put some test data together and sorted.
perform a desk check of the algorithm.

48 | P a g e
Chapter 3: Data structures and data manipulation

Context Questions

1. What condition must be met for a binary search algorithm to be used successfully on a
collection of data?
2. Define the terms ‘pass’, ‘comparison’ and ‘swap’.
3. Of the terms above, which do you think would have the greatest negative effect on the
efficiency of a sorting algorithm?
4. If the program shown in Figure 15 were redesigned to read one million numbers into a suitably
sized array, how many additional lines of code would be required?
5. What is a data structure?
6. Why does the index of a 1D array typically begin at zero?
7. An array has been declared as having a maximum index of 10. How many elements does it
consist of?
8. What advantages does the use of arrays have when designing programs that deal with large
quantities of data?
9. What is a record?
10. Explain the term ‘Garbage In, Garbage Out’.
11. Describe ways in which a well designed user interface can minimise the amount of validation
code that is required.
12. Explain why it is logical to perform an existence check before a type check.
13. Suggest a suitable range check that could be used on a person’s birth date.

Applying the Concepts

• Create a program of your own that will generate a list of 10 thousand random numbers and
then sort them using a selection sort and a quick sort. It should display the time and number
of passes once sorted.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Understand how one-dimensional arrays can be used to store data

Explain the differences between associative arrays and records

Compare the differences between TXT, CSV and XML file formats

Explain the difference between subroutines, functions, events and methods

Describe the process of undertaking a selection or quick sort

Explain how a linear search is conducted and what must be done to use a

binary search

Describe how data can be validated using existence checking, range checking

and type checking

Explain how test data can be constructed to test that modules work correctly

49 | P a g e
Software Development: Core Techniques and Principles 4th edition

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
A hash function can be used to:
A. sort data
B. make data in a list easier to find
C. remove data that is no longer current
D. create a copy of a set of data

Question 2
The validation process known as type checking is:
A. Used to prevent data of the incorrect variable type being accepted as input
B. Only effective if used as a method after a range check
C. Is used to check to see if a number has been entered with the correct number of decimals
D. Is only applicable when considering postcodes

Question 3
A difference between CSV and XML file formats is:
A. XML files are much harder to visually read than CSV files
B. CSV files do not contain any delimiters
C. CSV files are much larger than XML files for the same data set
D. XML contains tags which define field names and structure

The following information is required for Questions 4, 5, 6 and 7.

Consider the following array of data:


Index 0 1 2 3 4 5 6 7 8 9 10 11 12
List(Index) 33 29 76 32 3 17 91 28 30 85 19 64 52

Question 4
If the array above were to be searched using a linear search method, the maximum number of
comparisons to find an item would be:
A. 5
B. 12
C. 13
D. 11

Question 5
For the array above to be searched using a binary search, it must first be:
A. Copied to another array
B. Sorted
C. Placed into a hash table
D. Analysed for duplicates

50 | P a g e
Chapter 3: Data structures and data manipulation

Question 6
If the array of data was prepared correctly for a binary search, the maximum number of comparisons
to find an item would be:
A. 1
B. 3
C. 4
D. 5

Question 7
If the array above was prepared correctly for a binary search, the maximum number of comparisons
to determine if an item was not present would be:
A. 1
B. 3
C. 4
D. 5

Question 8
After being alerted to an issue by a customer, Emily examines some code in a software solution that
has been published by one of her employees. Emily is very surprised to find a number of syntax errors
as well a large logic error in a financial year tax calculation.

The logic error would most likely have meant that when the software was run without syntax errors:
A. It would have crashed
B. It would not have performed the calculation at all
C. It would have run a number of times and then stopped
D. It would have given the incorrect results to the tax calculation

The following information is required for Questions 9, 10 and 11.

Question 9
The data shown above is in which file format?
A. CSV
B. HTML
C. Javascript
D. XML

51 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 10
This is not a complete file. It is easy tell this because:
A. There is no closing ‘catalog’ tag
B. The last item listed begins with ‘G’ and this is not close to the end of the alphabet
C. The file states that there are 7 items in total
D. The arrow at the side of the main tag has not been expanded

Question 11
The <YEAR> field would most likely be of type:
A. Character
B. Text
C. Number
D. Date

Question 12
Client.Surname could be an example of a piece of code you might see when dealing with:
A. a one dimensional array
B. a hash table
C. a data dictionary
D. a record

Question 13
When coding a search for data in an array, Joey decides to start at the beginning and check each array
element in turn. This is called a:
A. linear search
B. brute force
C. binary search
D. hash function

Question 14
What is a trace table and what is its’ main purpose?

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

Question 15
Consider the following array of numbers that is to be sorted in ascending order using a selection sort.
The sort is programmed to move through the array from left to right.
Index 0 1 2 3 4
List(Index) 65 23 45 62 27

a. On the first pass through the array, what number would be selected?

Value: ______________________________________________________________________
1 mark
b. How many passes will it take to sort this array?

Value: ______________________________________________________________________
1 mark

52 | P a g e
Chapter 3: Data structures and data manipulation

c. If this array had 1000 elements, how many passes would it take to sort this array?

Value: ______________________________________________________________________
1 mark
d. If this array had 1000 elements, how many comparisons would it take to sort this array?

Value: ______________________________________________________________________
1 mark

Sample Examination Answers

Question 1
Answer: B

A hash function on its own does not sort data. Best response in this case is option B.

Question 2
Answer: A

Question 3
Answer: D

Question 4
Answer: C

Must consider the possibility that the item is not contained in the array at all.

Question 5
Answer: B

Question 6
Answer: C

Question 7
Answer: C

Answer to Q6 and Q7 is the same. It takes the same number of comparisons to determine if a number
is present or not.

Question 8
Answer: D

Question 9
Answer: D

Question 10
Answer: A

Question 11
Answer: C

53 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 12
Answer: D

Question 13
Answer: A

Question 14
A trace table is a way of recording the current values of variables in an algorithm as it is tested.
The main purpose of a trace table is in testing – the benefit being that a software developer
can see all of the values and how they change line by line.

Question 15
a. 23
b. 4
c. 999
d. 499,500

54 | P a g e
Chapter 4
From planning to analysis

The chapter covers Unit 3: Area of Study 2 key knowledge:


• Data and information
o KK2.2 Techniques for collecting data to determine the needs and requirements, including
interviews, observation, reports and surveys
o KK2.3 Functional and non-functional requirements
o KK2.4 Constraints that influence solutions, including economic, legal, social, technical and
usability
o KK2.5 Factors that determine the scope of solutions
o KK2.14 Development model approaches, including agile, spiral and waterfall
o KK2.15 Features of project management using Gantt charts, including the identification
and sequencing of tasks, time allocation, dependencies, milestones and critical path
• Interactions and impact
o KK2.16 Goals and objectives of organisations and information systems
o KK2.17 Key legal requirements relating to the ownership and privacy of data and
information.

Key terms: design brief, mission statement, goals, objectives, information system, quantitative,
qualitative, solution requirements, functional requirements, non-functional requirements, constraints,
scope, project management, tasks, resources, critical path, developmental models, waterfall, agile,
spiral.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Planning a new software management’. Note that it is only really


possible to undertake the planning of a project,
solution if there is already an understanding of the aims
of the proposed software solution. If this has
One of the first steps in moving forward with not yet been determined, it may first be
the creation of a new software solution is to necessary to determine what the problems are
plan the various stages and timelines so that within the information system and what may
the scope of the project can be better need to be done to solve them.
understood. The process of doing this is often
placed under the heading of ‘project
Software Development: Core Techniques and Principles 4th edition

A design brief statement of why the organisation exists. The


mission statement guides the actions of the
organisation and provides the framework
A design brief is a simple document (or
through which all other strategy or decisions
statement) that outlines what the problems
are formulated. Often the directors of
are and anything about the nature of these
organisations or the board of management will
problems that is currently understood. Note
find themselves being drawn back to the
that a design brief serves to start the analysis
mission statement when discussing new
process and is not an end product of analysis.
business decisions.
It may not contain much detail at all.
While there is a lot more to organisational
One of the first things that a software
goals and objectives than simply a mission
developer needs to come to an understanding
statement, it is the starting point. A mission
about is the goals and objectives of the
statement is often a public statement that can
organisation. It is also important to gain insight
be found on an organisation’s website.
into the existing systems that are in place and
how the proposed development outlined in
For example, the mission statement for
the design brief, fits into the overall picture.
Microsoft is to create a family of devices and
services for individuals and businesses that
Organisations and goals empower people around the globe at home, at
work and on the go, for the activities they value
An organisation is an entity that has a most. While this is a very broad statement, it is
collective goal (often stated broadly in a the starting point for many other goals and
‘mission statement’). While organisations objectives and is the core phrase that the
differ in size (from sole traders or individuals organisation refers back to when making
up to large multi-nationals), they can also differ decisions.
in many other aspects such as:
Goals and objectives
• government / private (commercial)
• not for profit Goals and objectives provide a framework to
• charities organisations to assist them to make strategic
• clubs and groups decisions. Whenever a new proposal or course
• political parties. of action is before the board of management
for consideration, they will often refer back to
While there is a lot that can be said about the core goals and objectives for guidance. It is
organisations and their structure, we are not when these goals and objectives are ignored or
going to explore this in depth. Instead our forgotten about that organisations risk making
focus is on the goals and objectives of an decisions that not only put the business at risk,
organisation. As a software developer (and but can potentially change the way people
indeed as an employee of an organisation), it is perceive the company as a whole.
vital that you have an understanding of the
goals that the organisation has, how these Goals are statements that describe a
translate into specific objectives and what (potentially) future state or principle that the
information systems exist within the organisation strives to uphold or achieve.
organisation to help them achieve these goals Objectives support the goals by providing
and objectives. some targets with measurable results.

Mission statements Goals serve four basic purposes. They provide


guidance and direction, assist planning for the
A mission statement is essentially a statement future, inspire those within the organisation
of the ultimate goals of the organisation. It is a and support the evaluation process.

56 | P a g e
Chapter 4: From planning to analysis

Typical examples of goals for organisations simply consisting of hardware and software.
include the following things: The processes that are used by the
organisation, the data within the system and
• profit (for commercial organisations the roles that people play are just as important
only) (if not more so) and have a significant impact
• achieving a specified percentage of on the creation of a software solution.
market share
• service to a cause (for non-profits
organisations)
• quality products Information system: the combination of
• excellent customer service (for non- digital hardware, software and network
components (digital systems), data,
profits this may well be their main
processes and people that interact to
goal) create, control and communicate ideas and
• excellent reputation data in digital solutions.
• the price and reliability of products
• the care and development of staff. VCAA VCE Computing Study Design 2020-
2023 Glossary
As goals are very broad statements (generally),
it can be difficult for an organisation to know
Information systems within organisations
the best way to act on them. In addition to this,
support the goals and objectives of the
it is important to be able to reflect on and
organisation. If an information system is not
assess the success of these goals – that’s where
doing this or not doing it well, then its function
organisational objectives come into play.
needs to be re-evaluated.
Organisational objectives are often
quantifiable statements that expand upon the
goals of an organisation in a way that allows
them to be assessed.

For example, some examples of organisational


objectives might be:

• increase sales in the next financial year


by 10%
• reduce the number of customer
complaints by 50%
• ensure that 95% of products ordered Figure 32: An information system in action
online are packaged and delivered
within 5 working days
Goals and objectives of information
• maintain an employee retention rate
of 95%
systems
• engage 3 new clients per quarter
In order to support the goals and objectives
within an organisation, information systems
Information systems have their own goals and objectives. These
goals and objectives are expressed in a very
The term ‘information system’ encompasses similar style to those of the organisation as a
the combination of people, procedures, whole.
equipment and data that process data and
information. It is an important definition as it is Some examples of information system goals
easy to think of an ‘information system’ as are shown on the next page.

57 | P a g e
Software Development: Core Techniques and Principles 4th edition

• Keep track of all of the employee to process transactions as they occur


payroll information and issue continuously. An example of such a system
payments on time. might be a ticketing system for a concert hall.
• Record customer transactions, The types of inputs and outputs processed by
manage stock and issue receipts. this system are:
• Record attendance and manage class
rolls. • list of performances being offered
• Produce quarterly reports of sales. (output)
• list of available dates for each
You will notice that these goals are similar in performance (output)
style to the overall goals an organisation will • performance name (input)
have. The key difference between • performance date (input)
organisational and information system goals, is • availability of tickets (output)
organisational goals describe the overall goals • categories of ticket available (output)
that the organisation has and information • price of tickets for each category
system goals describe what each system within (output)
the organisation is trying to achieve. • requested tickets (input)
• customer details (input)
Information system objectives perform the • billing details (input)
same purpose as organisational objectives. • ticket confirmation (output).
Some examples of information system
objectives are shown below: The transactions that occur in such a system
would be roughly in this order. You will also
• Ensure that all data is securely notice that the billing of the customer would
archived and that no data is lost. possibly be handled by another system that
• Provide a response time to user just handles these processes (verifying credit
queries that is acceptable to all users. cards, etc.) passing information to and from
• Perform in a stable and reliable fashion the ticketing system.
for 99.5% of the time (up-time).
• Process up to 500 reports per month. Another type of information system is used by
• Store up to 10,000 transactions per the management of organisations to help them
week. make decisions. An example of these sort of
• Reduce the number of user generated systems might those that assist with decisions
validation errors by 50%. such as whether a stock item is worth keeping
in a store or how classes of students are
Types of interactions (inputs and outputs) performing in a subject from year to year.
generated by information systems
Let’s look at an example of an information
The inputs and outputs from an information system that is comparing the results of
system will be determined by the type of students in a range of subjects. A system such
information system concerned. Information as this (or more correctly, many such systems),
systems are categorised by their purpose and is used by VCAA to analyse student data and
function. The study of the different types of respond to the needs of students.
information system can be a detailed topic on
its own and for that reason, we will simply The types of inputs and outputs processed by
examine how they differ functionally. this system are:

The information systems that deal primarily • student enrolments for the year
with the processing of transactions are (input)
grouped together as their function is similar. • GAT results (input)
These systems are categorised by their ability • SAC results (input)

58 | P a g e
Chapter 4: From planning to analysis

• moderated scores (output) There are a number of ways in which this can
• results of previous years (input) be done. Data can be gathered through the use
• analysis of the year’s cohort to that of of interviews, by observing the operation of
previous years (output). the information system and by conducting
surveys or issuing questionaries.
This is a simplification of the overall processes
involved, but hopefully this illustrates how an Before discussing each of these in more detail,
information system takes in data of many it is useful to understand the distinction
different kinds, processes it and then produces between quantitative and qualitative data.
information tailored to the needs of the
organisation. Quantitative and qualitative data

Determining the solution All data or information can be classified as


being either quantitative or qualitative.
requirements Quantitative data is data that can be easily
processed in a statistic manner. Usually this
One of the first steps in undertaking the means that the data is composed of definite
creation of a software solution may well be the numbers. Qualitative data is data that consists
determination of what the actual problems of descriptive details, usually the type
are. It will be difficult to begin the planning of gathered in surveys or interviews.
the solution if the scope and context of the
current problems are not fully understood. Surveys and questionnaires
What does the solution need to provide?
There are a variety of ways in which this can be Surveys and questionnaires are useful to
determined and they all come under the gather data from employees, clients,
heading of ‘data collection’. management, the community and other
stakeholders. This data will often include
Determining the solution requirements is the satisfaction levels with the current system and
first activity listed under analysis in the PSM. opinions about how to solve the present
difficulties. This form of analysis could also be
Data collection presented as a test or some form of
assessment, if this were appropriate.
Organisations are typically motivated to
change their information systems in response Surveys and questionnaires can take a number
to problems that may be present. One of the of different forms. Structured forms such as
challenges is defining these problems in ways multiple choice or scaled responses are
that are quite specific. For example, employees commonly used as these allow data to be
may complain that the system does not processed easily, but unstructured forms in
perform well during peak times and which opinions can be freely expressed also
management may not be able to produce have their place.
reports in the desired format. The PSM can be
used to specifically identify the nature of Collecting data in this way can have a number
information problems. of advantages. As surveys and questionnaires
are easy to administer and not expensive, the
Data collection is an essential part of the overall satisfaction with the system can be
analysis phase of the PSM. This is where a easily gauged. Despite this, processing
systems analyst will collect all of the data on responses can be time consuming unless an
the existing system in order to document its automated method is used. In addition to this,
operation and come to a better understanding retrieving completed responses can
of the problems that are being encountered. sometimes be difficult, as those surveyed often
feel that there is little incentive to do so. It can

59 | P a g e
Software Development: Core Techniques and Principles 4th edition

also be difficult to determine who has filled out observations so they can be conducted
a survey and therefore how legitimate simultaneously.
responses are.
Observations can be time consuming and
Interviews and focus groups expensive to carry out. Their strength is that
they provide a view of the system that is
Interviews and focus groups can be used to unbiased by the opinions of those within it. It
obtain qualitative data on the use of the is important to recognise that observations
system and attitudes to it. They have the only provide a snapshot of the system within
advantage of providing richer data than that the time frame of the observation. For this
obtained through the use of surveys alone. In reason, the chosen time frame is very
fact, sometimes interviews and focus groups important. It is also important to acknowledge
may shed light on responses that have been that people being observed often behave
retrieved through other means but were not differently, which may have an effect on the
fully understood. Smaller focus groups are data that is gathered.
preferable to larger ones, as in a smaller group,
those participating may be more inclined to Solution requirements
respond. Documentation of interviews or focus
groups can be tedious, but technologies such The requirements of a software solution can be
as scribe pens, mp3 recorders, web-cams, etc. classified as either functional or non-
can make this task easier. functional. Often the difference between
functional and non-functional solution
Observations requirements is described as being the
difference between what a software solution is
Observations provide a method by which data going to do as opposed to what qualities it has.
can be gathered on how the system operates
and how it is used, in an unobtrusive manner.
Observations can be conducted in a very
structured way with specific actions being
recorded using a predetermined method. For
example, data could be gathered on how a
system operates during a stock-take, with
observers recording the interaction between
operators and a particular software package.
Alternatively, observations may involve an
observer simply recording what is occurring in
a system as a whole. Typically, a number of
individuals are employed to conduct the
Figure 34: User-friendly music software

Functional solution requirements

Functional requirements are directly related to


what the software solution is required to do.
What inputs will it receive, what outputs will it
generate and how will it behave? The
functional requirements of the solution are
often determined after an in-depth analysis of
what is currently in place and what is required.
Figure 33: Customer survey Much of what we will discuss in the next few

60 | P a g e
Chapter 4: From planning to analysis

chapters will be concerned with ways in which use at an organisation where it will be used
this analysis can be performed. across a number of different platforms,
dividing the solution up between the logic and
Non-functional solution requirements the interface will also save a great deal of
money and time.
Non-functional requirements are related to
the characteristics of the software solution, Robustness
such as usability, reliability, portability,
robustness and maintainability. Let’s look at Robustness is a measure of how well a
each of these in turn. software solution responds to poor use or
inputs. When designing any software solution,
Usability it is very important to validate all of the input
to ensure that the software works correctly,
Usability is related to how easy it is for users to but it is equally important to ensure that the
make use of a software solution. It isn’t a program continues to run and will not fail with
measure of how happy or content users are an incorrect input. A software solution that
with a solution, but rather is measuring how was not robust, for example, would crash if a
clear, intuitive, logical and accessible a user entered some incorrectly typed data into
software solution is. A so called user-friendly one of its inputs.
software solution may work in one
environment but fail in another – as the Maintainability
requirement of user-friendliness is strongly
dependent on the users themselves. This is an Maintainability is mostly concerned with how
important consideration. It is paramount to easily a software solution can be fixed if and
have an understanding of the users, their when problems occur. In addition to this, the
relative skill levels, expertise and needs in term maintainability is also used to describe
order to write a software solution that they how easy it is to make modifications or
would consider user-friendly. additions to the software and how well the
software copes with changes made to its
Reliability environment.

Reliability is a measure of how long a software There are measures that are used in the field
solution can perform its required functions in of software development to gauge the
the operating environment in which it was maintainability of a software solution over
designed. Often this is expressed as a time. The key reason why this is a very
probability. important measure is that there is a definite
correlation between the maintainability of a
Portability software solution and time. Maintainability is
generally excellent when a software solution is
Portability is the ability for a software solution first implemented, but it decreases as time
to be moved from one environment to goes on. As the process of maintaining a
another. In Software Development, portability solution can be costly, there is often a time
is generally achieved by separating the logic when the cost of creating a new software
code from the interface code. If such a solution is more viable than maintaining an old
separation is performed, a software solution one. Predicting or detecting when this point
can be migrated to a new environment with has been reached can save an organisation a
changes only being necessary to the interface great deal of money.
code (which will save money and time). What’s
more, the function of the software solution will
not have changed from one system to another.
If a software solution is being developed for

61 | P a g e
Software Development: Core Techniques and Principles 4th edition

Identifying the constraints on the solution Technical factors: speed of processing

It is very important to understand the The speed of processing required of a system


constraints that are placed upon a solution. will vary depending on what the system has
Without an understanding of the constraints, been built to do – an air traffic routing system
the success of the solution may be in jeopardy. will certainly need to process a large amount of
data quickly as opposed to an online diamond
Economic factors purchasing website.

Probably the most common and obvious The speed of processing of a system is often
constraint is that of cost (economic). This will exhibited in its response rate. The response
vary quite a bit from situation to situation and rate is a measure of how long it takes for
will depend on the size of the organisation, the certain actions within the system to be
budget and the scale of the project. completed. If the response rate is too long,
users may get frustrated with the system and
Often when reflecting on the cost of a project, will view the software product negatively.
software developers will focus on the Similarly, a low response rate could result in
resources required, whereas the main cost is loss of customers and loss of money, as
generally time. Development time is valuable completing a task may take longer than it
and it may be that the time frame is set by the should.
production milestones. While this is a factor,
the cost of having a software developer or a Speed of processing is often measured in IPS
project team working on a solution over a (instructions per second) which in turn
period of time, may mean that the overall cost becomes MIPS (million instructions per
of a project is much more expensive that an second) and GIPS (giga instructions per
organization realises. This is especially true second). The number of instructions that a
when organisations are using their own system can process is often not the best way to
software developers who are on staff, as they represent its speed, as the speed can be
aren’t considering what it would cost to significantly influenced by factors such as the
employ someone on a contract basis CPU architecture and memory structure.
specifically to create the solution. When
employing a software developer on a contract Technical factors: capacity
basis, the cost of the software developer’s time
is transparent, and may cause an organization The capacity of the current information system
to reconsider their options or their budget. is a significant constraint on any proposed
software development. Capacity can include
Technical factors simple factors such as the amount of available
hard drive and/or network storage or the
Security features may be needed or may be amount of memory within the computer
part of the existing infrastructure and need to systems that will use the software solution.
be catered for. Compatibility with existing The network infrastructure will place a limit on
hardware and software may also be a factor as the amount of data that can be transferred
well as the speed of processing required (or simultaneously. It may also place a functional
available) and the capacity of the existing or physical limit on the number of people that
system. The solution may need to be can use the software solution at the same time.
developed for multiple platforms as well.
Technical factors: availability of
Let’s expand upon some of these briefly. equipment

The availability of equipment may be a


constraint in a number of ways. It might be the

62 | P a g e
Chapter 4: From planning to analysis

case that the software solution is being users in the system who have special needs
developed for an information system that is such as sight or hearing impairment or a
being upgraded and does not have the disability.
required hardware and software components
as yet. It might also be the case that the current Legal requirements
hardware and software will not be upgraded at
all to cater for the new solution, so the Without discussing the specifics of the
software developer must work within the legislation concerning copyright and privacy, it
reasonable limits of their capacity. This leads is easy to comprehend how legal requirements
into the next technical constraint that needs to within organisations act as a constraint on a
be considered – compatibility. software solution. For example, an
organisation’s privacy requirements may
Technical factors: compatibility dictate that a software developer needs to
protect the data within a software solution and
As with the previous constraint, a new include security features that prevent
software solution will often need to work unauthorised access. This will undoubtedly
within the bounds of the existing system. This take time and add to the cost of the
includes being compatible with the existing development. Chapter 9 explains the relevant
components. On a base level, a software legislation in detail and discusses some of the
developer needs to be aware of the operating issues surrounding it.
system that the existing system is utilising. On
top of this, a software solution will need to An organisation also needs to consider the
send data to and from related systems and ownership of key components. Now that we
databases and will need to be compatible with are in the era of cloud computing, key software
the file formats that these systems use. The solutions may well be located off-site or indeed
specifications of the devices within the system in another country. It is important that an
will need to be catered for so that the software organisation has a clear agreement with the
solution runs in an optimal fashion when being software solution provider that ensures the
used by these devices. ownership of their data remains with them.

Technical factors: security Usability factors

An organisation may have specific Another constraint on the software solution is


requirements in terms of security for the new usefulness. This is not to be confused with the
software solution. Features such as encryption ease of use of a solution.
may need to be included and data may need to
be stored and accessed from secure network Usability factors: usefulness
servers within the building or off-site. A
software solution may need to be able to be Usefulness is the measure of the ability of
integrated with a SSO (single sign on), which something to satisfy need. As a constraint, this
essentially means that a user can access the may be the requirement from users that a new
system once they have logged in to the main software solution performs in a particular way
portal of the organisation. or offers a number of core features. This may
be a constraint as a direct result of an
Social factors evaluation of the previous system that may
have identified some specific concerns that
Constraints on the solution of the ‘social’ type users have.
can include the level of expertise of users,
availability of technical support staff, the time
available to develop the solution and the
availability of equipment. There also may be

63 | P a g e
Software Development: Core Techniques and Principles 4th edition

Usability factors: ease of use It is quite possible that a solution is being


developed for just one part of an organisation.
The ease of use of a proposed software In a case such as this, the constraints would be
solution is a constraint as it may place some focused just on that one part and would
specific requirements on the development of probably not affect the rest of the system
the user interface. There may be aspects of the (although the rest of the system may have
development that need to concentrate on the imposed some constraints on the solution).
function of the new system when used on the The scope of the solution in this case would be
types of devices that users have. There may limited to the part of the organisation being
have been difficulties in the past with the examined.
number of steps or complexity of tasks within
the software. There may have also been issues Determining the scope of the solution is the
with the ease with which data was entered and third activity listed under analysis in the PSM.
validated.
Project management
Identifying the constraints on the solution is
the second activity listed under analysis in the The field known as project management is one
PSM. that can be undertaken by a person within a
project as a full-time task. While this is true,
Determining the scope of the having project management skills is essential
to the successful delivery of software solutions
solution on time and to specification. The level of
project management required is influenced by
The scope of the solution is similar in ways to factors such as the size and complexity of
the constraints. The factors that may have projects being undertaken. For example, there
played a part in forming the constraints of the may be a date by which the project needs to be
solution may likewise help to form the scope. operational or there may be times when the
In this sense, the scope of the solution may be system will be unavailable or the company
directly determined by the constraints or the
scope (most likely) will fall within the
constraints.

The scope defines what the software solution


will do and what it won’t do. It defines what
the boundaries of the solution will be and what
benefits there will be for users. In a way the
scope identifies the responsibilities and
parameters of the solution. Benefits are often
stated in terms of efficiency and effectiveness
measures.

For example, an organisation might be wishing


to develop a software solution for their
accounting department. The accounting
department, however, might stipulate that the
system needs to be ready for the start of the
financial year. The original scope of the project
may have been too large to fit into this
constraint, so the scope of the solution is
reduced to ensure that there is enough time to
get it finished for the start of the financial year. Figure 35: An example of a Gantt chart

64 | P a g e
Chapter 4: From planning to analysis

closed. These would be examples of project when a certain stage must be reached.
constraints on the proposed software solution These are called milestones. An example of a
and they may also have an impact on the Gantt chart is shown in Figure 35.
scope. One of the main tasks that a project
manager performs is breaking the project In the Gantt chart shown in Figure 36, you can
down into a series of key tasks that need to be see milestones at points in the project timeline
accomplished. This list of tasks is important for represented as diamonds. The separate tasks
software developers, as it not only helps to are drawn from their starting date to their
structure how the software solution will be put finishing date and have arrows leading to tasks
together, but it helps to divide a project up in a for which they are a predecessor (that is, a
team development environment. dependency). The long horizontal bars are
used to represent stages of the project as
Project management enables targets to be opposed to separate tasks that need to be
defined, resources to be managed, deadlines completed.
to be met and changes in circumstances easier
to deal with. It is extremely useful in Identifying tasks and resources
determining budgets and is often used to
present a project proposal for tender. Time allocation and sequencing are of prime
Although there are a number of tools that can importance in project management. What are
be used to make this task easier, we will the main tasks that need to be completed to
discuss one of the main ones. take a particular project from start to finish?
The identification of these tasks is vital as it will
Gantt charts often give those involved in the project a good
sense of the scope of what needs to achieved
A Gantt chart is a type of bar chart that is used as well as how easily the tasks can be
to show how a project’s tasks progress from distributed amongst those involved. Not only
one to another. It was originally created by this, but some tasks will be dependent upon
Henry Gantt, a mechanical engineer, who first others being complete first (known as a
published the chart in 1910. A Gantt chart is dependency, as mentioned before).
able to show which tasks can be done
simultaneously, and which tasks depend on It is important to remember that a project is
others to be completed before they can be not just about completing a number of tasks in
commenced. These are referred to as a set amount of time. It is also about managing
dependencies. It also shows times in the resources. Resources can include personnel,
hardware, software as well as a
variety of other items such as
hiring venues for consultations.
These resources are only
available in finite amounts and
so the construction of a project
management timeline must
account for the shared usage of
them.

Planning for things to go wrong


is almost as important as
planning itself. If sufficient slack
is built into the project timeline
‘just in case’, then unforeseen
problems are less likely to cause
Figure 36: Recording progress against the initial plan the project to become late. It is

65 | P a g e
Software Development: Core Techniques and Principles 4th edition

equally as important to have contingency plans project plan as it is being implemented. By


in case a major difficulty occurs. A contingency keeping a close eye on how the project is
plan may be one in which the functioning of progressing compared to the initial plan, it is
the old system can be returned for a period of easier to respond in order to accommodate
time or a manual system can be put into place these changes.
so that business operation is not lost.
As a project progresses, a good project
Using a tool such as a Gantt chart allows all manager will regularly examine the plan and
tasks to be visually represented, resources adjust tasks for the times that they are
allocated to each task and their progress completed or estimated to be finished. This
monitored. This is of prime importance, as will in turn have an effect on the overall
project timelines will often change for a variety timeframes within the project and may mean
of reasons as the availability of resources can that some resource bookings need to be
change, the completion of tasks can take modified (outside contractors, system testing
longer than expected (and have a flow on times).
effect to subsequent tasks) or critical deadlines
are changed. A good project manager will also make
annotations to tasks as changes are made or
additions catered for. They will
keep a comprehensive log of
changes so that when they need
to make reports to their
superiors or the organisation,
they are able to explain the
variations to the initial plan that
have taken place and the impact
of these changes.

Critical Path

Once the initial draft of a project


management plan has been
created, a project manager can
examine the tasks, resources,
times allocated and
dependencies and gain an
overall view of the project.
Various milestones will fix the
Figure 37: A section of a Gantt chart, and then the same chart
timeline at points in the plan,
with the critical path highlighted.
but other than this, there may
still be some flexibility in regards
to how long tasks will take and the order in
Recording progress which they will be done.

Recording the progress of a project is almost as Of specific interest to a project manager, will
important as making the initial plan. Projects be the critical path. The critical path in a
almost never progress from start to finish project is defined as the longest path from the
without a variation in times, resources or beginning to the end of the timeline. What this
schedules. While it is important to allow for represents is the path along which there can be
this as the project plan is being constructed, it no delays that do not affect the completion
is equally important to monitor changes to the date of the project.

66 | P a g e
Chapter 4: From planning to analysis

We have just been discussing how the progress The waterfall model
of a project will change over time and how
important it is to reassess tasks and resources The waterfall model is a development model
and make adjustments as required. The critical that moves through the stages of the PSM in
path of a project is important as any changes order. Each stage is started and the activities
to the tasks on the critical path will mean that within each are completed in order, before
the project will be delayed. A project manager moving to the next stage. On first reading of
that is aware of this and making frequent the PSM, it would seem that this is the most
adjustments to their timeline, on recording a logical way for it to be implemented, though in
delay to a task on their critical path, will need practice, this is not the case. The reason why
to recognize this and make a future adjustment this model is called the waterfall model, is that
to another task on their critical path to once stages or activities are completed, they
compensate. Tasks that lie on the critical path cannot be revisited. This can be illustrated in
should have dependencies attached to them this way.
and should have no ‘slack time’ (that is, time at
the end of the task in which the task can be Figure 38 only shows the 4 stages and not the
delayed without penalty or negative effect). activities, but the concept should be clear. The
waterfall model does not allow for any overlap
Development model of activities or stages. Once an activity or stage
is completed, the model moves on to the next
approaches one. Likewise, stages or activities cannot be
revisited. Once they have been concluded,
While time is in an important factor in they are considered finished.
determining milestones and the duration of
tasks within a project
management plan, another
important consideration is the
development model that will be
used.

In the first chapter we introduced


the PSM and discussed the four
stages within it and their
associated activities. The natural
way to envision how this would be
implemented is in a purely
sequential way. That is, starting at
the analysis stage, completing the
activities in order, and then Figure 38: Waterfall model
moving on to the design stage.
This linear progression makes
sense and is often referred to as the “waterfall Advantages:
model”. • Simple, linear progression through all
of the stages and activities of the PSM.
In fact, the development models that follow Easy to understand and use.
can be applied to any methodology that is • Specific resources and time frames are
being used, and in some cases, will over-ride easy to allocate as the number of steps
the stages and activities within it. The PSM are known.
lends itself to be implemented using the • As the stages do not overlap, it is easier
waterfall model, and this course is set up for it to concentrate resources on each task
to be used in this way. and complete it in the optimal time.

67 | P a g e
Software Development: Core Techniques and Principles 4th edition

• Easier to keep track or milestones and development model can also be susceptible to
ensure they are met. ‘scope creep’. A hallmark of this model is the
• Works best for projects of a small client interaction. While this is a very positive
scale. thing, it does allow the client to add features to
the original design or to change the scope once
Disadvantages: the process is started, which is an advantage of
• Once an application is in the this model but one that needs to be
development or evaluation stage, monitored.
changes cannot be made to the design
(if these become apparent). A popular way of implementing an agile
• Not suited to large scale projects or development model is using Scrum framework.
ones that will take a long time to In the Scrum framework, cycles are defined as
complete (there is the potential that ‘sprints’ and are each given a specific (non-
scope or constraints may change). moveable) time-frame (typically several
• A working prototype or beta version of weeks). At the start of the sprint, the client
the application will not be available till meets with the software development team to
late in the development process. discuss their needs (which can be specific or
general, large or small). All of these needs are
listed are developers are assigned to them.
The agile model Each day, the development team meets (in a
scrum) for 15 minutes to discuss their
progress. As a team they problem
solve impediments, discuss
constraints and scope and share their
progress. This process is managed by
one member of the development
team who has been assigned the role
of ‘scrum master’. The scrum master
also ensures that everyone
understands the process and the time
left in the sprint. At the conclusion of
the sprint, the client is shown the
progress of the work so far and new
goals are set for the next sprint.
Figure 39: Agile model

Unlike the waterfall model that works


methodically from start to finish, the agile
model makes a quick start and aims to rapidly
perform all of the stages of the PSM (or the
methodology being used) over and over
gradually improving on the functionality and
scope of the final product.

While agile processes are very useful for


achieving outcomes that are favourable to all
stakeholders, they are prone to causing
projects to run over time. For this reason, strict
project management is a must. The agile
Figure 40: Using a Scrum in the agile model

68 | P a g e
Chapter 4: From planning to analysis

The spiral model has four phases:


planning, risk analysis,
engineering and evaluation. A
software development project
repeatedly passes through these
phases (which are called spirals
in this model). The baseline
spiral, starting in the planning
phase, requirements are
gathered and risk is assessed.
Each subsequent spiral builds on
the baseline spiral.

Planning phase: Requirements


are gathered during the planning
Figure 41: A Scrum board in action
phase (that is, an SRS).

Risk Analysis: In the risk analysis phase, a


Advantages: process is undertaken to identify risk and
• There is a quick turnaround from the alternate solutions. A prototype is produced in
initial data collection with the client to response to this.
the delivery of a software product. This
promotes discussion between the Engineering Phase: In this phase software is
software developer and the client and developed and tested.
leads to changes and enhancements
being made to the design for the next Evaluation phase: This phase allows the client
iteration. to evaluate the output of the project to date
• Clients are much more included in the before the project continues to the next spiral.
development process.
• Agile development is able to adapt to Advantages:
changing circumstances in relation to • As spiral is focused on risk analysis, the
the design brief. avoidance of risk is enhanced.

Disadvantages:
• The scope of the project may
not be fully evident at the
beginning.
• The project can quickly go off
track, especially if the client
keeps changing their mind or
circumstances keep changing.

The spiral model

The spiral model of development can be


seen in ways to be a merger between
waterfall and agile. It maintains the
iterative style of the agile model, while
placing an increased emphasis on risk
analysis.
Figure 42: Spiral model

69 | P a g e
Software Development: Core Techniques and Principles 4th edition

• Good for large projects that have


critical elements.
• Additional functionality can be added
at a later date.
• Software is produced early in the
development cycle.

Disadvantages:
• The risk analysis component is a highly
specialised one and requires a
software developer that has
experience in this area.
• Doesn’t work well for small projects.
• Due to the complexity of the phases
and the model itself, it can be costly to
implement.

70 | P a g e
Chapter 4: From planning to analysis

Context Questions

1. What should be the first step in planning a project?


2. What is the difference between qualitative and quantitative data?
3. Describe one advantage and one disadvantage of collecting data via observation.
4. Give three examples of constraints that can be placed on a solution.
5. What is the difference between functional and non-functional requirements?
6. What is meant by the term ‘robustness’ and what can be done to make a solution robust?
7. What is meant by the term ‘portability’?
8. What is a focus group and who would be included in such a group?
9. What is the difference between a goal and an objective?
10. What is the purpose of a mission statement?
11. What are the four components of an information system?
12. What does it mean to say that the PSM can be implemented using the agile method?

Applying the Concepts

• See if you can create a Gantt chart to describe the process of building an average house.
Divide the project up into tasks and try to visualise the dependencies between them as well
as what resources will be needed and when. What tasks would lie on the critical path?
• Try to write a mission statement for your school and then compare what you have written to
the actual one.
• Select a number of devices and list as many functional and non-functional requirements as
you can for each.
• Imagine you are making a film using the agile method. What would this process look like and
what difficulties could you foresee? How would this compare to using the spiral methodology?

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

List and describe ways in which data can be collected and the benefits of each

Explain the difference between organisational goals and objectives

Explain the difference between organisational goals and objectives and information

system goals and objectives

Create a project management plan for a software solution

Determine the critical path in a project management plan

Describe a number of functional and non-functional requirements for a proposed

software solution

Discuss the agile, waterfall and spiral development models and their differences

For a proposed software solution, explain the constraints and the scope

71 | P a g e
Software Development: Core Techniques and Principles 4th edition

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
Which of the following is not a valid organisational goal?
A. Be profitable
B. Ensure 99% network uptime
C. Manufacture quality products
D. Provide excellent customer service

Question 2
Which of the following is not a valid organisational objective?
A. Reduce customer complaints by 60%
B. Maintain an employee retention rate of 98%
C. Provide care and support for the staff within the organisation
D. Make a profit of $10K per calendar month

Question 3
An agile process is one that:
A. Begins at the start and continues through to the end
B. Can begin and end anywhere in the cycle
C. Is very cost effective
D. Can revisit previous stages to refine the solution further

Question 4
A milestone in a Gantt chart is used to show:
A. How many people will be working on a task
B. The resources needed for a task
C. How tasks flow from one to another
D. A point in time when a certain stage must be reached

Question 5
A mission statement is:
A. A lengthy document outlining all business practices
B. Only for public relations purposes
C. A blurb for the website stating name and location and ‘about us’
D. A statement of the ultimate goals of the organisation

The following Gantt chart is required for Questions 6, 7 and 8.

72 | P a g e
Chapter 4: From planning to analysis

Question 6
The critical path in the Gantt chart shown above has a duration of:
A. 8 days
B. 9 days
C. 10 days
D. 7 days

Question 7
If Task G is delayed by 1 whole day, the overall effect on the project will be:
A. There will be no change in the completion date
B. The project will be delayed by a whole day
C. The project will be delayed by two whole days
D. The project will finish one day earlier

Question 8
Task H must wait for both Task E to be completed, before it can begin. In this context, Task E is known
as a:
A. Critical task
B. Predecessor
C. Dependent task
D. Resource

Question 9
Paul employs Maxine to create an App for his business. The first thing that Maxine sets about doing is
creating a project plan. When she presents this plan to Paul, the first thing that is obvious to him is
that she has left a gap between the end of one task and the start of the next one. Explain why she has
done this and the effect it may have on the finish date of the project.

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

Question 10
Julie shows some of her proposed solution screen designs to Geoff and asks for his feedback. In doing
so, she mentions to him that he needs to ensure that his feedback is accurate as she does not believe
in agile development.

a. What is agile development?

___________________________________________________________________________

___________________________________________________________________________
1 mark

73 | P a g e
Software Development: Core Techniques and Principles 4th edition

b. Discuss the differences between agile and waterfall development.

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________
4 marks

Question 11
What is a mission statement?

__________________________________________________________________________________

__________________________________________________________________________________
1 mark

Question 12
Geoff is in charge of a large software implementation in his workplace, and has constructed a detailed
Gantt chart to manage the process. Two weeks into the implementation, Geoff’s colleague Regina has
noticed that he has not updated the chart. When she questions him about this, he states that he has
no intention of modifying the chart at all. List three problems that could arise as a result of this.

Problem 1: ________________________________________________________________________

__________________________________________________________________________________

Problem 2: ________________________________________________________________________

__________________________________________________________________________________

Problem 3: ________________________________________________________________________

__________________________________________________________________________________
3 marks

Question 13
WeCare is a support service for those that need urgent maintenance work to be done in and around
their home or residence, but cannot afford to pay for it. Many of the clients are either elderly or out
of work. Write two goals that would be suitable for the WeCare information system.

Goal 1: ___________________________________________________________________________

Goal 2: ___________________________________________________________________________
2 marks

74 | P a g e
Chapter 4: From planning to analysis

Sample Examination Answers

Question 1
Answer: B

Organisational goals are, by definition, not quantitative in nature.

Question 2
Answer: C

Objectives always have measurables associated with them.

Question 3
Answer: D

Question 4
Answer: D

Question 5
Answer: D

Question 6
Answer: B

Longest path from beginning to end.

Question 7
Answer: B

Task G is on the critical path, therefore a delay of 1 day will result in a delay of 1 day for the entire
project.

Question 8
Answer: B

Question 9
The reason Maxine has left a gap between the end of one task and the start of another is to allow for
any unforeseen events or delays in the project. By doing this, the chances that delays during stages of
the project will also delay the finish date of the project, are minimised.

Question 10
a. Agile development is the process by which stages in the PSM are cycled – beginning and
ending with consultation and feedback with the client.
b. The opposite is Waterfall development – in which the stages are done one after another
with no repeating or cycling back. While waterfall development is quicker than agile
development, it also does not respond to feedback (or give the client as many opportunities
to provide this). Agile development is more thorough, but takes longer (and therefore costs
more). Clients tend to be much happier with agile development as they are kept ‘in the loop’
much more.

75 | P a g e
Software Development: Core Techniques and Principles 4th edition

The question stem ‘discuss’ means that you should explain the pros and cons of each of the things that
have been mentioned. Another clue to the response that the examiners are looking for is the number
of marks. As this is a 4 mark question, it is important to ensure that you list 4 distinct points.

Question 11
A mission statement is a statement about the goals of the organisation and why it exists.

Remember that a mission statement is often very broad and while it is important, it does not go into
specific detail about objectives.

Question 12
a. Milestones may not be met.
b. If the project is not running exactly to time, it may mean that resources that are booked for
key parts of the implementation will not be available (given that they will be required later
or earlier).
c. Parts of the implementation may begin (at their allotted time) without the pre-requisite
tasks they need to be completed having been finished. This may lead to confusion and
wasted time.

Question 13
Goal 1: Record the jobs that clients register with WeCare.
Goal 2: Keep track of the progress of jobs.

Other goals could be equally valid involving communicating with trades-people or maintaining the
register of clients.

76 | P a g e
Chapter 5
Analysis tools

The chapter covers Unit 3: Area of Study 2 key knowledge:


• Data and information
o KK2.6 Features and purposes of software requirement specifications
o KK2.7 Tools and techniques for depicting the interfaces between solutions, users and
networks, including use case diagrams created using UML
o KK2.8 features of context diagrams and data flow diagrams

Key terms: Unified Modelling Language (UML), use case diagram, actor, use case, system boundary,
association / communication, includes, extends, Data Flow Diagram (DFD), entity, data flow, data store,
process, DFD level, context diagram, Software Requirements Specification (SRS).

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

encompasses a large variety of diagram types


Analysing an information that can be used to represent many aspects of
problem software development. Originally conceived in
the mid 90s, UML had a major revision in 2005
There are a large number of tools that can
be used to analyse an information problem
in an organisation. The use of these tools
makes it easier to identify the interfaces
between solutions, the users of the system
and the network within which the solution
needs to operate. The first of these tools is
called a use case diagram (UCD) and these
can be graphically represented using
Unified Modelling Language (UML).

Unified Modelling Language


(UML)
Figure 43: A selection of diagram types from within
Unified Modelling Language (UML) is a UML
standardised modelling language that
Software Development: Core Techniques and Principles 4th edition

and the current standard is ‘UML 2’. The Use Case


important aspect of UML is that each diagram
type is represented and utilised in a strictly A use case is a representation of a function
defined way allowing references to be easily within the system generally beginning with a
created and diagrams easily compared. verb. A use case is shown as an ellipse with the
function written inside it. As use case diagrams
Use Case Diagrams (UCDs) are often the first tool that is used in coming to
an understanding of how a system works, each
A use case diagram is essentially a structured use case is often quite broad.
‘story’ depicting the functional aspects of a
system within an organisation. It is used to
depict the goals of the system and how people
and organisations interact with the system to
achieve these goals. It can be helpful to
imagine a use case diagram as depicting
workflow, but it is much more than this. It can
be used at a number of stages in the PSM – for
example, it can be used as a training aid once a
software solution has already been created. In
the context of this chapter, we will discuss how Figure 45: Use case
use case diagrams are used in analysis.
System Boundary
A use case diagram has a number of elements.
Let’s look at what these are.
A system boundary is a rectangle that is drawn
around all of the use case symbols and
Use case elements represents the confines of the system. The
actors are outside of the system boundary.
Actor Although there will typically be only one
system boundary, it is possible to have sub-
An actor is the term used to denote a role that systems represented by additional system
is external to the system. They are usually boundaries inside of each other.
stated as a noun. The actor is often a person
or organisation. Stick figures are often used to
denote actors, but in the case of organisations
this can be misleading – thus boxes can also be
used. Actors connect to use cases.

Figure 46: System boundary

Associations / Communications

Lines showing the links between a use case and


Figure 44: An actor an actor(s). A use case can be carried out by
many actors and an actor may carry out many
use cases.

78 | P a g e
Chapter 5: Analysis tools

teachers, students and parents. Figure 50 on


the next page shows the completed UCD.

The next stage is to draw the system boundary.


In this example, the system boundary is the
student attendance system.
Figure 47: Association
Next we draw the use cases themselves within
Includes the system boundary. These are the broad
functional aspects of the system. Connections
Dotted lines, with arrowheads and the text are drawn between the use cases and those
“<<includes>>” showing the links between use actors that either supply data or retrieve
cases. Usually it indicates that the functionality information in some way. The description of
of a use case is used in another use case. each use case should explain what sort of
transfer of data or information is occurring.

Notice that in some cases, the connections can


be drawn as arrows instead of as plain lines
(although the example shown doesn’t have any
connections like this). Arrows indicate when an
actor has initiated the use case and is an
Figure 48: Includes optional element that can be included. Not all
use cases can feature a connection such as this
Extends but they do convey extra information that is
quite useful when interpreting the diagram.
Dotted lines, with arrowheads and the text
“<<extends>>” showing that the functionality A use case diagram provides a simple and easy
of a use case contributes to (or enhances) the to read representation of the functional
functionality of another use case. This can be aspects of a system. It is by definition quite
conditional on an actor being a member of a broad. Use case diagrams are not intended to
particular group. For example, “{is be the only tool used in analysing a system and
administrator}”. the danger is that if they are used in this way,
they can become bloated and difficult to
interpret. They are best used as a starting point
before using a more detailed analysis tools like
data flow diagrams.

One of the main criticisms of Use case


diagrams is that they can only be used to
Figure 49: Extends represent certain aspects of the way a system
operates (or is designed to operate). It does
not represent algorithmic processes well at all
Use case diagram examples and instead provides an overview. In addition
to this, there is no standard that is fully agreed
Student attendance system upon as the Use case diagram standard. For
this reason, it is important to be consistent in
The first step in the construction of a use case the diagrams that you are creating.
diagram is determining how many actors are
interacting with the system. For the student
attendance system being analysed in this
example, there are four actors: administrators,

79 | P a g e
Software Development: Core Techniques and Principles 4th edition

Figure 50: Use Case Diagram examples

80 | P a g e
Chapter 5: Analysis tools

Martial arts training centre

The Martial Arts Training Centre example also


shown in Figure 50, is one that consists of four
actors: students, instructors, the secretary and
the accountant. Students contact the centre to
join and this includes being added to a training
session. The accountant generates invoices Figure 51: Entity
and sends them to students who arrange for
them to be paid. The secretary creates rolls of Data flow
all of the students in each training session, and
these are given to the instructors. Students A data flow is represented by an arrow from
often change their session times or notify the one symbol to another and generally begins
centre when they will not be attending. This is with a verb. There are some invalid
either done in person or by phone. combinations of symbols which can link the
two ends of the data flow, and these are
Data Flow Diagrams (DFDs) discussed in the next section. The data that is
being transferred should be written on top of
Data flow diagrams (DFDs) provide a method (or next to) the data flow. This data could be
by which the movement of data through a specific (e.g. ID number, surname) or it could
system can be visually represented. In be general (e.g. order details).
particular, a DFD makes it easy to track the
progress of data as it is input into a system, the
ways in which it is processed, what information
is produced, how it is stored and how the
information is ultimately used.

The first step in learning and understanding


how to create DFDs involves becoming familiar
with the symbols that are used and the rules
that govern their use. The following section
describes the symbols that are used in DFDs
and also describes what they contain and the Figure 52: Data flow
way they should link to other symbols.
Data store
DFD Symbols
A data store symbol represents a storage
Entity
location within the organisation. It is labelled
with the name of the file or the location. A data
An entity is a person, agent or company
store only holds data and does not process that
outside the organisation being examined, that
data. It is also possible to represent manual
provides some data to the system or receives
data stores such as filing systems or timesheets
some information from it and is normally
in this way.
labelled with a noun. It is important to
understand that an entity does not process
information in any way. Note also that an
entity symbol should only be used for those
people or organisations that are outside of the
organisation. In other words, anyone working
for the organisation should not be represented
as an entity. Figure 53: Data store

81 | P a g e
Software Development: Core Techniques and Principles 4th edition

Process

A process symbol is used to describe any


activity that transforms data and generally
begins with a verb. The activity could be
performed by someone within the
organisation, or it could be an automated one.
Processes are numbered in the order in which
they occur, and contain a description of the
task in question.
Figure 55: Entity-entity data flow

Data cannot flow between these symbols:


Data Store – Data Store

If a data store is connected to another data


store like this, there is no description of the
process that is occurring. For this sort of
transfer data to exist, a process needs to be
Figure 54: Process placed in between the two data stores to
explain what is happening.
DFD Rules

There are a number of rules that need to be


followed in constructing a DFD. Some of them
are fairly logical while others have been
formed to ensure that diagrams are neat and
remain easy to interpret.

General format: left to right, top to


bottom

As a general convention, creators of DFDs try Figure 56: Data Store – Data Store data flow
to have the DFD flow from the top left corner
of the page to the bottom right (in a clockwise
Data cannot flow between these symbols:
direction). This is not always easy to
accomplish, but aiming to have a DFD Data Store – Entity or Entity – Data Store
constructed in this way aids readability.
For the same reason that data cannot be
passed from data store to data store, data
Data cannot flow between these symbols:
cannot be passed directly between an entity
Entity – Entity
and a data store. For this to be valid there
needs to be a description of the process that is
An interaction between two entities is outside
being carried out. This is especially true
of the organisation and is therefore not of
considering that an entity by definition exists
interest. The only time that it is going to be of
outside of the organisation – and it would be
interest to the organisation, is if the
highly unusual for them to be writing data or
organisation is facilitating this transfer of data
retrieving data directly from within the
in some way. If this is the case, a process needs
organization.
to be included between the two entities to
describe what is occurring.

82 | P a g e
Chapter 5: Analysis tools

Figure 57: Data Store – Entity data flows Figure 59: Avoid ‘magic processes’

Avoid ‘black holes’ Data that is rejected or discarded is


represented by a data flow to nowhere
A black hole is a term that is used to describe
the situation that occurs when data flows into Data that is rejected or discarded from a
a process but nothing is produced. If the process can be an important part of what is
process is doing something to the data and taking place, even though the data may not be
there is no data flow exiting the process, then used for anything else. Often data that does
why is the process necessary at all? not pass validation will fall into this category,
as will data that is redundant or not needed.

Figure 58: Avoid ‘black holes’

Avoid ‘magic processes’ or processes that


do not transform data Figure 60: Rejecting data

Be very careful that the data flows that are Ideal DFD size is no more than eight
going into a process are enough to produce the processes
required information. The process should not
be ‘creating’ information out of nothing. This is more of an opinion than a set
convention, but it is generally understood that
Another situation to avoid is one in which a the size of a DFD is best kept to a manageable
data flow enters and leaves a process level of detail. If it is not possible to break a
unchanged. In such a case, there is no problem down into eight processes or fewer,
indication of what processing has been done (if then the way that the problem is being divided
any). up needs to be re-examined. There is a
mechanism by which DFDs can be further
broken down and this will be discussed after
the detailed DFD example that follows.

83 | P a g e
Software Development: Core Techniques and Principles 4th edition

Process descriptions should be These are the main processes that are taking
meaningful place. It is possible to pick out more processes
than this or to phrase these in a different way.
When constructing a DFD, it is important to The important task is to represent the data
ensure that the descriptions that are placed flows within the system as these are
inside processes are meaningful. It can be a dependent on the number and types of
good practice to try to include a verb, as the processes that you identify, there is more than
nature of a process is that some task is being one way to do this correctly. We will firstly
carried out. For example, using words like: draw our processes in a roughly clockwise
‘calculate’, ‘validate’, ‘add’, ‘remove’, ‘edit’, fashion on the page, as shown in Figure 61.
‘archive’, ‘process’ and ‘compile’ is highly
desirable.

A worked example: River’s Cafe

Let’s have a closer look at how DFDs work by


constructing one step by step. Consider the
description of the process of placing an order
at a cafe.

The current system of ordering at River’s Café


works as follows. When a patron arrives at the
café, they place their order with one of the
waiting staff who writes it down on a note pad.
This order is entered into the computer system
by the member of staff currently on the Figure 61: River’s Café DFD example
register. The order is printed out and taken to
the kitchen where it is placed in the order
queue. The kitchen prepares orders based on There are three data stores and two entities
the orders that are in the order queue and that are obvious to us on first reading, so let’s
discards them when done. The customer is add them into our diagram. Our new DFD is
presented with a bill which they pay before shown in Figure 62 below.
they leave. At the end of the day, all of the
orders are tallied and stock ordered from
the supplier as needed.

The first step in constructing a DFD should


be working out what the separate
processes are. Read through the case study
and try to pick out the main processes that
are taking place.

Processes – River’s Café

1. Customer places order


2. Order entered into system
3. Kitchen processes order
4. Bill produced
5. Order is placed with supplier
Figure 62: River’s Café DFD example continued

84 | P a g e
Chapter 5: Analysis tools

Let’s now start working through each of the


processes and turning them into data flows.
Note that when drawing a DFD in this way,
it is often necessary to move objects around
to make space for data flows or to position
them in more convenient locations. You will
notice this occurring in this example. Rather
than reformatting the diagram so that
everything is in perfect position from the
start, I have left the original diagrams the
way they were laid out from the start. This
way, the example also demonstrates how
diagrams can evolve in this fashion.

1. Customer places order

When a customer places an order, they give


their choices to one of the waiting staff who Figure 63: River’s Café DFD example continued
writes it down on an order pad. Any record
of data or information constitutes a data 2. Order entered into system
store. In this case, the data store is a physical
note pad. The member of the waiting staff When the order is entered into the system, the
would also take note of the table number while data is processed in a number of ways. The fact
taking the order. that the data is entered implies some
validation takes place although how much the
Note that there are many more exchanges of data is validated and in what ways is not known
information taking place in this process. The (and could well be a problem with the current
customer would probably ask what specials are system). The order itself is printed out (in a
on offer for the day and the member of the specific format) so that it can be physically
waiting staff may or may not need to check this placed in the kitchen order queue. As well as
with the kitchen. It may also be the case that this, the details of the order are saved so that
some choices are not available or the customer
has specific questions about how dishes
are prepared. At this ‘level’ of DFD, we are
deliberately keeping the number of
processes to a minimum and the data
flows simple. More complex DFDs can be
constructed using the process known as
‘levelling’ which is described at the end of
this example.

It is important at this stage to only


represent what is documented as taking
place. It is very easy to guess that some
data flows would occur and fill them in, but
if this is done then it is hard to analyse the
current processes for deficiencies.

Figure 64: River’s Café DFD example continued

85 | P a g e
Software Development: Core Techniques and Principles 4th edition

later on the amount of food/produce that customer, but this order is not obvious by
needs to be ordered can be calculated. looking at the data flows themselves.

Note that the presence of the ‘saved orders’


data store was not obvious at the beginning,
but has presented itself as being required by
this process and subsequent ones.

3. Kitchen processes order

When the kitchen is ready to process an


order, the next order is retrieved from the
order queue. Once completed, the order is
discarded.

These data flows are shown in Figure 65


below.

4. Bill produced

When the customer is ready to receive the


bill, they notify the waiting staff who retrieve
the order details from the computer and
calculate what is owing. This is then
formatted into a bill which is given to the Figure 66: River’s Café DFD example continued
customer.

Note that the order in which the data flows 5. Order is placed with supplier
occur is not represented in a DFD. We know
that the customer requests the bill and that the The orders for the day are retrieved and
bill is produced and then given to the analysed to determine what needs to be
ordered. This order is then sent to the
supplier (or suppliers).

Now that the DFD is complete, you will be


able to get a better picture of what is going
on at River’s Café. In fact, as you worked
through this example one step at a time,
some questions may have occurred to
you. For example, if the customer changed
their mind after they had placed their
order, what implications would this have
for the order in the kitchen order queue as
well as their bill? If the customer doesn’t
request the bill at all, does this mean that
the bill is not generated?

This is one of the advantages of creating a


DFD to analyse the data flows within an
organisation. Not only can the finished
Figure 65: River’s Café DFD example continued product give you a picture of how things
work, but creating the DFD causes you to

86 | P a g e
Chapter 5: Analysis tools

that level would be called a


Level 2 DFD. Likewise a
process in the Level 2 DFD
could be chosen to expand
to another level, and so on.

By convention, the
processes in a Level 1 DFD
are numbered starting from
‘1’. If the first process was
chosen to expand, the
processes would be
numbered ‘1.1’, ‘1.2’, ‘1.3’
and so on, as shown in
Figure 68 below.

Figure 67: Completed River’s Café DFD example

think about how processes are achieved in


detail.

Levels of DFDs

The DFDs that we have been examining


are also called Level 1 DFDs. They form
the starting point from which processes
can be selected and broken down into
more detail. By design, a Level 1 DFD is
fairly broad. Once you have created this
first DFD, you can select any process
and create a whole new DFD that
describes how that process is
accomplished. This process (known as
levelling) could be done almost
indefinitely. Each subsequent DFD is
named with a new level, so the first
time a process is used as the basis for a
whole new DFD, it is called a ‘Level x
DFD’.

For example, if process number ‘3’ was


chosen to expand into another level, Figure 68: River’s Café DFD example continued

87 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context diagrams diagram to represent the role that the waiting


staff play in the organisation, would be a use
A context diagram is a special type of DFD. It is case diagram.
in many ways a simpler version of the level 1
DFD in that it does not show any of the internal Software Requirements
processes inside the organisation. A context
diagram focuses on the interaction between Specifications (SRS)
the organisation’s information system and the
external entities that supply data or receive All of the features of a software solution that
information. have been discussed so far – timelines,
resources, functional and non-functional
For example, the previous DFD example which requirements, constraints and scope are
looked at processes within River’s Cafe would documented in the form of a Software
have a context diagram that looked like the Requirements Specification (SRS).
one in Figure 69 below.
An SRS is a complete description of what the
software solution will need
to do. It has the purpose of
bringing together all of the
data that has been gathered
and all of the analysis that
has been performed, so that
software developers can
refer to this information in
one document. The content
of an SRS can vary
depending on the scope of
the project and what is felt
to be important in the
Figure 69: River’s Café context diagram particular situation. Often
tools such as Use Case
Diagrams and Data Flow
Note that this context diagram does not Diagrams / Context Diagrams are included as
contain any reference to the internal processes they can show diagrammatically how users will
within River’s Cafe. In the same way that a DFD interact with the system and the ways in which
can be expanded upon by taking the central data will be transformed into information.
part of a context diagram and expanding it, we
end up with a level 1 DFD. Because of its The key elements of an SRS are commonly:
relationship to a DFD, context diagrams are
often referred to as ‘Level 0 DFDs’. 1. An introduction.

You will also have noticed that the waiting staff The introduction outlines the purpose
are not included in the DFD or in the context of the software solution, its
diagram. This is because DFDs focus on the constraints and its scope.
processes within an organisation and not on
individuals. The waiting staff are involved in 2. A description of the proposed
carrying out many of the processes shown, but software solution.
the main purpose of creating a diagram like
this is in the understanding of what is taking This includes what functions it will
place. It may be that staffing will change in an perform, what characteristics the user
organisation from time to time, but the interface will have and any
processes will still be carried out. A better dependencies.

88 | P a g e
Chapter 5: Analysis tools

3. The specific requirements of the


software solution.

This section can be the most detailed,


often with separate sub-sections
detailing the user interface as well as
the functional and non-functional
requirements.

4. A description of the environment


within which the solution will operate.

SRS documents vary in size and content.


Complex software development projects will
require a significant investment in time to
ensure that the SRS is completed and contains
all of the necessary elements.

The goal of an SRS is to bring together all of the


proposed elements of a software development
into one document. This document is then able
to be handed to the client, discussed, amended
and then ultimately signed off, so that design
and development can begin. The investment in
time in producing an SRS ensures that the
software developer and the client are on the
same page and have a common understanding
as to the scope of the project. It provides a
software developer with a concrete roadmap
that has been agreed upon by all parties.

Software requirements specification


(SRS): The intended purpose and
environment of a software solution. It
documents the key activities associated
with the analysis stage of the problem-
solving methodology. Features of an
SRS should include a description of the
functional and non-functional
requirements, system and technical
requirements, constraints, scope and
assumptions..

VCAA VCE IT Study Design 2020-2023


Glossary

89 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context Questions

1. What is UML?
2. What is a use case diagram?
3. Give two examples of possible actors in a use case diagram?
4. In a use case diagram, what does a system boundary represent?
5. In what ways is a DFD different from a use case diagram?
6. Give examples of three physical objects that could be data stores in a DFD?
7. Why is it convention not to represent data flows between entities in a DFD?
8. Why is it a problem when the data flows into a process are the same as the data flows out of
a process?
9. What is the process known as DFD levelling?
10. What are the main advantages of a context diagram over a DFD?
11. What is a Software Requirements Specification used for?

Applying the Concepts

● Draw a use case diagram to describe the process of approving an absence (once a student has
returned to school after being sick).

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Draw a use case diagram to represent the interactions in an information system

Read through a documented process and identify the entities, processes and

data stores

Draw a Level 1 DFD to represent a documented process

Draw a context diagram from a supplied Level 1 DFD

Gather the required components to create a Software Requirements Specification

90 | P a g e
Chapter 5: Analysis tools

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
The term UML stands for:
A. Unified Modelling Link
B. Unified Modem Linkage
C. Unlinked Mode Linkage
D. Unified Modelling Language

Question 2
The term UCD stands for:
A. Use Case Diagram
B. Used Cases Diagrammatic
C. Use Case Drawing
D. Unlinked Carrier Diagram

Question 3
In a DFD, an entity represents:
A. A person inside the organisation that deals with data
B. A system within the organisation that processes data
C. A system outside the organisation that processes data
D. A person or company outside the organisation that receives or provides data

Question 4
Which of the following is not an element in a Use Case Diagram?
A. Actor
B. System Boundary
C. Includes
D. Data Flow

Question 5
An element of a DFD which transforms the data coming into it is known as a:
A. Data flow
B. Data store
C. Process
D. Entity

Question 6
Which of the following should not be used as an actor in a Use Case diagram?
A. Maxwell Smith
B. System Administrator
C. Accounting Department
D. Time sheet software

91 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 7
Each of the items described below have been taken from a DFD. State whether the description
involves a process or a data store.

Item Description Process or Data Store


Combine this year’s financial figures with those
of previous years.
Retrieve the number of items from the inventory
database.
Archive the month’s data for future reference.

Discard the daily calculations.

Calculate the profit margin for each stock item.

5 marks

Question 8
Phillip is the manager of a collaborate work space called ‘My Space Is Your Space’ (MSIYS). He receives
many requests from clients in relation to catering. Phillip would like to set up the MSIYS Café to allow
clients to order via an App for table service or local Ooba Eats establishments. It would work in the
following way.

When placing an order, the user of the App would scan a QR code (which would be located on every
desk and meeting room within the office area), so that the café will know where to bring the order.
The App will have a list of café items that can be ordered (some of which may vary from day to day).
The App will also be able to display menu items from local cafes and restaurants, that the App will
then place orders with, via an Ooba Eats API (a code module that will be supplied by Ooba Eats that
will integrate with the App seamlessly). Phillip would like to create a context diagram to represent this
process. Label the entities and data flows in the diagram below:

1.
1. 2.
3.
2.
MSIYS 4.

6 marks

Question 9
SportsCastOz is a small Internet radio station that delivers a number of sports based shows either live
or via podcast. There are up to 20 presenters involved with the station at any one time. Paul is the
director of SportsCastOz and works full-time managing the station and occasionally presenting on the
shows. Emily, Traci and Steven work part-time as producers (editing and publishing podcasts as well
as managing the live streams). Jay works part time and handles advertising and marketing. Some of
this advertising is placed into the edits of the podcasts and some is placed on the website.

92 | P a g e
Chapter 5: Analysis tools

Paul would also like to employ someone to develop an App for the radio station. The App would allow
the user to listed to all of the shows, read articles published by people at the station as well as provide
reviews and feedback.

Paul has started a sketch of a Use Case Diagram to help visually represent what people are doing and
how their roles can overlap. Some information is missing. On the diagram below, label the five actors
and two use cases.

7 marks

Question 10
Nancy wants to create a DFD to represent the maintenance process at her job (WeCare). She tries to
understand how the job booking system works by drawing a data flow diagram based on the following
information.

WeCare have a help line and (for convenience), requires that all maintenance jobs be called in via this
line. One of the staff members takes the call and starts a new job (appended to the file of the person
making the call). If the person making the call is a new client, then they must be registered with
WeCare – which requires a detailed application to be completed. At present, the job file is a MS Word
document placed into a folder with the client’s name on it. A time is booked (that is suitable for the
client) for a tradesperson to visit and attend to the problem. The tradesperson then advises WeCare
of the success (or otherwise) of the job and whether a follow up is needed. Though the tradespeople
work on a volunteer basis, WeCare does cover the expenses that the trades-people incur, so this is
taken care of at the end of the job.

On the diagram on the next page, label the two entities, two data stores and four processes.

93 | P a g e
Software Development: Core Techniques and Principles 4th edition

7 marks

94 | P a g e
Chapter 5: Analysis tools

Sample Examination Answers

Question 1
Answer: D

Question 2
Answer: A

Question 3
Answer: D

Question 4
Answer: D

Question 5
Answer: C

Question 6
Answer: A

Question 7
Item Description Process or Data Store
Combine this year’s financial figures with those Data store
of previous years.
Retrieve the number of items from the inventory Data store
database.
Archive the month’s data for future reference. Data store
Discard the daily calculations. Process

Calculate the profit margin for each stock item. Process

Question 8

1. Client 1. Order & 2. Ooba Eats


location __________
3. Current
menus,
prices
2. Confirmation of MSIYS 4. Order details
order, billing

Question 9
1. Director (actor)
2. Presenters (actor)
3. Producers (actor)
4. Advertising / Marketing person (actor)
5. Organise advertising (use case)
6. Listen to programs (use case)
7. Listener (actor)

95 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 10

96 | P a g e
Chapter 6
The art of design

The chapter covers Unit 3: Area of Study 2 key knowledge:


• Digital systems
o KK2.1 Security considerations influencing the design of solutions, including authentication
and data protection
o KK2.9 techniques for generating design ideas
o KK2.10 Criteria for evaluating the alternative design ideas and the efficiency and
effectiveness of solutions
o KK2.12 Factors influencing the design of solutions, including affordance, interoperability,
marketability, security and usability
o KK2.13 Characteristics of user experiences, including efficient and effective user interfaces

Key terms: Design, computational thinking, brainstorming, DeBono hats, mind-mapping, attribute
listing, SCAMPER, efficiency measures, speed, functionality, cost of file manipulation, affordability,
data protection, authentication, interoperability, marketability, processing efficiency, effectiveness
measures, completeness, readability/clarity, accuracy, timeliness, relevance, attractiveness,
accessibility, communication, usability, needs of users, user interface, clear, concise, responsive,
familiar, consistent, scalable, forgiving, development time.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 - Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

The art of design The art of design can be represented by the


term ‘computational thinking’. As the glossary
definition explains, computational thinking
Design is a major stage of the PSM. In fact, the encompasses the ability to think in a logical,
amount of work that needs to be done in the creative fashion while having an understanding
analysis and design stages is significantly of the ways in which software solutions and
greater in terms of tasks to be completed as systems perform.
compared to the development and evaluation
stages. Both analysis and design could be There are a wide range of tools that can be
shortened simply by performing less actual used to assist in the design of a software
analysis and spending less time designing the solution. Not all of the tools described in this
software solution, but doing this is ultimately text need to be used at once. A software
detrimental to the quality of the solution. developer will make use of those tools that are
the most appropriate for the information
Software Development: Core Techniques and Principles 4th edition

problem as well as those tools that they are Brainstorming


most familiar with.
Brainstorming can be a collective or individual
process in which ideas are written down
without critical evaluation or criticism. Some
Computational thinking: the process of commentators believe that group
recognising aspects of computation in brainstorming sessions are not very effective
the world and being able to think
logically, algorithmically, recursively due to factors such as people within the group
and inferentially. It typically involves blocking others ideas or dominating the
inferential thinking, defining problems discussion. For this reason, it is very important
through decomposition, documenting to lay a foundation where each person’s
steps and decisions through algorithms, contribution is valued. It is also important that
the use of programming languages and suggestions are noted when they occur and are
software, and evaluating the resulting
solutions. not debated or assessed on the spot. All
potential ideas should be noted as they occur,
VCAA VCE Computing Study Design otherwise they can be lost.
2020-2023 Glossary
For effective brainstorming, the following
should ground rules should be established.

Generating design ideas • There should be a relaxed environment in


which everyone is encouraged to
participate.
At the core of design is the process of
• No ideas should be judged or critiqued. In
generating design ideas. It is often a very
fact, quirky ideas should be welcomed and
undervalued step, but producers of successful
built upon.
software development projects will likely
• Ideas should only be evaluated at the end
espouse the value that the design stage had in
of the session.
the development process.
A nice way to structure and plan a
There are many ways that design ideas can be
brainstorming session is to use an existing
generated, but we will cover a few key ones.
technique such as the DeBono thinking hats.

DeBono ‘Six Thinking Hats’

‘Six Thinking Hats’ is a book written by Edward


DeBono that describes a brainstorming tool
that can be used very effectively by groups. It
assists in the planning and organising of the
thinking within a group by ordering the
brainstorming process into a number of
defined stages (each designated by a different
coloured hat). Each of the ‘hats’ should be
(metaphorically) worn by the participants as
they work through the process. Though the six
hats are defined a certain way, their order is a
suggested one and it may be that the person
facilitating the brainstorming session wants to
focus the group on particular aspects and so
Figure 70: Some methods for generating design will concentrate on those ‘hats’ alone. It could
ideas also be the case that while the session is being

98 | P a g e
Chapter 6: The art of design

conducted, discussion moves quickly to a or the suggestions that have been made. A
conclusion rendering some of the ‘hats’ group facilitator will often try to limit the time
redundant. spent in this phase as it is often not a very
productive one. It can be revisited at different
times in the brainstorming process. There does
not need to be any justification to responses in
this phase and the facilitator needs to ensure
that the group understands that this is the
expectation.

Black hat (discernment)

While it may seem that the ‘black hat’ would


represent negativity, it in fact indicates a time
for the group to consider reasons to be
cautious.

Green hat (creativity)

When wearing the ‘green hat’, group members


generate ideas, new concepts and possible
solutions. No evaluation or criticism of these
ideas takes place during this phase.

Yellow hat (optimistic response)


Figure 71: The DeBono thinking hats
The ideas that have been presented are now
assessed for their relative benefits. Given that
White hat (information) the ‘yellow hat’ is an optimistic response to the
ideas that have been presented so far, just as
A brainstorming session using the DeBono with the green hat, no negative evaluation or
thinking hats will typically begin with an criticism should take place during this phase.
extended period with just the ‘white hat’. The
white hat is all about information. What One of the key benefits of using a technique
information is available? What are the facts? such as the DeBono thinking hats, is that it
helps define the rules and structure to the
Blue hat (managing) brainstorming session. Often brainstorming
sessions will break down or be less effective as
The ‘blue hat’ is also sometimes used as the some participants will be offering suggestions
first hat as it is concerned with the (green or yellow) while others are discussing
management of goals and objectives. Using the goals and information (blue or white) or
blue hat, a facilitator can define with the group critically evaluating the ideas being presented
the way that the brainstorming session will (black).
run. Once the brainstorming is underway, the
blue hat is used to define and list the goals and Oblique Strategies
objectives of the project or problem.
The set of cards known as ‘oblique strategies’
Red hat (emotions) is intended to promote lateral thinking and
break groups out of thinking in set ways.
While wearing the ‘red hat’, participants can Developed in 1975, there are 55 cards in the
give instinctive or gut reactions to the problem

99 | P a g e
Software Development: Core Techniques and Principles 4th edition

set and they have been used successfully in communicate the scope and constraints of a
very creative fields such as music and art. proposed solution to others.

Some examples are: Researching


• ‘What to increase? What to reduce?’
• ‘State the problem in words as clearly as Researching ideas of concepts that are already
possible’ in existence can also be a very valid method of
• ‘Use an old idea’ generating design ideas. It can also form the
basis for other brainstorming methods to build
Mind mapping and graphic organisers on the inspiration that these ideas may
provide.
Mapping tools and graphics organisers allow
those involved in the brainstorming process to Attribute listing
visually represent ideas and the relationships
between them. Drawing a mind map is an The technique known as ‘attribute listing’ is
excellent method of visually representing what also one that works best in concert with other
is known about a topic or a problem. The brainstorming techniques. In essence, it
process of creating a mind map helps those involves listing the different possible attributes
creating it to understand what they know related to aspects of a problem and then
about a topic, what questions still exist and mixing and matching these in ways that may
how to group different facets together. create an innovative solution.
Many software tools exist though it is often the
case that people will prefer to use pen and For example, a company may wish to design a
paper (or whiteboards) as it is easier to new software solution that customers will
represent ideas quickly. access to order goods. The first step in utilising
this technique, is to list the attributes
that such a software solution might have.

For example:
• Storage type
• Platform
• Programming language
• Authentication method.

For each of these attributes, the options


that are available are listed below each
other inside a table.

The attribute table can then be used to


Figure 72: A person working on a mind map ‘mix and match’ combinations of the
attributes to see if one of the
A mind map is a type of diagram that has at its combinations is a viable design
centre the main idea or concept. Related alternative to consider.
concepts are then connected to the central
one by a line and the diagram grows – SCAMPER
branching out in a circular pattern.
SCAMPER is an acronym for a structured
Mind maps can be especially useful as a brainstorming technique that encourages
reference tool. They can also be used to users to think outside of the box. The user first
selects a design from which the alternative

100 | P a g e
Chapter 6: The art of design

designs will be generated. The letters of M is for Modify (Magnify or Minify)


SCAMPER guide the process from this point Increase or decrease the design in scale,
onwards. change shape, change attributes such as
colour or give it a new angle.
It is important to be aware that this technique Questions that can be used to stimulate
is primarily used for the improvement of an discussion:
existing design and needs a starting point from Can it be duplicated, multiplied,
which to proceed. It could be that the starting exaggerated?
point for the SCAMPER has been developed Can anything be omitted, made shorter,
using one of the other methods for generating smaller, lightened?
design ideas. The letters of SCAMPER stand for Can you change the meaning, colour,
the following: motion, sound, smell, form, shape?
Consider magnifying aspects of the design.
S stands for Substitute What can you add or exaggerate?
Substitute components of the design, Consider minifying aspects of the design.
materials, people/characters, approach, What could you make smaller or
angle, colour, shape, meaning, idea or condensed?
words.
Questions that can be used to stimulate P is for Put
discussion: Put the design or aspects to another use or
What can you substitute? idea.
What/who can be used instead? Questions that can be used to stimulate
What else can be used instead? discussion:
Are there other materials? How can you put the design to different or
Are there other processes? other uses?
Are there other shapes? Are there new ways to use the design as is?
Can another place be used? Are there other uses if it is modified?
Is there another approach?
E is for Elliminate
C stands for Combine Eliminate or remove elements in order to
Mix or blend the design with other things or simplify the design.
ideas. Questions that can be used to stimulate
Questions that can be used to stimulate discussion:
discussion: What can you eliminate?
What can you combine or bring together? Rather than thinking of specific elements of
How about a blend, a mixture, an the design, what other things could be
assortment or an ensemble? removed?
Can you combine the purposes of two or What about if time, effort or costs were
more designs? reduced?

A is for Adapt R is for Reverse (or Rearrange)


In what ways can you alter, change the The design could be turned inside out or
function or use a part of another element? upside down, inversed, components could
Questions that can be used to stimulate be swapped or transposed.
discussion: Questions that can be used to stimulate
What can you adapt for use as a solution? discussion:
What else is like this? What can be rearranged in some way?
What other ideas does this suggest? Are there components in the design that can
Do past designs offer a parallel? be interchanged?
What existing designs could I copy?
Who could I emulate?

101 | P a g e
Software Development: Core Techniques and Principles 4th edition

Functionality

How well are the functions of the user


interface designed and how well do they
address the tasks for which they have been
designed?

Cost of file manipulation

Each time a file needs to be manipulated in


some way, the efficiency of the overall solution
will be affected. Processing the data within a
file, sorting, searching or moving the data from
one file or location to another will all have a
time cost associated with it. The challenge is to
minimise this cost in any way possible.

Effectiveness measures for a


software solution
Completeness

Figure 73: SCAMPER in action How complete is the information that is being
produced for the purposes that it is intended?
Does the user of the software need to source
Efficiency and effectiveness in additional information from other software
solutions or sources? Completeness is a
the design phase subjective measure based on the task that is
being performed and the user’s needs – so a
Having an understanding of the different detailed analysis of this should have been
efficiency and effectiveness measures that are performed to prevent a lack of completeness
used to assess a software solution is important occurring.
to the design phase. Efficiency and
effectiveness can be used when designing the Readability or clarity
user interface of the software as well as the
criteria by which designs will be contrasted and How easy is it to read and understand the
compared. Let’s examine some of these output that is produced? If the information is
measures in detail. not easy to read, the effectiveness of the
software solution is lessened even if all of the
Efficiency measures for a software requested information has actually been
solution produced.

Speed of processing Attractiveness

Is data able to be entered quickly? Is the How attractive is the user interface? This can
output from the software solution generated have a flow on effect in terms of the usability
in a prompt fashion? The layout of the user of the software solution. Attention to the
interface and the way that the software flows colour scheme, font size, consistency of object
from one screen to another can also have a sizes and access to the program’s functions can
profound effect on speed. all have a positive effect on the user’s

102 | P a g e
Chapter 6: The art of design

experience and as a consequence, make the the time the information has been produced is
program more effective. it ever too late to make practical use of it? Data
such as stock prices and weather conditions
Accuracy are affected by the attribute of timeliness and
are often used by software solutions to make
Is the information produced always correct? predictions or strategic decisions in a very time
Does it often need to be adjusted or queried? dependant way. This data not only needs to be
Have difficulties occurred in the past due to the able to be entered into the software quickly
inaccuracy of information? A well designed and easily, but the processing of the software
user interface can ensure the accuracy of the needs to produce the required information in
inputted data by limiting the type and range of time for it to be used.
the data. Validation techniques can detect
when errors occur and assist users in making Communication of message
corrections.
How clearly is the software solution presenting
Accessibility the information that it is producing? Is this
information hard to interpret or is it easy to
To what extent are the functions of the user understand? The information may be accurate,
interface easy to find and use? If users of a timely and relevant, but simply presented in a
software solution are confused by the function way that makes it difficult to understand or not
of particular elements, it will slow them down well suited to the role of the user within the
and make the software not as effective. organisation. For example, a manager may
Functions, icons and navigation should clearly wish to extract information from a software
indicate their function without confusion. Help solution that shows them how profitable the
files and on-screen help assist with company has been in the last month. The
accessibility. software solution may have all of the required
information, but the manager may simply want
Software developers need to be careful to to be presented with a graph or bottom line
design their solution with the users of that figure rather than reading through all of the
solution in mind. The skill level of the users as companies transactions for the month.
well as the hardware and environment that the
solution will be used on, can all be factors in Relevance
limiting the accessibility. If a software solution
is designed for the latest computer with a large Does the information contain the necessary
amount of RAM and a high screen resolution, detail or elements? When a user tries to
but will be used by an organisation in which the extract information from the software, is the
typical hardware is three years old and the information that is presented to them relevant
screen resolution is smaller, it will make the to their request? Relevance can also be
software very difficult to use. improved by providing user interface features
that target the user’s level of access and role.
The measure of accessibility is also concerned Validation techniques that filter out incorrect
with ensuring that the software solution caters or inaccurate data can help to improve the
for those with disabilities such as vision relevance of results that are produced.
impairment, by adding features that account
for these disabilities – such as allowing text to It is certainly true that information is very
be resized. valuable and organisations will never discard
information (preferring instead to archive it
Timeliness away from the central storage location). As
more and more information is produced, there
Is the information produced within a time could be a tendency for a software solution to
frame that is relevant and useful? That is, by present so much information to the user, that

103 | P a g e
Software Development: Core Techniques and Principles 4th edition

the information that they were seeking is ‘lost’. even if it accomplishes the tasks demanded of
Archiving can certainly prevent this sort of it.
occurrence, but software can also be designed
to filter irrelevant information out so the user The amount of experience that the users of the
does not have too much presented to them at proposed solution have can also have an effect
once. on the types of functions that are included in
the design. For example, if the users of a
When discussing the relevance of the proposed software solution are highly capable
information produced by a software solution, IT users with a lot of experience, it may be
it can also be useful to consider redundancy. appropriate to build extra functions into the
For example, a software solution might contain solution that allow them to accomplish quite
a function that displays all invoices that have complicated tasks. They may be given the
not been paid. If this information is presented ability to write scripts or create macros within
as a list on the screen with the heading the solution. If the users did not have the
‘Invoices not paid’ and next to every item is the knowledge or expertise to do these sorts of
text ‘not paid’, then the user is being presented things, it would not be a good use of time to
with a lot of information that they already build this into a solution.
know.
Affordability
Usability
Often the scope and features of a proposed
Is the user interface intuitive and easy to use? project will ultimately be limited by the
Well designed software should have a user amount of money that is available in the
interface that acts as a conduit to the required budget. With unlimited funds and time, a
processing but does not become so much of a solution to almost any problem could be
focus that it detracts from the functioning of conceived, but this is not realistic. A design that
the software itself. Sometimes the use of too does not take the available budget into
many functions on the screen at once or the account can doom a project to failure.
use of animations can detract from the
usability of the solution. Security: data protection and
authentication
There is a definite art to the creation of a user
interface, but the software should be ‘about’ It goes without saying that any software
the task itself and not the interface. solution will contain security features to
protect the data it stores and ensure that those
Factors influencing the design accessing it have the necessary levels of
authority. Legislation ensures that all
of solutions organisations enact good data protection
practices. The most relevant legislation is the
Usability Privacy Act, but there are many that have

Following on from the point above which


describes usability as an effectiveness
measure, it is also one of the key factors
that needs to be taken into account in the
design stage. When designing a software
solution, the users themselves are of prime
importance. If a solution does not cater for
the specific needs of the users, then the
solution cannot be considered a success,
Figure 74: Login screen

104 | P a g e
Chapter 6: The art of design

bearing on the ways in which organisations two scenarios is probably the situation in
use, store and display their data. which what is being created has the possibility
of being marketed at some future point in
While this is a given, a software developer time.
needs to ensure that these aspects are not an
after-thought from a design perspective. Processing efficiency
Designing a user interface that allows the user
to sign in efficiently (or perhaps integrates with In what ways can processing efficiency
a SSO – Single Sign On) is important. It is also influence the design of a software solution? A
important to consider what the interface will solution may require information to be
look like and how the solution will operate with available within a certain period of time and
different levels of access enabled. the data that needs to be gathered in order to
produce this information may be varied in type
Interoperability and form. It may also be the case that the
amount of processing that needs to be done is
Designing a software solution so that is works quite large and will have an effect on the time
with the other software packages being used the solution needs to run. Many software
on a system is also a key consideration. solutions that have a variety of data sources
Software packages will often share data and so and are producing information that is time
need to used file formats that are standards dependant will present users with different
based such as XML. options at different times and prompt users
based on what they require.
An API (Application Programming Interface) in
simple terms is like a library of routines that For example, Parent Teacher Online by
can be plugged into a software solution to Country Net Solutions (shown in Figure 63)
allow it to share data with another one. In presents each user with a menu that is
designing a new software solution, the ability structured so that data is entered in the correct
to be able to use particular APIs may be critical order. Particular options become available
to allow it to be properly integrated with the only at certain times, preventing users from
existing system or indeed systems in any trying to access information before it has been
organisation in which the software may be correctly processed.
used.

Marketability

Marketability is the measure of how


much something can be bought or
sold. In what ways could the
marketability of a software solution
have an effect on the design? It all
depends on what context the software
solution is being designed for. If, for
example, a software solution is being
designed for a small company and is
only being planned for use by their
employees and on their hardware,
then perhaps the marketability of the
final product is not important. At the
other end of the spectrum would be a Figure 75: Parent teacher online booking
software solution that is being created PTOnline by Country Net Solutions, image courtesy of Country Net
Solutions
specifically to be sold. Between these

105 | P a g e
Software Development: Core Techniques and Principles 4th edition

User interfaces
There is a lot of information available on the
creation of effective user interfaces. The
creation of a user interface is often such a
major part of the development of a software
solution, that there will be software
developers dedicated to working on this aspect
of the solution. When you reflect on successful Figure 76: Progress bar
technological devices of the last few years,
they generally will all have one feature in processing that is being done. A responsive
common: a highly successful user interface. A user interface is not only as fast as it can be,
good user interface allows users to access the but it gives feedback to the user in the form of
software solution with ease and perform the progress bars, messages, icons that indicate
tasks required of it. It has been said that if a that processing is taking place (hour glass,
user begins to ‘notice’ a user interface, then spinning wheel) and confirmation that tasks
perhaps the user interface is not doing its job. have been undertaken or completed.

Let’s look at some of the characteristics of Familiar


effective user interfaces.
A user interface that is familiar in style and
Clear operation to those of other successful user
interfaces will make users comfortable and
A clear user interface is one that is easy to read mean that they will be able to use the software
and understand. The interface should much more intuitively. The term ‘familiar’ is a
communicate its function in a way that does synonym for intuitive in this context, but it can
not leave the user wondering what a particular difficult to describe how to make a user
function does. In instances where the user can interface intuitive. Familiarity does not need to
hover over a button or function or access a come from other successful user interfaces; it
help dialog, the explanation should be clear can also come from a comparison with real life
and easy to understand. objects and the ways that we interact with
them. A user interface that is styled as a book,
Concise for example, could have controls that turn the
pages or add bookmarks that can be referred
As well as being clear (and possibility in to later on.
competition with it) is the desire to be
concise. Users do not want to be presented
with long explanations of the functions
within the software solution. Functions
and explanations should be as concise as
possible, explaining the tasks performed
without being so lengthy that they
dominate the user interface.

Responsive

When discussing a responsive user


interface, we are not just talking about
speed. A fast user interface is definitely
desirable but this is often influenced by the Figure 77: Microsoft Office menus

106 | P a g e
Chapter 6: The art of design

Consistent but it is more than this. An attractive user


interface is enjoyable to use and is tailored to
User interfaces should strive for consistency the users themselves. The users at an
across all of their screens so that the user can accounting firm may find a very logical and
easily use what they have learnt in one part of ordered user interface to be attractive,
a software solution to the whole product. For whereas the users at a fashion house might like
example, consistent navigational controls that a user interface that features lots of bright
are placed in the same position each time and colours and revolving fashion designs in the
operate the same way. background of the screen.

The different software programs in the Scalable


Microsoft Office package all share a very
similar interface that allows users to move A user interface that has been designed to be
from one to the next without having to relearn scalable is one that has allowed for the future
the menu system in its entirety. addition of functions or features in such a way,
as their addition will not adversely affect the
Efficient overall layout of the software. Although this is
probably not high on the list of desirable
An efficient user interface is one that allows features of an effective user interface, it is
the user to execute tasks with a minimum of nonetheless one that warrants consideration.
effort. The task itself may appear easy to
complete, but there are some challenges in Forgiving
doing this and simply placing all of the
functions that are available to the user on the User interfaces that allow the user to roll back
screen at once does not make the user any transactions that have been performed are
interface efficient. Microsoft Windows, for highly desirable. The ‘undo’ and ‘redo’
example, will present the user with a list of functions are almost considered to be core
options when they try to save an image from a functions of a software solution by many and
website, as shown in Figure 67. These options the inclusion of them means that users gain a
try to anticipate the possible tasks that the sense of security and don’t have to be
user may want to accomplish. tentative or hesitate when using the software
solution.

Figure 78: Microsoft Windows right click


menu

Attractive
Figure 79: Undo and redo functions
It is easy to underestimate the effect that an
attractive user interface has on the user’s
overall experience. Certainly the combination
of colours, fonts, screen size and layout all play
a part in the attractiveness of a user interface,

107 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context Questions

1. A user interface that is described as ‘responsive’ is more than simply ‘fast’. What other
attributes does an interface have that is ‘responsive’?
2. What does it mean if a user interface is described as efficient?
3. What are the benefits in determining the evaluation criteria for a software solution during the
design stage?
4. Considering brainstorming techniques such as DeBono’s hats, what are the key ground rules
that should be established before participants begin giving suggestions?
5. List and describe 5 measures of a user interface in relation to its effectiveness.
6. In what ways do security requirements influence design?
7. What is the difference between user interfaces being described as ‘familiar’ as opposed to
‘consistent’?
8. The term ‘computational thinking’ refers to being able to think recursively. What does this
mean in a design context?

Applying the Concepts

• Reflecting on your favourite Apps, what aspects of their user interface design are most
appealing to you?
• Pick a household object or appliance and perform a SCAMPER design process on it. Each
person in the class could do a SCAMPER for a different object and compare results at the end.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Generate new design ideas using one of several methods

List and describe efficiency measures of a software solution

List and describe effectiveness measures of a software solution

Describe factors that influence the design of solutions

Discuss what makes efficient and effective user interfaces

108 | P a g e
Chapter 6: The art of design

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
At the beginning of a project, Danny decides that it would be good for the software development team
to brainstorm possible solutions. Which of the following would be a valid brainstorming technique?
A. Creating a mock-up
B. Writing a pseudo-code algorithm
C. Drawing a mind-map
D. Researching current trends and products

Question 2
In creating a mind map, which of the following is the most correct?
A. The central idea or concept is placed in the middle of the mind map
B. Each person thinks of their own ideas and these are combined at the end
C. More important concepts are represented with larger circles
D. A maximum of 6 circles should be added to ensure that the ideas are not too broad

Question 3
A user interface that is described as being ‘responsive’ is:
A. One that is fast above all else
B. One that gives the user good instructions
C. One that is not only fast, but gives good feedback to the user
D. One that is able to print to a number of different network devices

Question 4
An interface that is ‘forgiving’ is one that:
A. Works reliably all of the time
B. Is easy to understand
C. Allows the user to undo transactions
D. Works on a variety of platforms

Question 5
Phillip is to use a method for generating new design ideas for an App he is developing. Which of the
following is not a method he could use?
A. Brainstorm
B. Use Case Diagram
C. DeBono’s thinking hats
D. Oblique strategies

Question 6
There are a number of ways to generate different design ideas. Describe two methods that can be
used and explain which of the two you would recommend for use by a local council attempting to
solve the issue of crime in the area?

Method 1: _________________________________________________________________________

Method 2: _________________________________________________________________________

109 | P a g e
Software Development: Core Techniques and Principles 4th edition

Recommended: _____________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
3 marks

Question 7
Jill’s Green Thumbs is a garden maintenance business developing an App that will allow customers to
submit a request for a quote. It will:
- allow them to enter their contact information
- use location services to pinpoint their exact map location
- take photos, annotate them and add them to the detail of the quote
- connect to a database of available plants and materials and their cost
- allow the user to create sketches as needed.
- email the finished quote request to Jill’s Green Thumbs
In the table below, suggest a feature of the user interface for the App in each category.

Clear

Responsive

Familiar

Efficient

Forgiving

5 marks

Question 8
The annotated screen design below is for a proposed App that will be used in junior athletics to convert
the results of throws from para-athletes using MDS tables.

110 | P a g e
Chapter 6: The art of design

a. Despite the efficient design, when the developers seek feedback from their peers, many tell
them that it is not ‘scalable’. What does this mean and how could the design be changed to
make it ‘scalable’?

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

b. Name two elements of the design shown that are ‘familiar’?

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

Sample Examination Answers


Question 1
Answer: C

A and B require some decision to have been made on the direction / purpose of the App. D is not a valid
choice either as it is researching the ideas of others rather than coming up with one of your own.

Question 2
Answer: A

Question 3
Answer: C

Question 4
Answer: C

Question 5
Answer: B

Each of the other three options are tools for generating design ideas. A Use Case Diagram is a method
of representing the processes within an existing system.

Question: 6
Method 1: Brainstorming – ideas are written down as they are offered and then the group goes
through them critically.
Method 2: DeBono’s six thinking hats – a group brainstorming session is structured by focusing
everyone on facts (white hat) firstly and then moving through a number of phases.
Recommend: A complex issue such as crime in a local area might be well served by having a discussion
using DeBono’s thinking hats. This way the discussion can remain focused at all times.

There are a number of possible answers here. The two methods could include mind mapping or some
other method of generating design ideas. The method that is recommended is also not important – the
focus is on explaining why the method chosen would be a good choice.

111 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 7
Clear: All aspects of the user interface should be easy to find and understand. It should be easy to see
where to start with the quote and how to submit it.
Responsive: Upload photos and information quickly and give the user an indication of what processing
is taking place.
Familiar: Use standard icons for features such as location services and email.
Efficient: Allow the user to complete their quote in as few steps as possible.
Forgiving: Allow the user to undo or go back to modify steps in their quote.

Other options are possible. When answering questions such as this in the exam, it is important to put
your answer into the correct context. Make sure that you mention aspects from the question or case
study in your answer.

Question 8
a. There isn’t enough space on the screen to expand the functionality if required. It could be
fixed by having a greater amount of white space or introducing a menu or tabbed interface.

b. Name two elements of the design above that are ‘familiar’?


The exit button (X) and the drop down lists.

112 | P a g e
Chapter 7
Of input and output

The chapter covers Unit 4: Area of Study 1 key knowledge:


• Digital systems
o KK3.1 Procedures and techniques for handling and managing files and data, including
archiving, backing up, disposing of files and data and security
• Data and information
o KK3.2 Ways in which storage medium, transmission technologies and organisation of files
affect access to data
o KK3.3 Uses of data structures to organise and manipulate data

Key terms: Archiving, backup, full backup, differential backup, incremental backup, backup media,
RAID, disposal, cloud storage, tape backup, hard disk, optical storage, solid state drive (SSD), data
structure, array, associative arrays, classes, fields, files, hash tables, linked lists, queues, records,
stacks.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 – Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

are easy to locate and are grouped with the


Handling files and data solution when files are backed up. This can also
be achieved by including an import feature in
A number of procedures and techniques are the software that copies a file from any
often used when managing files and data. As location to a secure folder within the solution.
files are easily corrupted, it is often best to
leave them ‘open’ for the shortest time Naming of files is also important. A specific file
possible. From a programming perspective the extension may be chosen so that files are not
best way to do this is to ‘open’ the file, read in misinterpreted as belonging to another
its contents and then ‘close’ the file. If the file software package. Files should be named in
is ‘open’ and the software solution experiences ways that identify them and distinguish them
an unexpected crash, it is quite likely that the from previous versions.
file will be damaged. Opening files for
prolonged periods can also mean that other Of specific importance when discussing the
users cannot access the file. handling and management of files and data are
issues of security, archiving, backing up and
Input files should be located in a folder that is disposal.
specific to the software solution so that they
Software Development: Core Techniques and Principles 4th edition

Security of files Version control is paramount to


managing the backups that are being
A common use of files is the storing of sensitive produced in an organisation. Version
information such as usernames and passwords control software (VCS) packages are
or logs of user activity. It can be a dangerous applications that assist organisations
practice to leave these files unprotected, as to keep track of the versions of their
backups / files.
someone could choose to access the file
directly rather than through the software
solution, potentially bypassing any security
Performing a backup can be as simple as
features of the program. The easiest way to
selecting a number of files and making a copy
protect files of this sort is to encrypt them – a
of them that can be placed in a secure location.
process that is described in more detail in
Organisations are generally much more
Chapter 10. Files can also be password
organised than this and will have a specific
protected or placed into folders that are only
backup routine that they follow regularly. The
accessible by those with administrator
extent and frequency of this routine is often
permissions.
determined by how much data there is to
backup, what facilities are available and how
Archiving valuable the data is.

Archiving is a task performed by all The use of an Uninterruptable Power Supply


organisations at some time. As the amount of (UPS) can keep a server running in the event of
storage space that is available for software and a power outage for enough time for an up to
data is finite, archiving is necessary to remove date backup to be made and for the system to
those files that are no longer needed for be shut down in a safe manner that doesn’t
immediate access. The size and number of files damage any files.
that are being used on the main storage media
also has a significant effect on the backup There are three different types of backup: full,
routine – as a large amount of data is much differential and incremental.
harder to backup than a small amount (see
below). Archiving is simply the process of
removing those files that are no longer
required to be accessed to another storage When safes are labelled ‘fireproof’,
medium that can securely store them for an this often means that they are
indefinite period of time. Should the files in an ‘flameproof’ and will protect their
archive need to be accessed, they should be contents for a certain period of time.
Higher quality safes are also
able to be opened via a process that makes ‘waterproof’ so that the contents are
them available to the software. not damaged by the water that is used
to put out a fire that is threatening the
Backing up safe.

Data and information within an organisation


are exposed to a number of threats. These Full backup
threats can be as simple as the accidental
damage or loss of storage devices, to A full backup is the simplest method of backing
deliberate acts that destroy, delete or corrupt up data. When a full backup is performed, all
data. The process of backing up the data and of the data is copied to the backup media. This
information within an organisation is vital to can be time consuming if there is a large
ensure that the operation of the organisation amount of data and requires the use of a
continues as unaffected as possible in the case backup medium that can accommodate the
of a data loss of this kind. combined size of all of the files (which can be
expensive). The advantage in using a full

114 | P a g e
Chapter 7: Of input and output

backup comes when the files need to be day. It is important that the users of the system
restored. As a full backup is a complete set of have finished their work and closed all of their
all of the organisation’s data, it can be simply files, as this can interrupt or impede the
restored all at once or aspects of it selected. backup procedure. A backup procedure can
also take a long time to complete. For these
Differential backup reasons, backups are often set to be
performed during the night.
A differential backup is used in conjunction
with a full backup. A full backup is made and As they are often automated processes, the
then subsequent backups contain only those backup media have to be able to accommodate
files that have been changed (since the full the size of the backup without needing to be
backup). Note that there are technically only swapped or changed.
two backups when using this method
(although more copies may be made). The full
backup is the first and the second is the most
How often to backup can be a personal
recent differential backup. This makes the choice. There is a saying that you should
process of completing daily backups much only backup that which you cannot afford
simpler and quicker, but does mean that the to lose. At the same time, there has to be
process of restoring the backup is more a balance between the time spent backing
complex than restoring a full backup. To fully up and the time spent working. Daily
backups are commonly implemented by
restore the backup, the full backup would need
companies, but this could be varied
to be restored, followed by the most recent based on the size of the company and the
differential backup. amount of data being processed.

The size of the differential backup will grow as


the number of files that have been modified
from the last full backup also grows. A point Backup media
will be reached when it becomes practical to
make another full backup to reduce the size of There are a wide variety of hardware devices
the differential backup. To get around this that can be used to store backup data. The
problem, an incremental backup can be used. choice of device will depend on the capacity
required, access speed, convenience and cost.
Incremental backup Capacity is of particular importance as the
backup media should ideally be able to backup
An incremental backup works in a similar way all of the data without the media needing to be
to a differential backup. A full backup is made swapped. Swapping media while in the process
to start the process. Subsequent backups then of backing up will complicate the process and
consist of any changes that have occurred make it less likely that the backup will be
since the last incremental backup. This makes performed correctly each time. It is also harder
the backup process very fast and efficient. An to automate a backup process that involves
incremental backup is more complicated to swapping the backup media.
restore however, as the full backup needs to be
restored followed by all of the incremental Cloud storage
backups in order. For this reason, a logical file
naming convention is needed to ensure that When it comes to backup media, many
the order of backups is maintained. organisations will make use of cloud storage
solutions. This may be via a formal backup
When to perform a backup? procedure that is moving data to the cloud
storage site on a regular schedule or simply
Backups are generally performed as manually moving data to one of the many
automated processes at the end of the working

115 | P a g e
Software Development: Core Techniques and Principles 4th edition

cloud platforms such as Google Drive or means that an organisation maintains physical
Microsoft’s OneDrive. control of the data, which may cause
organisations to opt for media like this over
The advantage of using cloud storage as a cloud storage.
backup media is that it is independent of
physical media and relies only on the Tapes are portable, small and manageable.
accessibility of the cloud server. There is little They are easy to store compared to hard
infrastructure that the organisation needs to drives, servers and other physical media. They
implement. Retrieving a backup is as simple as are also very durable. Many digital tapes can
connecting to the cloud server and transferring retain data for 30 years or more without
the required files. The only disadvantage of regular maintenance. Given this and the fact
using this form of backup media, is that an that the cost is relatively low, an organisation
organisation will be unable to access the data can archive data without the need to reuse
in the event of a power outage. backup media.

Tape backup The capacities of tape drives are increasing


each year. The current standard (LTO-8) can
Tape backup is still used today despite it being store 12 TB and 30 TB if the data is
one of the oldest forms of backup media compressed. Data can be accessed at 320 MB
available. Some organisations utilising tape per second data rate.
backup systems will be doing so as it is a legacy
system and all of their previous backups are in Hard disk
this form. Its use is certainly declining, but it
will probably be used for a number of years still The price per capacity for hard disks has been
based on the advantages that it provides. making them more competitive as a backup
medium in recent years. The main advantage
Tape is very good at holding onto data for that hard disk storage has over tape storage, is
really long periods of time and is still relatively access time. Performing a backup to a hard disk
cheap. Many tape formats offer encryption is relatively quick and retrieving data from the
features, which allow for easier compliance disk is equally so. Capacities of hard disks are
with various laws and information security now in a similar range to that of tapes. In
standards. In addition to this, a tape backup addition to this, hard disks are highly portable,
can be connected to an
information system in a wide
variety of ways and they are a
physical form of media that
once again, allows an
organisation to maintain control
of its data.

The main disadvantage of hard


drive storage is its volatility.
Being a mechanical device, it is
more easily damaged (especially
while being transported). Some
hard drive manufacturers have
included technology to support
a drive’s contents in the event of
it being dropped and others
Figure 80: Backup tape LTO storage installed in a data centre encase their drive in a shock
server room absorbing case or surround.

116 | P a g e
Chapter 7: Of input and output

Despite these improvements, drives can


be easily damaged and manufacturers do
not make any guarantees about the
safety of the data stored on their drives
in the event of a drop or physical event.

Hard drives also do not retain their data


for long periods of time. Given their cost,
they can be replaced with newer drives,
but organisations would need to be
purchasing new hard drives in the region
of 3-5 years.

Optical storage Figure 81: Inside a hard disk drive

Optical storage has been on a rapid decline are relatively expensive for their low capacity
with the increased use of cloud storage and the in comparison to hard disk drives, but are
cheap cost of portable hard drives. Recordable highly portable and easy to transport. Unlike a
CDs, DVDs, and Blu-ray discs were common hard disk, a solid-state drive does not contain
forms of backup, especially for consumers, due any movable parts making it less susceptible to
to their low cost. Each of the optical storage physical damage. Their speed of access is
media listed above had limited capacity and determined by the interface, but is generally in
were also time consuming to write data to. the range of 500Mbit/s to 6Gbit/s. Capacities
Interestingly, tests conducted on standard continue to grow.
consumer grade optical media has shown that
it has an archival period of up to 10 years. High While SSDs can be very convenient to use, their
end optical media produced with a special use as a backup media is questionable. Given
gold-sputtered layer is projected to have a the short amount of time that SSDs have been
retention as high as 100 years. in mainstream use, information concerning
their ability to retain data is limited. Their cost
Solid-state drive (SSD) per unit of storage makes them one of the
most expensive forms of media storage, and
Solid-state drives (also known as USB flash when an SSD fails, data recovery is difficult.
memory, thumb drives etc.), are devices that

Backup Media
Backup Media Type Capacity Cost Max write speed
Optical media CDR ~700 MB ~$0.30 per disc ~7 MB/sec (52x)
Optical media DVD 4GB ~$0.60 per disc ~21 MB/sec (16x)
Optical media Blu-ray 25 GB / 50 GB ~$3-$8 per disc ~72 MB/sec (16x)
(single / dual
layer)
Hard disk Internal 1-14 TB $50 - $1000 ~200 MB/sec
Hard disk USB 1-5 TB $50 - $400 ~130 MB/sec
Solid state drive Internal 120 GB – 4 TB $30 - $1000 ~550 MB/sec
disk
Solid state drive USB 16 GB – 512 GB $10 - $400 ~300 MB/sec
Magnetic Tape Tape Up to 1TB $40 per tape ~240 MB/sec

Figure 82: Backup media

117 | P a g e
Software Development: Core Techniques and Principles 4th edition

being continuously rewritten.


Not only would this mean
that the backup could
potentially fail sooner (due to
its high usage), but in the
event of failure, there would
not be any other backups to
use instead. Organisations
will often use a set of backup
media that they will cycle
through (one for each day of
the week, for example).

Figure 83: A hard disk next to an SSD Lastly, a backup procedure


needs to include a detailed
and well documented restore
Other factors to consider when creating a procedure.
backup procedure
RAID – an alternative to backup?
Data may need to be recovered from a backup
when it is lost as a result of an accident or from A Redundant Array of Inexpensive Disks (RAID)
the actions of a deliberate threat. Accidents can be used to increase the reliability of data
can happen and employees may simply lose or and the speed at which it can be accessed. It is
delete data and may not be able to recover it not really an alternative to backing up data, but
themselves. Deliberate threats such as a RAID can be configured so that a real-time
malware and hacking can also result in deleted backup of data is kept. Different RAID
or modified data and are discussed in more configurations are known as levels and several
detail in Chapter 10. Backups should be easily exist. A RAID 1 configuration utilises at least 2
accessible so that the time between the HDDs and simply copies the contents of one to
detection of the loss and the recovery of the the others. In the event of one HDD failing, the
data is minimal. No matter how good the others have an exact copy. Other RAID
backup procedure in an organisation, there will configurations spread the data across the
often be a version gap between the data that HDDs in such a way that if one HDD fails, it can
has been lost and the data within the last good be removed and a new one installed in its place
backup. For this reason, it is often a good without any loss of data. The data on the other
practice to encourage the users of the system HDDs can be used to ‘build’ the data that was
to make their own backups, especially if they stored on the failed HDD.
are making frequent changes.

At least one set of backup media should be


kept off-site in the event of a disaster (such as
fire or flood). The use of fireproof safes is a
good practice, but even though such
containers are billed as being ‘fireproof’, the
high temperatures that are reached in a fire
can mean that backup media kept within them
can be damaged nonetheless.

Multiples of backup media should be


produced in the event of the media itself
failing. For example, it would be quite
dangerous to have one single backup that was Figure 84: A RAID configuration

118 | P a g e
Chapter 7: Of input and output

Disposal of files The choice of storage media also has a


significant effect on the way that files are
The saying “information is power” has quite a accessed. Storage media are often selected
lot of relevance in today’s information age. based on their capacity, but other attributes
Rarely is information simply disposed of or that are important are speed of access
deleted. Information is more commonly (latency), reliability, cost and ease of use.
archived in preference to disposing of it as it is
often hard to know what information is going Data structures and how they
to be useful later on. Having said this, there are
often data files that are created as temporary can be used to store and
files storing various calculations on the way to manipulate data
the production of a final result. It is useful to be
able to identify files such as these as their Data structures provide a way of storing and
storage can add to the size of a system backup organising data so that it is easier to access or
and can make it harder to find important more efficient to use. There are a wide variety
information. of data structures available to software
developers and the choice of which ones to use
Attributes of files that affect access of will be made based on the problem that needs
data to be solved and the ways in which the data
needs to be used.
The size of files, the way files are organised and
the storage medium onto which they have It is worth noting that we are often not
been placed all affect the ways in which files concerned with how data structures are
can be accessed. This is not a one-way represented in memory, though some
relationship. The need to have files accessed in structures use more memory than others.
a certain way will in turn influence the choices
that are made about their structure. Files that Many of the data structures that are in this
are used frequently by an organisation need to course were discussed in Chapter 3. In the
be placed in a location that is easily accessible section below, we will discuss these data
and on a storage medium that is both fast and structures and others in the context of how
reliable. The size of important and frequently they can be used to store and manipulate data.
used files can often be large which puts
pressure on the network infrastructure as well Arrays
as the storage device. Less frequently or widely
used files can be distributed around a network When a set of data elements is collected under
in locations that are convenient for those using the same name, it is known as an array. Arrays
them. can be created to hold any type of data, the
only constraint being that all of the array
elements need to be of the same type. Each
array element can be separately accessed
Data structures: The way data is stored
to enabled efficient algorithms to be used using an index (or in the case of multi-
to optimise program execution time and dimensional arrays, a number of indices).
memory usage. Types of data structures
include: arrays, associative arrays,
classes, fields, files, hash tables, linked
lists, queues, records and stacks. Number
Index 0 1 2 3
VCAA VCE Computing Study Design
Value 20 15 10 30
2020-2023 Glossary

Figure 85: An array of integers called ‘Number’

119 | P a g e
Software Development: Core Techniques and Principles 4th edition

Array declaration points to the first element in the array. To


access the second element of the array, an
Arrays can be declared statically or index of ‘1’ would be used as shown below.
dynamically.
Number[1]
Static arrays have a defined number of
elements while dynamic arrays expand as To cycle through all of the elements of an
elements are added to them. Arrays are array, a loop can be used that is 1 less than the
declared to be of one type only. The benefits of size of the array.
declaring a static array is that its size is known.
A software developer can calculate what Begin
storage space is available or required, and can Length ß Len(Number)
declared an array to maximise the use of this For Loop ß 0 to Length - 1
space. Loops can then be constructed that Display Number(Loop)
iterate through all of the elements within the Next Loop
array easily. End
Arrays can also be declared as multi- Figure 87: Cycling through all of the elements of
dimensional arrays. A two-dimensional array an array
(for example), has two indices (which can
thought of as an ‘x’ and ‘y’ index).
Benefits or shortcomings
Number
Arrays are one of the simplest data structures
Index 0 1 2 3
that are available. They are easy to use and
0 20 15 10 30 every programming language supports them as
1 65 23 0 -43 a core structure. Many more complex data
2 10 14 5 27 structures use arrays as their base, so an
understanding of how to declare and use
Figure 86: A 2-dimensional array of integers arrays is essential for a software developer to
called ‘Number’ be able to make use of these.

Array access The size of an array can make it time


consuming to navigate, though there are
Elements in an array are accessed by using the algorithms (some of which we have already
index which points to each individual value. discussed) to make this easier. The fact that
The index of the first element is ‘0’, and one data of one type can be stored is definitely
increases by increments of 1. This may seem a shortcoming, but the use of ‘records’ solves
strange, but has evolved from the way that this issue.
arrays are stored in memory. The first element
in the array would be accessed from the first Knowing the ‘bounds’ of an array is essential as
memory location (the one pointed to by the this is a common source of error in code. It is
variable). Subsequent elements are accessed very easy to ‘step’ 1 index outside of the
via an ‘offset’ (that is, if the offset is ‘1’, this bounds of an array (especially at the end).
means that the next element is located at the Thorough testing usually picks up errors such
memory location pointed to by the variable as these, but it is also one worth highlighting as
‘+1’). it occurs frequently.

For example,

Number[0]

120 | P a g e
Chapter 7: Of input and output

Associative arrays Begin


Length ß Len(Number)
One of the main difficulties with using an array For Loop ß 0 to Length - 1
comes when trying to find an element within it. Display Number_Key(Loop)
Associative arrays solve this by using a pair of Display Number_Value(Loop)
values (one that acts as a unique key and one Next Loop
that stores the value). End

“Apple” : “95” Figure 89: Cycling through all of the elements


“Banana” : “111” of an associative array
“Apricot” : “17”
“Kiwi” : “112”
“Lemon” : “17” Benefits or shortcomings
“Mango” : “202”
“Orange” : “62” In the base form of an associative array, only
“Lime” : “20” one value may be stored per key. If a number
“Passionfruit” : “17” of values were needing to be stored, this could
be accomplished by using duplicate arrays.
Figure 88: Storing information in an Some programming languages will allow data
associative array structures to be used in the place of the ‘value’.
In this case, the key can reference an array of
Figure 78: Cycling through all of the values.
Let’s say that
elements weassociative
of an wanted to record
array the calories
of particular fruits. We could represent this in
Classes
an associative array like the one below.
A class is an object that has a number of
Array access
methods and events associated with it. It can
be duplicated as many times as required and
Elements in an associative array can be cycled
used independently of the other instances of
through just as they can in a normal array. A
itself. Just as a variable can be declared to be
loop that starts at index 0 and proceeds to the
of a particular type, an object can be created
length of the array – 1 can be used to move
that is of a particular class and named in a way
from the first to the last element. At each index
that allows it to be referenced on its own.
position, the key or the value can be accessed.
Doing this doesn’t harness the power of an
Let’s consider an example in which we want to
associative array however.
declare a class to store information related to
sprinklers in a home sprinkler system. We
Associative arrays allow a user to access a
could represent this as a class like the one
value via the key. The exact syntax will be
below.
programming language dependant, but
something like the following is typical.
Class declaration
Name[“ID5243”]
Classes consist of variables (within the
instance), methods and events. In this
Cycling through all of the array elements could
example, let’s define the class as shown in
be done using code like that shown in Figure
Figure 90 on the next page.
89.
Using this example, a Sprinkler could be
declared using a line similar to the one below:

Sprinkler front_lawn = New Sprinkler( )

121 | P a g e
Software Development: Core Techniques and Principles 4th edition

class Sprinkler further, read up on the concepts of


encapsulation, inheritance, polymorphism and
string Name abstraction.
integer Location
integer Power On the negative side, classes can be tricky to
Boolean SprayOn define initially.

public int GetLocation( ) Records and fields


return Location
Easier to implement than classes, but with
void SprayOn( ) similar functionality on a base level, are
SprayOn ß True records and fields.

void SprayOff( ) A record is a structure that can be used to


SprayOn ß False group together variables for a particular
purpose. Records are similar to arrays except
end class that where an array usually contains elements
all of the same variable type, the fields within
Figure 90: Example class definition a record may be of different types and sizes.

Accessing a record or field


The variables within this instance could then
be set using lines such as the ones below: Let’s consider an example where we are
storing information related to the items being
front_lawn.Name = “Front Sprinkler 1” sold in a store.
front_lawn.Location = 1
front_lawn.Power = 5 item(index).id ß “0011”
item(index).name ß “Screwdriver”
You will also be able to see some methods that item(index).brand ß “Champion”
have been defined. These could be called in the item(index).model ß “MX-52”
following ways: item(index).stock ß 12
item(index).location ß “3B”
front_lawn.GetLocation( ) item(index).price ß 10.99
front_lawn.SprayOn( )
front_lawn.SprayOff( ) Figure 91: An example record data structure

Benefits or shortcomings
Benefits or shortcomings
For each problem being solved, there are data
structures and techniques that can be used to Records have some specific advantages over
best suit the data being stored and the using ‘plain’ one dimensional arrays. If data like
processing required. The use of classes can be that shown in the example were to be
very beneficial when a large number of represented in arrays (one for each field), each
complex objects needs to be created and one would be disconnected from the others
standard variables and functions would be and would not make a lot of sense on its own.
difficult to use. There are many other aspects For example, if the price were required for the
of using classes that are beyond the scope of item with the ‘id’ of ‘0012’, the index of this
this course and text, but in implementing item would need to be found in the ‘id’ array
classes, they can be combined and expanded and then this index would need to be used in
upon in ways that make them extremely the ‘price’ array. The main difficulty with such
powerful. If you wish to investigate these an arrangement comes when the information

122 | P a g e
Chapter 7: Of input and output

needs to be sorted in some way. Sorting a forefront of any solution design. In addition to
number of parallel arrays (and keeping them in this, multiple users will often access files
sync with each other) is a difficult task. simultaneously and will have different
permissions to access the file.
Files
Choosing between TXT, CSV and XML files
While we have been dealing with ways in
which data can be represented in memory, it is When choosing to use a file to store data, the
also important to consider how data will be format will be largely dictated by the amount
stored. Storing data on secondary storage of data that will be stored and the ways in
devices (such as hard drives), requires the use which it will be accessed. For example, if a
of a file. A file is a way to organize stored data software solution needs to store some default
for access at a later time. values to be used in the software solution, a
TXT file will suffice.
In Chapter 3, we discussed the use of TXT, CSV
and XML file formats. Each file format has its When a software solution needs to store array
own benefits, but files will all generally be values or fields associated with records, a CSV
accessed in similar ways in a software solution. file can be a good solution. The thing to be
An example from an XML file is shown below: aware of when choosing a CSV file format over
and XML format, is that the number of fields
needs to be constant for each record. In
<stock item> addition, when writing to or reading from a
<id>0002</id> CSV file, the order of the fields needs to be
<type>Fish</type> hard coded into the software solution.
<name>Comet Goldfish</name>
<quantity>15</quantity>
In contrast to this, an XML file adds flexibility of
<cost>$12.23</cost>
format and structure. The tags within an XML
</stock item>
file indicate the name of the field. Not only this,
<stock item>
but they can be in a different order in each
<id>0003</id>
record, and records can contain different
<type>Plant</type>
fields. An XML file is very easy to read and
<name>Kelp</name> understand in isolation, and a software
<quantity>20</quantity> developer could easily utilise it without having
<cost>$0.43</cost> to read any documentation to know what it
</stock item> contains.
Figure 92: An example section of an XML file
Hash tables

A hash table is essentially a way of


Benefits or shortcomings
implementing an associative array. While an
associative array of unlimited size can be used
When accessing a file within a program, it is
to store the occurrences or the values
important to control its access carefully. Files
associated with key variables, a hash table uses
need to be opened for use and then closed
a hashing algorithm to store these values more
after the program has finished reading from
efficiently.
them and writing to them. Files should be kept
open only for the amount of time that is
The elements within a hash table data
required to read or write to them. There is a
structure consist of a key and a value (just as
danger that if the program crashes and it has
they do in an associative array). A hash
files that are currently open, that these files
function will work out the index in an array
will be corrupted. As data is such an important
where the key:value pair will be stored. In the
commodity, protecting it should be at the

123 | P a g e
Software Development: Core Techniques and Principles 4th edition

case of the array index being used already, the There are only two ways in which a stack data
key:value pair is attached in a process known structure can be accessed. Data can be placed
as ‘chaining’. In effect, each element of the on ‘top’ of the stack. This function is called the
hash table should be an array or list in its own ‘push’ function. Data that is taken off the stack
right. can only be taken off the ‘top’ and this function
is called the ‘pop’ function. Diagrammatically,
An example of a hash table is shown below: a stack could be represented as shown in
Figure 94.
Array
B: 102 Stack
G: 305 à GA: 452 à GF: 872
J: 725
L: 273 à LM: 526
M: 662
P: 012 à PR: 526 à PS: 210
S: 901 à SD: 431 Top
Z: 032 -1

Figure 93: An example of a hash table Figure 94: A diagrammatic representation


of a stack
Benefits or shortcomings
Note however that a stack is just a one-
Hash tables (like associative arrays) make it dimensional array. Drawing it vertically does
quick and easy to locate specific elements with help to visualize what is going on within the
the array. While it is more ideal if the element stack, as the process of placing items on top of
being searched for is one that exists, it will be the stack and taking items off is the same as
easy enough to find one that does not exist, by what would happen with a physical stack of
hashing to the index where it should be objects.
located, and following the chain till the end.
In order to operate a stack, two things about
If a poor hashing algorithm is used, a hash table the stack must be known. We must keep track
could potentially have a large number of of the position (or index) of the ‘top’ of the
elements in a small number of chains and in stack. We must also know how large the stack
effect, become just like a one-dimensional it, just in case we fill it to capacity.
array. In this case, searching for elements
would become very inefficient. However, a Consider the stack from Figure 94. If we were
hashing algorithm could be assessed to ensure to perform the following ‘push’ and ‘pop’
that it is distributing elements evenly commands, this would add items to the stack
throughout the array, and if it is performing in and remove them. Note that as items are
this way, the benefits of using a hash table will added to the stack, the ‘top’ variable is
be easy to see. changed to the index of the top element.

Stacks Push(10)
Pop
A stack is a data structure that works on the Push(3)
principle of First In - Last Out (FILO) (or Last In Push(22)
– First Out depending on your perspective!). A Push(74)
stack is a one-dimensional array of values that Pop
are accessed in a particular way. Push(17)

124 | P a g e
Chapter 7: Of input and output

Stack in the array. Adding an item to the back of the


queue is referred to by the term ‘enqueue’
while removing an item from a queue is called
17 ‘dequeue’.
22
Let’s work through an example so that you can
3
see how a queue works. The following
Top
commands will be executed on the queue
2
shown in Figure 96 below.
Figure 95: The stack after some elements
have been added Queue
Index 0 1 2 3 4
Benefits or shortcomings Queue(Index)

Stacks are useful data structures as they Front:


operate in a First In - Last Out way. This is 0
particularly useful in executing code and Back:
backtracking through calls to modules and -1
functions once they have been completed. In
fact, you may have encountered a ‘stack Figure 96: Queue example
overflow’ error if you have had a circular
function call in some code that you have
written.
enQueue(20)
enQueue(40)
While easy to implement, stacks are limited by
deQueue
their size. A software developer will aim to
enQueue(35)
make the size of the array to be used for the
deQueue
stack big enough so that it will cater for the
enQueue(15)
requirements of the problem, but not so big
enQueue(10)
that it is using up a large amount of memory
unnecessarily.
The ‘front’ variable points to the front of the
queue while ‘back’ points to the back of the
Many languages have stack data structures
queue (in this case ‘back’ is set to ‘-1’ to
available natively, so that software developers
indicate that the queue is empty).
don’t have to code their own ‘push’ and ‘pop’
routines. In these cases, it is only necessary to
As items are added, they are added to the back
declare a new stack variable and then use the
of the queue. As items are removed, they are
available ‘push’ and ‘pop’ methods to place
removed from the front of the queue.
items on to the stack or remove them.

Queues Queue
Index 0 1 2 3 4
A queue data structure can be implemented in Queue(Index) 35 15 10
a very similar way to a stack except that it
works as a First In - First Out (FIFO) structure. Front:
2
Items in a queue are added to the back and Back:
removed from the front. For this reason, it is 4
necessary to keep track of where both the
front and the back of the queue are positioned Figure 97: Queue example continued

125 | P a g e
Software Development: Core Techniques and Principles 4th edition

Benefits or shortcomings and one to store the links which will point to
the next array element. It is also necessary to
Queues can be useful data structures to store the index of the first item in the list.
implement as many real world situations
operate in a First In – First Out fashion. A A small linked list might look like the example
simple printer queue is a classic example of a below:
queue data structure in action.
Linked list
In practice, the process of adding elements to Index 0 1 2 3 4
the queue and removing them has the side List(Index) 30 40 35 20 10
effect of moving the queue forwards through Link(Index) 2 -1 1 0 3
the array. As you can see in Figure 97, the
queue as it stands is located in array elements First:
2 to 4. If another element were to be added to 4
the queue as it stands, the algorithm would fail
as the queue is full. Figure 98: Linked list example
There are two ways that this can be prevented
in practice. The first way is to move the queue In the example above, you can read the list of
back down the array when it gets close to the numbers in order by beginning at the index
end. This isn’t very efficient as it adds to the pointed to by ‘First’ and proceeding to follow
processing time. The second way is to the indexes associated with each array
implement what is known as a ‘circular queue’. element. If you do this, the order will be:
While beyond the scope of this course and
text, a circular queue connects up the 4, 3, 1, 2, 1, -1
beginning and end of the array logically
speaking. Items added to the end of the queue The ‘-1’ indicates that the end of the list has
may be located at the beginning of the array if been reached.
the queue has moved beyond the end.
Benefits or shortcomings
Like a stack, a queue still needs to have enough
space allocated to it to cater for the maximum The example shown here is obviously a very
amount of data that is likely to be stored within simple one and has been used to illustrate how
it. a linked list works. In practice, linked lists are
much larger than this and also utilise a variable
Linked lists pointer called ‘free’ which indicates the next
free array element. While there is some
A linked list is a data structure that maintains overhead in relation to maintaining the links,
itself in sorted order at all times. One of the this is a small price to pay compared to sorting
main disadvantages of using arrays to store an array from scratch. One of the main
data is that it is very difficult to maintain them advantages in using a linked list data structure
in sorted order. There are sorting algorithms is that elements are added in order. This does
that can be used to rearrange the data within require that the algorithm locate where in the
the array (two of which we examined in order the item belongs. In contrast to this,
Chapter 3). However, sorting algorithms take removing elements is a simple matter of
time to execute. By using a linked list, items are changing the links so that they skip over the
placed in their correct order straight away. removed element. Over time, this can lead to
‘holes’ in the linked list, so an advanced
To create a linked list requires two parallel method of preventing this is to also maintain a
arrays: one to store the required information linked list of the free array elements.

126 | P a g e
Chapter 7: Of input and output

Context Questions

1. Why is it important to use a file naming convention?


2. Why is it important for two software packages that are going to be exchanging data via text
files, to use the same delimiter?
3. List the three types of backup procedure.
4. Company A is using a differential backup procedure and company B is using an incremental
backup procedure. Both companies have the need to restore their backups due to a
catastrophic loss of data. Which company will have their backup restored quicker and why?
5. Why is it important to have at least one backup stored off-site?
6. How often should a backup be performed?

Applying the Concepts

• Investigate and then design a backup system for your home network and present this solution to
your class for feedback.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Generate new design ideas using one of several methods

List and describe efficiency measures of a software solution

List and describe effectiveness measures of a software solution

Describe factors that influence the design of solutions

Discuss what makes efficient and effective user interfaces

127 | P a g e
Software Development: Core Techniques and Principles 4th edition

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
Which of the following would be a good example of the use of a file naming convention?
A. Document1
B. File050613
C. Smith_AuditData_Expenses_070416
D. Smithfamilyauditexpenses56

Question 2
Turtle Enterprises performs a full backup of its’ data each Friday after work hours (on an automated
process) at 11pm. For Monday to Thursday, an incremental backup is done each day at the same time
in the evening. On Thursday morning, there is a hard drive failure and all data is lost. Which backups
will need to be restored to reinstate this data?
A. Wednesday
B. Friday and Wednesday
C. Friday, Monday, Tuesday, Wednesday
D. Friday, Monday, Tuesday, Wednesday, Thursday

Question 3
The data structure Length[Index] is most likely to be:
A. A one-dimensional array
B. A two-dimensional array
C. A reference to a hash table
D. An integer

Question 4
What is the difference between full backup, differential backup and incremental backup?

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
3 marks

Question 5
A company has decided to change from performing a daily incremental backup to performing a single
full backup once per week. List two disadvantages of such a move.

Disadvantage 1: _____________________________________________________________________

Disadvantage 2: _____________________________________________________________________
2 marks

128 | P a g e
Chapter 7: Of input and output

Question 6
Khalil is considering different data structures that he can use in the creation of a software solution he
is writing, that will store all of the information related to a client.
a. What is a data structure?

__________________________________________________________________________

__________________________________________________________________________
1 mark

b. Khalil is deciding between using a record data structure and a simple 1D array. Which of these
would be the better choice? Give two reasons to support your choice.

Choice: _____________________________________________________________________

Reason 1: ___________________________________________________________________

Reason 2: ___________________________________________________________________
2 marks

Question 7
The diagram below shows how a hash table works conceptually.

a. Explain what a hash function does.

__________________________________________________________________________
1 mark

b. What is the main advantage that a hash table has over a one-dimensional array?

___________________________________________________________________________

___________________________________________________________________________
1 mark

Sample Examination Answers

Question 1
Answer: C

Questions such as this are always ‘best possible answer’. Option C in this case is quite good as it
contains a person’s name, a reference to what the file actually is as well as a date.

129 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 2
Answer: C

They will need to reinstate first the full backup, followed by every incremental backup in turn.
Question 3
Answer: A

Question 4
Full backup: backup all of the data every time.
Differential backup: only record the changes to the data compared to the last full backup.
Incremental backup: only records the changes since the last incremental backup.

Question 5
Disadvantage 1: A full backup takes longer to perform than an incremental backup (discounting the
first day).
Disadvantage 2: If the server fails the day before the full backup is due to take place, then potentially
6 days of data could be lost (as opposed to a maximum of 1 day for an incremental backup).

Question 6
A data structure is a way of organizing data in a computer so that it can be used effectively.
Choice: record
Reason 1: Each of the fields associated with the client’s record can be kept together.
Reason 2: A 1D array would not be able to store more that one data type.

Question 7
A hash function maps a value to an index based on a rule.
It is quicker to locate values or determine if they are present than checking all of the values in a one-
dimensional array.

130 | P a g e
Chapter 8
Testing and evaluating

The chapter covers Unit 4: Area of Study 1 key knowledge:


• Digital systems
o KK3.5 Characteristics of efficient and effective solutions
o KK3.8 Techniques for testing the usability of solutions and forms of documenting test
results
o KK3.9 Techniques for recording the progress of projects, including adjustments to tasks and
timeframes, annotations and logs
o KK3.10 Factors that influence the effectiveness of development models
o KK3.11 Strategies for evaluating the efficiency and effectiveness of software solutions and
assessing project plans

Key terms: Efficiency, effectiveness, testing, test data, testing table, project plan, testing, useability,
large file sizes, benchmarking, live data, documenting testing, evaluation, strategies, acceptance
testing, quality assurance.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 – Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Characteristics of efficient factors that play a role. Entering data into a


software solution is certainly part of this. There
and effective solutions may not be much processing taking place at
this time, but interfaces may be working to
In Chapter 6, we discussed various measures of dynamically populate fields and delays in doing
efficiency and effectiveness and how attention so will be noticeable. When output is
to these can affect the design stage. We will produced, is the software solution able to
now revisit these measures and examine them generate this output in a timely manner?
in light of development and testing. Switching from one view in the interface to
another can also be dependent on processing
speed.
Efficiency measures
How can we improve speed of processing?
Speed of processing
Speed of processing can be improved in a
When examining the processing speed of a number of ways. The most obvious one may be
software solution, there are a number of to improve the hardware (processor, available
Software Development: Core Techniques and Principles 4th edition

memory), but this may be out of the control of How can we improve functionality?
a software developer. Let’s examine some
other ways that speed of processing can be Methods of testing a software solution were
improved. discussed in Chapter 3, but improving
functionality is different from testing in that
• Some data structures will be quicker to use the main focus is on the ease of operation.
than others. An assessment of the data
structures that have been chosen in the • A usability test is a useful tool that can be
context of how well they are responding, is implemented to improve functionality. In
a worthwhile exercise. constructing a usability test, a software
• Examine the solution’s use of memory. Are developer will write out a list of tasks and
data structures sized reasonably? Are steps typical of the normal operation of
there variables that have been declared the software solution. They will then ask
that are not being used or ones that could others to use the software solution
be reused? following these steps and give them
• Examine how well the key sections of code feedback on how well they were able to
perform their functions. For example, execute these. When writing a software
there are a large number of sorting solution, it is sometimes quite easy to
algorithms available and each has their become so familiar with its operation, that
strengths and weaknesses. Implementing aspects that are not logical to a first-time
a quick sort might seem like a good idea, user will be difficult to spot. This process
but arrays that are in reverse order or have helps to highlight these. You can read more
many duplicates can cause a quick sort about usability testing later in this chapter.
algorithm to perform poorly. Likewise, the • Are there variables or data structures that
placement of the pivot can affect how well can be automatically populated either
as quick sort works for a set of data. completely or partially? Constructing code
• Screen operations are slow in comparison to do this may take extra work but may
to those taking place in memory. In cases remove steps that the user needs to
where a software solution is making a large complete.
number of changes to the GUI, it can be
quicker to write changes to screen RAM Cost of file manipulation
directly and have the whole interface
displayed at once with the next clock cycle As mentioned in the speed of processing
of the display. section, accessing files can have a significant
• The biggest delay in speed of processing effect on the efficiency of a software solution.
can occur when reading or writing to
storage media. As before, this may be out How can we improve the cost of file
of the control of the software developer, manipulation?
but the way that data is read or written
could be changed and the number of times Any file operations will be relatively slow. The
this is done, minimised. challenge can be to structure these in a way
that allows them to have minimal impact. As
Functionality discussed previously, changing the storage
media can be beneficial but may be out of the
The functionality of a software solution can control of the software developer.
have a profound effect on the efficiency. How
well are the functions within the software • File reads and writes can be held till there
solution designed? Do they address the tasks is an opportunity to do them all at the
for which they have been designed with the same time. One way this can be
minimal amount of interaction from the user? accomplished is to construct files in RAM
first and then write to the storage media

132 | P a g e
Chapter 8: Testing and evaluating

when the software solution is being closed. shorter sentences. Online tools such as the
The downside of this is that there is Hemingway App will take text input and
increased risk of losing data. give feedback based on the education level
required to read the text and feedback on
Effectiveness measures what changes can be made.
• Allowing the user to change the font size
and interface colours can aid in improving
Completeness
the readability of text.
• The choice of font is important as some
Completeness is the measure of how well the
fonts are well suited to headings while
information that is being produced meets the
others are better suited to descriptive
purposes for which it is intended. While it is in
passages. In addition to this, numbers in
some ways a subjective measure based on the
some fonts are easier to read than others.
task that is being performed and the user’s
needs, there are ways that a software
developer can work to improve this area. Attractiveness

How can we improve the completeness of It can be difficult to relate the attractiveness of
a software solution to efficiency. Colour
the information?
scheme, fonts, consistency of objects in the
GUI and how the software’s functions are
• Throughout the design and development
accessed, all have an influence on how
stages, a software developer can show the
effective the solution is and can all be
client what information is being produced
considered to belong in this area.
and gain feedback on the appropriateness
of it and make adjustments as required.
How can we improve the attractiveness
• In designing a GUI, it is often tempting to
present all of the available information to with a view to increased effectiveness?
the user at once. Doing so however, can
lead to issues with attractiveness and • Just as with the area of clarity, by
clarity. Instead of limiting the information, improving the look and feel of the software
a software developer could consider solution, it will become easier to read, use
having some of the information accessible and understand.
on different screens or hidden from the • An attractive interface will make it more
default view. likely that a user will be inclined to use it.
• An attractive interface may also lead the
Readability or clarity user through the process rather than
challenge them to work out what to do
Readability or clarity is simply the measure of next.
how easy it is to read and understand the
output that is produced. If a user struggles to Accuracy
read or understand the information that is
being presented, the effectiveness with which Accuracy would seem to be an area that is easy
they are able to use the software solution to access and improve upon. When assessing
rapidly deteriorates. accuracy, a software developer will be testing
the solution to see that the information being
How can we improve the readability or produced is correct. In addition to this, they
clarity? will be checking to ensure that inputs are
designed in such a way as to prevent
• As a general rule, clarity can be improved inaccuracies being introduced into the system.
by using simpler language, removing the
use of jargon (where possible) and using

133 | P a g e
Software Development: Core Techniques and Principles 4th edition

How can we improve accuracy? Timeliness

• Testing tables can be used to show how Does the software solution produce
accurate the information being produced information within a time frame that is
is and compare this to known results. In relevant and useful? A software solution that
cases where the differences are larger than doesn’t do this, risks being completely
what is acceptable, the calculations can be ineffective.
adjusted to ensure that this is improved.
• In some cases, the use of particular How can we improve timeliness?
variable data types will have an effect on
the accuracy of a software solution. • The first step in improving timeliness is
Ensuring that the data types being used coming to an understanding of the time
match the data that is being stored should frames involved in each particular
eliminate this. instance. When is data available and what
• The use of strong validation techniques are the expectations from users in regard
can prevent incorrect or inaccurate data to when information will be required?
being introduced into the software Once this is understood, the software
solution. solution needs to be designed to work
within this time frame.
Accessibility • In cases where timeliness is still an issue, a
software developer should examine the
By definition, the accessibility of a software ways that data is obtained to see if there
solution is the practice of allowing it to be used are methods that can automate or
by the greatest range of users without placing improve the process.
barriers in their way. A software solution that • Processing times may have a lesser effect
is not accessible for some, will mean that it is on timeliness but may be something that
being used ineffectively (or not at all) by those needs to be considered. If the information
users and this will have an overall negative that is being produced needs to be
effect on the effectiveness of the software as a available within seconds, then the
whole. algorithms, data structures and
calculations that are being implemented
How can we improve accessibility? should be examined for ways in which they
can be refined. Just as with ‘cost of file
• Functions and navigation should be clearly manipulation’ discussion in the efficiency
designed in a way that indicates their section, delaying or streamlining reads or
function. On screen help and tips that writes to storage media can have a
appear when the user hovers over objects significant effect on the processing time
can aid in this. required.
• Software solutions can be designed with
the specific client in mind. The client’s Communication of message
experience, age and their role can all be
factors worth considering. In considering how effectively the software
• Accessibility is also concerned with solution is presenting the information that it is
ensuring that those with disabilities are producing, there are a number of measures
catered for. This can mean designing that can be used. For example, is the
interfaces that accept input from a variety information easy to read and interpret or is it
of sources, including text to speech being presented in a way that is acting as a
functionality, catering for those with barrier to this. The software solution may have
colour blindness and allowing the user to all of the correct information on hand, but just
change the colour scheme, fonts and font may not be presenting it in way that is very
sizes. useful or accessible.

134 | P a g e
Chapter 8: Testing and evaluating

How can we improve the communication is to conduct a usability test. A usability


of message? test asks users to perform typical tasks and
then asks them for their feedback on the
• In performing usability tests, feedback can aspects of the solution that they felt
be gained from users in regards to how performed well or could otherwise be
easily they were able to locate the improved.
required information and ways in which
they feel this acquisition process can be Testing
improved.
• The use of familiar icons, symbols and Testing a coded solution for errors
interface features will make it easier for
users to determine how to use and locate When a program is first being tested by the
the functions they need. creators of the program, it is said to be in
‘alpha’ testing. ‘Alpha’ testing is also known as
Relevance ‘white-box’ testing. Once a program has
passed ‘alpha’ testing, it is released to select
Is the information that is being produced by members of the public for further testing
the software solution relevant to its purpose? known as ‘beta’ testing. ‘Beta’ testing is also
Are there times when the solution gives known as ‘black-box’ testing. The main
information that is not strictly targeted to the difference between ‘white-box’ testing and
person using the software? ‘black-box’ testing, is that those who are
carrying out the ‘black-box’ testing do not have
How can we improve relevance? access to the code and are concentrating on
features like function, usability and
• By including specific levels of user access appearance. Once a specified time of ‘beta’
that restrict the amount and type of testing has been completed (and the bugs
information that is displayed, users can be corrected), the software package can be
presented with information that is released officially.
targeted to their needs.
• Provide options to users that allow them to The process of testing a program is one in
select extra information if they require it. which it is determined whether the program
will produce the expected solutions under a
Usability variety of conditions. This not only implies that
it will produce the correct output for a set of
Well-designed software solutions should have valid inputs, but that it will reject input that is
interfaces that allow the user to perform the not in the correct form (that is, validate it as
tasks they need to accomplish without getting completely as possible).
in the way. There is a fine balance between all
of these measures and sometimes factors such To perform a thorough test of a software
as attractiveness and accessibility can be at package, a table like the one below is usually
odds with usability. The important thing is that produced.
the processing that is required should remain
the central focus as
opposed to how the Table of tests
solution looks.
Item Tested Test Data Expected Result Actual Result

How can we improve


usability?

• One of the best ways Figure 99: Testing table


to improve usability

135 | P a g e
Software Development: Core Techniques and Principles 4th edition

Tests are then methodically carried out on all Usability testing


aspects of the program’s operation. Where the
program does not perform as expected, notes While ‘Alpha’ testing is carried out by the
are made in the ‘actual result’ column so that software developers and is concerned with
changes can be made or bugs fixed. debugging the code and ensuring that it is
operating correctly, usability testing is carried
Selecting test data out by some of the intended users of the
software solution.
When testing a program using a testing table,
it is very important to select a range of test Usability testing could be considered to be a
data that ensures that the program operates as type of ‘Beta’ testing, though it is more
expected in all situations. Firstly, pick typical focused.
values for which you can easily calculate or
predict the results they will produce. You In carrying out usability testing, a select group
should also pick values at the extremes of the of users are given access to the current version
allowable data input range to ensure that of the code and asked to perform standard
validation rules are operating correctly. Next, tasks (usually following a specific script). In
select values that are outside of the allowable carrying out this testing, users give feedback
data range to ensure that the program does on how easy it was to perform tasks, how easy
not allow this data to be accepted and displays the interface was to understand, any errors
an appropriate message to that effect. Lastly, that take place and potential improvement
it is always good to include nonsense test data ideas they may have. This feedback is given
that will ensure that the validation procedures back to the developers who will compile them
are robust and the program is stable. and create an action plan to address them.

Large file sizes


‘Boundary value analysis’ is the name Testing how the system copes with large file
given to the testing technique involved in
testing values both inside and outside of
sizes is important as it may be something the
the allowable inputs for a software system needs to deal with on a regular basis. In
solution. As the bulk of logic errors addition to this, it is important to discover
occur at the boundary of valid inputs, what the limits of the system are, even if the
this technique involves checking values files used at this stage are far in excess of what
at the boundary, just inside the is expected. Over the course of a few years, the
boundary and just outside the boundary.
volume of data may change dramatically and
knowledge concerning the limits of the system
will be valuable in planning for the next system
upgrade.
Testing the usability of software solutions
Benchmarking
The original design specifications need to be
checked against the completed software The process known as benchmarking examines
solution to ensure that all components are in and measures the tasks within an organisation
place and operating correctly. Factors of and compares them to best practice. It is a
interest at this stage will be response times, large field in which much has been written, but
how the system copes with transactions we will only make a small mention of it in this
involving very large file sizes, coping with context. It is more commonly performed in the
transactions which have a mixture of data analysis stage of the PSM as a means of
types and how modules interface with each determining problems within the system. At
other. this stage, benchmarking is a means of
documenting the capabilities of the new

136 | P a g e
Chapter 8: Testing and evaluating

software solution and comparing these with that members of the project team can access
the design specifications. the documentation system easily.

Live Data However an organisation decides to track and


document testing, it is important that software
‘Live data’ is a term used to describe data that developers and testers are familiar with the
flows through the system in real time or while required format of this documentation and
the system is in normal operation. Testing a that it is consistent.
system using live data is a very important step
as it has possibly not been exposed to the Items typically included would be:
rigors of its intended environment prior to this.
The software developers have certainly • Tester's name and their department
thoroughly tested the system using simulated • Software package and version (or
data, but this is not the same as live data and module name)
does not place the same demands on the • Date and time the tests were
system in terms of required performance. performed
• Full description of the results
How will the system cope when a large number • Resolution of any problems
of users are simultaneously executing different
processes or entering data of different types?
How well does the system validate user input
Evaluating the solution
and prevent problems from occurring in a live
environment? Are there unacceptable delays It may seem strange to be considering how the
in response times when employees are using software solution will be evaluated at this
the system at the same time or at certain times stage in our journey, but in fact, an
of day? appreciation of when and how the solution will
be evaluated is a very important consideration.
An awareness of these criteria prior to the
Documenting test results
completion of the software development
process will help to keep the development on
The process of documenting test results is an
track. When deciding on criteria for evaluating
important one. While it is easy to see that
how successful a software solution is, first and
conducting thorough testing is a vital step, it
foremost should be the consideration of
can be hard to see why it would be important
whether the solution meets the Software
to finish off the testing stage by documenting
Requirements Specification (SRS).
all that has been carried out.
Data can be gathered using some of the
Documenting testing creates a record of what
following methods: surveys and
has been carried out, how successful it was and
questionnaires, interviews and focus groups,
what measures were taken to correct any
as well as direct observation. As these have
errors that were found. This information is
been discussed in a previous chapter, they will
useful for management in reporting and being
not be described again here.
accountable to clients or the legal team.
So what form do assessment criteria take when
Some organisations use a database system,
used to evaluate the software solution? Often
documenting the test results on a report
they can be simply expressed as a series of
template. One alternative is to purchase a
questions. For example, does it cost less to
commercial product that tracks and produces
perform the same task in the new system
reports on software testing. With these
compared to the old? Are customers satisfied
approaches, you can automate reports to
with the service that they are receiving? All
monitor test results and their progress.
criteria created for an evaluation should relate
Whatever method is chosen, it is important

137 | P a g e
Software Development: Core Techniques and Principles 4th edition

An example evaluation strategy


Time Frame Description Efficiency and Effectiveness Measures
Prior to Acceptance testing by the users of
Efficiency
signing off the solution (more detail on the • Is the software solution easy to use and
on the acceptance testing process can be understand?
finished found on the next page) • Is the software solution able to be used to
software produce results in a reasonable amount of
solution time?
Effectiveness
• Are the correct results produced when
known values are entered?
Immediately Network or technical staff ensure Effectiveness
after the that the software solution is • Is the software solution fully operational?
software operational and ready to be used
solution has
been
installed
3–6 Feedback on the software solution Efficiency
months gathered from the following groups • Is the software solution as easy to use and
after the and using the methods below: understand as it was initially?
introduction Employees via: • Does the software solution continue to
of the • A survey produce results in a reasonable amount of
software • Interviews with select staff in time now that all users are using the
each department solution at once?
Management via: • How much time has the software solution
• Interviews been down due to maintenance or errors?
System support via: Effectiveness
• System error logs • Is the correct information being produced
• Reports on help desk requests by the software solution?
Customers and clients via:
• Feedback forms
Figure 100: An example evaluation strategy

to the identified system goals, aims and • Is the software solution easy to use and
objectives. functional?

When evaluating a solution, it can useful to Effectiveness criteria


frame the criteria around measures of
effectiveness and efficiency. As stated • Is the information that is being produced
previously, effectiveness can be measured by complete, relevant and accurate?
examining whether the goals of the system • Is the output readable and the message
have been met. Effectiveness is a measure of well communicated?
how well the solution performs its assigned • Is the user interface and output easy to
tasks. Often the best way to frame criteria is as understand and attractive?
a question that can be posed in relation to the • Are the on-screen instructions clear?
solution. Some examples are listed below. • Is the information produced within a time
frame that ensures that it is still relevant
Efficiency criteria and useful?
• Is the software solution easy to use?
• Does the solution complete all processing
tasks in an acceptable amount of time?

138 | P a g e
Chapter 8: Testing and evaluating

Strategies for evaluating the involved, the criteria that will be used to
conduct the evaluation, the users that will be
quality of solutions affected and the techniques that will be used
to gather data. Gathering evaluation data can
Evaluation of a software solution usually takes be done in the same ways as data was gathered
place after it has been in operation for a while. during the analysis phase.
Evaluating a software solution is a complex
task and strategies need to be formulated to Acceptance testing and quality assurance
ensure this task is a manageable one.
Quality assurance is the process by which
In Chapter 6, we discussed some of the software developers ensure a new software
efficiency and effectiveness measures that are solution is operating correctly. They need to
used to define solutions. The techniques ensure that its function meets or exceeds the
known as acceptance testing and quality expectations of the designers and the
assurance are ways to compare these organisation. This is shortly followed by
measures with the initial goals of the software acceptance testing.
solution as outlined in the SRS. However,
before these techniques can be used, Acceptance testing is so named because the
strategies must be devised to acquire the employees and/or management of the
necessary data. organisation are involved in ensuring
everything is operating as it should.
It is no coincidence that the first activity listed Representatives of the organisation are given a
under evaluation in the PSM is determining a chance to test the system for themselves and
strategy. A strategy is a plan of action designed give feedback to the development team on any
to achieve a particular goal. In this context, the problems or faulty processes. Specification
goal is the evaluation of the solution. creep (also known as scope creep) must be
dealt with at this point because the testing is
A strategy to evaluate the quality of a solution intended to show that the original design goals
is a more complex activity than surveying the have been met. Where the organisation has
users of a system or checking system logs. An been adding to the goals during development,
evaluation strategy should contain an the test data must be extended to show that
explanation of the timeframes that are the extra specifications have been met. This of

Criteria and techniques for acceptance testing


Does the software solution perform Identify the main tasks required of the software solution as
the tasks set out in the SRS? identified in the SRS and place these in a testing table – where
the actual result can be compared to the expected result.
Is the software solution effective Compare information produced using the previous
(producing the correct / expected information system (if this is available) to that produced using
results)? the software solution.
Manually verify the information that is produced using the
software solution.
Is the software solution efficient in Compare the time taken to perform tasks using the software
terms of its use (time to produce solution to the time taken using the previous system (or what
results, ease of accessing functions is expected of the software solution).
within in)?
Is the software solution user Select a number of different users within the organisation and
interface intuitive? ask them to provide feedback on the location of functions,
ease of understanding, consistency of navigation, etc.
Figure 101: Criteria and techniques for acceptance testing

139 | P a g e
Software Development: Core Techniques and Principles 4th edition

course has a significant effect on the Inadequate time for testing


effectiveness of the project plan that was
created at the beginning of the software If inadequate time is allocated for testing
development process. modules at different stages in the
development, problems could develop in later
Factors that influence the effectiveness of stages that require these modules to be fixed
project plans or modified.

Lots of factors can have an influence on the Budget constraints


effectiveness of a project plan. While a project
is in progress, it is not possible to fully If modules take longer to code that expected,
anticipate all of the problems or interruptions the most obvious effect on the project plan is
that will occur. that milestones can be pushed out. However, a
hidden effect is on the budget of the project as
The following can all be factors in limiting the a whole. Longer development time could mean
effectiveness of a project plan. that more money is spent on coding than was
allocated and this may in turn lead to other
Clear scope aspects of the project having to be cut down.

Has the original scope of the project been clear Dependent software
enough and has it been correctly translated
into the specifications and goals for the No coded software solution works in isolation.
solution. There are always other software packages that
it needs to integrate with on some level.
Specification creep Sometimes a constraint like this is hard to
quantify in terms of the time and effort that
Specification creep occurs when the clients of will be needed. Once the process is started,
a software solution add features to the unforeseen problems can crop up. There is also
requirements even though the development the unknown factor of dealing with the
has already begun. Creep is more likely in developers of these packages. They may be
projects that have long timelines and is often very willing to work with you or may be busy or
caused by changes in personnel, the disorganised.
marketplace or to compete with the latest
innovations of competitors. Technology changes

Changes in staff Technology changes can happen while a


software development is in progress. This
Changes in staffing while the project is could be new hardware releases, newer
underway can have a significant effect, versions of operating systems or software
especially if those staff are the project leads or drivers or newer versions of software packages
key members of the developer’s team. Staff that the coded solution needs to integrate
taking or returning from leave can also have an with. Once again, changes like this are very
effect as there needs to be time allocated to hard to predict and can lead to lengthy delays
handover in both cases. in a project as the implications are assessed
and changes made.
Communication issues

Too many meetings can slow a project down


while not having enough meetings can mean
that team members can potentially go off on
tangents or be idle.

140 | P a g e
Chapter 8: Testing and evaluating

Strategies for evaluating the A good project manager will also make
annotations to tasks as changes are made or
efficiency and effectiveness of additions catered for. They will keep a
project plans comprehensive log of changes so that when
they need to make reports to their superiors or
At the beginning of a software development the organisation, they are able to explain the
process, a project plan is created to support an variations to the initial plan that have taken
organised approach to the problem-solving place and the impact of these changes.
methodology. This project plan (as described in
Chapter 4) consists of a list of all of the tasks Annotations should changes such as the
that need to be completed, the team members following:
that will be involved in the project and what
their responsibilities will be as well as the • The actual duration of tasks compared to
schedules that the project needs to adhere to. their expected duration.
• The achievement of milestones.
As the project progresses, this project plan is • What modifications to the project plan
examined frequently to ensure that timelines were performed and notes explaining why
are being kept to. A useful strategy to employ this was the case.
is to make annotations to the project plan as
they happen. If this is done in a methodical manner, at the
conclusion of the project, the original project
Adjustments to timeframes and tasks plan can be examined and compared to what
transpired over the course of the
Projects almost never finish in the manner that development. This will help to inform future
was expected of them and recorded in the projects as well as explain delays or changes
initial plan. There will be variations in times, that have occurred to the client.
tasks, schedules and available resources. A
good project manager will adjust these aspects Evaluating the effectiveness of
of the project as they occur, so that potential
problems or future changes can be catered for.
development models

A project plan that is stored in the cloud and One of the very first decisions that software
available to key personnel, allows all involved developers face when beginning the
development of a software solution is, which
in the project to see these changes and to keep
development model will be most suited to the
an eye on upcoming tasks and milestones.
project. In Chapter 4, we discussed three of
While maintaining a dynamic plan such as this
these development models: waterfall, agile
has obvious benefits, it is also beneficial to
save versions of the plan and to annotate and spiral. Each has its advantages and
changes. disadvantages as well as situations to which
they are well suited. However, software
As a project progresses, a good project development is not an exact science and it is
manager will regularly examine the plan and not always easy to see which is the best choice
adjust tasks for the times that they are or indeed foresee circumstances that might
completed or estimated to be finished. This make one model favored over another. The
will in turn have an effect on the overall ability to evaluate the effectiveness of these
timeframes within the project and may mean development models and the factors that have
that some resource bookings need to be influenced this, gives valuable insight into
modified (outside contractors, system testing each.
times).

141 | P a g e
Software Development: Core Techniques and Principles 4th edition

Factors influencing the effectiveness of required to work closely with the project
the waterfall model team throughout the project. This high
degree of client involvement, while an
As discussed previously, in implementing a advantage, may present problems for
true waterfall development model, each stage some clients who do not have the time or
of the PSM generally finishes before the next interest for this type of participation. A
one can begin. This transition is not seamless; client that does not have the commitment
there are typically ‘gates’ between the stages. for this development model, will curtail its
For example, at the conclusion of the analysis effectiveness.
stage, the client would need to sign off on the • With the intensity involved in an agile
SRS. Once this has happened, and only then, development (especially when it is
would the design stage be initiated. implemented using a method such as
Scrum), team members need to be focused
There are certainly both good and bad aspects and dedicated to their work on the project.
to using the waterfall development model and While it is not necessary that all of the
we have discussed these already. At this point tasks need to be completed in a particular
in our discussion, let’s consider the factors that sprint, if the completion of tasks drags on,
can occur in a project utilising the waterfall and additional sprints need to be added to
model. the allotted project timeline, this can add
to the overall cost of the solution. It may
• While developers and the client agree on also put pressure on developers that cause
what will be delivered early in the PSM, the quality of the solution to decline.
there is some argument to be made that • A close working relationship within a
this is too early. Clients are sometimes software development team using the
intimidated by having to provide such agile model is vital. If circumstances
specific details, so early in the project. caused members of the team to work
They are not always able to visualise the remotely or to miss work for a period of
finished product or possibly, what their time, the effectiveness of the process will
exact needs are. A client unsure of these be put at risk. Webcams or online
aspects of the purpose and scope, can collaboration tools will alleviate this to
quickly derail a waterfall development that some extent, but the daily scrum and the
has moved past the anaylsis stage. ability to actively problem solve or advise
• A client may also not be happy with the others will be limited.
delivered software product. As all the
deliverables are based on the SRS and the Factors influencing the effectiveness of
analysis, the client may not see what will the spiral model
be delivered until it is almost finished.
The spiral model is a complex one that
Factors influencing the effectiveness of combines aspects of both the waterfall and
the agile model agile models. While its primary focus is on
mitigating risk, its complexity leaves it
Just as with the waterfall model, there are a susceptible to a number of factors that can
number of factors that can occur in an agile limit its effectiveness.
development that can influence the
effectiveness of the project. • The complexity of a spiral development
places a significant pressure on the
• As discussed previously, the client in an software developer managing the project.
agile development has frequent and early If the person doing so is not able to
opportunities to see the work being navigate and guide their team through the
delivered and to make decisions and various iterations of the development or is
changes throughout the project. They are not able to evaluate and assess the risk at

142 | P a g e
Chapter 8: Testing and evaluating

each point, the development could be


compromised.
• Ironically, as the spiral model is focused on
risk management, risks that arise during
the development may severely affect the
project and in extreme cases, may cause
the project to be ended. In many ways, it
could be argued that the spiral model is
doing its job if such an eventuality occurs,
but if either of the agile or waterfall
models were being used instead, the
development may have continued on to a
positive outcome.

143 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context Questions

1. Why is it beneficial to have an awareness of the evaluation criteria for a software solution
prior to its completion?
2. In what ways can data be gathered to evaluate a software solution?
3. Why is testing such an important step in developing a software solution?
4. What is the difference between alpha testing and beta testing?
5. What test data should be selected when designing a testing table?
6. Under what circumstances is testing with live data most beneficial?
7. Of all of the factors that can have an effect on a project plan, which do you think would have
the greatest potential impact and why?
8. What is specification creep and in what ways could it be prevented?

Applying the Concepts

• Compile a list of infamous bugs or logic errors that have occurred and what the consequences
were in each case.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

List and describe efficiency measures of a software solution

List and describe effectiveness measures of a software solution

Describe methods that can be used to test the usability of solutions

Describe ways in which testing can be documented and what actions should take

place in each case

List factors that influence the effectiveness of development models

Describe ways in which the progress of projects can be recorded and plans amended

Discuss strategies that can be used to evaluate software solutions and project plans

144 | P a g e
Chapter 8: Testing and evaluating

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

Question 1
What tool can be used to test to see if an algorithm is producing the correct set of outputs for a set of
inputs?
A. Storyboard
B. Mock up
C. IPO chart
D. Trace table

Question 2
What does the process known as benchmarking achieve?
A. Tasks within the organisation are compared to best practice
B. Results of tests within the organisation are published online
C. Processes within the organisation are timed and compared to their main competitor
D. The main bench within the organisation is ‘marked up’ to show when tests were carried out

Question 3
A software solution is struggling to be effective with the main problem being the timeliness of data.
This means:
A. Data is too large to be processed
B. Once it has been processed, the timeframe of its usefulness has passed
C. The data is not being communicated clearly
D. The data does not have data or time fields attached to it

Question 4
When using test data to test the boundaries of the age input for a student between 12 and 18 years
old, which combination of test data would do this appropriately?
A. 12, 13, 17, 18
B. 12, 18
C. 11, 12, 18, 19
D. 11, 12, 13, 17, 18, 19

Question 5
An App is being developed that will allow a landscape gardener to take measurements of a garden and
then develop a quote for the customer in real time. Describe the scope of the proposed software
solution by listing two measures of efficiency and two measures of effectiveness.

Efficiency Measure 1:

Measure 2:

Effectiveness Measure 1:

Measure 2:

4 marks

145 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 6
The beta version of a new software solution has just been completed by Killabrew Solutions for their
client #NoLimits Real Estate. Ameer has been managing the project and is now trying to determine
what criteria should be used for the acceptance testing with the client. List two criteria that must be
included.

Criteria 1: _________________________________________________________________________

Criteria 2: _________________________________________________________________________
2 marks

Sample Examination Answers

Question 1
Answer: D

Question 2
Answer: A

Question 3
Answer: B

Question 4
Answer: C

Question 5
An App is being developed that will allow a landscape gardener to take measurements of a garden and
then develop a quote for the customer in real time.

Describe the scope of the proposed software solution by listing two measures of efficiency and two
measures of effectiveness.

Efficiency Measure 1: It is easy for the landscape gardener to enter the values into the
App.
Measure 2: The quote can be easily delivered electronically to the customer.
Effectiveness Measure 1: Quoted is calculated in real time for the customer.
Measure 2: Quote is correct (comparing it to previous methods of producing
quotes).

Question 6
a. The software performs the tasks required of it.
b. The software is easy to use.

Questions like this provide very little information and it is important to remember that simple answers
are often the best. The question also uses says ‘list two criteria that MUST’ – which means that the
focus is on the most important things as opposed to things that are down the list.

146 | P a g e
Chapter 9
The law in a software development context

The chapter covers Unit 4: Area of Study 2 key knowledge:


• Interactions and impact
o KK4.7 Reasons why individuals and organisations develop software, including meeting the
goals and objectives of the organisation
o KK4.8 Key legislation that affects how organisations control the collection, storage
(including cloud storage) and communication of data: the Copyright Act 1968, the Health
Records Act 2001, the Privacy Act 1988 and the Privacy and Data Protection Act 2014
o KK4.9 Ethical issues arising during the software development process and the use of a
software solution

Key terms: Privacy Act 1988, The Privacy and Data Protection Act 2014, Health Records Act 2001,
Copyrght Act 1968, cloud storage, open source software, creative commons

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 – Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

Legal issues landscape, especially with the advent of the


Internet, has meant that laws that have served
the Australian community well for many years
There are a number of laws that have direct have suddenly been rendered almost
bearing on software developers. The laws that
are listed in the VCE Computing Study Design
are the: Privacy Act 1988, the Privacy and Data
Protection Act 2014, the Copyright Act 1968
and the Health Records Act 2001. While it is
important for students to have an awareness
of the contents of each of these laws or acts
and be able to discuss their implication on
software developers, it is not necessary to
memorise the specific detail or clauses within
them.

The challenges faced by the legal community in


keeping up with computer-related crime and Figure 102: Australian High Court
privacy have been huge. The rapidly changing
Software Development: Core Techniques and Principles 4th edition

irrelevant. The difficulties encountered information it collects, how an individual may


because of the borderless nature of the complain about a breach of the privacy policy
Internet have been many; extending from the and whether the organisation is likely to
powerlessness of our Australian laws to disclose information to overseas recipients.
counter outside threats to the daunting task of
law enforcement. APP 2: Anonymity and pseudonymity

Let’s examine each of the relevant laws in turn. In making enquiries or complaints, individuals
have the option of not identifying themselves,
The Privacy Act or of using a pseudonym. There will of course
be times when this is not appropriate or
Incorporating:
possible.
The Privacy Act 1988,
Privacy and Data Protection Act 2014
APP 3: Collection of solicited personal
information
The Privacy Act 1988 (incorporating the Privacy
and Data Protection Act 2014) is a law that sets
An organisation must not collect personal
out the ways in which organisations can
information (other than sensitive information)
collect, use or distribute personal data. The Act
unless the information is required for one or
also outlines what must be done to protect the
more of the organisation’s functions or
data and the rights of individuals with respect
activities. Sensitive information must only be
to their own data.
collected with an individual’s consent. The
collection of sensitive information must also be
The Privacy Act applies to government
required for the organisation’s functions or
organisations, local councils or any
activities.
organisation that is contracted to a
government organisation. The Privacy Act also
applies to the private sector (more on this APP 4: Dealing with unsolicited personal
later). information

The Privacy Act 1988 initially set out 11 This may seem like an odd inclusion in the
Information Privacy Principles (IPPs) and 10 Australian Privacy Principles, but APP 4
National Privacy Principles (NPPs). The IPPs outlines what organisations must do with
applied to the government sector and the NPPs unsolicited personal information (that is,
to the private sector. The Privacy Amendment personal information that they receive that
(Enhancing Privacy Protection) Act 2012 and was not asked for). Where an organisation
the Privacy and Data Protection Act 2014 receives unsolicited personal information, it
combined these two sets of principles into 13 must first determine if it would have been
Australian Privacy Principles (APPs) which permitted to collect the information under APP
serve as the base line for the privacy 3. If so, then APP 5-13 apply as they would
legislation. Each is described in the following normally.
section.
If the information could not have been
APP 1: Open and transparent collected under APP 3, the organisation must
destroy or de-identify that information as soon
management of personal information
as they can (as long as this is both lawful and
reasonable).
Data needs to be managed in an open and
transparent manner. This includes having an
easy to follow, accessible (free of charge and
easily available) and current privacy policy. An
organisation must have a privacy policy that
contains information on the kinds of personal

148 | P a g e
Chapter 9: The law in a software development context

APP 5: Notification of the collection of APP 9: Adoption, use or disclosure of


personal information government related identifiers

APP 5 is similar to APP 1, but differs in its Organisations are prohibited from adopting,
application. APP 5 states that the organisation using or disclosing a government related
must make an individual aware, at the time or identifier (think of identifiers such as Medicare
as soon as possible after, they collect their numbers and Tax File Numbers).
personal information.
APP 10: Quality of personal information
They are also required to notify individuals
about the access, correction and complaints An organisation must take reasonable steps to
processes in their privacy policies, as well as ensure the personal information it collects is
the location of any overseas recipients of accurate, up-to-date and complete.
individuals’ information.
APP 11: Security of personal information
APP 6: Use and disclosure of personal
information An organisation must take reasonable steps to
protect the personal information it holds from
As was the case with the previous Privacy Act, interference, loss, damage, unauthorised
an organisation must only use or disclose access, modification and disclosure. This also
information for the purpose for which it was includes destroying or de-identifying personal
originally collected. The new privacy legislation information that the organisation no longer
introduces a small number of exceptions when needs (unless it is required to keep it for legal
an organisation can use or disclose personal reasons).
information for a different purpose. These
exceptions include to assist in locating a APP 12: Access to personal information
missing person, to defend or establish a legal
claim or for the purposes of resolving a An organisation must give an individual access
confidential dispute. to the personal information that it holds about
that individual, unless an exception applies. In
APP 7: Direct marketing addition to giving an individual access, an
organisation must respond to this request in a
For an organisation to use personal reasonable amount of time. If an organisation
information for direct marketing purposes, the decides not to give an individual access, it must
individual must have either given their consent provide reasons why this is the case and make
or they would have a reasonable expectation available means for the individual to complain
that their information was going to be used for if they wish to do so. An organisation charging
this purpose. Organisations must always allow a fee to give an individual access to their
individuals to opt-out if they wish. personal information, must ensure that the
charge is not excessive. The charge must also
APP 8: Cross-border disclosures not be as a consequence of simply making the
request.
Before an organisation decides to disclose
personal information to an overseas recipient, APP 13: Correction of personal
the organisation must take reasonable steps to information
ensure that the overseas recipient does not
breach the APPs (excluding APP 1) in relation Organisations must take reasonable steps to
to that information. If the overseas recipient ensure that personal information is accurate,
breaches the APPs in any way, the organisation up-to-date, complete and relevant. This could
could be held accountable as if they be initiated by the organisation or could be in
themselves had breached the privacy laws! response to a request from an individual. In

149 | P a g e
Software Development: Core Techniques and Principles 4th edition

either case, it is then the organisation’s


responsibility to ensure that these changes are
There are very few exemptions under the
also provided to outside entities that they Act. Even organisations such as
share data with. gymnasiums, child-care centres,
educational institutions, counsellors,
The Privacy Act 1988 was originally written for insurers and superannuation providers
the government sector and is mandatory for are bound to follow the Act as they all
collect personal health information of
government organisations and for
one kind or another.
organisations that are working under a
government contract. In relation to the private
sector, the Privacy Act is mandatory for the
following organisations: medical information regardless of the size or
type of the organisation that is collecting this
• Organisations working under a data. The creation of the Act was seen as
government contract. necessary as health records are considered
• Organisations with a turnover of more amongst the most sensitive of all personal
than three million a year (including not for data. There also needed to be uniformity
profit organisations)*. across both the public and private sectors. The
• Organisations of any size that store Act establishes 11 Health Privacy Principles
medical information. (HPPs) that are similar in style to the APPs. The
• Organisations of any size that distribute or important aspect of the Act to emphasise is
sell personal information. that any organisation in Victoria that deals with
medical information is bound by the Act.
* Note that the great majority of businesses in
Australia do not have a turnover of greater The legislation is not all in favour of the
than 3 million dollars per year. Businesses that individual. Under the Health Records Act,
fall into this category are not mandated to organisations collecting medical information
follow the Privacy Act (unless of course they can share information with other organisations
are store medical information or sell for the purposes of research and planning as
information to other companies). long as the information is de-identified. In
Organisations can choose to opt-in if they wish. addition, the Act allows for the mandatory
reporting of certain diseases without an
individual’s consent (such as STDs and specific
infectious diseases).

Copyright Act 1968


In Australia, copyright laws are governed by
the Copyright Act 1968. In 2004 there were
significant changes made to this act in
Parliament, particularly as it relates to
computer technology.

Figure 103: Medical data Copyright in Australia is free and automatic.


That is, if you are the creator of a work, you do
not have to place a “(c)” symbol on the work
Health Records Act 2001 itself or register that you are the copyright
owner with any organisation or body. If you are
The Health Records Act 2001 is an additional the creator of a work, you own the copyright
Act to the Privacy Act (although not a part of it) automatically.
that gives individuals protection of their own

150 | P a g e
Chapter 9: The law in a software development context

An owner of the copyright over a piece of work purchased CDs to an iPod would be a
has a number of rights. They have the right to breach of the Act.
choose when and how the work will be • Prior to this amendment, a copyrighted
distributed, published or otherwise work entered the public domain 50 years
communicated. They also have the right to following the death of the last author or
incorporate any sort of technological creator of the work. As the legislation in
protection devices to protect the work. the US is 70 years instead of 50 years, the
Australian Copyright Act was changed to
70 years to bring it in line.
• The concept of “fair dealing” (which in the
US is called “fair use”) was introduced in
this amendment.

Open Source Software (OSS) /


Figure 104: The accepted symbol to denote
copyright
Creative Commons (CC)
As a software developer, it is not always
When dealing with a copyrighted work, it is necessary (or desirable) to “reinvent the
illegal to copy or share the work without the wheel”. Fortunately, there are code libraries
permission of the author of the work. It is also that are freely available on the Internet that
illegal to change the format of the work enable software developers to quickly code
(digitise it by scanning it into a digital form, routines that would otherwise be time
change the file format or use a compression or consuming, expensive and tedious to create.
ripping technology). If a copyrighted work is This is a very different concept to simply
protected in some way, it is illegal to looking on the Internet for some code and
circumvent this protection or use a tool to do “taking it”. Doing that could very well be a
so. breach of copyright. Software that has been
created (and tagged) as OSS has been created
Individuals do have some ways that they can with this very purpose in mind, but there are
use a copyrighted work without breaching the some catches.
Act. They can copy 10% or 1 chapter of a
reference book (whichever is greater) without OSS is often protected by a licence of some
permission. A copyrighted work can be used type and a number of these exist. The
under the banner of “fair dealing” for parody, organisation known as Creative Commons has
review, research, reporting or by the legal created a number of licences that owners of
system. any sort of work can attribute to the work to
waive some or all of their copyright rights.
The Copyright Amendment Act 2006 was an These licences have been translated for many
amendment to the Copyright Act brought countries and a set of these licences exist for
about by the US and Australia Free Trade Australians to use specific to Australian
Agreement. copyright law.
In particular:
The following licences exist:
• The recording of TV or radio programs for
family or personal purposes was allowed
as an exception under the Act.
• Format shifting of music for personal use
was also allowed as an exception under the Figure 105: Attribution
Act. Without this exception, ripping legally

151 | P a g e
Software Development: Core Techniques and Principles 4th edition

This licence allows others to use, modify, This licence allows others to use, modify and
distribute (including commercially), as long as distribute the work non-commercially as long
the original author is credited. All of the as the original author is credited. Any
content for the description of these licences derivative works must also be shared in this
and the icons themselves are under an way with the same licence.
attribution licence – and have been sourced
from the Creative Commons website.

Figure 110: Attribution, Non-Commercial, No


Figure 106: Attribution, Share Alike Derivatives

Once again, this licence allows others to use, This is the most restrictive of the licences as it
modify (including commercially) and distribute allows others to use and distribute the work
the work as long as the original author is non-commercially but not modify the work in
credited. In addition to this, the new product any way. The author must be credited.
must be shared under the same terms. This is
the licence that is closest to a true OSS model. The legal implications of cloud
storage
The use of cloud storage has many benefits for
organisations, the least of which are portability
Figure 107: Attribution, No Derivatives
of data and the security of off-site backup.
There are however legal issues that must also
This licence allows others to use and distribute be taken into account. In response to the
(including commercially) as long as the original Privacy Act, organisations must ensure that
author is credited. However, no derivative they take reasonable steps to keep the data
works are allowed. That is, the work must be secure (APP 11). If the data is of a medical
unchanged. nature, the adherence to this APP is not
mandatory.

The difficulty with the use of cloud services is


that an organisation is effectively handing over
Figure 108: Attribution, Non-Commercial the control of their data to an external vendor.
Personal data of any kind has a “street value”
to hackers as it can be used to perform identity
This licence allows others to use, modify and theft.
distribute the work; however it cannot be
distributed commercially. The author must of Small to medium sized businesses that are not
course be credited, but interestingly, able to invest a great deal of capital into
derivative works do not need to use this security infrastructure, may find cloud services
licence. attractive and affordable. Doing so may
increase the level of security over what they
would be able to provide. Under the Privacy
Act, this would seem to be a reasonable.

Figure 109: Attribution, Non-Commercial, As many cloud services are not based in
Share Alike Australia, an organisation planning on storing
personal data in this way needs to ensure that

152 | P a g e
Chapter 9: The law in a software development context

the organisation adheres to the Australian will be in service of some goal or objective that
Privacy Principles (per APP 8). The use of an they have.
Australian cloud provider will certainly make
the process simpler, as they will already be Software development is generally driven by
aware of the Privacy Principles. In the case of specific need, whether this be the need of
cloud servers located in other countries, individuals or organisations. The reasons why
different Privacy Laws will be in effect. In this software solutions are developed probably fall
case, organisations are responsible for into one or more of the following categories.
ensuring that the provider is adhering to all of
the applicable APPs in the Australian Privacy • There is a need in the market and doing so
Act. This will usually take the form of a contract will be profitable.
in which the organisation will define their • To remain competitive.
requirements in terms of how the data is • It will be beneficial to society.
handled and accessed. If an organisation has • To perform a task that is difficult or not
taken “reasonable steps” to ensure that the possible otherwise.
cloud storage provider is keeping the data • To assist with the transition between
secure in accordance with Australian law, then information systems.
they will not be liable for any data breach. • To save time.
• To save money.
An organisation would also be wise to do the
following: The motivations of individuals or organisations
involved in developing software will
• Ensure that the cloud provider is a sometimes put them at odds with the law or
reputable vendor (some cloud providers other stakeholders (both internal and
abide by the International Standard for external). It is also possible for individuals or
Cloud Privacy – ISO270018) organisations to conduct themselves in ways
• Encrypt data sent to the cloud that are legally sound, but still considered to be
• Use multi-factor authentication to access unethical.
the data
• Maintain an encrypted backup with a As a software developer, ensuring that the law
different cloud provider is followed, specifically in relation to the
• Ensure that the ownership of the data software solution that you are creating, is very
remains with the organisation important. This can, however, bring a software
developer into conflict with other stakeholders
Why do organisations or involved in the project – either though a
misunderstanding of a software developer’s
individuals develop software intentions or by insisting on the inclusion (or
solutions? exclusion) of features that would contravene
the law. There are also times when a situation
In Chapter 4, we discussed at length how the may challenge a software developer in an
mission statements of organisation inform ethical way and there is often quite a
their goals and objectives. In putting these difference between what is unethical and what
goals and objectives into action, organisations is illegal.
will create information systems that will in
turn, have their own set of goals and Some examples of this sort of conflict follow,
objectives. although this list is by no means exhaustive.

Individuals will create software solutions for


similar reasons. They will have a problem they
wish to create a solution for, and this solution

153 | P a g e
Software Development: Core Techniques and Principles 4th edition

Ethical issues arising in the software important but may be a part of the project that
development process either the client or the software developer
wants to omit.
The Privacy Act requires that a reasonable level
of security be placed on personal information Code of ethics
that is being stored by a software solution or in
a networked environment. For this reason, it is Most professions are also guided by a code of
often important to include password ethics that lays out guidelines specific to the
protection, levels of access and encryption of profession. A code of ethics is not legally
some sort. Organisations often have a binding but sets out to establish a level of
password policy, which they may require a ethical conduct and to raise the awareness of
software developer to enforce in a software issues related to the profession.
solution. Typical password policies now consist
of upper and lower case characters, numbers The Australian Computer Society, which
and punctuation and are often a set minimum represents a large proportion of the IT
length. These password policies often cause professionals in Australia, has a published code
concern amongst users as they are hard to of ethics can that can viewed via their website.
remember and seem to be “over the top”. A
software developer may encounter resistance The introduction of the ACS code of ethics
from staff when implementing a policy such as states “An essential characteristic of a
this or may encounter problems if their profession is the need for its members to abide
software solution uses a different policy or by a Code of Ethics. The Society requires its
none at all. members to subscribe to a set of values and
ideals which uphold and advance the honour,
What if a software developer were asked not dignity and effectiveness of the profession of
to implement security features in a program information technology.” Within the code of
due to time or budget constraints? ethics are statements about honesty,
competence, values, ideals, professional
What if a software developer, in the course of conduct and social implications.
the creation of the software solution, found
that the security of information on the
networked environment was inadequate?

What if a software developer were asked by


the client to include features within the
program that could potentially breach the
privacy of the user – such as key logging
software or the use of an in-built camera or
microphone without the user’s knowledge?

A software developer may be asked to forego


rigorous testing in favour of a reduced timeline
or a reduction in costs. It may even be
tempting for a software developer to not
thoroughly test a software solution as this part
of the software development process is often
not documented for the client.

Documentation for the client catering for


different groups within the organisation is very

154 | P a g e
Chapter 9: The law in a software development context

Context Questions

1. Do all private companies or organisations need to follow the Privacy Act?


2. Under what circumstances is a private company exempt from adhering to the Privacy Act?
3. What changes were made with the introductions to the amendments to the Privacy Act 1988?
4. If an organisation shares your personal information with a company overseas (with your
permission) and the company breaches one of the APPs, is the original organisation liable?
Explain.
5. What rights does the author of a work have?
6. Under what circumstances can medical providers share your personal information without
your permission?
7. How much of a copyrighted work can be copied without the permission of the author?
8. What is the relationship between Open Source Software and Creative Commons?

Applying the Concepts

• Find examples of OSS or creative commons content and note the conditions under which the
authors have said it can be used.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

List and describe efficiency measures of a software solution

List and describe effectiveness measures of a software solution

Describe methods that can be used to test the usability of solutions

Describe ways in which testing can be documented and what actions should take

place in each case

List factors that influence the effectiveness of development models

Describe ways in which the progress of projects can be recorded and plans amended

Discuss strategies that can be used to evaluate software solutions and project plans

155 | P a g e
Software Development: Core Techniques and Principles 4th edition

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

“Even though I created this module for my last job, it’s mine to sell and do what I wish with”.

Question 1
This person would be breaking the law if they were to do this. Which law would this person be
infringing?
A. Privacy Act
B. Code of Ethics
C. Copyright Act
D. Health Records Act

Question 2
The winners of a tennis tournament are to have their names and photos published on the club website.
This is not a breach of privacy legislation as long as:
A. Other tennis clubs are doing it as well
B. The players are members of the club
C. The players have a social media account
D. The players have given their permission for this to happen

Question 3
Open Office, VLC and Audacity are examples of Open Source software packages. This means:
A. The software and source code is free to use and access
B. A small licence fee is required to use the software
C. The software can only be used by developers
D. The software has to be programmed for it to be used

Question 4
A new App asks users to create an account and supply an address for security verification. This is in
breach of the Privacy Act because:
A. it is a unique identifier
B. another user may have supplied the same address
C. the owners of the App would be unable to check the accuracy of the address
D. the owners of the App have no valid reason to collect this information

Question 5
What rights does an author of a work not have?
A. The right to chose when and how the work will be distributed
B. The right to incorporate protection methods to protect their work
C. The right to control how their work is communicated to others
D. The right to prevent anyone quoting from the work

156 | P a g e
Chapter 9: The law in a software development context

Question 6
Paul is in the process of developing an App that will be free to download. One of the ways that the
App will revenue is via advertising. Paul wants to make the advertising as targeted as possible by
accessing the browsing history of those using the App, as well as their location details. Paul’s colleague
Maxine advises Paul this this is technically very easy to code, but that some things need to be done in
order to ‘make it legal’.

a. What would be the best way to ensure that this data is able to be obtained legally?

___________________________________________________________________________

___________________________________________________________________________
1 mark

b. List two potential negatives that could arise from implementing such a feature.

Negative 1: __________________________________________________________________

___________________________________________________________________________

Negative 2: __________________________________________________________________

___________________________________________________________________________
2 marks

Question 7
AFLellofAGame is an AFL based podcast that is recorded weekly. The show is broadcast live and then
a recording is placed on a server for access. Both the live show and archived shows can be accessed
via an App that can be downloaded free from the App store. The hosts have taken to starting the show
with a pop song to represent what has happened during the week. Two weeks ago, they played most
of “Take it to the limit” by The Eagles. Last week the song they selected was “Firework” by Katy Perry.
Paul (one of the hosts) was initially concerned that this might be a breach of copyright, but Dave (the
other host) reassured him saying that by stopping the song just before the end meant they weren’t
playing the whole song. He also said that as they weren’t selling the song, that it was OK.

Discuss any problems with Dave’s reasoning, with reference to any relevant legislation.

___________________________________________________________________________

___________________________________________________________________________

___________________________________________________________________________
2 marks

Question 8
The web developer for a company is approached by the another company with a proposal. They would
like to have their company advertised on the website and in exchange will share their client data. The
web developer feels that the proposal has merit, but also knows that some there are some things to
be careful of legally.

157 | P a g e
Software Development: Core Techniques and Principles 4th edition

a. What piece of legislation is most relevant to this issue?

__________________________________________________________________________
1 mark

b. Describe what the web developer needs to do in order to ensure that this proposal is
implemented legally.

__________________________________________________________________________

__________________________________________________________________________
2 marks

Sample Examination Answers

Question 1
Answer: C

Question 2
Answer: D

Question 3
Answer: A

Question 4
Answer: D

Question 5
Answer: D

Question 6
a. The best way to obtain this data legally is to advise listeners downloading the App that this
data will be collected and ask for them to agree to this.
b. Negative 1: the App might not be as responsive as it will be sending additional data back and
will also be loading ads to display.
Negative 2: listeners might get annoyed at the presence of the ads and may also feel that
their privacy is being infringed upon.

Question 7
It is against the Copyright legislation to play the songs that Dave is playing at the front of his show.
Stopping the song just before the end makes no difference. The station may not be selling the song,
but they are making revenue from advertising used in the show, so the song is contributing to that (so
indirectly – they in fact are selling the song).

Question 8
a. Privacy legislation.
b. Clients need to be asked whether they will allow their data to be shared with and informed as
to what it will be used for. If they say no, then a record of this must be kept and the data
cannot be shared.

158 | P a g e
Chapter 10
Cybersecurity

The chapter covers Unit 4: Area of Study 2 key knowledge:


• Digital systems
o KK4.1 Physical and software controls used to protect software development practices and
to protect software and data, including version control, user authentication, encryption
and software updates
o KK4.2 Software auditing and testing strategies to identify and minimise potential risks
o KK4.3 Types of software security and data security vulnerabilities, including data breaches,
man-in-the-middle attacks and social engineering, and the strategies to protect against
these
o KK4.4 Types of web application risks, including cross-site scripting and SQL injections
o KK4.5 Managing risks posed by software acquired from third parties
• Data and information
o KK4.6 Characteristics of data that has integrity, including accuracy, authenticity,
correctness, reasonableness, relevance and timeliness
o KK4.10 Criteria for evaluating the effectiveness of software development security
strategies
o KK4.11 The impact of ineffective security strategies on data integrity
o K4.12 Risk management strategies to minimise security vulnerabilities to software
development practices

Key terms: security threats, accidental, event-based and deliberate threats, risk management, user
authentication, dictionary attack, rainbow table, firewalls, malware, SQL injection, man in the middle
attack, social engineering, encryption, logical and physical security, barrier techniques, biometrics.

Study roadmap

Unit 3 Unit 4
Ch Area of study 1 – Area of study 2 – Area of study 1 – Area of study 2 -
Programming Analysis and Design Development and Cybersecurity: software
Evaluation security
1
2
3
4
5
6
7
8
9
10

The nature of threats accidental and event-based threats first, as


these are the simplest ones to describe.
When considering the nature of threats to an
information system, they can be placed into Accidental threats
three broad categories: accidental, deliberate
and event-based. Let’s have a brief look at Data can be lost in many different ways. Of all
the threats to data that will be discussed in this
Software Development: Core Techniques and Principles 4th edition

chapter, the most common way that data is The loss from accidental damage as well as
lost is through the deletion of files or parts of event-based damage are best prevented with
them without having any backups available. good backup and operational procedures (as
Updating and deleting files is an activity that discussed in Chapter 8). However, deliberate
people do almost every day and so it is easy to threats require the use of security measures to
make a mistake or not think twice about it. minimise their risk.

Deliberate threats and security


Security threats: The actions, devices
and events that threaten the integrity What is security? No matter what your
and security of data and information definition and how complex it may be, security
stored within, and communicated is ultimately about prevention. Computer
between, information systems. The security seeks to prevent the damage that
threats can be accidental, such as losing could be caused if unauthorised people were
a portable storage device containing
files; deliberate, such as malware,
to gain access to an organisation’s information
phishing; and events-based such as a (known as a deliberate threat).
power surge.
When discussing the topic of security and the
VCAA VCE IT Study Design 2020-2023 nature of deliberate threats, there is not one
Glossary absolute way to protect the integrity of data
and information. This is due mainly to the fact
that there are so many different types of
Files can be saved in the wrong files formats,
deliberate threats. Deliberate threats are
causing others to think they are corrupt. It
present from both inside and outside of an
could also be administrative errors or
organisation. They can be both physical and
processes that cause files to be lost or corrupt.
logical in nature. There are ways to protect
against certain types of threats, but a ‘solution’
Data can also be lost accidentally through the
to this problem is complex and one that needs
physical loss of hardware. Notebooks, smart-
to be constantly attended to – otherwise the
phones and flash drives are all easily lost or
threats can advance beyond the capability of
misplaced and sometimes this can mean that
the defences.
sensitive data is exposed to risk.
When discussing this topic, it is natural to think
Event-based threats of the threats from the ‘outside’ (or the
Internet) being of the most concern. While
Event-based threats are ones that are these threats are definitely a concern, it is
categorised by particular events occurring that more common to encounter threats from
are outside of the control of users. These could within an organisation, but more on this later.
be on a small scale such as the failure of a hard Let’s start by looking at the threats from the
drive, a power failure or software freezing Internet.
causes a file to become corrupt (or not able to
be saved). With the wide-spread use of cloud Why do Internet threats exist?
services, a business providing cloud storage or
software as a service could experience a loss of The Internet was built for a trusted world. The
their own or simply go out of business. beginnings of the Internet (ARPANET) were
established by the US military using packet
Event based threats can also be on a much switching to enable a network to operate if
larger scale. Natural disasters such as fire, nodes within the network went offline at any
earthquake, flood or tornado, could all lead to time. The technology was essentially built to be
quite significant losses of data. ‘nuclear-war’ proof – so that in the event of a
nuclear strike with key US military installations

160 | P a g e
Chapter 10: Cybersecurity

being destroyed, the network would remain in a monetary cost. This also needs to be factored
operation and the US would be able to launch into the cost/benefit analysis.
a counter-strike. For a time, the network was
made available for select universities to use Protection and detection
and was opened up to commercial interests in
the late 1980s. In the early 1990s CERN Security is primarily concerned with protection
publicised the ‘world wide web’ project and but it also includes detection. Protection is
the Internet began to gain in popularity and what we naturally assume when we raise the
make its way into the world culture. topic of security; that is, what can be done to
prevent malicious attacks on the data within an
The underlying engineering behind the organisation. Protection is often at the ‘front
Internet was not built with security concerns in door’ – the problem being that once an
mind. There was no e-commerce, no crime and intruder has access, there are no mechanisms
no financial transactions. HTTP is not a secure to detect their presence. This is where
protocol and contains no encryption of any detection is important.
kind. Internet security, as a result, had not
been a coordinated effort. One of the first barriers that can be effective in
protecting an organisation and its data, is user
The ease with which Internet sites can be authentication.
cloned makes trust a difficult issue. It is not
enough to trust a web-site based on the User Authentication
reputation of the ‘brand’. Prime examples are
banking sites, which are regularly cloned using User authentication is the process by which a
very similar URLs and by using the images person that is wishing to gain access to an
captured from the official web-site. information system provides satisfactory
credentials to allow them to be confirmed as
Internet security risk management being who they say they are. User
authentication falls into three categories:
Internet security, for many organisations, falls ownership factors (based on something the
into the area known as risk management. That user has), knowledge factors (based on
is defined as the application of security something the user knows) and inherence
measures based on a cost/benefit analysis. For factors (based on something the user is).
example, there may be many security
measures that can be applied to a company’s Some examples of these are as follows:
network, but at some point in time, the cost of
adding the security measure will be greater Ownership factors: ID card, security token,
than the potential losses that can result. In mobile phone, implanted device.
many ways, risk management is about
probability. Ultimately, no network or
organisation is 100 per cent secure and it can
cost an organisation a very large amount of
money if attempting to attain this ideal. At
some point, an organisation needs to decide
that what it has in place is ‘good enough’ and
that they have sufficient data recovery
methods in place to ensure that they can
recover from a worst case scenario.

With any security measure that is applied to a


system, there is a performance cost as well as
Figure 111: Example of a captcha.

161 | P a g e
Software Development: Core Techniques and Principles 4th edition

Knowledge factors: a password, a pass phrase, complex and hard to remember. Best practice
a Personal Identification Number (PIN), answer for password construction generally involves
to a security question, response to a question using something memorable to the user but
(such as a ‘captcha’). not easily guessed by anyone else.
Inherence factors: fingerprint, retinal pattern,
DNA sequence, signature, face, voice.
Password entropy is a measure of how
While each of these categories of user easy a password will be to ‘crack’. A
authentication have their strengths and known password will have an entropy of
weaknesses, many organisations are moving to 0 bits. One that can be guessed 50% of
the use of multi-factor user authentication. the time on the first guess will have an
entropy of 1 bit. A password’s entropy
This involves the use of 2 or more forms of can be calculated by finding the entropy
authentication, ideally from different per character (which is log base 2 of the
categories. For example, a common (and easy number of characters in the character
to implement) multi factor authentication is to set), multiplied by the number of
ask the user to enter their username and characters in the password.
password, and follow this with a pass code sent
to their mobile phone.

Password construction Cracking passwords

Many organisations enact password policies One of the simplest ways to ‘crack’ a password
that force users to create complex passwords is by using ‘brute force’. Brute force is a
that follow a number of rules. Typically this method by which every combination is tested.
involves using at least 8 characters, a mix of While this would be an arduous task for a
numbers, alpha-numeric characters and human being to perform, it is very easy to write
special characters as well as upper and lower code to try out all of the combinations
case characters. Policies vary, but passwords of possible. The level of difficulty rises with the
this sort are considered ‘strong’ by different combinations that are allowed in the
organisations and they will
often have an interface
that will give a user
feedback on the relative
strength of their selected
password against this
measure. The formation of
such policies is done by
considering the ‘entropy’
of passwords constructed
using the set of rules laid
out.

As the entropy of each


character is multiplied by
the number of characters,
password length is one of
the most important
factors in creating a secure
password. There is a fine
balance, as passwords can
quickly become too long, Figure 112: password strength.
source: xkcd.com used with permission

162 | P a g e
Chapter 10: Cybersecurity

password. This is the reason why ‘complex’ stored in the system. A solution to this is
password policies add addition character sets hashing.
(such as upper and lower case, numbers and
special characters). However, we have already Encrypting data using hashing
discussed how adding these characters doesn’t
increase the entropy per bit by very much. It is We will discuss how encryption works shortly,
really the length of the password that makes but for the purposes of storing passwords
‘brute forcing’ a password difficult (especially securely, hashing is the best method. Hash
if the length is unknown). tables were discussed in Chapter 3, and the
method for hashing passwords is the same.
Many systems will include a mechanism that
will suspend an account after a small number Hashing works in the following way. When a
of incorrect attempts have been made, and user enters their username and password,
this is usual enough to make ‘brute forcing’ both are hashed using a key to produce a
ineffective. hashed value of each that is significantly
different from the original. This hashed value
Dictionary attack can be compared to the hashed value that is
already stored in the password database. The
A more common method of attempting to main benefit of using this method is that the
crack a password is a dictionary attack. A database only contains the hashed values. As
dictionary attack is based on trying all the the hashing process is a one-way conversion,
strings in a pre-arranged list. Dictionary attacks gaining access to the hashed password
often succeed because people have a tendency database is not very useful if the hash key is not
to choose short passwords that are ordinary or known. Of course, once the database is
common words, or simple variations that obtained by a hacker, they can take their time
include a digit or punctuation character on the and brute force different hash table keys, so
end. Dictionary attacks can easily be thwarted while this is far better than storing usernames
by choosing a password that is not a simple and password in plain text, it is still not very
variant of a word found in any dictionary or secure. Once a hashed database is obtained,
listing of commonly used passwords. rather than brute force, a hacker will make use
of a rainbow table.
Storing passwords securely
Rainbow tables
While we have been discussing methods by
which users can create passwords to protect A rainbow table is a listing of all possible
themselves, the best password in the world permutations of encrypted passwords specific
will be of no protection if the password to a given hash algorithm. Once a hacker gains
database and where it is stored in the access to an information system’s password
information system are not secure. database, they compare the rainbow table’s
precompiled list of potential hashes to the
In the early days of software development, hashed passwords that are stored in the
usernames and passwords were stored in database. The rainbow table is effectively a
databases and text files within the information reverse engineering of the hash that has been
system. While these files were often secured in applied to each username and password. It is
some way, administrator accounts could be an offline only attack, so while much more
targeted for brute force or dictionary attacks in powerful than brute force or a dictionary
order to gain access. Instead of targeting attack, a hacker must first have been able to
individual user accounts, by targeting an gain access to the password database.
administrator account, a hacker could gain
access to all of the usernames and passwords Rainbow tables can be used to crack 14-
character alphanumeric passwords in about

163 | P a g e
Software Development: Core Techniques and Principles 4th edition

160 seconds. Unfortunately for those cases they may be automated to push updates
designing the security of information systems, out to users.
the increase in processor speed and data
access times have meant that large rainbow While we have been discussing techniques to
tables are easily viable. Large rainbow tables protect information systems and data, one of
can approach 700 GB in size and vary the most difficult aspects of information
depending on the hash algorithm that is being system security is detecting when an intruder
targeted. While the hacker will have to guess gains (or has gained) access. Often intrusions
which has been used, they may simply try go undetected and even when detected, it is
many rainbow tables. not always easy to determine what data has
been accessed.
To protect against the use of rainbow tables,
system administrators use ‘salt’. Detecting intruders
Salt Intrusion detection can take place if particular
files or assets within a network are tracked or
‘Salt’ is the process of adding random data to ‘watched’. Audit logs can be generated that
each username and password before it is track all network traffic and in this way an
hashed. This has the effect of making unauthorised entry into a network can be
passwords and usernames longer (which adds detected. The downside of this is that it can
to their complexity) which results in a very take quite a bit of effort to ‘watch’ files on a
large hashed value. Rainbow tables have great network. Audit logs will track all network
difficulty with salted hashes as they need to activity, but the task of sorting through
not only be able to work out what was used, legitimate traffic and unauthorised traffic is
but also what salt has been added. not easy or remotely enjoyable! In this sense,
it is easy to trace the activity of an intruder
The best ‘salts’ are uniquely random ones that after a breach has been detected, but by this
are applied to each username or password (as stage, the damage may be done. The result of
opposed to the same ‘salt’ being used for detection will hopefully be the removal of the
everything). intrusion mechanism.

Version control and software updates

By maintaining a regular system of version A honeypot is a form of trap that is


control and ensuring that key software used to detect hacking attacks or
collect information on malware that
packages are updated where possible, the can be used to protect the
possibility of vulnerabilities existing and being information system against future
exploited is minimised. attacks.

CASE tools

Keeping track of all of the software packages


within an organisation can be a difficult task. A Honeypot intrusion detection
category of software tool known as CASE tools The ‘honeypot’ intrusion detection method
can be used by organisations to do this in an can be used to steer intruders away from the
efficient manner. While the breadth of tools heart of the network, gather information that
available under the banner of CASE tools will help prevent further attacks and possibly
covers all aspects of the PSM, they do have a gather evidence that can be used in any
particular application in keeping track of the potential prosecution. A ‘honeypot’, as the
versions of software packages that are being name suggests, is a server on the network that
used and updates that are available. In some is set up as generically as possible, and often

164 | P a g e
Chapter 10: Cybersecurity

without much protection, as a lure to an computer on which it is running. Client


intruder to the network. It is often populated firewalls will often work by asking the user if a
with old or useless data but in such a way that particular program that is seeking a connection
this data seems to be legitimate and valuable. into or out of their computer is to be allowed
access and will quickly build a profile for that
user and computer.

When traffic passes through a firewall, it is


inspected to see if it meets certain criteria. If it
does not meet these criteria, the traffic is
blocked; otherwise it is allowed to proceed
through. Firewalls can filter traffic based on the
type of network traffic (known as protocol
filtering), the source or destination address
(known as address filtering) or the attributes of
the packets being sent.
Figure 113: Diagrammatic representation of a
firewall Malware

The placement of the ‘honeypot’ server is ‘Malware’, or malicious software encompasses


often inside the firewall of the network but a wide range of software that has been
removed from the real servers by other designed for a purpose contrary to the wishes
barriers. The ‘honeypot’ will have active logs of the user who is ‘infected’. The term
and trace software in operation and as it will ‘computer virus’ is often used incorrectly to
not be in use by the legitimate users of the describe software of this type – however a
network, traffic on it can immediately alert the computer virus is a particular type of program
administrators of a network to an intrusion. and malware describes the broader category of
which a computer virus is a member.
The role of firewalls So what types of software categorise malware?
Malware includes computer viruses, worms,
Trojans, spyware and other types such as
One of the first tools that organisations employ
rootkits, but in this text, we will just discuss
to protect themselves from outside threats is a
these main types. Some of these types include
firewall. A firewall can be software or
what is known as a ‘payload’, which is the part
hardware (usually as part of a router) that
of the malware that damages the computer,
filters traffic based on a set of rules that can be
changed to suit the environment. A firewall is provides access from outside or performs
good at preventing outside intrusion via ports some other specific function.
that are not open for any legitimate traffic to
There is a misconception that there are no
or from the organisation. All traffic into or out
viruses for the Mac or Linux platforms. While it
of an organisation has to pass through a
is true that there are certainly far fewer viruses
firewall. Firewalls are very good at stopping the
for these platforms, it is not as a result of the
spread of ‘worms’, but when it comes to other
Linux or Mac platforms being more secure than
types of ‘malware’, they are basically
Windows. As around 90 per cent of the world’s
ineffective.
computers are Windows-based, it makes more
A network firewall is installed on the boundary
sense to write malware for Windows if the
between two networks (usually between the
creator wants it to have the best chance of
Internet and the organisation’s information
propagation. Linux and Mac platforms are far
system). A client firewall is software that runs
less attractive targets to the creators of
on an end user’s computer. Whereas a
malware who are looking to spread their
network firewall protects a whole
software creations and ultimately steal money
organisation, a client firewall only protects the

165 | P a g e
Software Development: Core Techniques and Principles 4th edition

or cause damage. With the popularity of Apple Worms


hardware and software continuing to rise,
however, we may begin to see more and more A worm, as opposed to a virus, does not
instances of Mac related viruses. (usually) have a purpose other than the
consumption of bandwidth. A worm is a
Computer viruses program that rapidly replicates itself through a
network without the need of user assistance.
A computer virus is a program that is designed Worms have been known to consume the
to replicate itself and is often transmitted via bandwidth of organisations so much that the
removable secondary storage devices such as organisations have effectively had to shut
USB thumb drives or portable hard drives. As a down their network.
virus needs to be executed to perform its
function (which is usually not just replication), Some worms carry a payload and in these cases
viruses often attach themselves to legitimate their purpose is usually related to opening a
programs or place themselves in locations ‘back door’ to the computer with the controller
where they will be executed (such as the boot of the worm able to take control of the
sector of primary or secondary storage infected computer and use it as a ‘zombie’ (see
devices). ‘zombies’ and ‘botnets’ later in this chapter).

Computer viruses can operate in a number of The Conficker worm


ways. Often when the file is executed, they will
seek out other files to infect before passing The Conficker worm is notable as it has been
control to the legitimate executable file to one of the most widely spread infections of a
which they are attached. Some viruses will load worm in history. Estimates are that over 7
themselves into RAM so that any files that million computers were infected in more than
happen to be loaded will be infected. 200 countries. Many of the functions of the
Conficker worm are typical of worms in
Computer viruses have a number of methods general, so it is a useful example to discuss.
by which they try to avoid detection. Some
have been known to include code within them Conficker was first detected in 2008 and it
that encrypts the virus so that the ‘signature’ propagated via a security vulnerability in
of the virus is different. Other more Windows XP. Conficker would not only
sophisticated viruses have been known to re- propagate, but would also make contact with a
write parts of their own code for this purpose. host server to receive updated code. This was
Another way that a virus can remain one of the aspects of Conficker that made it
undetected is if it has code that specifically very difficult to eradicate. Conficker was
tries to counter the type of scan that a virus constantly being improved via updates to
checker does. In this case, the virus tries to counter what was being done to detect and
intercept the scan and send an ‘OK’ signal back remove it (updates were named ‘Conficker.B’,
to the virus checker. ‘Conficker.C’, etc.). The vulnerabilities that it
was exploiting were added to in each version
Most computer viruses exist for some purpose; update, as were the ways in which it made
that is, they have a payload. The payload, when contact with the update server. Technically,
triggered, can be as simple as deleting or having an update server creates a central point
corrupting files, causing a prank effect or connecting all of the worms and should make
displaying a message of some type. Viruses are the eradication of the worm easier – but only if
often triggered some time after the infection the update server can be identified and shut
and this trigger can be a specific date or time down. However, Conficker would seek out its
or can be completely random. update server by using a randomly generated
domain name which the ‘owner’ of Conficker
could access when they wished by registering

166 | P a g e
Chapter 10: Cybersecurity

it and placing the update on it. As all of the


Confickers were accessing random servers, the
‘owner’ of Conficker was only able to update a
percentage of them at a time in this way, but
as each update server was detected and shut
down, more could be set up to continue
communication between the ‘owner’ and the
worms.

‘Conficker.E’ was the first to download a


payload – which it did from an update server
located in the Ukraine. Many believe the
Ukraine to be the origin of the Conficker worm
for this reason and also because the worm was Figure 114: This Trojan horse sculpture was
programmed to not affect computers with IP made for the 2004 movie Troy starring Brad Pitt.
addresses in the Ukraine or with Ukraine It was donated to the Turkish city of Canakkale
keyboard layouts! where it is a tourist attraction.

The Conficker payload consisted of a


Most of all of the ‘cracking’ programs, key
‘scareware’ anti-virus program (described later
generators, etc. designed to give users access
in this chapter) as well as the installation of a
to pirated software, contain Trojans of some
‘spambot’ which would scan the computer for
description – as those who download these
email addresses for use by spammers (called
types of packages are quite trusting of the
harvesting) as well as use the computer itself
product, as it is by its nature written by an
to send spam (which protects the spammer
unidentified person and is usually very
themselves from direct detection).
unpolished.

Trojans Spyware

A Trojan is a virus that is disguised as another Spyware is a type of malware that is concerned
software package that performs a different primarily with the collection of information.
(although not always legitimate) purpose. It is This information could be personal details
named after the Trojan horse in Greek (which could be used for identity theft), email
mythology, which was used to hide Greek addresses, browsing activities (for the
soldiers in the Battle of Troy. purposes of marketing) or simple keystrokes.
Spyware is often delivered as the payload via a
A Trojan may perform a number of different worm or a Trojan and is not normally self-
functions based on its payload. A Trojan can replicating. Spyware can also be installed when
install a ‘spambot’, it can install software that visiting a compromised web-site via specific
allows the owner of the Trojan to access the vulnerabilities in Internet Explorer. In a case
infected computer, it may install pop-up like this, the user could be infected and not be
advertisements, it can turn the computer into aware that the infection has occurred.
a ‘zombie’ or it can delete files in the same way
a virus can. One thing that a Trojan does not do Adware
well is replicate, instead it relies on the users
to download and run the program or distribute Adware is a subset of spyware in that it
it. Despite this, makers of anti-virus software performs some of the same functions. The
report that the majority of malware is in fact of function that classifies adware is that it will
the Trojan variety. display pop-up ads to the user and do so
frequently. These pop-up ads can be targeted
to the browsing habits of the user or they may

167 | P a g e
Software Development: Core Techniques and Principles 4th edition

be completely unrelated. In most cases, the ‘zombies’ pulse their IP requests so that
advertising being displayed is generating instead of flooding the web-server and
income for the owner of the adware in terms inevitably shutting it down, the Internet traffic
of the number of ‘clicks’ the advertisement is is slowed considerably. This sort of attack is
receiving. much harder to detect or even do anything
about.

A DDOS attack does not aim to steal or Other types of attack


corrupt data, but instead tries to
overload web servers and cause them to Of course, other types of attack exist where the
be unusable. This loss of service can be intruder is playing a much more active role.
very costly to a company.
Some examples of these follow.

SQL injection
Zombies and botnets Often in order to authenticate a user, their
username and password will be compared to
One of the main purposes for malware is to what is stored in a database. A query language
establish ‘zombies’ which can be used to send such as SQL is used to extract the relevant data.
spam (or act as a ‘spambot’). This makes it SQL or Structured Query Language, is a
harder to establish the sender of the spam – database query language specifically designed
thus protecting the spammer from to extract, add, delete or edit records in a
prosecution. It is estimated that well over half database. The simplest command is one that
of the spam in the world originates from extracts all of the records from a database
‘zombies’. table.

‘Zombies’ can also be used as part of a ‘botnet’ SELECT * FROM movies;


to carry out distributed denial of service (or
DDOS) attacks. A DDOS attack is carried out to This command would extract the set of all of
flood a targeted web-site with so much traffic the database records from a table called
that it slows down significantly or possibly ‘movies’. The ‘*’ character is a wildcard used to
crashes. ‘Zombies’ in a ‘botnet’ are all select all the fields. Instead of the wildcard, a
instructed to send IP (Internet protocol) command such as the following can be used,
requests to a targeted web server at the same which lists all of the fields that you want to
time, and the sheer number of requests is extract.
designed to be in excess of the number that
the server can handle. There is very little that SELECT title, genre, director FROM movies;
any company can do in response to a
coordinated DDOS attack of sufficient scale When a command such as this is executed, the
aside from changing an IP address – a common set of results can be iterated through or
tactic that unfortunately is not instant, as it can displayed, depending on what the purpose of
take up to 72 hours to propagate a new IP. the application.
There is also no guarantee that this will
prevent a DDOS attack, as the attackers can When managing a login, an application may
simply redirect their attack to the new IP. take the username and password that has
been entered and execute an SQL query such
As it is possible to detect when a DDOS attack as the one below:
is occurring and take a web-site off-line (as a
prevention mechanism), ‘zombies’ can also SELECT * FROM users WHERE
perform a distributed degradation of service name = ‘ & txt_username & ‘ AND
attack instead. In this sort of attack, the password = ‘ & txt_password & ‘;

168 | P a g e
Chapter 10: Cybersecurity

How does an SQL injection work?

A simple SQL injection involves adding an ‘or’


condition to the statement. Let’s consider the
example above again. The user could enter the
following as their password:

password’ or 1=1;

If this were to happen, the SQL name in the


code would now look like:

SELECT * FROM users WHERE


name = ‘Bob’ AND password = ‘password’
or 1=1;
Figure 115: Password entry screen
The addition of the ‘or 1=1’ makes the
This command will return the set of records statement always true. Not only will a
that fit this query, which will either consist of statement like this potentially give the user
one or zero records. Note that the text input access, it may give them access to the entire
from the user needs to be placed between database, as the ‘1=1’ will cause all of the
inverted commas in the SQL command as the usernames and passwords to be returned (as
username is in the form of a string. If zero opposed to just one). A more insidious version
records are returned, the user either is not in of an SQL injection can be performed by
the database or their password doesn’t match. placing additional SQL commands into the
If one record is returned, the user has been string. An example of this is shown below.
authenticated.
password’; DROP TABLE users;
One of the difficulties with this concerns the
process of taking the direct input from the user The ‘DROP TABLE’ command deletes the entire
and placing it into an SQL command such as the table of that name. The user may not know the
one above. If code is written in this way, the name of the database table they are trying to
user has the ability to ‘inject’ their own code delete, but may attempt variations of common
into the query. This is known as an SQL names till they hit upon one that works. In this
injection. instance, the user’s intention is not to gain
access, but rather, to cause intentional
damage.

Figure 116: ‘Exploits of a mom’


source: xkcd.com used with permission

169 | P a g e
Software Development: Core Techniques and Principles 4th edition

Protecting from an SQL injection with a firewall that protects you from some of
the dangers inherent in connecting to the
Protecting from an SQL injection is relatively Internet, you are not protected from those
straightforward. By validating the inputs from that are closest to you. Ironically, the biggest
the user prior to inserting them into an SQL threat to your data in this type of situation is
string as parameters, a software developer can not an unscrupulous hacker on the other side
prevent giving the user the ability to enter of the world but rather the (equally
code directly into the input. Despite the ease unscrupulous) one that is standing metres
of this solution, a surprising number of from you!
information systems were vulnerable to SQL
injections in recent years, and badly coded An unsecured connection is not only easy to
software solutions may still be. connect to, it also means that the data being
sent over it is unencrypted. The data that you
A man in the middle attack send and receive is able to be ‘sniffed’.

Another way an intruder can gain access to a There are a number of settings within your
user’s data is to insert themselves in the operating system that can protect you
middle of the communication that the user is however. The first of these is network
having with the information system they are discovery. Both Mac OS and Windows allow
connected with. This generally involves the user to turn network discovery off which
eavesdropping on the initial communication so basically means that your computer will not be
that the intruder can get the information that seen on the network. In addition to this, it is
they need to make the connection, and then important to disable all means of sharing files
pretending to be the information system in and resources – as often these are ‘open’ by
future transactions with the user. default. When accessing any sites for which
you will be entering sensitive information, you
Unsecured wireless networks or wired should use HTTPS rather than HTTP. HTTPS is
networks in which users are connected are an option that is available for most sites that
vulnerable to ‘packet sniffing’. Any
device on a network can ‘listen’ to the
packets that are being transferred
between network nodes.
‘Packet sniffers’ are not illegal by
themselves as they can be important
diagnostic tools. The use of a ‘packet
sniffer’ can show how efficiently a
network is operating and what
bottlenecks are occurring.

The number of unsecured wireless


connections that are offered to the
public is growing. Coffee shops,
shopping centres, hotels, airports and
fast food restaurants have been
offering connections of some sort for
years, but many are now opening
these up as the number of enabled
devices grows. The attractions of using
wireless hotspots of this kind are
obvious and, although the connection Figure 117: Alice sends a message to Bob, but it is
will probably be sourced via a router intercepted by Eve (an eavesdropper).

170 | P a g e
Chapter 10: Cybersecurity

require passwords or sensitive personal


information and when accessing them an
encrypted connection is established to
protect your data (it is part of the TLS
protocol discussed earlier). It is also a
good idea to run your own software
firewall.

A great way to establish a secure


connection when connecting to an
unsecure wireless access point is to
connect to a Virtual Private Network
(VPN). Once connected to a VPN (which
could simply be your home computer),
the sensitive information will be coming
into the VPN (which is trusted) instead of
straight into the unsecured wireless Figure 118: Some of the facets of social engineering
network. The connection to the VPN will,
of course, be encrypted. What many in the security industry also
understand is, often this “weakest link” is “us”.
While this doesn’t completely protect you
from a ‘man in the middle’ attack, it does make Security systems can be set up and processes
things more difficult for the person attempting established so that these systems are used
to do this. effectively, but if employees do not follow
these procedures, security can be
compromised.

Software security controls: The Strictly speaking, the “art” of social


software and procedures used to assist
engineering includes reading body language,
in the protection of information
systems and the files created, understanding social cues, applying
communicated and stored by psychology and taking advantage of situations
individuals and organisations. These to leverage others to grant access to the
include usernames and passwords, system.
access logs and audit trails, access
restrictions, encryption, firewalls and
system protection, and security Security measures for the
protocols such as Transport Layer
Security (TLS). Internet
VCAA VCE IT Study Design 2020- A number of security measures can be put into
2023 Glossary place to protect against many of these threats;
there is not one magic solution and a multi-
layered approach is required. Anti-virus
Social engineering software is a must. Anti-spyware software is
also a necessary protection. Scripting is often
In this chapter we have already discussed a an option that is available via web-browsers
wide range of physical and logical security and other Internet enabled software packages
measures that can be put in place to protect an and is a feature that can be disabled when
information system. While an organisation can ‘surfing’ in un-trusted zones. The use of a
take a number of measures to protect their software firewall can be useful as an addition
data, the phrase “a chain is as strong as its to a hardware one. Many security gurus
weakest link” is a relevant one to consider. recommend the use of most or all of these

171 | P a g e
Software Development: Core Techniques and Principles 4th edition

protection methods as they complement each There are two main types of encryption:
other. symmetric key encryption and asymmetric key
encryption.
On Windows computers, many users operate
in administrator mode when they are working Symmetric key encryption
on day to day tasks. As the administrator
account is only really needed for the Symmetric key encryption can be used to send
installation of software, it should only be used large amounts of information across the
when this needs to be done. The problem with Internet from one person to another. The
working in administrator mode is that malware
that infiltrates a computer that is logged on in
this way, will also have administrator
privileges. If malware tries to infiltrate a In describing various cryptography and
security examples, the characters of
computer that is running in a ‘user’ mode, it ‘Alice’ and ‘Bob’ have become the
will be unable to install or replicate itself in any default for two users communicating
way and the attack will be nullified. Many with each other (A and B). Some other
experts advocate only using administrator notable characters are:
login when disconnected from the Internet Craig: A password cracker
Eve: An eavesdropper
entirely.
Mallory: A malicious attacker
Trudy: An intruder
Keeping anti-virus and anti-spyware software Sybil: An attacker that uses a number
up to date is essential as new malware threats of different identities
are being created daily. There is some
argument that keeping the operating system
up to date is also a good practice, although ‘plaintext’ version is encrypted using what is
some would also argue that it is good to wait referred to as a ‘secret key’ and then sent to
before installing the latest operating system as the recipient who then needs to decrypt it
sometimes security holes can present using the same key. This is a very secure way
themselves and it can take a while for these to of sending information. The problem that
be detected and fixed. arises is how does the sender let the recipient
know what the secret key is? Obviously the key
Sending sensitive data via the Internet would not be included with the message
carrying the ‘ciphertext’ as this would be too
Sending sensitive data over the Internet can easy to intercept. It would also be as risky
only really be done by using encryption. sending the key via the same transmission
Methods such as email, standard HTML web method either before or after the ‘ciphertext’.
forms and the like are highly susceptible to If someone was indeed intercepting the
‘sniffing’, which is discussed in the context of messages, then it would be an easy matter for
Intranets later in this chapter. them to also intercept the key.

Encryption It may be that the secret key is exchanged


successfully via another method that is secure.
Encryption is the process of encoding There is then the issue of how secure that key
information so that it is unreadable. This is over time. If the secret key is shared with
requires the use of an algorithm known as a other people, the security of the whole process
cipher. In encryption, the original information diminishes as there are then many people who
is known as ‘plaintext’ and the encoded could potentially share the key or not keep it
information is known as ‘ciphertext’. securely. A solution to this might be to create
more secret keys and have keys that are used
for communications with certain people,

172 | P a g e
Chapter 10: Cybersecurity

however, this makes the process complicated Hybrid encryption – a better solution
and potentially confusing.
Symmetric and asymmetric key encryption
Asymmetric key encryption have their pros and cons. While symmetric key
encryption is very good at transferring large
In asymmetric key encryption, there are two amounts of data quickly, it has problems with
keys instead of one – a public key and a private the exchange of the secret key. Asymmetric
key. Both are related mathematically, but the key encryption is slow, is one way only and has
public key can only be used to encrypt difficulties with large amounts of data. Both of
information and not decrypt it. The private key these methods can be combined for a good
can be used to decrypt the information that solution that solves these problems.
has been encrypted by the public key.
This works in the following way. If someone Let’s say two users (Alice and Bob) want to
wants to send information to someone else in exchange information securely between them.
a secure manner using asymmetric key Alice begins by making a transfer request of
encryption, they first request a copy of the Bob who sends his public key to her. Alice then
public key. This enables them to encrypt the encrypts a secret key that she has generated
information and send it. When the information for this transfer of information and he sends it
has been received, the owner of both the to Bob. Bob is able to use his private key to
public and private keys is able to decrypt the decrypt the information that has been sent –
information using the private key. which in this case is a secret key. Now that both
Alice and Bob have access to the secret key,
Public keys can be sent out to anyone who may they can send information to each other
wish to send encrypted data to you, but only securely and quickly.
you should have access to your private key.
Logical security
Asymmetric key encryption works well but is
not good for large amounts of information and
Logical security consists of software measures
is relatively slow compared to symmetric key
to protect access to the organisation’s data
encryption.
and information. The best way to apply logical
security measures is by establishing
Secure Sockets Layer (or SSL) and usernames, passwords and access restrictions
Transport Layer Security (or TLS) (levels of access). Many organisations employ
password policies that force their employees
A common application of asymmetric key to enter complex passwords and change them
encryption is SSL (or Secure Sockets Layer) frequently. Too often users will enter
encryption, which establishes a secure passwords that are easy for them to remember
‘handshake’ between a web browser and a – but also easy to guess. The use of common
web-server. SSL is part of a security protocol
called TLS (or Transport Layer Security).

TLS is commonly used when transferring


sensitive information to a web-site (credit card
information or personal details) and can also
be used with email, VoIP, instant messaging
and a variety of other applications. When
accessing a secure web-site using this method,
the user is sent the public key and the data is
then encrypted and sent to the web-server.
This link is usually indicated by a small padlock
icon in the web-browser’s title bar. Figure 119: Security cameras

173 | P a g e
Software Development: Core Techniques and Principles 4th edition

words as well as words or numbers that have Large organisations that can afford it will often
meaning to them – such as family or pet names employ guards or have 24-hour surveillance,
and birthdays, means that a person that wants but this is not an option that is always available
to gain unauthorised access will often have and it comes down to a risk management (cost
some success by doing some research into the versus benefit). For many organisations
employees themselves. The prevalence of this however, the use of biometric security is quite
sort of information on the Internet, especially cost effective and easy to implement.
via social networking sites such as Facebook,
makes passwords that are built on these types Barrier techniques
of facts extremely weak. Many organisations
now use password policies that force users to
enter passwords of a minimum length, with a
combination of upper and lower case Physical security controls: The
characters, including at least one number and equipment and procedures used to
one special character (such as an exclamation assist in the protection of information
systems and the files created,
mark, comma, hash or dollar sign). communicated and stored by
individuals and organisations.
Access logs and audit trails provide a useful Equipment controls included zoned
layer of protection in that security breaches security strategies, barrier techniques
can, to some extent, be traced. In addition, if and biometrics. Physical procedures
the users of a system know that access logs and include backing up, shredding
confidential documents and checking
audit trails are being produced, they might be authorisation credentials.
less likely to attempt to access areas of the
network to which they do not have access. VCAA VCE IT Study Design 2020-
2023 Glossary
Physical security
Physical security measures prevent
The term ‘barrier techniques’ describes the
unauthorised access to physical hardware and
control of access to an organisation through
software by using physical barriers and
the use of barriers. Barriers are typically
authentication techniques. The most obvious
arranged to form concentric layers. The area of
physical security measures (known as barrier
greatest sensitivity is at the centre of the layers
techniques) consist of locks, alarms, fences and
and someone wishing to access this area must
gates. While the use of these may seem to be
pass successfully through all of the layers in
quite obvious, their application within
turn. Barrier techniques do not exclusively
organisations can be inconsistent. Locks can be
consist of physical security measures, although
organised so that there are levels of physical
the first ‘layers’ can typically consist of fences,
access within an organisation (for example,
locks, guards, security cameras and gates.
some keys will access some areas). This is
known as a zoned security strategy. Locks can
also be digital locks that accept PINs or swipe
cards. The advantage of this sort of system is
that it can be installed in such a way that it
maintains a log of when employees unlock or
lock a particular door and the codes can easily
be changed remotely. Alarm systems and video
surveillance are both useful (and relatively cost
effective), but their presence should be
obvious so that they work both as a deterrent
as well as a tool for detection and response.

174 | P a g e
Chapter 10: Cybersecurity

Biometrics method by which transactions between users


of information systems can be traced. While it
The field of biometrics is concerned with may not be desirable (or indeed ethical) to
identifying individuals based on unique record every action that each user performs, a
identifiers of a physical or behavioural nature. record of a transaction should be made when
Perhaps the best known biometric measure is certain key events are triggered.
a fingerprint. Fingerprint scanners can be
placed on door locks or to provide access to Transactions such as the following could be
computers or resources. Fingerprint scanners examples of events that should be recorded:
however, can be fooled easily with prosthetics • Accessing a file
or even photocopies of a fingerprint – provided • Logging in and logging out of a system
that the fingerprint of an authorised person • Printing a file
can be obtained. For this reason, they are • Accessing a web-site
considered a fairly low level biometric device.

Common physical biometric measures that are


Threats to data integrity
used are face recognition, palm prints, hand
Before we discuss the threats to data integrity,
geometry and iris recognition. Examples of
we must first define what it means for data to
behavioural biometric measures that are used
be labelled as ‘having integrity’. For this to be
true, a number of criteria need to be satisfied.

Accurate

Accurate data is data that is true to the source.


It is captured (or entered) once and fully
validated. Judging the accuracy of data can be
a subjective thing, but there are a number of
criteria that can be applied to assist in doing
this.

• Does the data make sense in relation to the


data and information that you have
Figure 120: Fingerprint scanner already gathered?
• Is the source of the information reputable
are gait, voice and typing rhythm. These and trustworthy?
measures can all be used to good effect, • Has the data been checked?
however, one of the best (and simplest) • Is the data up to date?
biometric measures is actually a photograph
ID. For a photographic ID to be used, someone To say that data is accurate is not that same as
needs to physically check that the person in the saying that it is correct. Accurate data is simply
ID photo is the person that wants to gain a true representation of the source material
physical access to the organisation. and it may actually be ‘correct’.

Timely
Tools and techniques for
tracing transactions between All data has a ‘use-by’ date. Data that is
users of information systems considered timely for one purpose may not be
useful for another. For example, data on the
technology purchasing preferences of students
We have already spoken about access logs and
will only be relevant for a short time as
audit trails and how these provide a very useful

175 | P a g e
Software Development: Core Techniques and Principles 4th edition

technology is rapidly changing. The amount of Poor data management, especially in relation
time that can pass before data is no longer to unstructured (or poorly structured) data,
considered to be ‘timely’ really depends on costs organisations a lot of money. The
what sort of data it is. storage, control and transfer of data that is not
in a structured format creates many
Reasonable challenges. Information systems within an
organisation can easily lose or misplace data
Data may be perfectly valid, but not reasonable files or additional time may be needed to
in the context in which it is being used. For change the format of files to make them
example, while it is possible for a person to live compatible with different systems. This can be
to the age of 100 (and beyond), it is not likely as a direct result of a lack of standards within
that a person of this age would be amongst a the organisation or potentially resulting from a
list of students taking part in an athletics lack of compliance in adhering to the standards
carnival. that have been set.

Authentic The following are some other threats to data


integrity.
The term ‘authentic’ relates to the source of
the data. Is the data from a trusted source? The Initial data conversion
best way to ensure that data is authentic is to
gather it yourself. If this is not possible, then to Certainly once a database is established, well
ensure authenticity, the person or the device designed procedures and validation can keep
gathering the data must be trusted and proven the number of errors to a minimum. However,
to work as expected. If the data is not being databases need to be created and populated
gathered first hand but is instead being with data and this process alone is one that can
collected from another source, then that potentially introduce the most errors and
source must be one that has a proven track inconsistencies. Some data may be entered
record and transparent processes. that is completely new or represented in a
different form. Other data may be converted
Correct from existing information systems.

There are lots of terms to describe data and There will certainly be lots of differences
some of them are very similar. We have between these systems and the new databases
already spoken about validation, which is a being established: different fields types, data
process by which the data is made as error free ranges and data that is missing. What will be
as possible. Accuracy is a measure of how well done with data that exists in the existing
data can be measured and recorded. We have information systems but not in the new
also spoken about data that is authentic, system? What about gaps in the data or errors
however with data that is correct, the in the existing data?
emphasis is on whether the data is right.
Although it is unlikely, data could potentially Manual data entry
be described as being compliant with all of the
criteria mentioned previously, but still not be No matter how well designed an information
correct. To have a definitive answer as to system is, it will almost always require some
whether data is correct or not, it needs to be manual data entry. With the intelligent design
compared or scrutinised by someone that has of user interfaces, many potential entry errors
the knowledge or experience (or data from can be correctly validated and corrected.
other sources) to be able to make that Despite this, there are some aspects of data
determination. entry that will always be prone to error no
matter how well designed the entry process is.
Data items such as names are easy to miss-

176 | P a g e
Chapter 10: Cybersecurity

spell. It can be common for items to be miss- the more transactions that are introduced, the
categorised (for example, ‘male’ as ‘female’). more points of failure there will be in the
Data items can be left out altogether or an process. What’s more, there will likely need to
element of one record entered into another. be some translation between the existing
systems and the new system.
Data mining
It can be very easy in situations like this for the
Data mining, by definition, is the process by vendors of one software solution to ‘blame’
which as much useful information is gleaned the inefficiencies on the legacy packages being
from a set (or sets) of data so that some deeper used. Different data types could be used to
conclusions can be drawn. This can involve represent the same data, which could create
extracting and processing data from a variety handling issues (a good example might be a
of sources. The key question becomes one of postcode field being a numeric type in one
trust. The more data that is included in a data system and a text type in another). Other items
set (especially data sets from a variety of of data may be labelled differently (for
sources), the more potential sources of error. example, ‘Surname’ versus ‘Last Name’) or
When data sets are combined and processed there may be gaps in data sets (one software
to form new data, even though the new data solution uses a ‘Salutation’ data field while
set may be technically valid, the correctness of another leaves this out).
the original data then becomes hard to
examine. User permissions

Incompatibility between systems Another data management practice that can


cause conflict between information systems
In any information system, databases or could simply be that of user permissions.
software packages will exchange data in a Systems administrators are often very careful
number of ways. These transactions can be with the permissions that they assign to those
one way or can be a true two-way exchange. in the organisation, and rightly so. However,
In rare cases, an information system may have incorrectly set permissions could mean that a
been built in an organisation and evolved so user is able to extract data from one system
that it is the only source of data and it but is not then able to import the data into
communicates with modules that are built another to perform the task assigned to them.
following the same design. Given the time that While this might seem like an easy issue to
it takes to build an information system and the resolve, it could be the case that the
changing needs of organisations, it is more permissions required by the user are not
likely to encounter a situation where many available as they would give the user access to
systems interact with each other. many functions beyond their skill level or level
of access.
Often a new software package will become
available in the marketplace that will meet a
need in an organisation. When this happens,
one of the considerations is how the package
will integrate with the existing systems. The
integration may be a strong one or it may well
be that data is exported from the main systems
at regular intervals and then imported into the
new software package using a batch process.
While there is nothing wrong with doing this,

177 | P a g e
Software Development: Core Techniques and Principles 4th edition

Context Questions

1. What is the difference between a worm and a Trojan?


2. What is a DDOS attack and how can it be prevented?
3. What is the process known as ‘packet sniffing’?
4. What is encryption and what does the simplest form of this involve?
5. What is the difference between logical security and physical security?
6. What does a complex password typically consist of?
7. When discussing physical security measures, what are barrier techniques?
8. What do biometric devices measure?
9. When an attack occurs on an organisation, what information can be gained and how?
10. What is penetration testing and who is it carried out by?

Applying the Concepts

• Interview the network manager at your school and ask them to describe the physical and software
controls that they use to secure their data.
• Investigate cases where a loss of data integrity has led to a significant (and possibly detrimental)
effect for customers of a business. Be sure to focus on cases that are specifically due to a loss of
data integrity as opposed to the effects of viruses or errors in code.
• Discuss ‘best practice’ for creating passwords. Make a list of guidelines that can be followed by
users to ensure they do everything they can to protect their data.
• Discuss times in which you or others that you know have changed the outcomes of situations by
appealing to the nature of the other parties or by “stretching the truth”. This needs to be a ‘safe’
discussion, but can serve to highlight how social engineering can be used and how it can be
guarded against.

Key Skills Checklist

At the conclusion of this chapter, you should be able to address the following key skills. Mark each off
as you can achieve them.

Understand the difference between physical and software controls to protect data

List and describe strategies that can be used to identify and minimise risks

Describe types of risks and the ways that they can be protected against

List and describe the characteristics of data that has integrity

Describe how ineffective security strategies can have an impact on integrity of data

Understand how to formulate criteria to assess the effectiveness of security strategies

Discuss risk management strategies that can be used to minimise security

vulnerabilities

178 | P a g e
Chapter 10: Cybersecurity

Sample Examination Questions

The following sample examination questions can be attempted to test your knowledge of the content
of this chapter.

“Even though I created this module for my last job, it’s mine to sell and do what I wish with”.

Question 1
Which of the following would you classify as an event based threat?
A. Incorrect file formats leading to files being deleted
B. Loss of hardware devices
C. Water damage due to flood
D. Viruses damaging data

Question 2
One of the functions of a firewall is:
A. To prevent unauthorised users from accessing the network
B. To ensure that the Internet is protected from viruses on the internal network
C. To provide a method of logging people in
D. To run network software

Question 3
Which of the following is an example of a logical security measure?
A. Biometrics
B. Security cameras
C. Passwords and levels of access
D. Doors with swipe card locks installed

Question 4
An example of an accidental threat to data could be:
A. Flood in the basement of the building
B. Unauthorised access to the information system
C. Users not familiar with how to use the system properly
D. Power failure

Question 5
Employees working for a loan financing company have been given tablets that have WiFi and 4G
network access and connect to the company’s servers to access data while quotes are being given to
customers in their own homes.

a. List three threats to these devices.

1: _________________________________________________________________________

2: _________________________________________________________________________

3: _________________________________________________________________________
3 marks

179 | P a g e
Software Development: Core Techniques and Principles 4th edition

b. Jill (Client Services Manager) is in favour of locking the devices with a common password (such
as ‘ABC123’) while Kalen (IT Support) wants to use the build in finger-print scanner that the
devices have. Discuss the merits of each approach.

__________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

Question 6
One of the most common sources of accidental threats are the employees of a company. Describe
two ways that this threat can be minimised.

1: ________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________

2: ________________________________________________________________________________

__________________________________________________________________________________

__________________________________________________________________________________
2 marks

Question 7
Beans Meanz is a small suburban café that offers free WiFi to anyone in the area. For convenience, it
has been left “open” without any security or encryption on it at all. Traci, who has only recently been
hired to work at the café, has some network security experience and has been critical of this
arrangement.

a. What is the difference between a network password and encryption?

__________________________________________________________________________

__________________________________________________________________________
2 marks

b. List two reasons why Traci might be critical of the setup at Beans Meanz.

Reason 1: _________________________________________________________________

__________________________________________________________________________

Reason 2: _________________________________________________________________

__________________________________________________________________________
2 marks

180 | P a g e
Chapter 10: Cybersecurity

Question 8
Most organisations employ both logical and physical security controls to protect their information
systems. Describe two common logical security measures and two common physical security
measures that can be used.

Logical measure 1: ________________________________________________________________

________________________________________________________________________________

Logical measure 2: ________________________________________________________________

________________________________________________________________________________

Physical measure 1: _______________________________________________________________

________________________________________________________________________________

Physical measure 2: _______________________________________________________________

________________________________________________________________________________
4 marks

Sample Examination Answers

Question 1
Answer: C

Question 2
Answer: A

Question 3
Answer: C

Question 4
Answer: C

Question 5
a. 1: theft of the device.
2: damage to the device (dropping, impact, etc)
3: breach of the company server / database via unauthorised access
b. Using a common password will mean that the employees on the road will be easily able to
remember how to access the device and will be able to ask each other if they forget. The
password that has been chosen however, is a very simply one to crack or for someone to
work out if they are watching the employee enter it.
A fingerprint scanner is much more secure and will ensure that the device is only being
accessed by an authorised employee. There may be some circumstances in which it may not
be practical (injury, wet or dirty hands).

181 | P a g e
Software Development: Core Techniques and Principles 4th edition

Question 6
1: Backup data frequently.
2:Train staff in the correct procedures and use of software packages.

Could also maintain security levels of access and audit trails, so that changes can only be made by
those with access and even then, can be traced and rolled back.

Question 7
a. A network password will gain you access to a network. Encryption is a method by which the
data that is transferred is encoded so that it can’t be read (without the correct key).
b. Reason 1: Anyone could access resources on the network and implant viruses or copy (or
delete data).
Reason 2: Data transmitted over a WiFi network can be easily read by others if it is not
encrypted. This could include banking details and passwords, for example.

Question 8
Logical measures
• Usernames and passwords so that only those that are authorised can gain access to the
information system.
• Levels of access which only allow those with the correct clearance in an organisation to access
important / critical files.

Physical measures
• Security cameras which monitor critical areas such as the server room.
• Swipe cards on doors so that only those with clearance can gain physical access to certain
areas.

Other possible answers include the use of logs / audit trails and biometric devices such as fingerprint
scanners / locks.

182 | P a g e

You might also like