Data Flow Diagram Level 2
Data Flow Diagram Level 2
Data Flow Diagram Level 2
Description
A Data Flow Diagram (DFD) tracks processes and their data paths within
the business or system boundary under investigation. A DFD defines each
domain boundary and illustrates the logical movement and transformation
of data within the defined boundary. The diagram shows 'what' input data
enters the domain, 'what' logical processes the domain applies to that data,
and 'what' output data leaves the domain. Essentially, a DFD is a tool for
process modelling and one of the oldest.
Uses
A Data Flow Diagram is useful for establishing the boundary of the
business or system domain (the sphere of analysis activity) under
investigation. It identifies any external entities along with their data
interfaces that interact with the processes of interest. A DFD can be a
useful tool (particularly when used as a top DFD - refer to Context diagram)
for helping secure stakeholder agreement (sign-off) on the project scope. It
is also a useful tool for breaking down a process into its sub-processes for
closer analysis.
Application
A Data Flow Diagram can be modelled early in the requirements elicitation
process of the Analysis phase within the System Development Life Cycle
(SDLC) to define the project scope. A DFD can also be created throughout
the SDLC to investigate an aspect of the system. If necessary, each
process under study within a DFD can be broken down into its sub-
processes on a new DFD to show more details. A sub-process in turn can
be broken down further to reveal its sub-processes on a new DFD, and so
on until sufficient analysis is reached. The activity of drilling down the DFD
levels is called 'functional decomposition' with the resulting new DFD
referred to as a 'levelled DFD'. For example, the top level DFD (also known
as a Context diagram) is a level 0 DFD, level 1 DFD refers to the initial
decomposition, level 2 DFD to a second level decomposition, and so on.
Audience
The primary audience involved with a DFD are stakeholders such as
project sponsors, managers, and subject matter experts who provide the
information for a DFD and are the same people who should approve each
DFD. Project managers and requirements teams are also involved to plan
the project work.
Components
A DFD can be assembled from the following four components:
Process: A process is a logical activity that transforms or manipulates
incoming data within the domain under investigation. A process can be
regarded as a 'black box'; it receives input, processes it, and produces
output. A rounded rectangle (or circle) represents a process under study.
Each process is labelled inside its rectangle to describe its function or
purpose. It is common to use a verb-noun phrase for naming a process,
e.g. Check stock and Reserve product. If several Data Flow Diagrams
reference each other (e.g. as in a decomposed levelled DFD structure)
each process should be tagged with a numbering scheme or identifier to
show the hierarchical relationships between them. For example, 'process
1.0' in level 0 DFD can decompose into 'process 1.1', 'process 1.2',
'process 1.3', etc, in level 1 DFD. Similarly, 'process 1.1' in level 1 DFD can
decompose further into 'process 1.1.1', 'process 1.1.2' etc, in level 2 DFD,
and so on. It is usual to place the level identifier above the process name
with a horizontal line separating them.
External entity: An external entity sits outside the domain of interest and
supplies data to or receives data from the domain. An external entity is
referred to as an external source or sink (destination) for data flowing in
and out of the domain. A rectangle defines an external entity and is labelled
with a noun phrase inside its rectangle to describe an organisation,
process, machine or person (i.e. a thing) that is outside the domain under
analysis. Examples of naming an external entity are Payment company,
Store locator, 'Mainframe server' and Customer. Note an external entity
in a DFD is not permitted to transform data; only a process can.
Data flow: A data flow represents the path of data moving through the
domain under analysis. A data flow shows the movement of data between a
process and an external entity, a process and another process, and a
process and a data store (data repository). An arrow is the symbol used to
connect a process with other DFD components. Each arrow should be
labelled appropriately to describe the data being passed, e.g. Customer
details, Rejected order and 'Stock level lookup' are common. A data flow
can move the same type of data in both directions in which case both ends
should show the arrows. Data flows are also useful for identifying interfaces
which will need closer data analysis (e.g. ER data modelling). Note that a
data flow is considered to be within the domain of the process under study
whereas an external entity is not.
Data store: A data store represents a logical data repository accessible
within the domain under study. A data store can be a place where data is
created, read, changed and stored temporarily or permanently by a
process. A thin rectangle with the right side open (or two horizontal parallel
lines) shows a data store and is labelled with a noun phrase inside its
rectangle to describe the data stored, e.g. 'Order records' and 'Online
catalogue'. Physically, a data store can represent a file or a database
system. Note a data store is not permitted to process data.
Note:
There are two main styles of diagrammatic notations for Data Flow
Diagrams; Gane & Sarson notation set (e.g. rounded square symbol for a
process, missing right-sided thin rectangle symbol for a data store), and
Yourdon's notation (e.g. circle symbol for a process, parallel horizontal lines
symbol for a data store).
While there are guiding principles and rules for using Data Flow Diagrams,
in practice they are not necessarily always followed. Some DFD
practitioners add new symbols or adapt the rules to suit their needs.
Sometimes this could be useful but the important point is that whatever the
principles or rules, they should be applied consistently throughout the
project. The following are some of the principles and rules for using Data
Flow Diagrams:
Principles:
Within a DFD, the external environment (i.e. external entities) sends data
into the domain under analysis, where it is transformed as it moves from
one process to another inside the domain. The processed data finally
returns to the external environment as output data.
A DFD is designed to be broken down or decomposed into a hierarchical
tree structure of Data Flow Diagrams (DFD levels) with each child DFD
revealing more details than its parent.
Rules:
A process and a data store must each have incoming and outgoing data
flows.
A process and a data store can each have one or more input data flows
and one or more output data flows.
A process can connect to any other component, including to another
process.
An external entity must send its outgoing data flow only to a process.
An external entity must receive an input data flow only from a process.
A data store receives an incoming data flow only from a process.
A data store sends an output data flow only to a process.
Only a process is permitted to transform or change data.
Development
The following steps present guidelines for developing Data Flow Diagrams:
Define the process: Start a DFD by identifying each process under study
and naming it appropriately. If the process needs to be broken down into
sub-processes for further analysis, add a numbering scheme to each
process to show the hierarchical relationships between the parent and child
processes (see 'Process' under Component section). Also, label each DFD
with a descriptive name for referencing purposes, appending its levelled
DFD sequence identifier appropriately.
Identify external entities: Identify each external entity that interacts or
impacts the process of interest and label each one with a descriptive name.
Connect data flows: Connect each external entity to the process under
investigation by an arrow to show its data flow direction. Name each data
flow clearly and as close to it as possible so as to avoid confusing the
name with other data flows close by.
Identify data stores: Add any identified data store on the diagram, label it
and connect its data flow to its associated process.
Repeat the steps for each levelled DFD: Repeat the above steps for
each levelled DFD that is discovered or needed for further analysis. Finally,
name each DFD consistently for referencing back to it throughout the
project.
Tips:
Keep in mind the following pointers when developing Data Flow
Diagrams:
Decide which style of notation (Gane & Sarson or Yourdon) to use for the
DFD components throughout the project.
Begin with the Context diagram to show the target domain's top process
(think of it as the top 'black box') with its major external entities and data
flows. This will illustrate the overall business or system under investigation,
including its domain boundary from the outside environment. Sometimes it
may be useful to leave out any data store details in a Context diagram in
order to focus on the big picture; there is an assumption that the data
stores are included within the top process.
If you wish to avoid data flow lines crossing over each other, try
repositioning the components to see if this helps. If it does not, it is
acceptable to duplicate components on the diagram for this purpose.
If possible, limit the number of decomposition DFD levels to three; it will be
easier for stakeholders to follow. If further DFD levels are necessary,
provide a separate diagram to show the hierarchical tree structure of all the
levels involved.
It is common to attach a Context diagram in a high-level business
requirements document to define the project scope. More detailed Data
Flow Diagrams can be developed later on during the analysis phase and
attached to the requirements specification or functional documentation as
needed.
Engage project stakeholders when developing Data Flow Diagrams and
secure their approval (sigh-off) as soon as possible.
Decomposing diagrams into Level 2 and lower hierarchical levels
What is a Level 2 (or lower) DFD
We have already seen how a level 0 Context Diagram can be decomposed (exploded)
into a level 1 DFD. In DFD modeling terms we talk of the Context Diagram as the
parent and the level 1 diagram as the child.
This same process can be applied to each process appearing within a level 1 DFD. A
DFD that represents a decomposed level 1 DFD process is called a level 2 DFD.
There can be a level 2 DFD for each process that appears in the level 1 DFD.
A possible level 2 DFD for process 2: Loan of video of the level 1 DFD is as
follows:
Note, that every data flow into and out of the parent process must appear as part of the
child DFD. The numbering of processes in the child DFD is derived from the number
of the parent process so all processes in the child DFD of process 2, will be called
2.X (where X is the arbitrary number of the process on the level 2 DFD). Also there
are no new data flows into or out of this diagram this kind of data flow validation is
called balancing.
Look at the rectangular boundary for this level 2 DFD. Outside the boundary is the
external entity Customer. Also outside the boundary are the two data stores
although these data stores are inside the system (see the level 1 DFD), they are outside
the scope of this level 2 DFD.
A process box that cannot be decomposed further is marked with an asterisk in the
bottom right hand corner. A brief narrative description of each bottom-level process
should be provided with the Data Flow Diagrams to complete the documentation of
the Data Flow Model.
Each process on the Level 1 diagram is investigated in more detail, to give a greater
understanding of the activities and data flows. Normally processes are decomposed
where:
There are more than six data flows around the process
The following steps are suggested to aid the decomposition of a process from one
DFD to a lower level DFD. As you can see they are very similar to the steps for
creating a Level 1 DFD from a context diagram:
1. Make the process box on the Level 1 diagram the system boundary on the
Level 2 diagram that decomposes it. This Level 2 diagram must balance with
its parent process box i.e. The data flows to and from the process on the
Level 1 diagram will all become data flows across the system boundary on the
Level 2 diagram. The sources and recipients of data flows across the Level 2
system boundary are drawn outside the boundary and labeled exactly as they
are on the Level 1 diagram. Note that these sources and recipients may be data
stores, as well external entities or other processes this is because a data store
in a Level 1 diagram will be outside the boundary of a Level 2 process that
sends or receives data flows to/from the data store.
2. Identify the processes inside the Level 2 system boundary and draw these
processes and their data flows. Remember, each data flow into and out of the
Level 2 system boundary should be to/from a process. Using the results of the
more detailed investigation, filter out and draw the processes at the lower level
that send and receive information both across and within the Level 2 system
boundary. Use the level numbering system to number sub-processes so that, for
example, Process 4 on the Level 1 diagram is decomposed to Sub-processes
4.1, 4.2, 4.3 on the Level 2 diagram.
3. Identify any data stores that exist entirely within the Level 2 boundary, and
draw these data stores.
4. Identify data flows between the processes and data stores that are entirely
within the Level 2 system boundary. Remember, every data store inside this
boundary should have at least one input and one output date flow.
5. Check the diagram. Ensure that the Level 2 Data Flow Diagram does not
violate the rules for Data Flow Diagram constructs.
Under what conditions would you decompose a process on a Data Flow Diagram?
Exercise 14
Decompose the Video Rental Level 1 DFD process loan of video into a Level 2
DFD.
Review Question 2
Create a Level 1 DFD for the Estate Agency case study based on the context diagram
from the previous Review Question and the case study text.
A discussion on this review question can be found at the end of this chapter.
Review Question 3
Create a Level 2 DFD for the invoice client process of the Estate Agency case study
based on the Level 1 DFD from the previous Review Question and the case study text.
A discussion on this review question can be found at the end of this chapter.
The analyst must establish a clear overview of the system under investigation, and
identify the activities and information that are necessary for the business to meet its
objectives. This may be done in consultation with a single user, with a broad but
adequately detailed knowledge of all areas included within the scope, or in
consultation with a group of users. Alternatively, partial views obtained individually
from a number of users may be combined to give the complete picture.
The Context Diagram, sometimes called Level 0 Data Flow Diagram, is drawn using a
single process, appropriately labelled, to represent the entire system. All data flows
into and out of the system, to and from external sources and recipients, are shown
around the edge of the process.
The types of information flowing within the system will be affected by the nature of
system being investigated. The current system may be entirely manual or it may be
partly computerised but requiring major enhancements. An existing computer system
may no longer meet users requirements, or may no longer be supported (either by
external suppliers or internal IT resources). In some cases there may be no existing
system in place to support, for example, a new area of the business or new legislation,
in which case it may be necessary to investigate how other organisations have
addressed the issue.
All systems have both formal components supported by set procedures and
structured information such as forms, records and files - as well as informal aspects -
which operate through intuition and judgement - which uses conversation and other
unstructured information. The unstructured as well as structured information flows
need to be considered during investigation of the current system.
Levels
Most practical systems will have many hundreds of processes. Clearly, if these were
all shown on one data-flow diagram then we would hardly be any better off than if we
had used a single text document to describe the system. However, we do need to think
about all these processes and describe them. To resolve this problem, data-flow
diagrams can have a number of levels and the whole system is represented by a set of
levelled data-flow diagrams.
The top-most level is called the context diagram or level 0 diagram. On it there is a
single process represented by a plain box (with no divisions) representing the system
as a whole. The remainder of the diagram is all of the external entities and the data-
flows between them and the system. Thus, the diagram gives the context for the
system where the boundaries are between the system and the rest of the world.
The next level which is simply called the level one diagram shows the major
processes of the system. All of the external entities are still shown but now the
processes which receive inputs and generate outputs are also shown and also the data-
flows between these processes. The level one diagram is said to expand the process in
the context diagram.
Just as the level one diagram expands the context diagram so a second level diagram
expands a process in the level one diagram. The only difference is that the second
level diagram has a box around it which looks like the original process box but the
lower section is big enough to draw a diagram in. The inputs and outputs to the
process become inputs and outputs to the new diagram. You show the processes and
data stores supplying the inputs or receiving the outputs outside of the box for the
process.
Other than that, a level two diagram looks just like a level-one data-flow diagram with
data-stores, processes and data-flows between them. Also like a level one, data-flows
cannot flow directly to or from a data store to outside of the process.
Because each process in the level one diagram may be expanded to a level two
diagram, level two in fact consists of a set of diagrams.
The remaining levels work in the same way. Each level is a set of diagrams expanding
the processes in a diagram on the previous level - a third level diagram expands a
process on a second level diagram, a fourth level process expands a process on a third
level diagram and so on. The levels stop when the analyst feels that all processes have
been described in sufficient detail. Note: different processes can be expanded to
different levels. On the level one diagram, there may be one process which does not
need any expansion whilst another one may be very complex and is expanded through
several levels of diagrams.
Potential buyers complete a similar type of card which is filed by buyer name in an A4
binder.
Weekly, the estate agent matches the potential buyer requirements with the available
properties and sends them the details of selected properties.
When a sale is completed, the buyer confirms that the contracts have been exchanged,
client details are removed from the property file, and an invoice is sent to the client.
The client receives the top copy of a three part set, with the other two copies being
filed.
On receipt of the payment the invoice copies are stamped and archived. Invoices are
checked on a monthly basis and for those accounts not settled within two months a
reminder (the third copy of the invoice) is sent to the client.
External entities must provide inputs or receive outputs. There are usually one or two
which stand out as obviously interacting with the system but not being part of the
system. In the Estate Agent system, Client and Buyer stand out as good candidates for
external entities. Others may be harder to spot but once again consider nouns in the
case study and add them to a list of possible external entities.
From the list of candidates for external entities, determine what inputs they provide
and what outputs they receive. If a candidate entity does not seem to provide data into
the system or receive data from the system then it is not an external entity and can be
discounted (for now).
Just as for the logical data structure, an external entity stands for the type of thing
interacting with the system so all clients and all buyers are represented by the Client
and Buyer external entities.
Having identified the external entities there are two ways of progressing from here.
Both are equally sensible approaches and are covered in the next two subsections.
Follow inputs
Each input to the system must be received by a process. This gives us a natural way to
start building up the model.
First, take one of the more significant external entities and one of the main inputs it
provides. In out case a Client providing Property Details is a good place to start. Draw
a bubble for the entity, a data flow for the input and a process which receives the
input. From the case study, there should something which suggest what happens when
this data comes in and this will be the name of the process. For Property Details, the
case study says that the estate agent enters the details on a card and files them. So the
process name should be either Record Details or Receive Details.
Every process must have at least one output so for the process in hand, consider what
the outputs must be and put labeled arrows on the diagram for the outputs. The data
must be changed by a process and so should have a different name from the input.
data. Property Details are taken and recorded as a property on the file so the output
could be just something like Recorded Property or more simple Property.
Now start again only using this output as a new input. It must either go to another
process, to a data store or to an external entity. It should be clear from the case study
what happens.
If a new process is needed then do the same again. Find a sensible name for the
process using the case study, determine and label the outputs and then follow the
outputs.
If the data is stored then add a data store to the diagram, name it sensibly from the
case study and draw the output arrow going into the data store. This what happens in
to Property and so we add the data store Properties.
If the output is an output from the system then simply add the external entity which
receives the output.
When the data is finally output or comes to rest in a data store, go back and follow
any of the other outputs which may have been defined on the way. When they are
exhausted, choose a new input and follow that through in exactly the same way.
Follow events
Another way to approach building up a data-flow model is to consider what happens
in the system. The case study will outline a number of events, happenings. There must
be processes in the system which respond to these events or even make them happen.
Identify these processes and then add the data inputs which are used by the process
and determine the outputs.
For example, in the estate agent example, there is the phrase When a sale is
completed.... This is an event - a sale is completed. From the case study, we see that
lots of things then happen: the buyer confirms exchange of contracts so this is an input
to some process; the client details are removed from the file and invoice is sent out.
This is the process. A sensible name might be Record Sale or possibly Receive Sale
Confirmation. The data needed is the input from the Buyer and client details which are
on file. This must mean there is a data store somewhere on the diagram holding this
information. If there is not one there already then add it. And the output must be an
invoice to the Client.
From here on, the approach is the same as following inputs. For any new outputs,
work out where those outputs must go and if it is to a process follow them as if they
were inputs to the new process.
Most processes can be found in the case study using either technique of following
inputs or following events. However, some processes are related to temporal events
and so can only be found by following events.
As the name suggests, temporal events are events which occur at specific times. They
are not prompted to happen by the arrival of new data but rather because a certain
time has been reached. These events often appear in case studies beginning with
phrases such as Once a month... or At the end of every day,.... However, once
these have been identified, producing the model by following this event is exactly the
same as for any other event.
In the estate agent system, there are two temporal events: there is a weekly matching
of potential buyers with properties; invoices and reminders are sent out on a monthly
basis.
Though time is the trigger the processes carrying out temporal events, time is
generally not shown on the data-flow diagram. This is because the time aspect is often
just a practical implementation rather than rigid necessity. For example, the matching
of buyers and properties at the estate agents need not be weekly. It is probably done
weekly so that it always gets done and also so that it does not interrupt the other daily
business. With an automated system, it may be possible to match buyers with
properties as soon as any new details on either arrive.
Where time is crucial to a process, say accounting done at the end of a financial year,
then this can be reflected in the name of the process. For example, Calculate end of
year profits.
Fill in gaps
After building a model which handles each input or each event, it is worth going over
the processes defined so far.
For each process, ask the question Does this process have all the information it needs
to perform its task?. For instance, if a process sends out invoices, does it have all the
details of the invoice and the address of where the invoice should go? If the answer is
No then add a data-flow into the process which consists of the data needed by the
process. If there are several, clearly distinct items of data needed then you may need
an arrow for each item. Now try to identify the source of the data.
First, see if the data can be found already inside the system either on a data store or as
a result of a process. If not, it may be that the data can be obtained by processing some
of the existing data in which case add a new process which takes the existing data and
makes the data you require. Or, the data may be available but from the case study it is
clear that there is a time-lag between the process that produces the data and the
process which uses it. Simply add a data store where the data can reside till it is
needed.
If there is still no source for the data then it could be from an external entity. In which
case, this is a new input to the system. It may not be explicitly mentioned in the case
study but if it is necessary then it should be added. Having added the new input from
the appropriate entity, go back and correct the context diagram.
This is an important task. If there is not enough data to support a task then the system
will not function properly. Of course, to be on the safe side you could have all the data
going to all the processes! But this is not really a solution because with a large system
this would not be practical.
Having checked over all the processes, check that all the outputs have been generated.
All of the inputs should have been covered already but this does not mean that all the
outputs have been produced. If there is still an output which does not appear on the
diagram, try to see if there is a process where it could come from. If there is no
sensible candidate, add a process and begin to work backwards. What inputs does the
new process need? Where do these inputs come from? This task is almost the same as
the one just described.
Any left over outputs must have come from a process. Outputs cannot come from data
stores or external entities. If there is no sensible way to fit the output into the diagram
then it may be that it is not a sensible output for the system you are currently
considering. Use the case study to confirm this.
Finally, check the data stores. Data must get onto a data store somehow and generally
data on a data store is read. For each data store, identify when the store is either
written to or read by considering the processes which may use the data. Also, use the
case study to see that you have not missed any arrows to or from a data store.
Repeat
By this stage, you will have considered all the inputs, all the outputs and produced a
first draft of the data-flow model of the system.
Review the case study, looking for functionality described which is not performed by
the model. In particular, look for temporal events as these are sometimes hidden
implicitly in the text.
Where necessary, add new processes that perform the omitted functions and use the
method of following events to work out their inputs and outputs. Fill in the gaps of the
model in exactly the same way as was done to produce the first attempt.
The model can be declared finished when you have considered every word in the case
study and decided that it is not relevant or that it is incorporated in some way into the
model!
Making levels
For all systems, it is useful to make at least two levels the context diagram and the
level one diagram. In fact, when in the earlier description of how to create DFDs you
were told to start by identifying the external entities and then to identify the inputs and
outputs of the system, you were learning how to produce the context diagram. The rest
of the description was how to produce the level one diagram.
Whenever you perform data flow modeling start in exactly this way, producing a
context diagram and then a level one diagram. Of course, in producing the level one
diagram you may realize you need more inputs and outputs and possibly even more
external entities. In this case, simply add the new data-flows and the new entities to
the level one diagram and then go back and add them to the context diagram so that
both diagrams still balance. Conversely, you may realize that some of the inputs and
outputs you originally identified are not relevant to the system. Remove them from the
level one diagram and then go back to the context diagram and make it balance by
removing the same inputs and outputs.
This constant balancing between diagrams is very common when doing leveling.
What about making more levels? There are two reasons for making more levels. The
first is the obvious one - you, as the analyst, have not fully described a process to your
satisfaction so you expand that process into a next level diagram. The new diagram is
built in just the same way that a level one is built from a context diagram only the new
inputs and outputs are precisely to the data flows to and from the process you are
expanding.
The second reason is that you realize the diagram you are working on is becoming
cluttered and unclear. To simplify the diagram, collect together a few of the processes.
Ideally, these processes should be related in some way. Replace them with a single
process and treat the original collection of processes as a lower level expanding the
new process. The inputs and outputs to the new process are whatever inputs and
outputs that are needed to make the diagrams balance. Remember to re-number the
old processes to show that they have been moved down a level.
When doing this, if there is a data-store which interacts with these processes and only
these processes then this too can be put on the lower level diagram.
Do not group random processes together to make a lower level diagram. This will
only end up in a tangle of arrows and unrelated processes. A good guide as to whether
or not you have chosen a sensible collection is to try and come up with a new name
for the replacement process. If you cannot do this then you have probably made too
general a grouping. Perhaps leave out one or two processes or try a different grouping.
Always bear in mind that levelling is meant to simplify and clarify the diagrams and if
this cannot be done then it may be best to leave the diagram as it is.
Balancing
The key to successfully leveling is to make the diagrams balance. For example, if a
second level diagram expands a first level process then all the inputs to the process
must be inputs to the second level diagram and all the outputs from the process must
be outputs on the second level diagram. Moreover, there must be no other inputs and
outputs. To be particular, all the inputs and outputs of the system which appear on the
context diagram must appear on the level one diagram and there should be no other
inputs and outputs on the level one diagram.
This does not mean there can be no changes to the higher levels of a set of diagrams.
When producing a lower level diagram, the analyst may realize that a new input is
needed for the process to be able to carry out its task. In which case, the analyst
should add this data-flow as an input and then add the input as a data-flow to the
original process. If needs be, this input may be added at several levels higher up. The
analyst may add new outputs in the same way.
As long the diagrams always balance, inputs and outputs can be added and removed
wherever necessary.
Numbering
Numbering in a leveled set of diagrams is important as the numbers help you to find
your way around the levels. It is mostly easily described by example. Suppose
Receive Order is the process numbered 3 on the level one diagram (Remember,
numbers do not indicate any order, they are simply labels) and this is expanded to a
level two diagram. The process numbers on the level two diagram will be 3.1, 3.2, 3.3
and so on. Suppose now that, process 3.4 on the level two diagram is Register New
Customer and needs further expansion to a level three diagram. The process numbers
on this diagram will be 3.4.1, 3.4.2, 3.4.3 and so on. Basically, the rule is: if X is the
number of the process you wish to expand, then the numbers on the next level are X.1,
X.2, X.3... .
The same applies for data stores. Data stores that appear in a level two diagram
expanding a process labeled 4 in the level one diagram will be numbered D4.1. D4.2,
D4.3 and so on. Deeper levels will be D4.1.1, D4.1.2, the numbering scheme being
just the same as for processes.
Note though, it is not the data stores that are expanded. They may simply appear in the
expansion of a process.
Process Descriptions
An analyst may define a process where no further expansion is appropriate because
there are no separate sub-processes which may make up the original process.
However, the analyst may still wish to describe the process in more detail as it is a
particularly difficult or tricky process. In this case, the analyst writes down a process
description for the process. This can take any form which the analyst thinks
appropriate. Traditional flowcharts could be used or plain English. More common is
what is called structured English. This looks like English only it is written more like a
computer language. It used to avoid the problem that different people reading the
same piece of plain English may understand different things.
We will not cover writing process descriptions in great detail but you should at least
know what one is.
Validation
It should be clear that producing data-flow diagrams can be complicated. It is very
easy in all of this to make simple mistakes. A routine check using the following
questions should make sure that you find these. The first set of questions refer to a
single diagram so if you have a set of leveled data-flow diagrams then you need to
make these checks for each diagram.
1. Is every data-flow attached to a process at either the beginning or the end of the
arrow?
3. Does every process have at least one input and at least one output?
4. Is every process named sensibly (no uses of words like process or handle)
with an action and what is acted upon? (The template is Do something to
something)
5. Is every data store named with the type of thing it stores in the plural?
6. Where data stores and external entities have been shown several times on one
diagram, do all instances have a diagonal line?
7. Are there any data-flows which cross? If so, try and add more external entities
or data stores to avoid the crossing.
This second set of questions is specifically about leveling and so should be asked
about the set of diagrams as a whole.
Are all external entities shown on both the context diagram and level one
diagram?
All data-flow diagrams are an aid to communication between the analyst. Although
they may be correct and accurate, a messy or tangled data-flow model will reduce
communication as surely as a long-winded text description. To avoid this, as the
diagrams evolve, re-draw them whenever they begin to get cluttered or have several
corrections on them. A simple re-arrangement of the components may be sufficient to
greatly improve a diagram.
All symbols for DFDs & text boxes in the right places
automated levelling of diagrams (so, for example, any process from a level 1
DFD can be exploded into a level 2 DFD with all required data flows and
shared data stores)
You will need to provide a name for the title of the project (here we have entered DFD
01), and you need to state the directory in which the project files will be stored (here
we have stated the directory c:\temp\dfd01).
If you are creating the project in a new directory (recommended) youll probably get a
dialog box asking you to explicitly create the directory:
An INFO dialog will then be presented to confirm that the new project files have been
successfully created in your chosen directory:
You will then be presented with a choice of the kind of DFD you wish to create:
At this point we are creating a context diagram, so choose the top option Context
Diagram.
SELECT SSADM will create a new context diagram for you, automatically creating a
single process on your new diagram labelled System context:
Right button click to get the popup menu: Edit... Add > Links > Child
Diagram > Usage... Delete
The Context Node Editor dialog will then be displayed, in which you can
change the name from System Context to Video Rental LTD:
You should now have a context diagram with the process (context node) Video
Rental LTD:
Adding an external entity to a context diagram DFD
To add an external entity to a context diagram, you first need to have created a new
context diagram (or opened an existing one). You need to right-button click on the
white background of the diagram (i.e. dont click over a process box or other diagram
object) to get the following popup menu: External entity > Free format text > Free
format box. Choose the first of these items, External entity . You will now be
offered a dialog in which to enter the name of the external entity:
In this dialog enter an appropriate entity name (we shall entered Customer at this
point). You should now have a context diagram with this new external entity:
Choose ID from this menu, and enter an ID letter (or letters) (here we entered the ID
c):
The entity should now appear with this ID in the upper part of the ellipse on the
diagram:
If you answer YES the item will be deleted from the diagram.
Adding a data flow to a context diagram DFD
Before adding a data flow, ensure the system process box and the external entity have
been created on the diagram first:
Select the source of the data flow, so we shall select the Customer external entity:
Now right button click over this entity to get a popup menu: Edit... Add > ID, Links
> Usage... Delete. From the Add > option choose Data Flow you will now have
a data flow from the entity rubber banding (shown by the dotted line) wherever you
mouse the mouse.
To make the data flow go into the system, click the system process box. You will the
be prompted to enter the name for this data flow (here we entered payment):
To create a dashed box just right button click on the white background of the diagram,
to get the following popup menu: External entity, Free format text, Free format
box. From this menu choose Free Format Box.
When the box is selected you will see sizing and editing points displayed at each
corner, and in the middle of each side
You can now use the mouse to move each corner or side into an appropriate position
to show the boundary:
Select and right click on the system process box to get the following popup
menu: Edit... Add > Links > Child diagram > Usage... Delete. From this menu
choose Child diagram > DFD > Go to. If no child diagram (level 1 DFD) has been
created previously for this process, the following QUESTION dialog will be
displayed:
Since we are creating a new diagram, choose New from this dialog.
You will then be asked to state a file name (we chose dfd_02) and the type of
diagram. The diagram type should be DFD:
You will then be asked to choose a title for the new diagram (here we chose Level 1
DFD):
Select SSADM may then take a few seconds to add all the data flows and external
entities onto the lower level diagram.
Once it has finished Select SSADM will present to you the new child diagram, which
will have each data flow and exteral entity already added to it:
Right button click somewhere on the white background of the diagram (i.e. not on a
diagram object), to get the following popup menu: Process, Data Store, Manual
Store, Transient Data, Transient Manual, Resource Store, External Entity, Free
Format Text, Free Format Box. From this menu choose Process. You will then be
asked for the name of this new process (we entered create new customer):
The new process box should appear on your diagram. The ID of the process will have
been automatically numbered, so if the first process to be added, it will be numbered
1:
Note that the ID of the process can be changed by right clicking the item and choosing
ID, and editing the number (in the same way the IDs of external entities can be
changed):
Select the data flow you wish to connect to the process. When the data flow is selected
you will see sizing and editing points displayed at each corner, and in the middle of
each side:
To make one end of a data flow become connected to a process box, use the mouse to
drag and draw the end of the data flow somewhere over the process box. The data
flow arrow will rubber band as a dotted line following your mouse:
Select SSADM should then redraw the diagram with the data flow connected to the
process box:
This technique works both with data flows into a process (where the arrow end is
dropped into the process box), and with data flows out of a process (where the non-
arrow end of the data flow is dropped into the process box):
In many cases this can lead to a more complicated (busy) diagram than necessary
although there are times when having an entity notated in more than one place on a
diagram can reduce crossed data flows and aid clarity.
In the same way that a loose end of a data flow can be attached to a process (see
earlier section), so too can the end of a data flow attached to an entity be moved, and
attached to another diagram object, such as another instance of the same entity.
First the data flow in question must be connected to its appropriate diagram object. In
this example there are two data flows from the Customer external entity, both flowing
into the process create new customer. After each data flow has been connected to the
process the diagram looks as follows:
We shall now make the payment data flow leave from the same instance of the
Customer external entity as persnal details.
The payment data flow is selected, and the sizing and editing points are shown:
Now the source end of the data flow (where it is connectged to the Customer external
entity) needs to be dragged and dropped into the desired instance of the Customer
entity (the one above). The data flow will rubber band as a dotted line when being
moved in this fashion:
Once dropped both data flows should now be attached to the same instance of the
external entity:
The other instance of the Customer external entity can now be deleted from the
diagram:
Right click the background of the diagram (i.e. not over an item) to get the following
popup menu: Process, Data Store, Manual Store, Transient Data, Transient
Manual, Resource Store, External Entity, Free Format Text, Free Format Box.
From this menu choose Data Store. You will then be asked the name of the data store
to create (here we have entered customer file):
Data flows are created and linked to data stores in just the same way as to processes
and external entities.
Note that the ID of the data store is created automatically, but can be changed by right
clicking the item and choosing ID.
First select the process you wish to decompose (here we have selected process 2 loan
of video):
The right button click to get the following popup menu: Edit... Add > Links > Child
diagram > Usage... Delete. From this menu choose Child diagram > DFD > go to.
You will then be asked to create a new diagram (unless previously a the process has
been decomposed):
Choosing NEW will result in a dialog where you have to choose a file name (we
chose p2_lev2) and the type of diagram (choose DFD):
Select SSADM will then take a few seconds to process all data flows into and out of
the process, and will create a diagram with all those data flows already created for
you:
Processes, data stores and new data flows can be created in this diagram in the normal
way:
As you may have noticed, these diagrams are not balanced for the following reasons:
An extra data flow has been added in the level 2 DFD of loan item recorded
into data store D2: Stock File
An error has been introduced in the name of a copy of the data flow over due
items as overdueitem from data store D2: Stock Fileto process s.1: process
loan request.
To check the consistency of child-diagrams, choose from the menus: File | Check
or click on the shortcut tick button. The result of running the consistency checker is an
error log as follows:
The full contents of the error log for the level 1 and level 2 DFDs is:
Project: C:\TEMP\DFD02\
Title : DFDs
Date: 14-Feb-00 Time: 18:44
Checking P2_LEV2.DAT
Error 8412 IO imbalance.
returned item details [Data flow] to
stock file [Data store]
does not appear on parent diagram.
Error 8412 IO imbalance.
overdueitem [Data flow] from
stock file [Data store]
does not appear on parent diagram.
ERROR(s) DETECTED, No Warnings given.
---- End of report ----
We can add a new returned item details data flow on the parent level 1 DFD to
correct the first error
We wish to replace the data flow overdueitem with one labelled overdue
items for the second error
A problem occurs if you attempt to change the name of a data flow to the name of
some other data flow. Editing the overdueitem dataflow and simply changing its
name to overdue items results in an error:
To get around this problem, we need to delete the incorrectly spelt data flow
overdueitem, and to create a new data flow overdue items on this diagram.
If we now run the consistency checker again, we get a report showing that no errors
have been identified:
Project: C:\TEMP\DFD02\
Title : DFDs
Date: 14-Feb-00 Time: 18:53
Checking P2_LEV2.DAT
No Errors detected, No Warnings given.
---- End of report ----
You should always run the consistency checker for every diagram when you believe
the diagram is completed, and for all parent and child diagrams whenever changes are
made to diagrams.
A discussion on this review question can be found at the end of this chapter.
Review Question 5
Describe each of the main elements of Data Flow Diagrams.
A discussion on this review question can be found at the end of this chapter.
Review Question 6
Describe 2 of the points at which Data Flow Diagrams are used during systems
analysis
A discussion on this review question can be found at the end of this chapter.
Discussion Topic 1
The details of any level 2 or lower DFD could be displayed in a level 1 DFD, so really
there is no reason not to model the entire system in a single level 1 DFD and avoid all
the problems of balancing and hierarchical process numbering and so on.
Suggested contribution for this discussion can be found at the end of this chapter.
Discussion Topic 2
There is no facility in the Data Flow Modelling technique to model the order in which
processes occur and data flows. When creating an information system such time-based
aspects of a system are just as important as the processes and data themselves.
Why do you think that such a feature not been created as part of Data Flow Diagrams,
and how can system designers get around this omission?
Suggested contribution for this discussion can be found at the end of this chapter.
Answers and Discussions
Discussion of Exercise 1
Decomposition which divides complex information into manageable chunks using a
hierarchical tree structure. An overview of the problem is presented at the top level of
the structure, while lower levels provide increasing depth of detail for narrower areas
of the problem
Discussion of Exercise 2
Current System Physical model the physical processes and data flows and data
stores of the current system may be modelled with DFDs (e.g. forms, pieces of paper,
physical files and filing systems etc.)
Current System Logical model the logical processes and data flows and data stores
of the current system may be modelled with DFDs (e.g. logical actions, logical
collections of data, logical packages of information flowing etc.)
Required System Logical model the logical processes and data flows and data
stores of the required system may be modelled with DFDs as part of the specification
of the required system
Required System Physical model the physical processes and data flows and data
stores of the required system may be modelled with DFDs as part of the design for the
required system
Discussion of Exercise 3
There are 2 external entities shown in the above diagram (as ovals):
Discussion of Exercise 4
There are 3 data flows shown in the above diagram (as named arrows):
Discussion of Exercise 5
There is just one process in the above diagram (a rectangle with three parts) - Video-
Rental LTD
Discussion of Exercise 6
There are no data stores in the above diagram (rectangles with two parts)
Discussion of Exercise 7
The top left part of a process rectangle is the process number. For context diagrams, if
any number at all is used, it is usually zero. The zero indicates that this is the whole
system, whereas in lower level DFDs numbers like 1 and 3 indicate sub-processes of
the whole system. This will become more clear when you have progressed to
understanding and creating hierarchical, leveled diagrams.
Discussion of Exercise 8
A Context diagram is the first DFD to be created for a system. It represents a model of
the system as a whole (i.e. as a single process) and this systems interactions with
external entities that are outside the boundaries of the system, but which provide
inputs to, and receive the outputs of the system being modeled.
they show all external entities with which the system exchanges data flows.
Discussion of Exercise 9
Functional decomposition is the breaking down of higher level processes into their
component sub-processes, data flows and data stores as lower level DFDs.
The condition to decide to decompose a process is any time where there is some
detailed aspect of the system that is not modeled by the process description alone
i.e. when a lower level DFD provides something more to the systems analyst, such as
sub processes, additional data stores, and data flows that are used only for the process
and which have not been modeled at the higher level DFD.
Discussion of Exercise 10
Identify data flows by listing the major documents and information flows associated
with the system.
You may find the use of the following kind of table is useful:
From the case study we can underline all potential data flows INTO AND OUT OF
THE SYSTEM. At this point look for any possible data flows, we can change our
minds at any time in the process of creating a context diagram. We are not worried
about data flows that seem to be within the system at present, so the sender and
receiver should always be either an external entity, or the system itself.
Video-Rental LTD is a small video rental store. The store lends videos to customers
for a fee, and purchases its videos from a local supplier.
A customer wishing to borrow a video provides the empty box of the video they
desire, their membership card, and payment payment is always with the credit card
used to open the customer account. The customer then returns the video to the store
after watching it.
If a loaned video is overdue by a day the customers credit card is charged, and a
reminder letter is sent to them. Each day after that a further chard is made, and each
week a reminder letter is sent. This continues until either the customer returns the
video, or the charges are equal to the cost of replacing the video.
New customers fill out a form with their personal details and credit card details, and
the counter staff give the new customer a membership card. Each new customers
form is added to the customer file.
The local video supplier sends a list of available titles to Video-Rental LTD, who
decide whether to send them an order and payment. If an order is sent then the
supplier sends the requested videos to the store. For each new video a new stock form
is completed and placed in the stock file.
video by customer when joining the store this is a strong candidate data
flow, though we might name it 'video loan' or 'details of loaned video'
credit card charge by system this is a strong candidate data flow, but in
fact we have already identified a payment by the customer (when renting a
video) and we could just consider this to be anther example of customer
payment (for simplicity, although alternatively we could consider this a
separate data flow, the decision could be influenced on the sophistication of the
systems processing of payments, and might be delayed until more detailed
DFDs are produced later in the analysis procedure)
overdue reminder letter from system this is a strong candidate data flow
list of available titles from supplier this is a strong candidate data flow
the requested videos from supplier this is a strong candidate data flow,
although might be called something like 'videos purchased'
stock form this last data flow is within the system, so this will not be used
in the context diagram but will probably appear in a more detailed DFD later
o customer
Draw and label the external entities around the outside of the process box.
We just need to add external entity symbols for 'customer' and 'supplier'.
Add the data flows between the external entities and the system box
We can do a quick check when we have created the diagram by counting the
number of flows out of, and into each entity.
Discussion of Exercise 11
There are 3 data flows shown in the above diagram (as named arrows):
Loan of video
Stock control
Discussion of Exercise 12
There are 2 data stores:
Stock file
Stock file
Discussion of Exercise 13
First we start with the context diagram, since all external entities and data flows on
this diagram must appear on our Level 1 DFD:
We can now create an 'empty' Level 1 DFD with these entities and data flows:
Identify processes. Each data flow into the system must be received by a
process. Each process must have at least one output data flow. Each output data
flow of the system must have been sent by a process.
Now we need to identify the recipient and sending processes of the system for
each data flow. We need to replace with a system process each occurrence of
'system' as the sender or recipient in the table of data flows created previously.
Draw the data flows between the external entities and processes. After
creating process boxes and drawing the data flows the diagram looks as
follows:
Add data flows flowing between processes and data stores within the
system. Each data store must have at least one input data flow and one output
data flow.
We can create a table to indicate which processes send and receive data from
each data store:
Data store Data flow IN FROM data flow OUT TO
Apart from these extra two data flows the diagram appears to be correct.
So our Level 1 DFD for the Video Rental case study is now:
Discussion of Exercise 14
Make the process box on the Level 1 diagram the system boundary on the Level 2
diagram that decomposes it.
Identify the processes inside the Level 2 system boundary and draw these processes
and their data flows.
For each data flow into and out of the process for which this Level 2 diagram is being
created we need to identify an appropriate sub-process to receive and send the data
flows. The following table lists each data flow and suggests a suitable sub-process to
receive/send the data flow:
Data flow Sender Receiver
Adding these processes and data flows to the diagram we get the following:
Identify any data stores that exist entirely within the Level 2 boundary, and draw these
data stores: For this example there dont appear to be any local data stores
Identify data flows between the processes and data stores that are entirely within the
Level 2 system boundary: Since there are no local data stores, there are no data flows
between processes and data stores to be added.
Check the diagram: Upon checking the diagram, we find that the process validate
customer has no output data flows. Looking more closely we see that a plausible data
flow out of validate customer would be something like loan permission.
Upon adding this new data flow the diagram looks as follows:
Discussion of Review Question 1
Identify data flows by listing the major documents and information flows
associated with the system.
You may find the use of the following kind of table is useful:
From the case study we can underline all potential data flows INTO and OUT
OF THE SYSTEM. At this point look for any possible data flows, we can
change our minds at any time in the process of creating a context diagram. We
are not worried about data flows that seem to be within the system at present,
so the sender and receiver should always be either an external entity, or the
system itself.
Clients wishing to put their property on the market visit the estate agent, who
will take details of their house, flat or bungalow and enter them on a card which
is filed according to the area, price range and type of property .
Note
Potential buyers complete a similar type of card which is filed by buyer name
in an A4 binder.
Weekly, the estate agent matches the potential buyer requirements with the
available properties and sends them the details of selected properties.
When a sale is completed, the buyer confirms that the contracts have been
exchanged, client details are removed from the property file, and an invoice is
sent to the client. The client receives the top copy of a three part set, with the
other two copies being filed.
On receipt of the payment the invoice copies are stamped and archived.
Invoices are checked on a monthly basis and for those accounts not settled
within two months a reminder (the third copy of the invoice) is sent to the
client.
We can build a table of these data flows, and the senders and receivers of these
flows.
o the internal copies of the invoice - these data flows do not go outside the
system boundary so will not be part of this context diagram (but may
feature on a more detailed DFD later)
o the client details card is filed IN the system, so this internal data flow
will not feature on the context diagram
It is worth noting that the exchange of contracts between client and buyer is not
a data flow into or out of the system, but this data flow between external
entities is relevant so ought to be notated on the context diagram.
This step is easy if we have created a table like the above, since we can just
create a list of all the different entities: client, buyer.
Add the data flows between the external entities and the system box. We now
need to add those data flows earlier. Our context diagram looks as follows:
Draw the data flows between the external entities and processes. We can
now add these processes to the diagram, and connect the appropriate data
flows:
Add data flows flowing between processes and data stores within the system.
Each data store must have at least one input data flow and one output data flow
(otherwise data may be stored, and never used, or a store of data must have
come from nowhere!). Ensure every data store has input and output data flows
to system processes. Most processes are normally associated with at least one
data store.
We can create a table to indicate which processes send and receive data from
each data store:
Check diagram. We now can check the diagram for correctness, and find a
process that has no output data flow 'archive sale'. An appropriate data flow,
into data store 'invoices' would be something like 'record of payment'. The
consistent and balanced Level 1 DFD now looks as follows:
However, there is another problem with the diagram what causes the process
'invoice client' to send an invoice or reminder to the client? The only input to
the process 'invoice client' is a 'reminder' from the 'invoices' data store. The
answer is that there are two things that trigger this process to send a data flow
to the client:
Note
When a sale is completed, the buyer confirms that the contracts have been
exchanged, client details are removed from the property file, and an invoice is
sent to the client.
This must mean that the buyer informs the system that the sale is complete, so
we must create a new data flow from 'buyer' to 'invoice client' called something
like 'confirmation of sale'. (NOTE: Since we are adding a new data flow
between the system and the external entities, we shall have to update the parent
diagram if we forget we will be reminded by any CASE tool consistency
checker).
We also notice there should be a data flow of 'client to delete' from process
'invoice client' to the data store 'property file'.
We should start with the Level 1 DFD, and create an 'empty' Level 2 DFD with
all the same external entities and data flows as the invoice client process.
Identify the processes inside the Level 2 system boundary and draw these
processes and their data flows.
For each data flow into and out of the process for which this Level 2 diagram is
being created we need to identify an appropriate sub-process to receive and
send the data flows. The following table lists each data flow and suggests a
suitable sub-process to receive/send the data flow:
confirmation of
buyer invoice client - raise invoice
sale
The last row in the table above is interesting there doesnt appear to be a
sub-process inside the invoice client process that creates the data flow client
to delete. Looking carefully at the Level 1 DFD we can see that the archive
sale process is probably most appropriate to be sending the property file the
details of which client to delete, since it is this process that receives the
payment from the client. Therefore we need to delete this client to delete data
flow from the Level 2 DFD, and change the Level 1 DFD to have this data flow
from achieve sale to the property file.
Adding these processes and data flows to the diagram we get the following:
Identify any data stores that exist entirely within the Level 2 boundary, and
draw these data stores. For this example there dont appear to be any local
data stores
Identify data flows between the processes and data stores that are entirely
within the Level 2 system boundary. Since there are no local data stores, there
are no data flows between processes and data stores to be added.