MODULE 7
DATA INPUT METHODS
LEARNING UNITS
7.1 Data Dictionary : its development and use
7.2 Data input methods : Batch and Interactive
7.3 Coding technique for unique data representation.
Systems Analysis And Design © V. Rajaraman
LEARNING GOALS
1. The need for a data dictionary for an application
2. How to develop a data dictionary for an
application
3. Design of forms and screens for data input.
4. Need and methods of coding data elements.
5. Coding schemes for automatic error detection
while inputting data
6. Need for and design of input data validation
methods.
Systems Analysis And Design © V. Rajaraman 1 of 41
MOTIVATION
During systems analysis it is essential for an analyst
to decide the necessary and sufficient data for
designing an application.
DFD gives the dataflows and stores of a system
Individual data elements of dataflows and stores can
be catalogued
Such a catalogue with description of each element
and their types will be an invaluable aid while
designing a system.
Systems Analysis And Design © V. Rajaraman 2 of 41
MOTIVATION
A catalogue will also bring out if any data is
duplicated/missed
A catalogue will also be an invaluable documentation
of a system
Such a catalogue is called Data dictionary-It is
actually metadata,i.e.,data about data.
Systems Analysis And Design © V. Rajaraman 3 of 41
MOTIVATION
After data dictionary is designed one needs to
determine how the data is to be input.
Data input methods depend on whether the data is
filled in by customers in forms manually and later input
by data entry operators or data is directly input by users
on PC’s.
We thus need to understand both these methods.
Systems Analysis And Design © V. Rajaraman 4 of 41
MOTIVATION
Unless data input is correct, results will be unreliable
Information systems normally have a large volume of data
Because of large volume special controls are needed to ensure
correctness of data input - otherwise it is difficult to find which data
is incorrect
Thus it is important to design appropriate data input methods to
prevent errors while entering data
Key data elements are important to identify records.They need to
be unique,concise and understandable by users.Thus we need to
study methods of coding key data element
Systems Analysis And Design © V. Rajaraman 5 of 41
WHAT IS DATA DICTIONARY
Data dictionary is a catalogue of all data used in an
application, their names, type and their origin.
In other words it is data about data which is called
metadata
Data dictionary gives a single point reference of data
repository of an organization
It is thus an important documentation which would be
useful to maintain a system
7.1.1 System Analysis And Design © V. Rajaraman 6 of 41
HOW IS DATA DICTIONARY DEVELOPED?
Starting point is DFD
Example :
Consider the Receiving office process described in Module 3.
DFD is reproduced below.
Inspection
Receiving Items Office
Vendor Process
Delivery Received
note note
Orders Purchase
Office
Discrepancy
note
We will recall the word statement from requirement specification now.
7.1.2 System Analysis And Design © V. Rajaraman 7 of 41
WORD STATEMENT OF REQUIREMENTS
Vendor sends items with a delivery note while
fulfilling an order (along with the physical items) to a
receiving office.
Receiving office compares a delivery note against
order placed.If there is a discrepancy a discrepancy note
is sent to purchase office.
Actual items received note is sent to the inspection
office along with items received.
7.1.3 System Analysis And Design © V. Rajaraman 8 of 41
DATA ELEMENTS IN DATA FLOW
From word statement we derive data elements in each data flow.
1. Delivery note
Order no,Vendor name,Vendor address,item name,delivery date,quantity
supplied,units
Item name and Vendor name may not be unique to ensure uniqueness
we assign unique codes for them.Name is however still kept to aid
people.
Thus delivery note is:
Delivery note = Order no + Vendor code + Vendor name + Vendor address +
item code + item name + delivery date + quantity supplied + units.
7.1.4 System Analysis And Design © V. Rajaraman 9 of 41
DATA ELEMENTS IN DATA FLOW
Discrepancy note : Order no + Vendor code + Vendor name +
Vendor address + item code + item name + delivery date +
quantity supplied + units + excess/deficiency + no of days
late/early.
Items received note = Delivery note
Data in data store
Order records = order no + vendor code + vendor name +
vendor address + item code + item name + order date + qty
ordered + units + delivery period.
7.1.5 System Analysis And Design © V. Rajaraman 10 of 41
DATA DICTIONARY FORMAT
Data dictionary gives in detail the characteristics of a data element.
Typical characteristics are:
Data name : Should be descriptive and self explanatory.This
will help in documentation and maintenance
Data description : What it represents
Origin : Where the data originates
e.g. input from forms, comes from receiving office, keyed in by user
etc.
Destination : Where data will flow and will be used (if any)
Data Type : numeric, alphanumeric,letters(or text),binary(0 or 1;
True or False),Integer, Decimal fixed point, real(floating point),
currency unit, date
7.1.6 System Analysis And Design © V. Rajaraman 11 of 41
TYPICAL CHARACTERSTICS OF DATA
ELEMENTS(CONTD)
Length : no of columns needed
Limits on value : (if relevant)
e.g. upper and lower bounds of value (age>0,<100)
Remarks : (if any)
7.1.7 System Analysis And Design © V. Rajaraman 12 of 41
EXAMPLE OF DATA DICTIONARY ENTRY
Name : Order number
Description : Used to identify order given to vendor
Origin : Part of delivery note from vendor
Destination : Receiving process
Data type : Numeric Integer
Length : 8 digits
Limits on value : >000,<=99999999
Actual value not relevant.Used only as unique identifier
Remarks: It is a key field.
7.1.8 System Analysis And Design © V. Rajaraman 13 of 41
EXAMPLE OF DATA DICTIONARY ENTRY(CONTD
Name : Delivery date
Description : Date item is to be delivered
Origin : Part of delivery note from vendor.Is also in orders data
store which is input to receiving process
Destination : Receiving process
Data type : Numeric Integer
Length : 8 digits
Limits on value : Date field in the form DDMMYYYY.
Should satisfy constraints of a date in calendar
Remarks: Blank fields not allowed.
e.g.05082004 is ok but not 582004
7.1.9 System Analysis And Design © V. Rajaraman 14 of 41
DATA DICTIONARY USES
Data dictionary can be enormous in size.Requires careful
development. However,it is centralized reference document.
Invaluable resource to design
•Input forms and screens
•Data checking programs
•Process specification
•Database
Very useful in understanding and maintaining system
7.1.10 System Analysis And Design © V. Rajaraman 15 of 41
DATA INPUT METHODS
ON-LINE - User directly Enters data using screen
prompts
OFF-LINE -Forms filled by users- for example-
candidates for admission to a college fill forms
Data from forms keyed in by a data entry operator
7.2.1 System Analysis And Design © V. Rajaraman 16 of 41
ERROR SOURCES
Errors in on-line data entry due to poor screen design. System
should inform the user immediately when wrong data is input
Errors in off-line data entry due to bad form design and human
errors by users and data entry operator
System should prevent user making mistakes by
•Good form design by leaving enough space for writing legibly
•Clear instructions to fill form
System should prevent data entry operator making mistakes by
•Good form design
•Reducing key strokes
•Immediate error feedback
7.2.2 System Analysis And Design © V. Rajaraman 17 of 41
OFF LINE DATA ENTRY - PROBLEMS
Forms batched
Desirable for the machine to give message when
input is wrong.Not always possible
Error found after elapse of time
Need good controls to automatically detect and if
possible correct errors
7.2.3 System Analysis And Design © V. Rajaraman 18 of 41
BATCH DATA ENTRY
Data entered Keyboard Data validation
in forms Input file program
Data entry
Error
Input batch
batch
Update
Error program
report
Data processing
Output report Data
program
store
7.2.4 System Analysis And Design © V. Rajaraman 19 of 41
BATCH DATA ENTRY
Name
Address
Bad design : Tendency will be to fill
name on top line. Not enough space for
letters of address
Tick as applicable
Individual
Bad design : Choices are not codified.
Hindu undivided family
Data entry operator will be confused.
Parent/Guardian of minor
7.2.5 System Analysis And Design © V. Rajaraman 20 of 41
BATCH DATA ENTRY
Enter date Enter date
Day month Year
(Bad design)
(Good design)
Enter time Enter time
Hr Min Sec
(Good design) (Bad design)
7.2.6 System Analysis And Design © V. Rajaraman 21 of 41
BATCH DATA ENTRY
Enter name and address using capital letters Use one box for each alphabet
Tick any of the following
Shri Smt Kum
1 2 3
Name
Only address (do not
Repeat name)
Pin
I am applying as:
Tick one of the boxes below
Hindu undivided Parent or guardian
Individual
family Of minor
Clear instructions . Enough space for entry manually.Data entry operator can be
simple instructions for data entry
7.2.7 System Analysis And Design © V. Rajaraman 22 of 41
COMPUTER READABLE FORMS
As manual data input from forms are slow and expensive
attempts have been made to automate form reading using
scanners
Needs hand writing recognition and correct form alignment –
Not very successful
However if forms require just darkening some pre-defined
areas they can be machine read and interpreted.
Example – Multiple choice questions in exams where specific
boxes are darkened based on the choice.
7.2.8 System Analysis And Design © V. Rajaraman 23 of 41
INTERACTIVE DATA INPUT
Advent of PC’s and client/server model in computer
networks, interactive data input is now widely used
Advantages are instant response when data is input so
that errors are immediately corrected.
Flexibility in screen design which minimizes manual
effort
Use of mouse and icons simplifies pre-determined
choices of data
7.2.9 System Analysis And Design © V. Rajaraman 24 of 41
INTERACTIVE DATA INPUT
Three main models of interactive data input :
Menus
Templates
Commands
7.2.10 System Analysis And Design © V. Rajaraman 25 of 41
MODELS OF DATA INPUT
MENUS
User presented several alternatives and asked to type his/her
choice
EXAMPLE
SELECT ALTERNATIVE
Type 1 For entering new student record
Type 2 For deleting student record
Type 3 For changing student record
Your choice
7.2.11 System Analysis And Design © V. Rajaraman 26 of 41
MODELS OF DATA INPUT
TEMPLATE
Template analogous to form
Has features to reject incorrect data input using built-in program
User friendly visual presentation
Example
Roll no
Name
FIRST NAME/INTIALS LAST NAME
Dept code CE CS ME EE IT
CODES
Year
Hostel code A B C D
Pre-programmed to reject incorrect Roll no,Dept code,Year, Hostel code
7.2.12 System Analysis And Design © V. Rajaraman 27 of 41
MODELS OF DATA INPUT
Interactive commands guides user through alternatives
Example
Computer : Did you request deletion of record ?
Type Y or N
User :Y
Computer : Give student roll no
User : 56743
Computer : Is name of the student A.K.Jain?
Type Y or N
User :Y
Computer : Is he 1st year student
Type Y or N
User :Y
Computer : Shall I delete name?
User :Y
7.2.13 System Analysis And Design © V. Rajaraman 28 of 41
MODELS OF DATA INPUT
Normally all three models will occur together in
application .In other words Menu, Forms and
Commands are not mutually exclusive
Graphical user interface design very rich area-
languages such as Visual Basic simplifies design of user
interface
We have given only a flavor of the topic
7.2.14 System Analysis And Design © V. Rajaraman 29 of 41
WHY DO WE NEED CODES?
NEED FOR CODING
UNIQUE IDENTIFIER
-Example Roll no instead of name
CROSS REFERENCING BETWEEN APPLICATIONS
-unique Roll no may be used in examination
records,accounts, health centre
EFFICIENT STORAGE AND RETRIEVAL
- Codes concise- a long name will have a shorter
roll no
7.3.1 System Analysis And Design © V. Rajaraman 30 of 41
WHAT ARE THE REQUIREMENTS OF A GOOD
CODE?
CONCISE - Smallest length to reduce storage and data input
effort
EXPANDABLE - Add new members easily
MEANINGFUL- Code must convey some information about
item being coded
COMPREHENSIVE - Include all relevant characteristics of
item being coded
PRECISE - Unique, unambiguous code
7.3.2 System Analysis And Design © V. Rajaraman 31 of 41
WHAT METHODS DO WE USE TO CODE
1) SERIAL NO: Assign serial number to each item
2) BLOCK CODES: Blocks of serial numbers assigned to different categories.
3) GROUP CLASSIFICATION CODE- Groups of digits/characters assigned for
different characteristics
Roll no 87 1 05 2 465
Year Term Dept Status Serial no
admitted admitted UG/PG In dept
(use meaningful characters) 87 1 CS UG 465
4) SIGNIFICANT CODES - Some or all parts given values
Roll no BA 1 95 C B R
Banian Male Chest size Cotton Color Style
cms (blue) (Round neck)
7.3.4 System Analysis And Design © V. Rajaraman 32 of 41
CHARACTERSTICS OF CODES
Characteristics
Codes Concise Expandable Meaningful Comprehensive Precise
SERIAL NO Yes Yes No No Yes
BLOCK Moderate Yes No No Yes
CODES
GROUP No Yes Yes Yes Yes
CLASSIFICA-
TION CODE
SIGNIFICANT No Yes Yes Yes Yes
CODE
7.3.5 System Analysis And Design © V. Rajaraman 33 of 41
ERROR DETECTION CODE
Incorrect data entry can lead to chaos
Mistakes occur as volume of data processed is large
Need to detect and if possible correct errors in data entry
Error detected by introducing controlled redundancy in
codes
Error control digits added based on statistics of types of
errors normally committed during data entry
7.3.6 System Analysis And Design © V. Rajaraman 34 of 41
MODULUS 11 CHECK DIGIT SYSTEM
Error detection digit added at the end of a numeric code
Code designed in such a way as to detect all single transcription and single
transposition errors which is 95% of all errors
Single transcription error-Æ49687Æ48687
Single transposition errorÆ 45687Æ48657
Given code 49687 modulus check digit obtained as follows:Multiply each digit by
Weights of 2,3,4 etc starting with least significant digit
7*2+8*3+6*4+9*5+4*6=131
131/11=11,remainder 10; or 131 mod (11) =10;
(11-10)=1 append it to the code
The code with check digit=496871
If remainder is 1 then append(11-1)=10 code as X
7.3.7 System Analysis And Design © V. Rajaraman 35 of 41
ERROR DETECTION
496871 486871
Correct code Code as entered
Error detection - 1*1 + 7*2 + 8*3 + 6*4 + 8*5 + 4*6
= 127/11 Remainder != 0 => Error
496871 416879
Error detection - 9*1 + 7*2 + 8*3 + 6*4 + 1*5 + 4*6
= 100/11 Remainder != 0 => Error
7.3.8 System Analysis And Design © V. Rajaraman 36 of 41
WHY DOES MODULUS 11 CHECK DIGIT WORK
Given dn,dn-1,……..d1 where d1 is the check digit
n
(∑Widi )mod N=0 by design
i=1
What should be the values of N & Wis
Single transcription error: dk become t
n n
(∑Widi )= (∑Widi ) + t Wk - Wk dk
i=1 i=1
As (∑Widi ) mod N=0 (t-dk ) Wk mod N = 0
(t- dk ) Wk =p.N where p is any integer
Conditions
1. 0<Wk<N
2. As [t-dk] < 10 and Wk < N, N>10
3. Product of integer not a prime => N a prime
4. Smallest prime > 10 =11 => N=11
7.3.9 System Analysis And Design © V. Rajaraman 37 of 41
WHY DOES MODULUS 11 CHECK DIGIT WORK
Single transposition error
Let dk and dm get interchanged
n
[ ∑ Wi,di + (dkWm + dmWk - dkWk - dmWm )] mod N = 0
i=1
Or(dk-dm)(Wm-Wk) = p.N
1. (Wm - Wk) = 0 => Weights distinct
2. (dk- dm)< 10 If N> 10 equation satisfied
3. If N prime product cannot be prime
therefore N =11 satisfies conditions
7.3.10 System Analysis And Design © V. Rajaraman 38 of 41
OTHER CHECKING SYSTEMS
USE MODULO N CHECK WITH N PRIME > LARGEST CODE
CHARACTER VALUE
FOR HEXADECIMAL CODES SYMBOLS = 16, N =17
FOR ALPHANUMERIC CODES 26 LETTERS
10 DIGITS
36 SYMBOLS
Therefore N=37.
7.3.11 System Analysis And Design © V. Rajaraman 39 of 41
VALIDATING INPUT DATA
WHEN LARGE VOLUME OF DATA IS INPUT
SPECIAL PRECAUTIONS NEEDED TO VALIDATE
DATA
VALIDATION CHECKS :
SEQUENCE NUMBERING - detects missing record
BATCH CONTROL - Use batch totals
DATA ENTRY AND VERIFICATION-Dual input
RECORD TOTALS-Add individual values for checking
MODULUS 11 CHECK DIGIT
7.3.12 System Analysis And Design © V. Rajaraman 40 of 41
CHECKS ON INDIVIDUAL FIELDS
Radix errors - For example seconds field cannot exceed
60,month field cannot exceed 12
Range check - Fields should be within specified range
Reasonableness check - Telephone bill cannot be more
than 10 times average bill of last few months
Inconsistent data - For example : 31-04-99
Incorrect data- Batch total checks this
Missing data - Batch control data checks this
Inter field relationship check -
For example - Student of 8th class cannot have age > 25
7.3.13 System Analysis And Design © V. Rajaraman 41 of 41