Data Processing
Data Processing
Data Processing
ke 0713779527
Data can also be described as Raw data, if they are not yet processed, i.e. if they do not convey
particular meaning to a given activity within any given environment.
It therefore means that, Data are unprocessed information consisting of details relating to
business transactions. For example, in a Payroll system, data are employee’s names, basic
salary, department number, marital status, etc.
DATA PROCESSING:
The collection, manipulation & distribution of data (i.e.) letters, numbers & graphic symbols,
to achieve certain objectives.
The processing may involve calculations, comparisons, decision-making and/or any other
logic to produce the required result.
The activity of manipulating the raw facts to generate a set of meaningful data (described as
Information), which is able to convey some meaning.
Those activities, which are concerned with the systematic recording, arranging, filing,
processing, and dissemination of facts relating to the physical events occurring in a business.
Data processing is a very important activity in any organization of any size or nature because it
generates information for decision-making.
If the data processing uses complicated processing tools or aids, e.g. the computer, it is described
as Electronic Data Processing (EDP).
INFORMATION.
Information is data, which is summarized and processed in the way you want it, so that it is
useful in your work.
Information is an assembly of meaningful data items.
The information in Payroll activity includes; Net pay, Total Tax deductions, etc. In Stock
Control, the information generated includes; Closing stock, Total cost of the items, Purchases,
Sales, etc.
The information is obtained by applying some processing procedures onto the raw data being
input. For example, to get the Net pay in a Payroll activity, the procedure would be;
Net pay = (Basic salary + Allowances + Overtime, if any) – Taxes.
Information is the end product of data processing available at the right place, the right time and
in the right form.
The information generated by the data processing activities is very important in the working
strategies of any organization, because it is used by the organization to make decisions.
Characteristics/ Features of good Information.
1
www.arena.co.ke 0713779527
Exercise.
1. Define the terms:
(i). Data.
(ii). Information.
(iii). Data processing.
2. Distinguish between the following terms:
(i). Data.
(ii). Data processing.
(iii). Information
3. Using examples, explain the difference between ‘Data’ and ‘Information’.
ORIGINATION OF DATA
Data originates from Source documents,
Time cards, Sales orders, Purchase
orders, Invoices, etc
2
www.arena.co.ke 0713779527
OUTPUT OF INFORMATION
Output consisting of printed or
typewritten forms, etc
Summaries, Reports, & documents are
prepared.
Notes.
Exercise I.
1. (a). What is a Data Processing cycle.
(b). State and describe the stages involved in data processing cycle.
2. Draw and label a clear flow diagram of the stages involved in a data processing cycle.
3. List the various steps in the data processing cycle and briefly describe what happens at each
stage.
DATA COLLECTION.
Data Collection is the process involved in getting the data from the point of its origin to the
computer in a form suitable for processing.
Note. Data collection starts at the source of the raw data & ends when valid data is within the
computer in a form ready for processing.
3
www.arena.co.ke 0713779527
4
www.arena.co.ke 0713779527
DATA INTEGRITY.
(i). Accuracy:
(ii). Timeliness:
(iii). Relevance:
DATA CONTROL.
The quality of Input data is important to the accuracy of output. Control must be instituted as
early as possible in the system & everything possible must be done to ensure that data is
complete and accurate before being input to the computer.
Objectives of Data Control.
The objectives of Control are:
(i). To detect, correct and re-process all errors.
(ii). To ensure that all data is processed.
(iii). To preserve the integrity/reliability of maintained data.
(iv). To prevent and detect fraud/deception.
Note. Control must be designed into the system & thoroughly tested. Failure to build in
adequate control may cause expensive systems to fail. In addition, all users must be fully
consulted to ensure that adequate controls are implemented.
5
www.arena.co.ke 0713779527
VALIDATION CHECKS.
A Computer cannot notice errors in the data being processed in the way that a Clerk or Machine
operator does.
Data validation is the process of preventing wrong data from being processed. It involves
checking whether the results generated by the computer are valid or applicable. During input or
data preparation, the data must be checked for transcription errors, through a process known as
Verification.
Once the data is brought into the computer memory directly from an input device, immediately
before processing, the data is again subjected to checks built in the program described as
validation checks, to check the data integrity or the conformity of the data to the processing
requirements.
Data validation includes testing for the following:
(a). Test for reasonableness.
The computer program checks whether the data is reasonable, e.g., number of people should
not be represented in decimals, i.e. 9½ children.
(b). Test for numbers.
E.g., numbers should not be given as alphabets.
(c). Test for alphabets.
E.g., alphabets should not be represented as numbers.
These checks can be made at 2 stages:
(1).Input stage: When data is first input to the computer, different checks can be applied to
prevent errors going forward for processing. For this reason, the first computer run is often
referred to as Validation or Data vet.
6
www.arena.co.ke 0713779527
Exercise I.
1. Distinguish between Data verification and Validation as used in the context of data
collection.
9
www.arena.co.ke 0713779527
Exercise II.
1. Write short notes on the following:
(i). Manual systems.
(ii). Mechanized systems.
(iii). Electronic systems.
2. By use of a clear table and brief explanation, show the differences between manual,
mechanized and electronic systems touching on the following functions: input, process,
output, storage and control. (20 marks).
COMPUTER FILES
A File is a collection of related records (i.e. several records put together) that give a complete set
of information about a certain item or a particular business entity.
11
www.arena.co.ke 0713779527
DATA HIERARCHY.
In data processing, data is organized from the smallest element to the most comprehensive.
Bits
Characters
Bit:
A Bit is the smallest item that can be stored in a physical file.
The bit can either be a ‘0’ or a ‘1’; the two states that define the storage cells of a computer
memory & a storage media.
Bits combine together to form the Byte (which is the unit of measuring the computer storage). A
Byte is the collection of several bits that represent a Character.
12
www.arena.co.ke 0713779527
13
www.arena.co.ke 0713779527
Reference files.
A Reference file is used for reference or look-up purposes.
Lookup information is that information which is stored in a separate file, but is required during
processing. E.g., the item code entered either manually or using a bar-code reader in a point-of-
sale terminal is used to look-up the item description & price from a reference file stored on a
storage device.
Reference files contain records that are fairly permanent or semi-permanent such as tax
deductions, Wage rates, Customer address, etc, and therefore, they need to be revised
occasionally.
Backup files.
A Backup file is used to hold duplicate copies (backups) of data or information from the
computer’s fixed storage (hard disk). These files are kept for security purposes.
14
www.arena.co.ke 0713779527
Sort files.
Sort files are created from existing files, such as Master or Transaction files, and are used mainly
for sorting data (i.e., they are used to alter the sequence of the existing files).
A sort file is mainly used where data is to be processed sequentially. In sequential processing,
data or records are first sorted and held on a magnetic tape before updating the master file.
Report files.
A Report file contains a set of relatively permanent records extracted from the data in a Master
file or generated after processing.
Report files are used to prepare reports, which can be printed at a later date.
Example of Report files:
Report on Overtime, report on Taxes, report on student’s class performance in the term, etc.
Scratch file.
A Scratch file is a temporary file used to hold data during processing. It contains temporary data,
which can be erased when the task is finished.
Key field.
A Key field is one or more fields in a record that uniquely identifies the record or a group of
records.
E.g., an Employees Serial number may be used to identify the employee records in a Payroll file.
Note. Any field in the record can be used as the key field. However, it should display unique
identification characteristics.
Review questions
1. Define a computer file.
2. State four advantages of storing data in computer files over the manual filing system.
3. Differentiate between Logical file structure and physical file structure.
4. With the help of a figure, illustrate the information system Data hierarchy.
5. Define the following terms:
(i). Character.
(ii). Field.
(iii). Record.
(iv). Key field.
6. List 5 types of files used in data processing and their purposes.
FILE ORGANIZATION
File organization refers to the way records are arranged (laid out) within a particular file.
15
www.arena.co.ke 0713779527
IRG
1 2 3
File ‘head’ File ‘tail’
Serial files can be accessed serially. This involves searching through the entire file record by
record starting from the ‘head’ of the file towards the ‘tail’ of the file.
16
www.arena.co.ke 0713779527
Review Questions.
1. What do you mean by File Organization?
2. State and explain four types of file organization.
3. Distinguish between:
(a). Sequential and serial file organization methods.
(b). Random and indexed-sequential file organization methods.
4. (a). Describe how files are organized and accessed on tape.
(b). What are the disadvantages of storing files on tape?
5. Differentiate between Sequential and Indexed Sequential methods of file organization on
disk.
6. (a). What is random file organization? State its advantages.
(b). How are Random files accessed on disk?
7. Identify four file processing methods.
8. Discuss four considerations for choosing a file organization method.
20
www.arena.co.ke 0713779527
Review questions
1. Define the term “Data processing modes”.
2. Mention five types of electronic data processing modes.
Batch processing
In batch processing, data or transactions are collected & accumulated together over a specified
period of time, e.g., daily, weekly, or monthly. The data is then input & processed at once (or as
a single unit) to produce a batch of output.
For example:
In a payroll processing system, details of employees such as number of hours worked, rate of
pay, may be collected for a period of 1 month, after which they are used to process the payment
for the duration worked.
Data collection is usually done off-line (i.e. away from the CPU) on special machines known as
Data entry terminals. The data is entered & stored on a disk in a batch queue for a while. It is
then input & processed one or more at a time under the control of the Batch operating system,
and the result obtained.
Batches of transactions are scheduled for processing by assigning them priorities. The priorities
are assigned in terms of percentage ratio, e.g. 95%, 60%, etc. The most priority jobs are
processed first, while the less priority jobs are processed once the computer resources (i.e., CPU
time, Memory & I/O devices) are released by the most priority jobs.
Once the processing of a given batch starts, there is no interaction between the operator & the
CPU. Therefore, the user cannot intervene to perform amendments to the program.
A job is not processed until it is fully input. In addition, a program must wait its turn before
processing the data. This means that, there will be a delay in obtaining results. For instance, a
job may wait in the batch queue for minutes or hours depending on the workload. Hence, Batch
processing cannot be used when the results are needed immediately.
21
www.arena.co.ke 0713779527
Review questions
1. Briefly explain Batch processing.
2. Describe the application, advantages and disadvantages of batch processing.
Online processing
In online processing, data or the input transactions are processed immediately they are received
to produce the information required.
Online processing occurs when the transactions are processed to update (or make any change in)
a computer file immediately after the transactions occur.
In online processing, all the Input/Output facilities, and communication equipments are under
direct influence of the central Processor.
In online processing, the operator communicates directly to the computer’s operating system
using commands, which are then interpreted by the supervisor. This means that, the operator can
interact with the system at any point of processing using the Input/Output facilities.
Note. In online processing, the data input units (terminals) are connected directly to the central
computer using communication links.
In such a configuration, the data (input transactions) are communicated from the workstations to
the central computer for processing, & the results communicated back to the workstations
through the telecommunication links.
Characteristics of Online processing system.
√ The input device is connected directly to the computer.
√ The input data is processed immediately. Processing is completed within a short time (usually
1 or 2 minutes), depending on the speed of the system.
Application areas for online processing systems.
1. Banking:
A bank customer can make an inquiry using an online terminal. The system would then
respond immediately by accessing the relevant file, and inform the customer on the status of
his/her account.
2. Stock exchanges:
22
www.arena.co.ke 0713779527
3. Stock control:
Terminals located in warehouses enable stock records to be re-ordered automatically, make
reservations, follow-up of outstanding orders, & print picking lists.
4. Manufacturing plants: - to control the progress of work.
5. Inventory status: - i.e., ordering & reporting of geographically dispersed distributors.
Advantages of online processing.
1. Files are held online; therefore the information generated can be used to update the master
files directly.
2. The Information is readily available for immediate decision-making.
3. File enquiries are possible at any given time through the terminals (workstations).
Review questions
1. (a). Discuss Online processing.
(b). Mention and explain the Application, Advantages and disadvantages of Online
processing mode.
Real-time systems.
A Real-time system is capable of processing data so quickly such that the results (output)
produced are able to influence, control, or affect the outcome of the activity or process currently
taking place.
In a Real-time data processing system, the computer receives & processes the incoming data as
soon as it occurs, updates the transaction file, and gives an immediate response that would affect
the events as they happen.
The input-originating workstations may be connected directly to the central processor by
appropriate communication equipments. In this case, a transaction is processed & completed
immediately or at the same time it occurs. It also ensures quick update to the affected files
(records).
The main purpose of a real-time processing is to provide accurate, up-to-date information, hence;
better services based on a real situation.
Requirements of a real-time processing system.
1). There must be a direct connection between Input/Output devices & the central Processor.
2). The Response time should be fairly fast, to allow a 2-way communication (interaction)
between the user & central processor.
Review questions
1. (a). Name two industries that extensively use Real-time processing.
(b). Name 3 advantages of a Real-time system.
Time-sharing systems
24
www.arena.co.ke 0713779527
Multi-programming systems
Multi-programming (also referred to as Multi-tasking) refers to a type processing where more
than one programs residing in the computer memory are executed concurrently by a single
Processor.
A multi-programming system allows the user to run 2 or more programs, all of which are in the
computer’s Main memory, at the same time.
The jobs are scheduled to run automatically by the Processor under the influence of a Multi-
programming or Multi-tasking operating system).
The schedule is such that; the Processor bound jobs (i.e., jobs that require much of the C.P.U
time as compared to the peripheral time) are assigned low priorities for them not to tie up the
C.P.U time. The Peripheral or Print bound jobs (i.e., jobs that require much of the peripheral
time as compared to the C.P.U time) are allocated the C.P.U time whenever it is available.
The OS allocates each program a time-slice, and decides the order in which they will be
executed. In this case, the programs take turns at short intervals of processing time. The
programs to be run are loaded into the memory and the CPU begins execution of the first one.
When the request is satisfied, the second program is brought into memory and its execution
starts, and so on.
Note. A Multi-programming system is able to work on several programs at the same time. It
works on the programs one after the other, and at any given time it executes instructions from
one program only. However, the computer works so quickly that it appears to be executing the
programs at the same time.
Advantages of multi-programming.
1. Increases productivity of a computer.
2. Reduces the CPU’s idle time.
3. Reduces the incidence of peripheral bound operation.
Disadvantages of multi-programming.
1. Requires more expensive CPU.
2. A Multi-tasking operating system is complex & difficult to operate.
3. Requires more expensive Input/Output facilities.
Review questions
1. (a). Define Multi-programming.
(b). What are the factors which make multiprogramming possible?
(c). State the benefits to be derived from Multi-programming?
(d). Discuss the hardware and software facilities necessary to facilitate Multi-programming.
Distributed processing.
26
www.arena.co.ke 0713779527
Review questions
27
www.arena.co.ke 0713779527
Interactive processing
Interactive processing occurs if the computer & the terminal user can communicate with each
other. It allows a 2-way communication between the user & the computer.
As the program executes, it keeps on prompting the user to provide input or respond to prompts
displayed on the screen. In other words, the user makes the requests and the computer gives the
responses.
In Interactive processing, the data is processed individually and continuously as transactions take
place and output is generated instantly.
Interactive processing is mostly applied in Ticket reservation systems.
Multi-processing systems
Multiprocessing refers to the processing of more than one task at the same time on different
processors of the same computer.
In a multi-processing system, a computer may have 2 or more independent processors, which
work together in a coordinated manner, and are sharing the same computer memory.
This means that, at any given time, the processors could execute instructions from two or more
different programs, or different parts of one program simultaneously. In such systems, each CPU
is dedicated to one type of application, e.g., one CPU may handle all terminal users, while
another may process only the batch jobs.
The activities of the system are coordinated by the Multi-processing operating system.
Advantage: - if one CPU fails, the other(s) can take over the workload until repairs are made.
Review questions
1. Explain the difference between Multi-Programming and Multi-Processing.
Conversational mode.
This is interactive computer operation where the response to the user’s message is immediate.
28
www.arena.co.ke 0713779527
Review questions
1. Mention 5 factors to be considered when selecting the data processing mode suitable for use
in organization.
29