Data Representantion and Processing G10
Data Representantion and Processing G10
MARIO@2022
Data representation
▪ Data is a collection of raw or unprocessed facts that does
not have any meaning on its own.
Example of data
• Text
• Numbers
• Audio
• Video
• Images
▪ Information is processed data that convey meaning and its
useful to the user or organization in making decisions or
solving problems.
MARIO@2022
The concept of data representation
▪ Data representation refers to the form in which
data is stored, processed and transmitted.
▪ Data can be represented using digital or analogue
methods.
MARIO@2022
Analogue signal
▪ An analogue signal is the one that has a value
that varies smoothly.
▪ Examples of analogue signal
• Human voice
• Dimmer light switch
MARIO@2022
Digital signal
▪ This is the signal represented in one of distinct states,
either 0 or 1.
▪ Text, music, images, numbers, video and audio data are
in digital form.
▪ The computer uses millions of electronic switches to
represent data in only one of the two distinct states- off
(0) or on (1).
▪ Representing data as 1s and 0s is known as binary
system.
▪ Data is also stored in digital form on CDs DVDs and flash
memory.
MARIO@2022
▪ Computers use a sequence of 0s and 1s to represent
various characters as defined by a coding scheme.
▪ Example the number 4 is represented as
0 0 0 0 0 1 0 0
MARIO@2022
Coding scheme
▪ The ASCII (American Standard Code for Information
Interchange.
• It is most widely used common scheme for data
representation, it can represent 65 000 codes.
▪ Unicode
• It can represent 65 000 characters and symbols.
• It is capable of representing all world’s current
languages.
MARIO@2022
Reasons for data representation
▪ Makes it possible for humans to communicate with a digital
computer that uses 1s and 0s for processing.
▪ When data is in human-readable form via input devices, the data is
converted into machine-readable (binary) form which can
computer can process and store in its memory.
▪ Makes it possible for components of the computer to communicate
with each other successful.
▪ Standards also enable the manufacturers to produce a component
and be confident that they will operate correctly in a computer.
MARIO@2022
Exercise
MARIO@2022
Data collection and preparation
MARIO@2022
Methods of data collection
1. Data from digital text repositories.
2. Interviews
MARIO@2022
Data from digital text repositories.
MARIO@2022
Interviews
MARIO@2022
Face-to-face interviews
MARIO@2022
Advantages of using face-to-face
interviews
▪ Accurate screening.
▪ Capture verbal and non-verbal ques.
▪ Keep focus
▪ Capture emotions and behaviors
MARIO@2022
Disadvantages of using face-to-face
interviews
▪ Cost
▪ Quality of data by interviewer
▪ Manual data entry
▪ Limit in sample size
MARIO@2022
Telephone interview
MARIO@2022
Disadvantages of telephone interview
MARIO@2022
Computer-Aided Personal Interviewing
(CAPI)
MARIO@2022
Advantages of c CAPI
MARIO@2022
Disadvantage of CAPI
MARIO@2022
Questionnaires
▪ Researchers develop questionnaires to obtain
information.
• 1. Paper-pencil questionnaires: these can be sent to a
large number of people
Advantages
1. saves researcher time and money.
2. Participants are more truthful
Disadvantages
1. Many people who may receive questionnaires may not
return them.
MARIO@2022
MARIO@2022
1. Web-based questionnaire:
▪ participants receive an e-mail with a link to secure
website to fill in a questionnaire.
▪ It is quicker with fewer details.
▪ But it does exclude people without access to a
computer
MARIO@2022
MARIO@2022
Why questionnaires are important
MARIO@2022
Other sources of data
▪ Barcodes
▪ Optical mark recognition (OMR)
▪ Optical Character Recognition (OCR)
▪ Magnetic ink Character Recognition (MICR)
▪ Smart cards
▪ Magnetic stripes
MARIO@2022
Barcode reader Optical Mark
Reader
Quick Response
MARIO@2022
Magnetic
Magnetic
Stripe Cards
Stripe Card
Reader
MARIO@2022
Methods of data preparation and data
errors
MARIO@2022
Data preparation
MARIO@2022
Data processing errors
1. Routing errors:
Asking questions in wrong order
2. Consistence errors:
Contradictory responses
3. Range errors
Responses outside the range of reasonable answers
4. Coping errors
5. Coding error
MARIO@2022
Data verification
▪ Data must always be validated and verified for human errors.
MARIO@2022
Data validation
▪ Data validation is a process of comparing data against the set of
rules to determine whether the data is valid.
▪ before processing, data validation techniques such as format, range
and data type checks are applied to the input to increase
accuracy and consistency of data preventing invalid data to be
entered.
Methods for checking validity
1. Use of validation rules
2. Required fields
3. Specific file size
4. Input masks
5. Default values
MARIO@2022
Exercise
1. A group of grade 10 G pupils at St Clements wanted to collect
information about malaria outbreak in school by interviewing
fellow pupils.
(a)State two types of interviews that can be appropriate to use to
collect data in school.
(i) ………………………
(ii)………………………
(a)Explain the reason why questionnaire is used to collect data.
(b)The information captured must be verified. Give the reason for this.
MARIO@2022
Analogue-to-digital converters
▪ It is abbreviated as ADC, A/D or A to D.
▪ It is a device that converts a smooth continuous signal into digital.
▪ For example when you plug a microphone into a computer, you
actually connect it to the input jack of the sound card.
▪ Sound card in a computer system is circuit board that converts
sound from analogue to digital.
ADC are also used in devices that capture that capture the output
of the video.
MARIO@2022
MARIO@2022
Digital to Analogue converters
▪ DAC converts a digital signal into analogue signal.
▪ When you connect a loud speaker or headphones to your
computer, you require a DAC. This serves as a DAC and the
sound card converts the digital data from the computer into
analogue signals which loud speaker then converts into sound.
▪ CD players, audio players, old TVs have DAC.
▪ DAC uses a technique called interpolation to ensure that the
analogue signal is as smooth and continuous as possible.
MARIO@2022
▪ In a Dual-up connection, you need modem or router
that acts as an ADC and a DAC
▪ A modem receives the analogue signals and converts it
into digital signal for processing.
MARIO@2022
Exercise
MARIO@2022
Types of data representation
MARIO@2022
Data storage/ capacity
MARIO@2022
7 6 5 4 3 2 1 0
0 1 1 0 1 1 1 0
A byte
3 2 1 0
0 1 1 0
A nibble
MARIO@2022
Measure of storage capacity
MARIO@2022
MARIO@2022
Number systems
MARIO@2022
Decimal number system
▪ Base 10
▪ Number of elements 10
▪ Elements 0 1 2 3 4 5 6 7 8 9
MARIO@2022
Place value system
5432.9
Position 3 2 1 0 -1
Place value has a base 103 102 101 100 10−1
Decimal value 1000 100 10 1 0.01
Numbers 5 4 3 2 9
MARIO@2022
Expanded notation
Example 1
5432.9
Solution
= 5 x 103 + 4 x 102 + 3 x 101 + 2 x 100 + 9 x 10−1
= 5 x 1000 + 4 x 100 + 3 x 10 + 2 x 1 + 9 x 0.1
= 5000 + 400 + 30 + 2 + 0.9
= 5432.9
MARIO@2022
Binary number system
▪ It is used by computers
Base 2
Number of elements 2
Elements 0 1
Position 3 2 1 0 -1
Place value has a base 23 22 21 20 2−1
Decimal value 8 4 2 1 0.5
Numbers 1 0 1 1 1
MARIO@2022
Convert binary to decimal
Example
= 1011.12
= 1 x 23 + 0 x 22 + 1 x 21 + 1 x 20 + 1 x 2−1
= 1 x 8 + 0 x 4 + 1 x 2 + 1 x 1 + 1 x 0.5
= 8 + 0 + 2 + 1 + 0.5
= 11.5
MARIO@2022
The octal number system
▪ It is also used in computers
▪ Base 8
▪ Number of elements 8
▪ Elements 0 1 2 3 4 5 6 7
Example: 215.228 can be represented as follows
Position 3 2 1 0 -1
Place value has a base 82 81 80 8−1 8−2
Decimal value 64 8 1 0.125 0.015625
Numbers 2 1 5 2 2
MARIO@2022
Convert an octal to binary
215.228
= 2 x 82 + 1 x 81 + 5 x 80 + 2 x 8−1 + 2 x 8−2
= 2 x 64 + 1 x 8 + 5 x 1 + 2 x 0.125 + 2 x 0.015625
= 128 + 8 + 5 + 0.25 + 0.03125
= 141. 281259
MARIO@2022
The hexadecimal number system
Base 16
Number of elements 16
Elements 0 1 2 3 4 5 6 7 8 9A B C D E F
EXAMPLE 2AF.B16
Position 2 1 0 -1
Place value has a base 162 161 160 16−1
Decimal value 256 16 1 0.0625
Numbers 2 A F B
MARIO@2022
Solution 1
(a)2AF.B16
= 2 x 162 + A X 161 + F X 160 + B X 16−1
= 2 X 256 + 10 X 16 + 15 X 1 + 0.0625
= 512 + 160 + 15 + 0.0625
= 687.625
MARIO@2022
Solution 2
(b) 8E
= 8 x 161 + E X 160
= 8 X 16 + 14 X 1
= 128 + 14
= 142
MARIO@2022
EXERCISE
Convert the following to decimal numbers
(a)4B
(b)2AF
(c)AFB
(d)8D
MARIO@2022
Convert from decimal to binary, octal and
hexadecimal
MARIO@2022
▪ A decimal number has two parts
(i) Integer part
(ii) Fractional part
Example 4.15
▪ 4 is a decimal and .15 is the fractional part.
▪ The integer part of a decimal number is converted to
any base to any base using division operation.
MARIO@2022
Example 1:
MARIO@2022
Solution (a)
Base quotient R
2 25
2 12 1
2 6 0
2 3 0
2 1 1
0 1 110012
MARIO@2022
Solution (b)
Base quotient R
8 453
8 56 5
8 7 0
0 7 45310 = 7058
MARIO@2022
Solution (c)
Base quotient R
16 3456
16 216 0
16 13 8
0 13 (D) 345610 = D8016
MARIO@2022
Converting from decimal to hexadecimal
1. Example
2. Convert 345610 to hexadecimal
MARIO@2022
The Data Processing Cycle, Errors And
Data Integrity
DATA PROCESSING
▪ Data is a single item or fact that has no meaning
or news value.
▪ Information is processed data that does have
meaning or news value to the users.
▪ Data processing cycle consists of series of steps
where raw data (input) is fed into a process
(CPU) to produce output.
MARIO@2022
processing
input output
MARIO@2022
INPUT
▪ Acquiring or gathering data and entering it into the computer
system.
▪ The data is validated by checking for completeness and
accuracy.
▪ Data refers to unprocessed text, images, video or audio.
▪ Examples
• Capturing the prices of items in a supermarket using a bar
code reader.
• Gathering information using a form on a website.
• Typing using word processing program.
MARIO@2022
Input devices
▪ Keyboard
▪ Mouse
▪ Scanners
▪ Barcode readers
▪ Touchpads
▪ Digital cameras
▪ Video cameras
▪ Microphones
▪ Voice recognition
▪ Biometric devices
MARIO@2022
Processing
▪ The operation performed on the data to produce
information.
▪ The CPU is responsible for data processing.
▪ All data that is currently being processed by the CPU is
stored temporarily in a random access memory (RAM).
MARIO@2022
Storage
▪ Saving data for future use.
▪ Data and information stored in a secondary storage is
not lost when the computer is switched off.
Storage devices
▪ CDs
▪ DVDs
▪ Blu-ray
▪ External hard drive
▪ USB drives
MARIO@2022
Output
▪ Presenting the in the required format for the user.
▪ Output can be in form of hardcopy, softcopy, audio, or
video.
Output Devices
▪ Printers
▪ Monitor
▪ Fax
▪ Headset
▪ Speakers
▪ Multifunction devices
▪ Data projectors
MARIO@2022
Communication
▪ Computers are able to communicate with other
computers and mobile devices.
MARIO@2022
Exercise
1. Explain the meaning of the term data.
2. State two input devices which automatically
capture data.
3. Give an application that uses the devices you
have mentioned in question 2.
4. State two examples of output devices.
5. What is data integrity?
MARIO@2022
Data integrity
▪ Data integrity refers to the accuracy and
consistency of data stored in data base.
▪ if data is inaccurate then the results will be
inaccurate, this is called GIGO (Garbage in,
Garbage Out)
▪ Data integrity is vital because people make
decisions and take actions based on processed
data.
MARIO@2022
Types of computer processing file
MARIO@2022
Master and transaction file
▪ A file is a collection of organized data.
▪ The master file contains all the permanent or
semi-permanent data relating to a particular
application.
▪ Transaction files contains all the transactions
that are captured as they occur over the period
of time and it is used to update the master file.
MARIO@2022
Report files
▪ Are derived from records within a master file or
transaction file.
MARIO@2022
Backup files
▪ Backups are duplicate copies of important files that are
stored in a safe place used to recreate the master file in
case in case of loss of loss.
▪ Backups can take place in different ways:
1. In batch processing; three generations of the master and
transaction files are usually kept. Sometimes refers to as
grandfather-son method of backup.
2. Cloud storage/ online update – data is stored in an external
server on the internet.
MARIO@2022
Reference files
▪ These files contain referential data, such as tables and lists
which are necessary to support the data processing of an
organization when performing calculations or checking
the accuracy of input data. e.g. charts, tables of
inventory codes
MARIO@2022
Electronic data processing modes
1. Online processing
2. Interactive processing
3. Distributed processing
4. Time- sharing processing
5. Batch processing
6. Multiprocessing
7. Multiprogramming
8. Multi-tasking
9. Real-time processing
MARIO@2022
MARIO@2022
Online processing
▪ This is an automated way to enter and process data
as long as the document is available.
▪ Each transaction is processed as soon as it is
entered.
▪ It is a real-time method
▪ It is referred to as an online transaction processing
system (OLTP)
▪ Examples of applications that use online processing
bar code scanning.
MARIO@2022
Interactive processing
▪ The user provides the computer with instructions/input
during processing and observe the results.
Distributed processing
▪ Geographically distributed resources such storage
devices, data sources, software and multiple
independent are connected in a single network to
perform a particular
Time- sharing processing
▪ A computer system is shared among multiple users,
giving each other user illusion that they have exclusive
control of the system.
MARIO@2022
Batch processing
▪ Transaction are accumulated over a period of time
in a transaction file.
▪ The transaction file is used to update the master file
in on processing run without human intervention.
Multiprocessing
▪ The operating system supports multi-core processor
multi processing of programs by more than one
processor.
Multiprogramming
▪ It was designed to optimize CPU usage.
MARIO@2022
Multi-tasking
▪ The operating system allows a single user to work on two
or more programs at the same time.
▪ Users can run two or more programs concurrently.
▪ The program in the foreground, which is currently in use
is called active program.
▪ The other programs that are running, but not in use are
in the background.
Real-time processing
▪ Transactions are processed as soon as they take place
and the relevant master files are updated.
▪ Example; online banking and ATM transaction.
MARIO@2022
Exercise
1. Explain the meaning of Real time.
2. List two basic actions that are performed on a current file to
create new master file.
3. Name a file used to update a master file.
4. State one application which uses
(a)Real time processing
(b)On-line processing
MARIO@2022
Mario Chongo Makhanga
BICTed
MARIO@2022