0% found this document useful (0 votes)

10 views62 pages

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Lecture 3 of COMP 30660 focuses on data representation in computer systems, covering numerical data, character codes, and error detection techniques. It explains the basic units of data, such as bits, bytes, and words, and delves into integer and floating-point representations, including the IEEE-754 standard. The lecture also discusses character encoding schemes like ASCII and Unicode, as well as methods for data recording and transmission, highlighting the importance of error detection and correction in data integrity.

Uploaded by

1457981717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views62 pages

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

1457981717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

COMP 30660: Computer Architecture and Organization (CONV)

Lecture 3: Data Representation in

Computer Systems http://www.flickr.com/photos/sarahseverson/

Madhusanka Liyanage
School of Computer Science
University College Dublin, Ireland
[email protected]
1
Learning Objectives

• Understand the fundamentals of numerical data

representation in digital computers.
• Gain familiarity with the most popular character codes.
• Become aware of the differences between how data is
stored in computer memory and how it is transmitted
over networks.
• Understand the concepts of error detecting and
correcting codes.

2
Data and Information

• Data can be defined as a representation of facts,

concepts, or instructions in a formalized manner, which
should be suitable for communication, interpretation, or
processing by human or electronic machine.
• Information is organized or classified data, which has
some meaningful values for the receiver.
• Information is the processed data on which decisions
and actions are based.

3
Basic Unit of Data

• Use to indicate the capacity of some standard

data storage system or communication channels.
• Units derived from
– bit
– Byte
– Nibble
– Crumb
– Word

4
Bit

• A bit is the most basic unit of data in a computer.

– It is a state of “on” or “off” in a digital circuit.
– Sometimes these states are “high” or “low”
voltage instead of “on” or “off”

5
Byte

• A byte is a group of eight bits.

– A byte is the smallest
possible addressable unit
of computer storage.
– The term, “addressable,”
means that a particular
byte can be retrieved
according to its location in
memory.

6
Nibble
• A group of four bits is called a nibble (or nybble).
– Half a byte
– Bytes, therefore, consist of two nibbles: a
“high-order/Upper nibble” and a “low-
order/lower nibble”.
– Nibble is most often used in the context of
hexadecimal number representations, since a
nibble has the same amount of information as
one hexadecimal digit.

7
Crumb
• A pair of two bits or a quarter byte was called a
crumb.
– Quarter of a byte
– Often used in early 8-bit computing.

8
Word

• A word is a contiguous group of

bytes.
– Words can be any number of
bits or bytes.
– Word sizes of 16, 32, or 64
bits are most common.
– In a word-addressable
system, a word is the
smallest addressable unit of
storage.
– The number of bits in a word
is usually defined by the size
of the registers in the
computer's CPU

9
Data Representation

• The computer work with binary numbers

• Therefore, the numbers, letters, and other
symbols must be converted into their binary
equivalents.
Integers

12
Integer Representation (Recap)

• The Representation of a positive integer number

is quite straight forward
– but we are interested to represent positive as well
as negative numbers.
• Add a sign bit to representation
• For a Positive number, the sign bit set to 0 and
for negative number the sign bit is set to 1.
Integer Representation (Recap)

▪ An integer can be represented by fixed point

representation
▪ The left most bit is considered as sign bit.
▪ The magnitude of the number represent by the
rest of the bits

14
Integer Representation (Recap)

▪ The magnitude of the number can be

represented in following three ways:
1. Signed magnitude representation.
2. Signed 1’s complement representation.
3. Signed 2’s complement representation.
But how to represent the Floating-
Point numbers?

16
Floating-Point Representation

• The signed magnitude, one’s

complement, and two’s
complement representation that
we have just presented deal with
integer values only.
• Without modification, these
formats are not useful in
scientific or business applications
that deal with real number
values.
• Floating-point representation
solves this problem.

17
Floating-Point: Scientific Notation
• Scientific notation is a way of expressing numbers
that are too large or too small to be conveniently
written in decimal form.
– For example:
0.125 = 1.25  10-1
5,000,000 = 5.0  106

18
Scientific Notation
• Scientific Notation: has a single digit to the left of the decimal point.
• Numbers written in scientific notation have three components:

19
Floating-Point Representation
• Computers use a form of scientific notation for
floating-point representation
• Computer representation of a floating-point number
consists of three fixed-size fields:

• This is the standard arrangement of these fields.

20
Floating-Point Representation

• The one-bit sign field is the sign of the stored value.

• The size of the exponent field, determines the range
of values that can be represented.
• The size of the significand (mantissa) determines the
precision of the representation.

21
Example:
For illustrative purposes, we use a 14-bit model with a 5-bit
exponent and an 8-bit significand.
• Example:
– Express 3210 in the simplified 14-bit floating-
point model.
• We know that 32 is 25. So in (binary) scientific
notation 32 = 1.0 x 25
• Using this information, we put 101 (= 510) in the
exponent field and 1 in the significand as shown.

22
Example: synonymous forms
32 = 1.0 x 25 = 0.1 x 26 = 0.01 x 27 = 0.001 x 28 = 0.0001 x 29

• The illustrations shown at

the right are all equivalent
representations for 32
using our simplified model.
• Not only these
synonymous
representations waste
space, but they can also
cause confusion.

23
Floating-Point Representation: Negative
exponents

• Another problem with our system is that we have made

no allowances for negative exponents.
• E.g. no way to express 0.25 =1/4 = 1.0 x 2-2 = 0.1 x 2-1
– Notice that there is no sign in the exponent field!

24
IEEE-754 Representation
• A technical standard for floating-point arithmetic by
the Institute of Electrical and Electronics Engineers
(IEEE).
• The standard defines several interchange formats,

26
IEEE-754 Representation: How to Solve
synonymous Issue
• To resolve the problem of synonymous forms,
IEEE-754 establish a rule that the first digit of
the significand must be 1 (and integer part
should be zero).
• e.g. 32 = 1.0 x 25 = 0.1 x 26
• This results in a unique pattern for each floating-point
number.
– In the IEEE-754 standard, this 1 is implied meaning
that a 1 is assumed after the binary point.

27
IEEE-754 Representation: How to
Solve negative exponents
• To provide for negative exponents, IEEE-754 uses a
biased exponent.
• A bias is a number that is approximately midway in
the range of values expressible by the exponent.
• Exponent filed in IEEE-754 is filled by adding the
bias to the real exponent value
– So, Need to subtract the bias from the value in the
exponent field to determine its true value.
• Exponent values less than bias are negative,
representing fractional numbers.
28
IEEE-754 Representation
• The IEEE-754 single precision floating point
standard uses bias of 127 over its 8-bit exponent.

• The double precision standard has a bias of 1023

over its 11-bit exponent.

29
Example 1:
– Express 3210 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use
16 as bias.
• We know that 32 = 1.0 x 25 = 0.1 x 26.
• To use our excess 16 biased exponent, we add 16 to
6, giving 2210 (=101102).
• Graphically:

30
Example 2:Representation
– Express 0.062510 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use
16 as bias.
• We know that 0.0625 is 2-4. So, in (binary) scientific
notation 0.0625 = 1.0 x 2-4 = 0.1 x 2 -3.
• To use our excess 16 biased exponent, we add
16 to -3, giving 1310 (=011012).

31
Example 3 (To Do):Representation
– Express -26.62510 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use 16
as bias.
• We find 26.62510 = 11010.1012. Normalizing, we have:
26.62510 = 0.11010101 x 2 5.
• To use our excess 16 biased exponent, we add 16 to 5,
giving 2110 (=101012).
• We also need a 1 in the sign bit (for a negative
number).

32
What about Characters?

33
Character Codes

34
Character Codes

• Calculations are not useful until their results can

be displayed in a manner that is meaningful to
people.
• Also need to store the results of calculations and
provide a meaning for data input.
• Thus, human-understandable characters must be
converted to computer-understandable bit patterns
(and vise versa) using some sort of character
encoding scheme.
• Character Codes are used for this purpose
35
Character Codes :
Binary-coded decimal (BCD)
• The earliest computer coding systems used six bits.
• Binary-coded decimal (BCD) was one of these early
codes.
• In BCD, each digit is represented by a fixed number
of bits, usually four or eight.
• It was used by IBM mainframes in the 1950s and
1960s.
• As computers have evolved, character codes have
evolved.
• Larger computer memories and storage devices
permit richer character codes.

36
Character Codes : EBCDIC

• In 1964, BCD was extended to an 8-bit code,

Extended Binary-Coded Decimal Interchange
Code (EBCDIC).
• EBCDIC was one of the first widely-used computer
codes that supported upper and lowercase
alphabetic characters, in addition to special
characters, such as punctuation and control
characters.
• EBCDIC and BCD are still in use by IBM
mainframes today.
37
ASCII (American Standard Code for
Information Interchange)
• Other computer manufacturers chose the 7-bit
ASCII (American Standard Code for Information
Interchange) as a replacement for 6-bit codes.
• Until recently, ASCII was the dominant character
code outside the IBM mainframe world.

39
The ASCII Code

40
41
Unicode
Unicode

• Many of today’s systems embrace Unicode, a 16-bit

system that can encode the characters of every
language in the world.
• Defines 144,697 characters covering 159 modern and
historic scripts, as well as symbols, emoji, and non-
visual control and formatting codes.
• Maintained by the Unicode Consortium

43
Unicode

• The Unicode codes-

pace allocation is
shown at the right.
• The lowest-numbered
Unicode characters
comprise the ASCII
code.
• The highest provide for
user-defined codes.

44
Data Recording and Transmission

45
Codes for Data Recording and
Transmission
• When character codes or numeric values are stored in
computer memory, their values are unambiguous (Fixed).
• However, this is not always the case when data is stored
on magnetic disk or transmitted over a distance of more
than a few feet.
– Owing to the physical irregularities of data
storage and transmission media, bytes can
become distorted or garbled.
• Data errors are reduced by use of suitable coding
methods as well as through the use of various error-
detection techniques.
46
Codes for Data Recording
and Transmission
• To transmit data, pulses of “high” and “low” voltage
are sent across communications media.
• To store data, changes are induced in the magnetic
polarity of the recording medium.
• The period of time during which a bit is transmitted,
or the area of magnetic storage within which a bit is
stored is called a bit cell.

47
Non-Return-to-Zero (NRZ)

• The simplest data recording and transmission code

is the non-return-to-zero (NRZ) code.
• NRZ encodes 1 as “high” and 0 as “low.”
• The coding of OK (in ASCII) is shown below.

The problem with NRZ code is that long strings of

zeros and ones cause synchronization loss.
48
Non-return-to-zero-invert (NRZI)

• Non-Return-to-Zero-Invert (NRZI) reduces this

synchronization loss by providing a transition (either
low-to-high or high-to-low) for each binary 1 and no
transition for binary zero (0)

Although it prevents loss of synchronization over long

strings of binary ones, NRZI coding does nothing to
prevent synchronization loss within long strings of zeros
49
Manchester coding

• Manchester coding (also known as phase modulation)

prevents this problem by encoding a binary one with an
“up” transition and a binary zero with a “down” transition.

50
Error Detection and Correction

51
2.8 Error Detection and Correction

• It is physically impossible for any data recording or

transmission medium to be 100% perfect 100% of the
time over its entire expected useful life.
• As more bits are packed onto a square centimeter of
disk storage, as communications transmission speeds
increase, the likelihood of error is increasing.
• Thus, error detection and correction is critical to
accurate data transmission, storage and retrieval.

52
Types of Error

• Single bit error

– Only one bit in the
data unit has
changed.
• Burst error
– Two or more bits
in the data unit
has changed.

53
Error detection/correction

• Error detection
– Check if any error has occurred
– Don’t care the number of errors
– Don’t care the positions of errors

• Error correction
– Need to know the number of errors
– Need to know the positions of errors
– More difficult

10.54
Error Detection

• Error detecting code is to include

only enough redundancy to allow
the receiver to deduce that an error
occurred, but not which error, and
have it request a retransmission.
• Error detection uses the concept of
redundancy, which means adding
extra bits for detecting error at the
destination.
55
Redundancy

• For error detection, a

shorter group of bits may
be appended to the end
of each unit.
• This technique is called
Redundancy because the
extra bits are redundant
to the information.
• They are discarded as
soon as the accuracy of
the transmission has
been determined.

56
Error Detection Techniques

• Some popular techniques for error detection are:

– Parity check
– Checksum
– Cyclic redundancy check
– Cryptographic hash function

57
Parity check

• Check bit or parity bit will be added.

• Two methods
– Even parity checking
– Odd parity checking
• Even parity checking
– 1 is added to the block if the data
contains odd number of 1’s,
– 0 is added if the data contains even
number of 1’s
– Adding the parity bit makes the total
number of 1’s in the data even, that is
why it is called even parity checking.
• Odd parity checking
– 0 is added to the block if the data
contains odd number of 1’s,
– 1 is added if the data contains even
number of 1’s
– Adding the parity bit makes the total
number of 1’s in the data odd, that is • Can detect on Odd
why it is called odd parity checking. numbers of errors
• Only useful for detecting
errors 58
Checksum
• A small data block derived
from transmitted/stored digital
data for the purpose of
detecting errors that may have
been introduced during its
transmission or storage.
• The procedure which
generates this checksum is
called a checksum function
or checksum algorithm.
• E.g. a checksum of a message
can be a modular arithmetic
sum of message code words of
a fixed word length

59
Home work

• Find out what is

– Cyclic redundancy check
– Cryptographic hash function

60
Summery

• Understand the fundamentals of numerical data

representation in digital computers.
• Gain familiarity with the most popular character
codes.
• Become aware of the differences between how
data is stored in computer memory and how it is
transmitted over telecommunication lines.
• Understand the concepts of error detecting and
correcting codes.

61
Thank You

Data Representation
No ratings yet
Data Representation
28 pages
COA - Unit 2 Data Representation 1
No ratings yet
COA - Unit 2 Data Representation 1
59 pages
2.data - Representation - UNIT 2-2
No ratings yet
2.data - Representation - UNIT 2-2
42 pages
Unit 2
No ratings yet
Unit 2
85 pages
Alqalam Foundation of Seq PPT 3b
No ratings yet
Alqalam Foundation of Seq PPT 3b
91 pages
CSI104 Slot05
No ratings yet
CSI104 Slot05
66 pages
CSI 03 Tim
No ratings yet
CSI 03 Tim
73 pages
C#2 - Data Storage
No ratings yet
C#2 - Data Storage
102 pages
Integer Representation
No ratings yet
Integer Representation
34 pages
Data Representation
No ratings yet
Data Representation
19 pages
Architecture of Computers: Vistula University
No ratings yet
Architecture of Computers: Vistula University
30 pages
4.5 Fundamentals of Data Representation
No ratings yet
4.5 Fundamentals of Data Representation
11 pages
NMCNTT-03-Data Storage
No ratings yet
NMCNTT-03-Data Storage
101 pages
Lec 1
No ratings yet
Lec 1
65 pages
Week-2 Data Representation
No ratings yet
Week-2 Data Representation
15 pages
1 Numberrepresentation
No ratings yet
1 Numberrepresentation
36 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
CENG 103 Intro To CENG Lecture Notes SB - 1
No ratings yet
CENG 103 Intro To CENG Lecture Notes SB - 1
25 pages
Module 1 Part 2
No ratings yet
Module 1 Part 2
12 pages
Number Systems - Data Representation (Numbers)
No ratings yet
Number Systems - Data Representation (Numbers)
27 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
Lesson 03 - Number Systems
No ratings yet
Lesson 03 - Number Systems
14 pages
Cao Iii PDF
No ratings yet
Cao Iii PDF
16 pages
Unit1 Data Representation - 1
No ratings yet
Unit1 Data Representation - 1
35 pages
07 Datarepresentation 150216185458 Conversion Gate02
No ratings yet
07 Datarepresentation 150216185458 Conversion Gate02
43 pages
CH08.2-Computer Arithmetic
No ratings yet
CH08.2-Computer Arithmetic
14 pages
Tin học đại cương - Unit 1 (part 2)
No ratings yet
Tin học đại cương - Unit 1 (part 2)
83 pages
Lecture11 Slides 1
No ratings yet
Lecture11 Slides 1
52 pages
CH2 - Data Representation
No ratings yet
CH2 - Data Representation
29 pages
Unit Ii
No ratings yet
Unit Ii
11 pages
Unit1 2
No ratings yet
Unit1 2
98 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
CH 2
No ratings yet
CH 2
61 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
COMPX203 Computer Systems: Number Representation
No ratings yet
COMPX203 Computer Systems: Number Representation
33 pages
Coa Module-Iii
No ratings yet
Coa Module-Iii
13 pages
L2-Variables and Floating Point Number System
No ratings yet
L2-Variables and Floating Point Number System
38 pages
Lecture02-Data Representation 2
No ratings yet
Lecture02-Data Representation 2
38 pages
Machine Level Representation of Data Part 3
100% (1)
Machine Level Representation of Data Part 3
32 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Rbs 6501 Datasheet PDF
No ratings yet
Rbs 6501 Datasheet PDF
2 pages
03-Data Representation
No ratings yet
03-Data Representation
6 pages
COA Lecture 1
No ratings yet
COA Lecture 1
49 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Transforming Data Into Information: Syed Mohsin Ali Sheerazi
No ratings yet
Transforming Data Into Information: Syed Mohsin Ali Sheerazi
51 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Chapter1 2
No ratings yet
Chapter1 2
66 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Introduction To Numerical Computing: Statistics 580 Number Systems
No ratings yet
Introduction To Numerical Computing: Statistics 580 Number Systems
35 pages
Unit1 2
No ratings yet
Unit1 2
64 pages
CSC 206 Lecture 3
No ratings yet
CSC 206 Lecture 3
13 pages
w4 One PDF
No ratings yet
w4 One PDF
40 pages
Data Representation
No ratings yet
Data Representation
5 pages
Week 2 - Data Representation - Stud
No ratings yet
Week 2 - Data Representation - Stud
25 pages
Chap 02
No ratings yet
Chap 02
16 pages
Week II - Data Representation and Number System
No ratings yet
Week II - Data Representation and Number System
63 pages
Data - Representation - UNIT 2 PDF
No ratings yet
Data - Representation - UNIT 2 PDF
30 pages
Digital Media Planning Workflow Diagram
No ratings yet
Digital Media Planning Workflow Diagram
1 page
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
Sheet 1-4
No ratings yet
Sheet 1-4
3 pages
File Handling in Python PDF
No ratings yet
File Handling in Python PDF
25 pages
Introduction To Computer Organization: Don Johnson
No ratings yet
Introduction To Computer Organization: Don Johnson
5 pages
Module 5
No ratings yet
Module 5
21 pages
Norma Iso 10360-7 (Cmm-S.visión)
No ratings yet
Norma Iso 10360-7 (Cmm-S.visión)
46 pages
Lecture 11
No ratings yet
Lecture 11
85 pages
Manual de Utilizare PG106 Eng
No ratings yet
Manual de Utilizare PG106 Eng
27 pages
The Limit of A Function PDF
No ratings yet
The Limit of A Function PDF
12 pages
Previous Year
No ratings yet
Previous Year
14 pages
SPO Single Pass Optimization For Soccer Simulation 2D
No ratings yet
SPO Single Pass Optimization For Soccer Simulation 2D
24 pages
The Magic Cafe Forums - Red Streamlined Convertible by David Regal
No ratings yet
The Magic Cafe Forums - Red Streamlined Convertible by David Regal
3 pages
GGJ Upload Instructions
No ratings yet
GGJ Upload Instructions
31 pages
Networker Errors
No ratings yet
Networker Errors
230 pages
Dbms
No ratings yet
Dbms
6 pages
Immediate Download Harnessing The Uefi Shell Moving The Platform Beyond Dos 2nd Edition Michael Rothman Ebooks 2024
100% (3)
Immediate Download Harnessing The Uefi Shell Moving The Platform Beyond Dos 2nd Edition Michael Rothman Ebooks 2024
55 pages
Jenkins
No ratings yet
Jenkins
8 pages
PHP Lab Programs
No ratings yet
PHP Lab Programs
55 pages
Sem 2 Synopsis
No ratings yet
Sem 2 Synopsis
27 pages
Unit-1 Iot
No ratings yet
Unit-1 Iot
24 pages
E3220 p5k3 Deluxe
No ratings yet
E3220 p5k3 Deluxe
172 pages
EC - A1P - Language Test 3B
No ratings yet
EC - A1P - Language Test 3B
4 pages
Database Systems: Ms. Anum Hameed
No ratings yet
Database Systems: Ms. Anum Hameed
10 pages
Installation & Basic Operations: Medcaptain Service Dept
No ratings yet
Installation & Basic Operations: Medcaptain Service Dept
24 pages
Ginesh Goyal Data Structures Practical File
No ratings yet
Ginesh Goyal Data Structures Practical File
30 pages
IT English Test Unit 5
No ratings yet
IT English Test Unit 5
6 pages
Customer Service Advisor Training Manual Nexus 3
No ratings yet
Customer Service Advisor Training Manual Nexus 3
6 pages
Reading and Writing Skills Reviewer
No ratings yet
Reading and Writing Skills Reviewer
10 pages
OMAGND15 Fujitsu v5
No ratings yet
OMAGND15 Fujitsu v5
5 pages
Friends Forever
No ratings yet
Friends Forever
2 pages
Diagrama Entidad Relacion Moodle: Relación Usuario - Grupo - Curso
No ratings yet
Diagrama Entidad Relacion Moodle: Relación Usuario - Grupo - Curso
2 pages
Digital Electronics for Beginners: 1, #1
From Everand
Digital Electronics for Beginners: 1, #1
Raja Suresh
No ratings yet

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

COMP 30660: Computer Architecture and Organization (CONV)

Lecture 3: Data Representation in

• Understand the fundamentals of numerical data

• Data can be defined as a representation of facts,

• Use to indicate the capacity of some standard

• A bit is the most basic unit of data in a computer.

• A byte is a group of eight bits.

• A word is a contiguous group of

• The computer work with binary numbers

• The Representation of a positive integer number

▪ An integer can be represented by fixed point

▪ The magnitude of the number can be

• The signed magnitude, one’s

• This is the standard arrangement of these fields.

• The one-bit sign field is the sign of the stored value.

• The illustrations shown at

• Another problem with our system is that we have made

• The double precision standard has a bias of 1023

• Calculations are not useful until their results can

• In 1964, BCD was extended to an 8-bit code,

• Many of today’s systems embrace Unicode, a 16-bit

• The Unicode codes-

• The simplest data recording and transmission code

The problem with NRZ code is that long strings of

• Non-Return-to-Zero-Invert (NRZI) reduces this

Although it prevents loss of synchronization over long

• Manchester coding (also known as phase modulation)

• It is physically impossible for any data recording or

• Single bit error

• Error detecting code is to include

• For error detection, a

• Some popular techniques for error detection are:

• Check bit or parity bit will be added.

• Find out what is

• Understand the fundamentals of numerical data

You might also like