0% found this document useful (0 votes)

6 views14 pages

01 Data Representation

The document covers data representation in programming, focusing on user-defined data types, file organization, and floating-point number representation. It explains the necessity of user-defined types, composite and non-composite data types, various file organization methods, and the implications of floating-point approximations and rounding errors. Additionally, it discusses normalization, precision versus range, and potential issues like overflow and underflow in floating-point arithmetic.

Uploaded by

nmirza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views14 pages

01 Data Representation

Uploaded by

nmirza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

01 Data Representation

Candidates should be able to:

Show understanding of why user-defined types are necessary

Define and use non-composite types

Define and use composite data types

Choose and design an appropriate user-defined data type for a given problem

Show understanding of the methods of file organization and select an

appropriate method of file organization and file access for a given problem

Show understanding of methods of file access

Show understanding of hashing algorithms

Describe the format of binary floating-point real numbers

Convert binary floating-point real numbers into denary and vice versa

Normalize floating-point numbers

Show understanding of the consequences of a binary representation only

being an approximation to the real number it represents (in certain cases)

Show understanding that binary representations can give rise to rounding

errors

Definitions
The programming language defines the range of possible
Built-in data type values that can be assigned to and the operations that can be
applied to a variable

A data type for which the programmer has included the

User-defined data type
definition in the program

01 Data Representation 1
Composite data type A data type that is derived from other data types

Non-Composite data
A data type defined without referencing another data type
type

A data-type which provides an ordered list of values that a

Enumerated data type
variable of this type can take on

Pointer data type Used to reference a memory location

A data type that allows storing a finite number of different

Sets
values that have no order

Class Includes variables of given data types and methods

Serial file organization Records stored in the order they were added in

Sequential file Physically stores record and ordered according to their key
organization field value

Random file organization Stores records of data in a file in any available position

Each record in the file is read, one by one, until the desired
Sequential Access
record is found

Jumps to a specific record in the file without accessing other

Random Access
records

A mathematical formula used to perform a calculation on the

Hashing Algorithm
key field of the record

Normalized two’s
First and second bits of mantissa must be different for a
complement binary
floating point number to be normalized
number

Number following a calculation is too big to be represented in

Overflow
the given format

Underflow Number is too small to be represented in the given format

1.1 User-defined data types

1.1.1 Data Types
Built-in data types

01 Data Representation 2
The programming language defines the range of possible values that can be
assigned to and the operations that can be applied to a variable

User-defined data types

A data type for which the programmer has included the definition in the
program

It is used to create a new data type

Allows to extend the flexibility of the programming language

Non-Composite data types

A data type defined without referencing another data type

Example: Enumerated data type, Pointer

Enumerated data type

A data-type which provides an ordered list of values that a variable of this

type can take on

Example:
TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday, Friday)

Note: Since this is a different data type than STRING, quotation marks are not
used to represent them

Pointer data type

Used to reference a memory location

Example:

DECLARE IntPointer : ^INTEGER

DECLARE MyVar : INTEGER

DECLARE MyVar2 : INTEGER

MyVar ← 57
IntPointer ← @MyVar

MyVar2 ← IntPointer^

IntPointer^ ← 100

01 Data Representation 3
@ symbol before a variable identifier gives the address of the variable

^ is put before the data type to define a pointer data type

^ is placed after a pointer variable. It is an identifier that dereferences the

pointer variable

Composite data types

A data type that is derived from other data types

They are used to extend functionality of programming language

Example: Records, Sets, Classes

Sets

A data type that allows storing a finite number of different values that have no
order

Example:
TYPE Sletter = SET OF CHAR
DEFINE vowel = (’a’, ‘e’, ‘i’, ‘o’, ‘u’) : Sletter

Classes

Includes variables of given data types and methods

Used in Object Oriented Programming, discussed in later chapters

01 Data Representation 4
Skill Check 1

1. Describe the purpose of a user-defined data type. [2]

2. Define, using pseudocode, the following enumerated data types:

a. SchoolDay to hold data about the days students are usually in school.
[1]

b. WeekEnd to hold data about the days that are not school days. [1]

3. Define, using pseudocode, the composite data type ClubMeet. This will
hold data about club members that includes:

First name and last name

The two days they attend:

One on a school day

One not on a school day

Use the enumerated types you created in part(b). [4]

Solution

1. To create a new data type to extend the flexibility of the programming

language

2. TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday

TYPE WeekEnd = (Saturday, Sunday)

3. TYPE ClubMeet
DECLARE FirstName : STRING
DECLARE LastName : STRING
DECLARE Schoolday : SchoolDay
DECLARE Weekend : WeekEnd
ENDTYPE

01 Data Representation 5
1.2 File organization and access
1.2.1 File Organization
Serial File Organization
Records stored in the order they were added in

New records appended to end of file

Accessed only sequentially

Reorganization not needed when a new record is added

Sequential File Organization

Physically stores record and ordered according to their key field value

Accessed both sequentially and by direct access with an index file

High hit rate (everyone needs a statement)

Justification:

Suitable for batch processing

All customers need statement

There’s a unique key field

Organized by unique key field

Random File Organization

Stores records of data in a file in any available position

The location of any record in the file is found by using a hashing algorithm on
the key field of a record

Low waiting time

Low hit rate (only one record will match your account number/username)

Justification:

Real time access

01 Data Representation 6
No need to search records.

1.2.2 File Access

Sequential Access
Each record in the file is read, one by one, until the desired record is found

Efficient when every record in the file needs to be processed

Every record is searched until a record is found, or whole file has been
searched and not found, or if key field of current record being checked is
greater than the key field of record being searched in sequential file
organization

Direct Access
Jumps to a specific record in the file without accessing other records

Required when only an individual record from a file needs to be processed

For sequential file, an index of all the key fields is kept and used to look up the
address of the file location where a given record is stored

For random access files, a hashing algorithm is used on the key field to
calculate the address of the file location where a given record is stored

1.2.3 Hashing Algorithms

A mathematical formula used to perform a calculation on the key field of the
record

The result of the calculation gives the address where the record should be
found

To write a record:

Key field is hashed to produce a location address

If location is free, add the data there

Otherwise, use an overflow method to find a free location

If no free location, data cannot be stored

01 Data Representation 7
To read a record:

Key field is hashed to produce a location address

The record at the address is checked to see if it matches the desired

record

If it does not match, then the following records need to be read until a
match is found (if open hash is used), or the overflow area needs to be
searched for a match (if closed hash is used)

Skill Check 2

1. Compare sequential and serial methods of file organization. [4]

2. State the most suitable method of file access when a record is

referenced by a unique address on a disk-type storage medium. [1]

3. State the most suitable method of file access when a bank stores its
data records in ascending order of account number. [1]

Solution

1. In both serial and sequential files, records are stored one after the
other and need to be accessed one after the other. Serial files are
stored in chronological order, whereas sequential files are stored
with ordered records and stored in the order of the key field. In serial
files, new records are added in the next available space. In
sequential files, new records are inserted in the correct position.

2. Direct Access

3. Sequential Access

1.3 Floating point numbers, representation

and manipulation
Allows for representation of fractional values in binary number system

The number is the form M × 2E

01 Data Representation 8
M is the mantissa and E is the exponent

1.3.1 Converting binary floating-point numbers into

denary
For the mantissa, start with -1 as the first bit, 0.5 as the second bit, and every
next bit being half of the previous one

Add up all the mantissa values where a 1 bit appears to get the value of M

For the exponent consider the normal 8-bit binary number with the right most
bit representing 1

Add up the exponent values where a 1 bit appears to get the value of E

Use M×2E to get the denary output

For example:

0.1011010 00000100
1 1 1 1 45
0.1011010= 2

+ 8

+ 16

+ 64

= 64

Exponent = 4

Hence, the number is 45

64
× 24 = 11.25

1.3.2 Converting denary numbers into binary floating-

point numbers
Take a denary number and convert to normal two’s complement binary with
the decimal point and replace missing bits to the right with 0s

Move the binary point after the first digit

Represent the exponent by the number of places the decimal place is shifted
left to

01 Data Representation 9
For example:
4.5 →
0100.1000→ 0.1001000= Mantissa
00000011= Exponent

1.3.3 Potential rounding errors and approximations

Most numbers cannot be represented exactly in binary representation, hence
an approximate value is represented

This approximation becomes more accurate with greater number of bits

allowed to represent the number

Repeated calculation and using of previously rounded values can lead to an

inevitable rounding error

The inaccuracy may become significant enough to see

Normalization
First and second bits of mantissa must be different for a floating point number
to be normalized

Reasons:

To store maximum range of numbers with the smallest number of bits

To maximize the precision of the number for the given number of bits

To prevent multiple representations of the same number

To minimize the number of leading zeroes/ones

To normalize, simply make the mantissa start with 1.0 (for a negative number)
or 0.1 (for a negative number) by shifting bits left

For example:

0.0011100 00000101
Shift bits left to get 0.1110000

Since it was shifted left, the value was increased for the relevant bit,
hence reduce the exponent by the number of left shifts, which is 2 in

01 Data Representation 10
this case

Reducing exponent by 2 gives 00000011

Hence, the normalized form is: 0.111000000000011

1.3.4 Precision versus range

There is a trade-off between range and precision

More bits in mantissa means greater precision and vice versa

More bits in exponent means a greater range and vice versa

For a fixed number of bits, increasing one results in the decrease of other
and hence the trade-off

For a binary number with an 8-bit mantissa and an 8-bit exponent (using two’s
complement):

The maximum positive number which can be stored is:

127
0111111101111111= 128
× 2127
The smallest positive number which can be stored is:

1
0100000010000000= 2
× 2−128
The smallest magnitude negative number which can be stored is:

65
1011111110000000= 128

× 2−128
The largest magnitude negative number which can be stored is:

1000000001111111= −1 × 2127

1.3.5 Floating-point problems

Overflow: Number following a calculation is too big to be represented in the
given format

01 Data Representation 11
Underflow: Number is too small to be represented in the given format

Unable to stored the number zero in normalized form

01 Data Representation 12
Skill Check 3

1. Numbers are stored in a computer using floating-point representation

with:

12 bits for the mantissa

4 bits for the exponent

two’s complement form for both the mantissa and exponent

a) Write the normalized floating-point representation of the following

unsigned binary number using this system. [2]

1011100.011001

b) State the consequence of storing the binary number in part(a)(i) as a

floating-point number in this system. Justify your answer. [2]

2. Explain the reason why binary numbers are stored in normalized form.
[3]

Solution

a. Mantissa: 010111000110
Exponent:
0111
b. The accuracy of the number would be reduced because the least
significant bits of the original number have been lost.

2. To store the maximum range of numbers in the minimum number of

bits. Normalization minimizes the number of significant bits enabling
very large numbers to be stored with accuracy. Avoids the possibility
of many numbers having multiple representations.

Points To Note

01 Data Representation 13
Examples of non-composite user-defined data types include enumerated and
pointer data types

Record, set and class are examples of composite user-defined data types

File organization allow for serial, sequential or direct access

Floating-point representation for a real number allows a wider range of values

to be represented

A normalized floating-point representation achieves the best precision for the

value stored

Stored floating-point values rarely give an accurate representation of the

denary equivalent

01 Data Representation 14

Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
211 pages
A-Level - Paper 3 - Study Guide
No ratings yet
A-Level - Paper 3 - Study Guide
55 pages
Computer Science A2 Level 9618 Theory Notes
No ratings yet
Computer Science A2 Level 9618 Theory Notes
151 pages
Data Types Handout1
100% (1)
Data Types Handout1
75 pages
A2 Computer Science 9618 Paper 3 Notes
100% (2)
A2 Computer Science 9618 Paper 3 Notes
57 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
CompSci A2 Paper 3
No ratings yet
CompSci A2 Paper 3
42 pages
Caie A2 Level: Computer SCIENCE (9618)
No ratings yet
Caie A2 Level: Computer SCIENCE (9618)
20 pages
Caie A2 Level Computer Science 9618 Theory v1
100% (1)
Caie A2 Level Computer Science 9618 Theory v1
21 pages
13 Data Representation
No ratings yet
13 Data Representation
33 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
Computer Science (Cambridge International As & A Level)
No ratings yet
Computer Science (Cambridge International As & A Level)
261 pages
A Level Computer Science 9618 Paper 03 Notes by MR Saem
No ratings yet
A Level Computer Science 9618 Paper 03 Notes by MR Saem
20 pages
System Design Interview Complete Guide - Aman Barnwal
No ratings yet
System Design Interview Complete Guide - Aman Barnwal
28 pages
Chapter 16
No ratings yet
Chapter 16
28 pages
Data Structure and Algorithm Original Note
No ratings yet
Data Structure and Algorithm Original Note
116 pages
Computer Science - Engineering - Principles of Programming Languages - Specification Implementation - Notes
No ratings yet
Computer Science - Engineering - Principles of Programming Languages - Specification Implementation - Notes
64 pages
13.1 User Defined Datatype NEW (MT-L)
No ratings yet
13.1 User Defined Datatype NEW (MT-L)
15 pages
CIE ALevel CS Unit13
No ratings yet
CIE ALevel CS Unit13
70 pages
8 Type
No ratings yet
8 Type
81 pages
LESSON 5 - Introduction To Programming Part 1
No ratings yet
LESSON 5 - Introduction To Programming Part 1
26 pages
PPL-Unit 2
No ratings yet
PPL-Unit 2
45 pages
Unit 2 PPL
No ratings yet
Unit 2 PPL
56 pages
Data Representation Notes
No ratings yet
Data Representation Notes
34 pages
Rocket Revise
No ratings yet
Rocket Revise
28 pages
Data Representation
No ratings yet
Data Representation
32 pages
Computer Science
No ratings yet
Computer Science
78 pages
CH13 Data Representation
No ratings yet
CH13 Data Representation
3 pages
Chapter 13 - Data Representation
No ratings yet
Chapter 13 - Data Representation
33 pages
13
No ratings yet
13
27 pages
Unit 16 Data Representation
No ratings yet
Unit 16 Data Representation
25 pages
13.1, 13.2 User Defined Datatype & File Organi
No ratings yet
13.1, 13.2 User Defined Datatype & File Organi
17 pages
Data Types and Operations
No ratings yet
Data Types and Operations
24 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
Notes - Data Representation
No ratings yet
Notes - Data Representation
27 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
CAIE-A2 Level-Computer Science
No ratings yet
CAIE-A2 Level-Computer Science
26 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
17 pages
Caie A2 Level Computer Science 9618 Theory 66765357057aff1b71811350 715
No ratings yet
Caie A2 Level Computer Science 9618 Theory 66765357057aff1b71811350 715
30 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
DBMS End Term Question Paper April 2025
No ratings yet
DBMS End Term Question Paper April 2025
15 pages
Module 1 - Data Representation, and Data Structures-1
No ratings yet
Module 1 - Data Representation, and Data Structures-1
20 pages
A2 Computer
No ratings yet
A2 Computer
52 pages
Chapter 4 - Data Types
No ratings yet
Chapter 4 - Data Types
24 pages
Data Types & Data Structures - Arrays
No ratings yet
Data Types & Data Structures - Arrays
21 pages
Data Representation
No ratings yet
Data Representation
13 pages
Presentation On Use of Files. Programming
No ratings yet
Presentation On Use of Files. Programming
46 pages
Chapter 13
No ratings yet
Chapter 13
24 pages
Term2 Week 1 Lesson CSC Year 13
No ratings yet
Term2 Week 1 Lesson CSC Year 13
8 pages
20250505161527-DS Module-5
No ratings yet
20250505161527-DS Module-5
19 pages
Data Representation
No ratings yet
Data Representation
11 pages
Unit II
No ratings yet
Unit II
107 pages
CS-602 - PPL - Unit-2
No ratings yet
CS-602 - PPL - Unit-2
31 pages
Files and Their Organization: Data Hierarchy
No ratings yet
Files and Their Organization: Data Hierarchy
17 pages
Unit 3 - Data Structures
No ratings yet
Unit 3 - Data Structures
19 pages
Qlikview Tips Tricks
100% (1)
Qlikview Tips Tricks
189 pages
jBASE
No ratings yet
jBASE
49 pages
Data Representation - User Defined Data Types and Mantisa and Exponent
No ratings yet
Data Representation - User Defined Data Types and Mantisa and Exponent
7 pages
13 - Data Representation
No ratings yet
13 - Data Representation
4 pages
Unit 1: Introduction To Fundamental Data Types and Structures
No ratings yet
Unit 1: Introduction To Fundamental Data Types and Structures
15 pages
CSC-335 Data Structures and Algorithms: Instructor: Ahmad Reza Hadaegh
No ratings yet
CSC-335 Data Structures and Algorithms: Instructor: Ahmad Reza Hadaegh
23 pages
CSA2 Practical Chapter1 Sample
No ratings yet
CSA2 Practical Chapter1 Sample
3 pages
Chapter 16 Theory
No ratings yet
Chapter 16 Theory
2 pages
The Teradata Database - Part 3 Usage Fundamentals PDF
No ratings yet
The Teradata Database - Part 3 Usage Fundamentals PDF
20 pages
Forward Chaining in Python
No ratings yet
Forward Chaining in Python
17 pages
Ab Initio
No ratings yet
Ab Initio
14 pages
Android Tutorial
No ratings yet
Android Tutorial
694 pages
Data Structures Study Notes
No ratings yet
Data Structures Study Notes
34 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Algorithms (OBF) Dummies - SPARK
No ratings yet
Algorithms (OBF) Dummies - SPARK
29 pages
Sok ZK Friendly Hashes
No ratings yet
Sok ZK Friendly Hashes
19 pages
Hash Functions
No ratings yet
Hash Functions
60 pages
Oop Dsa
No ratings yet
Oop Dsa
18 pages
Unit 2
No ratings yet
Unit 2
23 pages
FIPS 180-1 - Secure Hash Standard
No ratings yet
FIPS 180-1 - Secure Hash Standard
10 pages
Top 50+ Java Collections Imp Interview Questions (2024)
No ratings yet
Top 50+ Java Collections Imp Interview Questions (2024)
54 pages
The Age of Cryptocurrency - How Bitcoin and Digital Money Are Challenging The Global Economic
No ratings yet
The Age of Cryptocurrency - How Bitcoin and Digital Money Are Challenging The Global Economic
4 pages
Bca C201 QB01
No ratings yet
Bca C201 QB01
21 pages
CNS U-3
No ratings yet
CNS U-3
24 pages
2.hybrid Machine Learning and MCDM Framework For Consumer Preference
No ratings yet
2.hybrid Machine Learning and MCDM Framework For Consumer Preference
24 pages
MCQ Amt 1
No ratings yet
MCQ Amt 1
17 pages
Advanced Java and J2EE March-2024-Varsha Pawar
No ratings yet
Advanced Java and J2EE March-2024-Varsha Pawar
14 pages
Notes ML For Data Science
No ratings yet
Notes ML For Data Science
14 pages
Informatica Partitions
No ratings yet
Informatica Partitions
11 pages
Python Built in Function
No ratings yet
Python Built in Function
2 pages
Enterprise Detection & Response - The Pyramid of Pain
No ratings yet
Enterprise Detection & Response - The Pyramid of Pain
3 pages
Physical Document Validation With Perceptual Hash
No ratings yet
Physical Document Validation With Perceptual Hash
6 pages
Pat 5893120
No ratings yet
Pat 5893120
17 pages
Database Management Systems Practice Problems: Storage, File Systems, Indexing, Hashing
No ratings yet
Database Management Systems Practice Problems: Storage, File Systems, Indexing, Hashing
3 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet

01 Data Representation

Uploaded by

01 Data Representation

Uploaded by

01 Data Representation

Candidates should be able to:

Define and use non-composite types

Define and use composite data types

Show understanding of the methods of file organization and select an

Show understanding of methods of file access

Show understanding of hashing algorithms

Describe the format of binary floating-point real numbers

Normalize floating-point numbers

Show understanding of the consequences of a binary representation only

Show understanding that binary representations can give rise to rounding

A data type for which the programmer has included the

A data-type which provides an ordered list of values that a

Pointer data type Used to reference a memory location

A data type that allows storing a finite number of different

Class Includes variables of given data types and methods

Jumps to a specific record in the file without accessing other

A mathematical formula used to perform a calculation on the

Number following a calculation is too big to be represented in

Underflow Number is too small to be represented in the given format

1.1 User-defined data types

User-defined data types

It is used to create a new data type

Allows to extend the flexibility of the programming language

Non-Composite data types

Example: Enumerated data type, Pointer

Enumerated data type

A data-type which provides an ordered list of values that a variable of this

Pointer data type

Used to reference a memory location

DECLARE IntPointer : ^INTEGER

DECLARE MyVar2 : INTEGER

^ is put before the data type to define a pointer data type

^ is placed after a pointer variable. It is an identifier that dereferences the

Composite data types

They are used to extend functionality of programming language

Example: Records, Sets, Classes

Includes variables of given data types and methods

Used in Object Oriented Programming, discussed in later chapters

1. Describe the purpose of a user-defined data type. [2]

2. Define, using pseudocode, the following enumerated data types:

First name and last name

The two days they attend:

One on a school day

One not on a school day

Use the enumerated types you created in part(b). [4]

1. To create a new data type to extend the flexibility of the programming

2. TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday

New records appended to end of file

Accessed only sequentially

Reorganization not needed when a new record is added

Sequential File Organization

Accessed both sequentially and by direct access with an index file

High hit rate (everyone needs a statement)

Suitable for batch processing

All customers need statement

There’s a unique key field

Organized by unique key field

Random File Organization

Low waiting time

Real time access

1.2.2 File Access

Efficient when every record in the file needs to be processed

Required when only an individual record from a file needs to be processed

1.2.3 Hashing Algorithms

Key field is hashed to produce a location address

If location is free, add the data there

Otherwise, use an overflow method to find a free location

If no free location, data cannot be stored

Key field is hashed to produce a location address

The record at the address is checked to see if it matches the desired

1. Compare sequential and serial methods of file organization. [4]

2. State the most suitable method of file access when a record is

1.3 Floating point numbers, representation

The number is the form M × 2E ﻿

1.3.1 Converting binary floating-point numbers into

The number is the form M × 2E

Use M×2E to get the denary output

Reducing exponent by 2 gives 00000011

Hence, the normalized form is: 0.111000000000011