0% found this document useful (0 votes)

34 views11 pages

Data Representation

The document discusses user-defined data types in programming, including composite types like records and sets, and non-composite types like enumerated and pointer data types. It also covers file organization and access methods, including serial, sequential, and direct access files, along with considerations for floating-point representation of real numbers and the challenges associated with it. The importance of normalization and precision in floating-point representation is emphasized, highlighting potential issues such as rounding errors and overflow/underflow conditions.

Uploaded by

waseem sabri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views11 pages

Data Representation

Uploaded by

waseem sabri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

MUHAMMAD WASEEM SABRI

Data representation

User-defined data types

 When object-oriented programming is not being used, a programmer may choose not to use

any user-defined data types. However, for a large program, their use will make a program less

error-prone and more understandable. It also has less restriction and allows for inevitable user

definition. The use of built in data types are the same for any program. However, there can't be

a built-in record type because each different problem will need an individual definition of a

record.

i. Composite data types:

Composite user-defined data types have a definition with a reference to at least one other type.

 ==Record Data type:== a data type that contains a fixed number of components that can be of

different types. it allows the programmer to collect together values with different data types

when these form a coherent whole. it could be used for the implementation of a data structure

where one or more of the variables defined are pointer variables.

 TYPE

 <main identifier>

 DECLARE <subidentifier1> : <built in data type>

 DECLARE <subidentifier2> : <built in data type>

 ENDTYPE


MUHAMMAD WASEEM SABRI

 <main identifier>.<sub identifier(x)> ← <value>

 ==Set Data type:== allows a program to create sets and to apply the mathematical operations

defined in set theory. Operations like:

• Union

• Difference

• Intersection

• Include an element in the set

• Exclude an element from the set

• Check whether an element is in a set

 ==Objects and Classes:== in object-oriented programming, a program defines the classes to be

used-they're all user-defined data types. Then for each class, the objects must be defined.

ii. Non-Composite data types:

Non-composite user-defined data types don’t involve a reference to another type. When a

programmer uses a simple built-in type the only requirement is for an identifier to be named

with a defined type. They have to be explicitly defined before an identifier can be created-

unlike built-in data types which include string, integer, real…

 ==Enumerated Data type:== a list of possible data values. The values defined here have an

implied order of values to allow comparisons to be made. Therefore value2 is greater than
MUHAMMAD WASEEM SABRI

value1(they're not string values and can't be quoted). This allows for comparisons to be made.

It is also countable thus finite values.

 TYPE

 <Datatype> = (<value1>,<value2>,<value3>…)

 ENDTYPE

 DECLARE <identifier> : <datatype>

 ==Pointer Data type:== used to reframe a memory location. it may be used to construct

dynamically varying data structures. The pointer definition has to relate to the type of the

variable that is being pointed to(doesn’t hold a value but a reference/address to data).

 TYPE

 <Datatype> = ^<type name>

 ENDTYPE

 DECLARE <identifier> : <datatype>

 <assignment value> ← <identifier>^

Special use of a pointer variable is to access the value stored at the address pointed to. The

pointer variable is said to be dereferenced.

File organization and access

MUHAMMAD WASEEM SABRI

Contents, in a file of any type, is stored using a defined binary code that allows the file to be

used in the way intended. But, for storing data to be used by a computer program, there are

only 2 defined file types, a text file or a binary file.

 A text file contains data stored according to a defined character code defined by ASCII or

Unicode. A text file can be created using a text editor.

 A binary file is a file designed for storing data to be used by a computer program(0's and 1's). It

stores data in its internal representation(an integer value might be stored in 2 bytes in 2's

complement representation to represent a negative number) and this file is created using a

specific program. Its organization is based on records (a collection of fields containing data

values). file → records → fields → values

Methods of file organization

 ==Serial files:== contains records that have no defined order. A text file may be a serial file

where the file has repeating lines which are defined by an end of line character(s). There's no

end of record character. A record in a serial file must have a defined format to allow data to be

input and output correctly. To access a specific record, it has to go through every record until

found.

File access: Successively read record by record until the data required is found thus very

slow. Uses:

 Batch processing

 Backing up data on magnetic tape

 Banks record transactions involving customer accounts every time there is a transaction
MUHAMMAD WASEEM SABRI

 ==Sequential files:== has records that are ordered and is suited for long term storage of data

and thus is considered an alternative to a database. A key field is required for a sequential file

to be ordered for which the values are unique and sequential. This way it can be easily

accessed. A sequential database file is more efficient than a text file due to data integrity,

privacy and less data redundancy. A change in one file would update any other files affected.

Primary keys from the DBMS(database management system) need to be unique but not

ordered unlike the key field from the sequential files which need to be ordered and unique. A

particular record is found by sequentially reading the value of the key field until the required

value is found.

File access:

Successively read the value In the key field until the required key is found.

To edit/delete data:

Create a new version of the file. Data is copied from the old file to the new file until the record

is reached which needs editing or deleting. For deleting, reading and copying of the old file

continue from the next record. If a record has been edited, the new version is written to the

new file and the remaining records are copied to the new file.

 ==Direct access/random access files:== access isn't defined by a sequential reading of the

file(random). It's well suited for larger files as it takes longer to access sequentially. Data in

direct access files are stored in an identifiable record which could be found by involving initial

direct access to a nearby record followed by a limited serial search. The choice of the position

chosen must be calculated using data in the record so the same calculation can be carried out
MUHAMMAD WASEEM SABRI

when subsequently there's a search for the data. One method is the hashing algorithm which

takes the key field as an input and outputs a value for the position of the record relative to the

start of the file. To access, the key is hashed to a specific location. This algorithm also takes into

account the potential maximum length of the file which is the number of records the file will

store.

 eg: If the key field is numeric, divide by a suitable large number and use the remainder to find a

position. But we won't have unique positions. If a hash position is calculated that duplicates one

already calculated by a different key, the next position in the file is used. this is why a search will

involve direct access possibly followed by a limited serial search. That's why it's considered

partly sequential and partly serial.

File access:

The value in the key field is submitted to the hashing algorithm which then provides the same

value for the position in the file that was provided when the algorithm was used at the time of

data input. It goes to that hashed position and through another short linear search because of

collisions in the hashed positions. Fastest access.

To edit/delete data:

Only create a new file if the current file is full. A deleted record can have a flag set so that in a

subsequent reading process the record is skipped over. This allows it to be overwritten.

Uses:

Most suited for when a program needs a file in which individual data items might be read,

updated or deleted.
MUHAMMAD WASEEM SABRI

Factors that determine the file organization to use:

 How often do transactions take place, how often does one need to add data?

 How often does it need to be accessed, edited, or deleted?

Real numbers and normalized floating-point

representation

 Real number: A number that contains a fractional part.

 Floating-point representation: The approximate representation of a real number using binary

digits.

 Format: Number = ±Mantissa × BaseExponent

 Mantissa: The non-zero part of the number.

 Exponent: The power to which the base is raised to in order to accurately represent the

number.

 Base: The number of values the number systems allows a digit to take. 2 in the case of floating-

point representation.

The floating point representation stores a value for the mantissa and a value for the exponent.

A defined number of bits are used for what is called the significant/mantissa, +-M. Remaining

bits are for the exponent, E. The radix, R is not stored in the representation as it has an implied

value of 2(representing 0 and 1's). If a real number was stored using 8 bits: four bits for the

mantissa and four bits for the exponent with each using two complement representation. The
MUHAMMAD WASEEM SABRI

exponent is stored as a signed integer. The mantissa has to be stored as a fixed point real value.

The binary point can be in the beginning after the first bit(immediately after the sign bit) or

before the last bit. The former produces smaller spacing between the values that can be

represented and is more preferred. It also has a greater range than the fixed representation.

Converting a denary value expressed as a real number into a floating point binary

representation: Most fractional parts do not convert to a precise representation as binary

MUHAMMAD WASEEM SABRI

fractional parts represent a half, a quarter, an eighth…(even). Other than .5 no other values

unless the ones above can be converted accurately. So you convert by multiplying by two and

recording the whole number part.

For example: 8.63, 0.63 * 2 = 1.26 therefore .1 -> 0.26 * 2 = 0.52 and .10 -> 0.52 * 2 = 1.04 and

.101 and you keep going until the required amount of bits are achieved.

The method for converting a positive value is:

1. Convert the whole number part

2. Add the sign bit 0

3. Convert the fractional part. You start by combining the two parts which gives the exponent

value of zero. Shift the binary points by shifting the decimal to the beginning giving a higher

exponent value. Depending on the number of bits, add extra 0's at the end of the mantissa and

beginning of the exponent.

4. Adjust the position of the binary point and change the exponent accordingly to achieve a

normalized form.

Therefore: 8.75 -> 1000 -> 01000 -> .11 -> 010000.11 -> 0.100011(mantissa) -> 0100011000

0100(10 for M, and 4 for E).

 For negatives, use 2's complement.

 When implementing the floating point representation, a decision has to be made regarding the

total number of bits to be used and how many for the mantissa and exponent.
MUHAMMAD WASEEM SABRI

 Usually, the choice for the total number of bits will be provided as an option when the program

is written, however, the split between the two parts will have been determined by the floating

point processor.

 If there were a choice, it's convenient to note that increasing the number of bits for the

mantissa would give better precision but would leave fewer bits for the exponent thus reducing

the range of possible values and vice versa. For maximum precision, it is necessary to normalize

a floating point number.

 Optimum precision will only be made once full use is made of the bits in the mantissa therefore

using the largest possible magnitude for the value represented by the mantissa.

 Also, the two most significant bits must be different. 0 1 for positives and 10 for negatives.

 -they both equal 2 but the most precise is the second one with the, higher bits in the mantissa.

 0.125 * 2^4 = 2 0 001 0100

 0.5 * 2^2 = 2 0 100 0010

-For negatives.

 0.25 * 2^4 = -4 1 110 0100

 1.0 * 2^2 = -4 1 000 0010

When the number is represented with the highest magnitude for the mantissa, the two most

significant bits are different thus that a number is in a normalized representation. How a
MUHAMMAD WASEEM SABRI

number could be normalized: for a positive number, the bits in the mantissa are shifted left

until the most significant bits are 0 followed by 1. For each shift left the value of the exponent is

reduced by 1. The same process of shifting is used for a negative number until the most

significant bits are 1 followed by 0. In this case, no attention is paid to the fact that bits are

falling off the most significant end of the mantissa. Thus normalization is shifting bits to the left

until the 2 most significant bits are different.

Problems with using floating point numbers:

1. The conversion of real denary values to binary mostly needs a degree of approximation

followed by the restriction of the number of bits used to store the mantissa. These rounding

errors can become significant after multiple calculations. The only way of preventing a serious

problem is to increase the precision by using more bits for the mantissa. Programming

languages therefore offer options to work in double/quadruple precision.

2. The highest value represented is 112 thus a limited range. This produces an overflow condition.

If there is a result value smaller than one that can be stored, there would be an underflow error

condition. This very small number can be turned into zero but there are several risks like

multiplication or division of this value.

eg: One use of floating point numbers are in extended mathematical procedures involving

repeated calculations like weather forecasting which uses the mathematical model of the

atmosphere.

Karina e Zindagi - Hindi 1
50% (2)
Karina e Zindagi - Hindi 1
229 pages
A-Level - Paper 3 - Study Guide
No ratings yet
A-Level - Paper 3 - Study Guide
55 pages
Vipin Resume
No ratings yet
Vipin Resume
2 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
Daniel Gottesman Thesis
100% (3)
Daniel Gottesman Thesis
6 pages
B9 Scheme Tsol All Subjects
100% (1)
B9 Scheme Tsol All Subjects
54 pages
Data Types Handout1
100% (1)
Data Types Handout1
75 pages
VB Console Student Booklet
No ratings yet
VB Console Student Booklet
47 pages
Unit 6 Arrays: Structure
100% (1)
Unit 6 Arrays: Structure
14 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
13 Data Representation
No ratings yet
13 Data Representation
33 pages
Caie A2 Level: Computer SCIENCE (9618)
No ratings yet
Caie A2 Level: Computer SCIENCE (9618)
20 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
13.1 User Defined Datatype NEW (MT-L)
No ratings yet
13.1 User Defined Datatype NEW (MT-L)
15 pages
Chapter 16
No ratings yet
Chapter 16
28 pages
Computer Science A2 Level 9618 Theory Notes
No ratings yet
Computer Science A2 Level 9618 Theory Notes
151 pages
IK Gujral Punjab Technical University Jalandhar, Kaputhala
No ratings yet
IK Gujral Punjab Technical University Jalandhar, Kaputhala
53 pages
Android Controlled Spy Robot With Night Vision Camera
No ratings yet
Android Controlled Spy Robot With Night Vision Camera
16 pages
Communication and Internet Technologies
No ratings yet
Communication and Internet Technologies
15 pages
Karvy Stock Broking Limited Mobile App User Manual
No ratings yet
Karvy Stock Broking Limited Mobile App User Manual
37 pages
Ecil Eee and Eie Syllabus
No ratings yet
Ecil Eee and Eie Syllabus
3 pages
CH13 Data Representation
No ratings yet
CH13 Data Representation
3 pages
Data Structures BSC
No ratings yet
Data Structures BSC
63 pages
Chapter 7database and Data Modelling
No ratings yet
Chapter 7database and Data Modelling
9 pages
Computational Thinking and Problem
No ratings yet
Computational Thinking and Problem
5 pages
Computer Science - Engineering - Principles of Programming Languages - Specification Implementation - Notes
No ratings yet
Computer Science - Engineering - Principles of Programming Languages - Specification Implementation - Notes
64 pages
FYP Proposal Presentation Final
No ratings yet
FYP Proposal Presentation Final
14 pages
A2 Computer Science 9618 Paper 3 Notes
100% (2)
A2 Computer Science 9618 Paper 3 Notes
57 pages
GEPI Instructions 2025
No ratings yet
GEPI Instructions 2025
2 pages
Data Structure and Algorithm Original Note
No ratings yet
Data Structure and Algorithm Original Note
116 pages
Review C Topic 2.13 EquationsInequalitiesInverses Updated
No ratings yet
Review C Topic 2.13 EquationsInequalitiesInverses Updated
2 pages
8 Type
No ratings yet
8 Type
81 pages
02 - Unit 2. Data Types
No ratings yet
02 - Unit 2. Data Types
64 pages
Rocket Revise
No ratings yet
Rocket Revise
28 pages
13.1, 13.2 User Defined Datatype & File Organi
No ratings yet
13.1, 13.2 User Defined Datatype & File Organi
17 pages
Unit 6 File Organization - Prof Gauri Y Gunjal
No ratings yet
Unit 6 File Organization - Prof Gauri Y Gunjal
67 pages
02 - Unit 2. Data Types
No ratings yet
02 - Unit 2. Data Types
60 pages
CompSci A2 Paper 3
No ratings yet
CompSci A2 Paper 3
42 pages
Unit 2 - Principles of Programming Languages - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Principles of Programming Languages - WWW - Rgpvnotes.in
14 pages
Final Project On MR Puff
No ratings yet
Final Project On MR Puff
12 pages
Colin Boyd Information Security Research Centre School of Data Communications Queensland University of Technology Brisbane Q4001 Australia
No ratings yet
Colin Boyd Information Security Research Centre School of Data Communications Queensland University of Technology Brisbane Q4001 Australia
17 pages
CSA2 Practical Chapter1 Sample
No ratings yet
CSA2 Practical Chapter1 Sample
3 pages
Data Representation Notes
No ratings yet
Data Representation Notes
34 pages
Unit 2 PPL
No ratings yet
Unit 2 PPL
56 pages
Term2 Week 1 Lesson CSC Year 13
No ratings yet
Term2 Week 1 Lesson CSC Year 13
8 pages
CIE ALevel CS Unit13
No ratings yet
CIE ALevel CS Unit13
70 pages
13 - Data Representation
No ratings yet
13 - Data Representation
4 pages
TCPIP Fundamentals
No ratings yet
TCPIP Fundamentals
1 page
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
Chapter 13 - Data Representation
No ratings yet
Chapter 13 - Data Representation
33 pages
Shading Devices PDF
No ratings yet
Shading Devices PDF
39 pages
Chapter 4 Processor Fundamentals A Levels
No ratings yet
Chapter 4 Processor Fundamentals A Levels
13 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
17 pages
Systems Analysis and Design With UML 2.0
No ratings yet
Systems Analysis and Design With UML 2.0
30 pages
Data Representation
No ratings yet
Data Representation
32 pages
Computer Science
No ratings yet
Computer Science
78 pages
A Beginner
No ratings yet
A Beginner
21 pages
Internship-Report 2028208
No ratings yet
Internship-Report 2028208
24 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
Unit 16 Data Representation
No ratings yet
Unit 16 Data Representation
25 pages
Textile Management System Final Review
No ratings yet
Textile Management System Final Review
40 pages
A2 Computer
No ratings yet
A2 Computer
52 pages
Expense Management in D365
No ratings yet
Expense Management in D365
28 pages
Data Types and Operations
No ratings yet
Data Types and Operations
24 pages
Caie A2 Level Computer Science 9618 Theory v1
100% (1)
Caie A2 Level Computer Science 9618 Theory v1
21 pages
Caie A2 Level Computer Science 9618 Theory 66765357057aff1b71811350 715
No ratings yet
Caie A2 Level Computer Science 9618 Theory 66765357057aff1b71811350 715
30 pages
Cse102-Data Structures and Algorithms Unit-I: M. Nagaraju, SCSE, VIT Vellore
No ratings yet
Cse102-Data Structures and Algorithms Unit-I: M. Nagaraju, SCSE, VIT Vellore
43 pages
Unit 3 - Data Structures
No ratings yet
Unit 3 - Data Structures
19 pages
Data Representation - User Defined Data Types and Mantisa and Exponent
No ratings yet
Data Representation - User Defined Data Types and Mantisa and Exponent
7 pages
Data Representation
No ratings yet
Data Representation
13 pages
Wintertotal 2014
No ratings yet
Wintertotal 2014
14 pages
Dex2jar Steps
No ratings yet
Dex2jar Steps
6 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
Chapter 16 Theory
No ratings yet
Chapter 16 Theory
2 pages
01 Data Representation
No ratings yet
01 Data Representation
14 pages
CAIE-A2 Level-Computer Science
No ratings yet
CAIE-A2 Level-Computer Science
26 pages
User Defined Data Type: Syllabus Content
No ratings yet
User Defined Data Type: Syllabus Content
10 pages
Notes - Data Representation
No ratings yet
Notes - Data Representation
27 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
Chapter 4 - Data Types
No ratings yet
Chapter 4 - Data Types
24 pages
Product Description
No ratings yet
Product Description
15 pages
Routing Protocol Selection Guide - IGRP, Eigrp, Ospf, Is-Is, BGP
No ratings yet
Routing Protocol Selection Guide - IGRP, Eigrp, Ospf, Is-Is, BGP
9 pages
Chapter 13
No ratings yet
Chapter 13
24 pages
F4 Help Options in BEx
No ratings yet
F4 Help Options in BEx
12 pages
CS-602 - PPL - Unit-2
No ratings yet
CS-602 - PPL - Unit-2
31 pages
A Level Computer Science 9618 Paper 03 Notes by MR Saem
No ratings yet
A Level Computer Science 9618 Paper 03 Notes by MR Saem
20 pages
CSC-335 Data Structures and Algorithms: Instructor: Ahmad Reza Hadaegh
No ratings yet
CSC-335 Data Structures and Algorithms: Instructor: Ahmad Reza Hadaegh
23 pages
Chapter-1 A Levels
No ratings yet
Chapter-1 A Levels
8 pages
Fundamentals of Photogrammetry
No ratings yet
Fundamentals of Photogrammetry
25 pages
Take-Home Exam Questions On Learning
No ratings yet
Take-Home Exam Questions On Learning
2 pages
Files and Their Organization: Data Hierarchy
No ratings yet
Files and Their Organization: Data Hierarchy
17 pages
Conceptofcdatatypes 090925045031 Phpapp02
No ratings yet
Conceptofcdatatypes 090925045031 Phpapp02
30 pages
Unit II
No ratings yet
Unit II
107 pages
External Parts: Types of Peripheral
No ratings yet
External Parts: Types of Peripheral
3 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet

Data Representation

Uploaded by

Data Representation

Uploaded by

MUHAMMAD WASEEM SABRI

User-defined data types

i. Composite data types:

where one or more of the variables defined are pointer variables.

 DECLARE <subidentifier1> : <built in data type>

 DECLARE <subidentifier2> : <built in data type>

 <main identifier>.<sub identifier(x)> ← <value>

defined in set theory. Operations like:

• Include an element in the set

• Exclude an element from the set

• Check whether an element is in a set

 ==Objects and Classes:== in object-oriented programming, a program defines the classes to be

ii. Non-Composite data types:

unlike built-in data types which include string, integer, real…

It is also countable thus finite values.

 DECLARE <identifier> : <datatype>

 <Datatype> = ^<type name>

 DECLARE <identifier> : <datatype>

 <assignment value> ← <identifier>^

pointer variable is said to be dereferenced.

File organization and access

only 2 defined file types, a text file or a binary file.

Unicode. A text file can be created using a text editor.

values). file → records → fields → values

Methods of file organization

 Backing up data on magnetic tape

partly sequential and partly serial.

collisions in the hashed positions. Fastest access.

Factors that determine the file organization to use:

 How often does it need to be accessed, edited, or deleted?

Real numbers and normalized floating-point

 Real number: A number that contains a fractional part.

 Floating-point representation: The approximate representation of a real number using binary

 Format: Number = ±Mantissa × BaseExponent

 Mantissa: The non-zero part of the number.

representation: Most fractional parts do not convert to a precise representation as binary

recording the whole number part.

The method for converting a positive value is:

1. Convert the whole number part

2. Add the sign bit 0

beginning of the exponent.

0100(10 for M, and 4 for E).

 For negatives, use 2's complement.

a floating point number.

 0.125 * 2^4 = 2 0 001 0100

 0.5 * 2^2 = 2 0 100 0010

 0.25 * 2^4 = -4 1 110 0100

 1.0 * 2^2 = -4 1 000 0010

until the 2 most significant bits are different.

Problems with using floating point numbers:

languages therefore offer options to work in double/quadruple precision.

multiplication or division of this value.

You might also like