0% found this document useful (0 votes)
20 views17 pages

Data Organisation and File Allocation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views17 pages

Data Organisation and File Allocation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Data Organisation

• A binary digit (bit) can be a 0 or 1, called as a bit.


• A character of data is represented by a string of 8 bits,
which is called as a Byte.
• In a computer system, data is organised in the following
ways:
• Meaningful data such as a name or an address is called as a
data field.
• Related data fields are organised into a data record which is
used to describe a certain object. Eg. A customer’s record that
may contain an ID, name, address, etc.
• Related data records with the same structures are grouped
together as a data file (data table). Eg. A file that contains all
customer records.
• In a business data storage, all the related data files are linked to
form a database. Eg. The database for an e-commerce website.
An Example of Data Organisation
• The following shows the contents of a data file
• Each row in the file is a data record
• Each piece of data item Is a data field
Data Measurement
• Byte is the basic unit for measuring the size of data.
• Commonly used units for data size are
Kilobyte (KB, = 1024 bytes, 210 bytes)
Megabyte (MB, = 1024 KB, 220 bytes, approximately 106 bytes)
Gigabyte (GB, = 1024 MB, 230 bytes, approximately 109 bytes)
Terabyte (TB, = 1024 GB, 240 bytes, approximately 1012 bytes)
Petabyte (PB, = 1024 TB, 250 bytes, approximately 1015 bytes).
Disk Structure
• A hard disk contains a number of
metal platters coated with magnetic
material on surfaces to store data.
• The surface of a magnetic disk is
organized into:
• Tracks: Concentric rings on the
platter.
• Cylinders: Collection of all tracks on
platters which are horizontally in the
same position.
• Sectors: Radial lines spaced
• Disk read/write heads move
concurrently along the fixed disk
arm. The disk itself rotates to provide
access to every sector.
Hard Disk Structure
Physical Disk Structure
• Physical disk structure is the physical existence on the
disk surfaces.
• Formatting is the operation which creates the physical
disk structure.
• Formatting organizes and marks the surface of a disk
into tracks, sectors , and cylinders.
• It is also sometimes (incorrectly) a term used to signify
the action of writing a file system to a disk (especially
in the MS Windows/MS DOS world).
Logical Disk Structure
• Logical disk structure is used to describe how disk
surface is organised to store data. It is only a conceptual
definition and may not exist physically.
• Partition: A disk can be coarsely divided into partitions –
each of which functions as an independent storage
device.
• Drive: A partition of a disk.
• Block: The operating system views all the disk space as
arranged into an array of fixed size logical blocks. A
logical block is the smallest unit of data to transfer.
File Allocation
• File allocation describes how a file is stored in secondary
storage and how it can be accessed by computer
programs
• There are three types of file allocation:
o Contiguous allocation
o Chained or linked allocation
o Indexed allocation (e.g inode)
• Reference:
File allocation
http://www.geeksforgeeks.org/file-allocation-methods/
Contiguous Allocation
• A single contiguous set of blocks is allocated to a file at the
time of file creation.
• The operation system only stores the location of the
starting block and the total size of the file.
• To access information in block number B, this information
resides at the location of starting block + B
• Supports random access:
The operating system knows exactly where every block is
after the starting block by a single calculation.
• Fragmentation of unused space (external fragmentation)
will occur, needs compaction (defragmentation).
• Often used in optical discs rather than magnetic disks
Contiguous Allocation (cont.)
Chained (Linked) Allocation

• File is saved as a collection of non-contiguous blocks


File is implemented as a linked list of blocks
• Each block contains a pointer to the address of next
block. Last block contains invalid (negative) number as
the pointer (End-Of-File marker)
• Directory entry contains the head (starting) block
number and length of the file
• Chained allocation is good for sequential access, bad
for random access
Chained (Linked) Allocation (cont.)
Indexed (inode) Allocation
• The inode system is used in Unix.
• Each file has an inode and all inodes are numbered.
• In addition to inode, there is a specialised block in a
file known as index.
• The blocks in a file are indexed.
• The index points to the addresses (locations) of the
blocks.
• A binary tree is usually used in searching index for a
particular block.
• Reference:
https://www.youtube.com/watch?v=lzAY_mxFDAI
The Binary Tree and Inode
• Inode uses Binary tree
method to search for
block locations
• Each index node
contains 2 links to other
nodes. 8-11 12-15

• The distance from the


top to the data blocks is
d
• Example:
• k=2 (for binary tree)
• n=the number of
data blocks in a file
The Binary Tree and Inodes (cont.)
• The General formula:
2d = n therefore , d = log2(n)
• In the case of binary tree, k = 2 (binary)
• For example :
If there are 16 blocks in a file, n = 16
The maximum number of access to the file to find the location
of a particular block is:
d = log2(16) = 4
This means that it takes the maximum of 4 access to find the
location of any block in the file of 16 blocks.
• Quiz:
What is the maximum number of access to find a block if
there are 30 blocks in a file?
The Big O Notation
• It is a mathematical notation, indicating “in the range
of…”
Used in file access calculation, it means “the maximum
number of access”
• Let n be the number of blocks in a file. To find a
particular block how many disk accesses does it take?
• Contiguous: O(1) “about 1”
• Chained/Linked: O(n) “roughly n blocks”
The average is n/2 blocks
• Indexed/Inode: O(log2(n)) blocks

You might also like