We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17
Data Organisation
• A binary digit (bit) can be a 0 or 1, called as a bit.
• A character of data is represented by a string of 8 bits, which is called as a Byte. • In a computer system, data is organised in the following ways: • Meaningful data such as a name or an address is called as a data field. • Related data fields are organised into a data record which is used to describe a certain object. Eg. A customer’s record that may contain an ID, name, address, etc. • Related data records with the same structures are grouped together as a data file (data table). Eg. A file that contains all customer records. • In a business data storage, all the related data files are linked to form a database. Eg. The database for an e-commerce website. An Example of Data Organisation • The following shows the contents of a data file • Each row in the file is a data record • Each piece of data item Is a data field Data Measurement • Byte is the basic unit for measuring the size of data. • Commonly used units for data size are Kilobyte (KB, = 1024 bytes, 210 bytes) Megabyte (MB, = 1024 KB, 220 bytes, approximately 106 bytes) Gigabyte (GB, = 1024 MB, 230 bytes, approximately 109 bytes) Terabyte (TB, = 1024 GB, 240 bytes, approximately 1012 bytes) Petabyte (PB, = 1024 TB, 250 bytes, approximately 1015 bytes). Disk Structure • A hard disk contains a number of metal platters coated with magnetic material on surfaces to store data. • The surface of a magnetic disk is organized into: • Tracks: Concentric rings on the platter. • Cylinders: Collection of all tracks on platters which are horizontally in the same position. • Sectors: Radial lines spaced • Disk read/write heads move concurrently along the fixed disk arm. The disk itself rotates to provide access to every sector. Hard Disk Structure Physical Disk Structure • Physical disk structure is the physical existence on the disk surfaces. • Formatting is the operation which creates the physical disk structure. • Formatting organizes and marks the surface of a disk into tracks, sectors , and cylinders. • It is also sometimes (incorrectly) a term used to signify the action of writing a file system to a disk (especially in the MS Windows/MS DOS world). Logical Disk Structure • Logical disk structure is used to describe how disk surface is organised to store data. It is only a conceptual definition and may not exist physically. • Partition: A disk can be coarsely divided into partitions – each of which functions as an independent storage device. • Drive: A partition of a disk. • Block: The operating system views all the disk space as arranged into an array of fixed size logical blocks. A logical block is the smallest unit of data to transfer. File Allocation • File allocation describes how a file is stored in secondary storage and how it can be accessed by computer programs • There are three types of file allocation: o Contiguous allocation o Chained or linked allocation o Indexed allocation (e.g inode) • Reference: File allocation http://www.geeksforgeeks.org/file-allocation-methods/ Contiguous Allocation • A single contiguous set of blocks is allocated to a file at the time of file creation. • The operation system only stores the location of the starting block and the total size of the file. • To access information in block number B, this information resides at the location of starting block + B • Supports random access: The operating system knows exactly where every block is after the starting block by a single calculation. • Fragmentation of unused space (external fragmentation) will occur, needs compaction (defragmentation). • Often used in optical discs rather than magnetic disks Contiguous Allocation (cont.) Chained (Linked) Allocation
• File is saved as a collection of non-contiguous blocks
File is implemented as a linked list of blocks • Each block contains a pointer to the address of next block. Last block contains invalid (negative) number as the pointer (End-Of-File marker) • Directory entry contains the head (starting) block number and length of the file • Chained allocation is good for sequential access, bad for random access Chained (Linked) Allocation (cont.) Indexed (inode) Allocation • The inode system is used in Unix. • Each file has an inode and all inodes are numbered. • In addition to inode, there is a specialised block in a file known as index. • The blocks in a file are indexed. • The index points to the addresses (locations) of the blocks. • A binary tree is usually used in searching index for a particular block. • Reference: https://www.youtube.com/watch?v=lzAY_mxFDAI The Binary Tree and Inode • Inode uses Binary tree method to search for block locations • Each index node contains 2 links to other nodes. 8-11 12-15
• The distance from the
top to the data blocks is d • Example: • k=2 (for binary tree) • n=the number of data blocks in a file The Binary Tree and Inodes (cont.) • The General formula: 2d = n therefore , d = log2(n) • In the case of binary tree, k = 2 (binary) • For example : If there are 16 blocks in a file, n = 16 The maximum number of access to the file to find the location of a particular block is: d = log2(16) = 4 This means that it takes the maximum of 4 access to find the location of any block in the file of 16 blocks. • Quiz: What is the maximum number of access to find a block if there are 30 blocks in a file? The Big O Notation • It is a mathematical notation, indicating “in the range of…” Used in file access calculation, it means “the maximum number of access” • Let n be the number of blocks in a file. To find a particular block how many disk accesses does it take? • Contiguous: O(1) “about 1” • Chained/Linked: O(n) “roughly n blocks” The average is n/2 blocks • Indexed/Inode: O(log2(n)) blocks