0% found this document useful (0 votes)

156 views6 pages

Internal File Structure: Methods and Design Paradigm

The document discusses various methods of organizing computer files, including their internal and external structures. It describes sequential, line-sequential, indexed-sequential, inverted list, and direct/hashed access methods of file organization. Key considerations for file organization include rapid access to related records, efficiency of storage and retrieval, and ensuring data integrity. The document also discusses file extensions and how file naming conventions differ between FAT and NTFS file systems.

Uploaded by

dhiraj100

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views6 pages

Internal File Structure: Methods and Design Paradigm

Uploaded by

dhiraj100

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 6

Introduction

File organization is the methodology which is applied to structured computer files. Files
contain computer records which can be documents or information which is stored in a
certain way for later retrieval. File organization refers primarily to the logical
arrangement of data (which can itself be organized in a system of records with correlation
between the fields/columns) in a file system. It should not be confused with the physical
storage of the file in some types of storage media. There are certain basic types of
computer file, which can include files stored as blocks of data and streams of data, where
the information streams out of the file while it is being read until the end of the file is
encountered.

We will look at two components of file organization here:

1. The way the internal file structure is arranged and

2. The external file as it is presented to the O/S or program that calls it. Here we will
also examine the concept of file extensions.

We will examine various ways that files can be stored and organized. Files are presented
to the application as a stream of bytes and then an EOF (end of file) condition.

A program that uses a file needs to know the structure of the file and needs to interpret its
contents.

[edit]

Internal File Structure

[edit]

Methods and Design Paradigm

It is a high-level design decision to specify a system of file organization for a computer

software program or a computer system designed for a particular purpose. Performance is
high on the list of priorities for this design process, depending on how the file is being
used. The design of the file organization usually depends mainly on the system
environment. For instance, factors such as whether the file is going to be used for
transaction-oriented processes like OLTP or Data Warehousing, or whether the file is
shared among various processes like those found in a typical distributed system or
standalone. It must also be asked whether the file is on a network and used by a number
of users and whether it may be accessed internally or remotely and how often it is
accessed.

However, all things considered the most important considerations might be:

1. Rapid access to a record or a number of records which are related to each other.
2. The Adding, modification, or deletion of records.
3. Efficiency of storage and retrieval of records.
4. Redundancy, being the method of ensuring data integrity.

A file should be organized in such a way that the records are always available for
processing with no delay. This should be done in line with the activity and volatility of
the information.

[edit]

Types of File Organization

Organizing a file depends on what kind of file it happens to be: a file in the simplest form
can be a text file, (in other words a file which is composed of ascii (American Standard
Code for Information Interchange) text.) Files can also be created as binary or executable
types (containing elements other than plain text.) Also, files are keyed with attributes
which help determine their use by the host operating system.

[edit]

Techniques of File Organization

The three techniques of file organization are:

1. Heap (unordered)
2. Sorted
1. Sequential (SAM)
2. Line Sequential (LSAM)
3. Indexed Sequential (ISAM)
3. Hashed or Direct

In addition to the three techniques, there are four methods of organizing files. They are
sequential, line-sequential, indexed-sequential, inverted list and direct or hashed
access organization.

[edit]

Sequential Organization

A sequential file contains records organized in the order they were entered. The order of
the records is fixed. The records are stored and sorted in physical, contiguous blocks
within each block the records are in sequence.

Records in these files can only be read or written sequentially.

Once stored in the file, the record cannot be made shorter, or longer, or deleted. However,
the record can be updated if the length does not change. (This is done by replacing the
records by creating a new file.) New records will always appear at the end of the file.

If the order of the records in a file is not important, sequential organization will
suffice, no matter how many records you may have. Sequential output is also useful for
report printing or sequential reads which some programs prefer to do.

[edit]

Line-Sequential Organization

Line-sequential files are like sequential files, except that the records can contain only
characters as data. Line-sequential files are maintained by the native byte stream files of
the operating system.

In the COBOL environment, line-sequential files that are created with WRITE statements
with the ADVANCING phrase can be directed to a printer as well as to a disk.

[edit]

Indexed-Sequential Organization

Key searches are improved by this system too. The single-level indexing structure is the
simplest one where a file, whose records are pairs, contains a key pointer. This pointer is
the position in the data file of the record with the given key. A subset of the records,
which are evenly spaced along the data file, is indexed, in order to mark intervals of data
records.

This is how a key search is performed: the search key is compared with the index keys to
find the highest index key coming in front of the search key, while a linear search is
performed from the record that the index key points to, until the search key is matched or
until the record pointed to by the next index entry is reached. Regardless of double file
access (index + data) required by this sort of search, the access time reduction is
significant compared with sequential file searches.

Let's examine, for sake of example, a simple linear search on a 1,000 record sequentially
organized file. An average of 500 key comparisons are needed (and this assumes the
search keys are uniformly distributed among the data keys). However, using an index
evenly spaced with 100 entries, the total number of comparisons is reduced to 50 in the
index file plus 50 in the data file: a five to one reduction in the operations count!

Hierarchical extension of this scheme is possible since an index is a sequential file in

itself, capable of indexing in turn by another second-level index, and so forth and so on.
And the exploit of the hierarchical decomposition of the searches more and more, to
decrease the access time will pay increasing dividends in the reduction of processing
time. There is however a point when this advantage starts to be reduced by the increased
cost of storage and this in turn will increase the index access time.

Hardware for Index-Sequential Organization is usually Disk-based, rather than tape.

Records are physically ordered by primary key. And the index gives the physical location
of each record. Records can be accessed sequentially or directly, via the index. The index
is stored in a file and read into memory at the point when the file is opened. Also, indexes
must be maintained.

Life sequential organization the data is stored in physical contiguous box. How ever the
difference is in the use of indexes. There are three areas in the disc storage:

• Primary Area:-Contains file records stored by key or ID numbers.

• Overflow Area:-Contains records area that cannot be placed in primary area.
• Index Area:-It contains keys of records and there locations on the disc.

[edit]

Inverted List

In file organization, this is a file that is indexed on many of the attributes of the data
itself. The inverted list method has a single index for each key type. The records are not
necessarily stored in a sequence. They are placed in the are data storage area, but indexes
are updated for the record keys and location.

Here's an example, in a company file, an index could be maintained for all products,
another one might be maintained for product types. Thus, it is faster to search the indexes
than every record. These types of file are also known as "inverted indexes."
Nevertheless, inverted list files use more media space and the storage devices get full
quickly with this type of organization. The benefits are apparent immediately because
searching is fast. However, updating is much slower.

Content-based queries in text retrieval systems use inverted indexes as their preferred
mechanism. Data items in these systems are usually stored compressed which would
normally slow the retrieval process, but the compression algorithm will be chosen to
support this technique.

When querying a file there are certain circumstances when the query is designed to be
modal which means that rules are set which require that different information be held in
the index. Here's an example of this modality: when phrase querying is undertaken, the
particular algorithm requires that offsets to word classifications are held in addition to
document numbers.

[edit]

Direct or Hashed Access

With direct or hashed access a portion of disk space is reserved and a “hashing”
algorithm computes the record address. So there is additional space required for this kind
of file in the store. Records are placed randomly through out the file. Records are
accessed by addresses that specify their disc location. Also, this type of file organization
requires a disk storage rather than tape. It has an excellent search retrieval performance,
but care must be taken to maintain the indexes. If the indexes become corrupt, what is left
may as well go to the bit-bucket, so it is as well to have regular backups of this kind of
file just as it is for all stored valuable data!

[edit]

External File Structure and File Extensions

Microsoft Windows and MS-DOS File Systems

The external structure of a file depends on whether it is being created on a FAT or NTFS
partition. The maximum filename length on a NTFS partition is 256 characters, and 11
characters on FAT (8 character name+"."+3 character extension.) NTFS filenames
keep their case, whereas FAT filenames have no concept of case (but case is ignored
when performing a search under NTFS Operating System). Also, there is the new VFAT
which permits 256 character filenames.

UNIX and Apple Macintosh File Systems

The concept of directories and files is fundamental to the UNIX operating system. On
Microsoft Windows-based operating systems, directories are depicted as folders and
moving about is accomplished by clicking on the different icons. In UNIX, the directories
are arranged as a hierarchy with the root directory being at the top of the tree. The root
directory is always depicted as /. Within the / directory, there are subdirectories (e.g.: etc
and sys). Files can be written to any directory depending on the permissions. Files can be
readable, writable and/or executable.

[edit]

Organizing files using Libraries

With the advent of Microsoft Windows 7 the concept of file organization and
management has improved drastically by way of use of powerful tool called Libraries. A
Library is file organization system to bring together related files and folders stored in
different locations of the local as well as network computer such that these can be
accessed centrally through a single access point. For instance, various images stored in
different folders in the local computer or/and across a computer network can be
accumulated in an Image Library. Aggregation of similar files can be manipulated, sorted
or accessed conveniently as and when required through a single access point on a
computer desktop by use of a Library. This feature is particularly very useful for
accessing similar content of related content, and also, for managing projects using related
and common data.

Huawei ICT Competition 2023-2024 Exam Outline - Cloud Track
0% (1)
Huawei ICT Competition 2023-2024 Exam Outline - Cloud Track
1 page
Iso 6489-3
No ratings yet
Iso 6489-3
12 pages
OS-Chapter 5 - File Management
100% (1)
OS-Chapter 5 - File Management
10 pages
Seoclarity SEO Roadmap Template
0% (1)
Seoclarity SEO Roadmap Template
19 pages
Advance iOS App Architecture PDF
100% (1)
Advance iOS App Architecture PDF
297 pages
Unit 6
No ratings yet
Unit 6
20 pages
MODULE-5 FILE & Their Organization
No ratings yet
MODULE-5 FILE & Their Organization
13 pages
MCA File Structures MCA 212
No ratings yet
MCA File Structures MCA 212
31 pages
"File Organization": Prof. Anand N. Gharu
No ratings yet
"File Organization": Prof. Anand N. Gharu
66 pages
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
Ignou Bca Cs 06 Solved Assignment 2012
No ratings yet
Ignou Bca Cs 06 Solved Assignment 2012
10 pages
File Organisation
No ratings yet
File Organisation
45 pages
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
12 pages
Module 5 File Organization 1
No ratings yet
Module 5 File Organization 1
37 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
Chapter 11 File Management
No ratings yet
Chapter 11 File Management
13 pages
1-File Structure
No ratings yet
1-File Structure
17 pages
ss2 DPR Second Term
No ratings yet
ss2 DPR Second Term
5 pages
File Org
No ratings yet
File Org
2 pages
File Organization
No ratings yet
File Organization
2 pages
Unitv Part1
No ratings yet
Unitv Part1
53 pages
Unit 1 Lecture 9
No ratings yet
Unit 1 Lecture 9
22 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
E-Note SS Two 2nd Term Data Processing
No ratings yet
E-Note SS Two 2nd Term Data Processing
17 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
File Organization
No ratings yet
File Organization
4 pages
DBMS File Organization
No ratings yet
DBMS File Organization
69 pages
Lecture 37-39
No ratings yet
Lecture 37-39
35 pages
Second Term Ss 2: Dataprocessing
No ratings yet
Second Term Ss 2: Dataprocessing
18 pages
File and Database Design
No ratings yet
File and Database Design
28 pages
Week 14 Persistent Data Storage
No ratings yet
Week 14 Persistent Data Storage
7 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
Lecture 3.3.1 File Organization
No ratings yet
Lecture 3.3.1 File Organization
13 pages
File Structure
No ratings yet
File Structure
18 pages
File Organization Unit 4 Notes
No ratings yet
File Organization Unit 4 Notes
29 pages
Ss2 Data Processing 2nd Term
0% (1)
Ss2 Data Processing 2nd Term
33 pages
Ss 2 Data Processing Second Term E-Note
No ratings yet
Ss 2 Data Processing Second Term E-Note
40 pages
Grade 11 - File Organisation and File Access New
No ratings yet
Grade 11 - File Organisation and File Access New
2 pages
Lecture 3.3.3 Sequential, Relative
No ratings yet
Lecture 3.3.3 Sequential, Relative
16 pages
History of File Structures
No ratings yet
History of File Structures
26 pages
Operating Systems: Internals and Design Principles: File Management
No ratings yet
Operating Systems: Internals and Design Principles: File Management
73 pages
Unit 5 Dbms
No ratings yet
Unit 5 Dbms
12 pages
Database Assignment
No ratings yet
Database Assignment
11 pages
File Organization in RDBMS
No ratings yet
File Organization in RDBMS
9 pages
Chapter 5: File Organization
No ratings yet
Chapter 5: File Organization
13 pages
2 - Week8 Lecture Note File Systems
No ratings yet
2 - Week8 Lecture Note File Systems
38 pages
Lecture 10 File Management
No ratings yet
Lecture 10 File Management
73 pages
File Management
No ratings yet
File Management
5 pages
File Organisation DP ss2 WK 1
No ratings yet
File Organisation DP ss2 WK 1
9 pages
File Management
No ratings yet
File Management
4 pages
DBMS Book Special Notes PDF
No ratings yet
DBMS Book Special Notes PDF
68 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
25 pages
DBMS
No ratings yet
DBMS
11 pages
File Organization, Hashing and Collision Full Copy. 1
No ratings yet
File Organization, Hashing and Collision Full Copy. 1
12 pages
1.file Organization
No ratings yet
1.file Organization
90 pages
File Organization Midterm
No ratings yet
File Organization Midterm
43 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
تنظيم الملفات
No ratings yet
تنظيم الملفات
179 pages
C++ File Handling Step by Step: A Practical Guide with Examples
From Everand
C++ File Handling Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Oracle Database 12c Quickstart
From Everand
Oracle Database 12c Quickstart
Michael Elliott
5/5 (5)
Python File Handling Made Easy: A Practical Guide with Examples
From Everand
Python File Handling Made Easy: A Practical Guide with Examples
William E. Clark
No ratings yet
Java File Handling Step by Step: A Practical Guide with Examples
From Everand
Java File Handling Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
WWW Kratikal Com Blog How Is Vulnerability Management Different From Vulnerability Assessment
No ratings yet
WWW Kratikal Com Blog How Is Vulnerability Management Different From Vulnerability Assessment
7 pages
Aptcom 3
No ratings yet
Aptcom 3
6 pages
Pearson Statistics Homework Answers
100% (1)
Pearson Statistics Homework Answers
4 pages
Gov Uscourts FLSD 521536 237 7
No ratings yet
Gov Uscourts FLSD 521536 237 7
5 pages
Odoo Apps
100% (2)
Odoo Apps
70 pages
Alloy Blend Soln
No ratings yet
Alloy Blend Soln
12 pages
Longest Palindromic Subsequence
No ratings yet
Longest Palindromic Subsequence
26 pages
Get Python For Finance 2nd Edition Yuxing Yan Free All Chapters
No ratings yet
Get Python For Finance 2nd Edition Yuxing Yan Free All Chapters
41 pages
2080iq4 2
No ratings yet
2080iq4 2
2 pages
BPL PVC Pipe
No ratings yet
BPL PVC Pipe
1 page
AccurioPress C2070 C2070P C2060 Catalog en PDF
No ratings yet
AccurioPress C2070 C2070P C2060 Catalog en PDF
16 pages
Samsung Cx593 Sct12b
No ratings yet
Samsung Cx593 Sct12b
10 pages
Wheatstone Bridge's Sensitivity, Resistors' Values Effect PDF
No ratings yet
Wheatstone Bridge's Sensitivity, Resistors' Values Effect PDF
6 pages
Casio px-130 Ver.4 SM
No ratings yet
Casio px-130 Ver.4 SM
60 pages
C Programming Sollution
100% (1)
C Programming Sollution
43 pages
Akira Ct-14ns9re 3y11 Chassis
No ratings yet
Akira Ct-14ns9re 3y11 Chassis
34 pages
Electrical Switch
No ratings yet
Electrical Switch
6 pages
The Dragonflybsd Operating System: Jeffrey M. Hsu, Member, Freebsd and Dragonflybsd
No ratings yet
The Dragonflybsd Operating System: Jeffrey M. Hsu, Member, Freebsd and Dragonflybsd
6 pages
Bridgelink User Guide
No ratings yet
Bridgelink User Guide
93 pages
Questions
No ratings yet
Questions
6 pages
OOPS Project Proposal-3
No ratings yet
OOPS Project Proposal-3
3 pages
WF Broadcast Network LUXeTV
No ratings yet
WF Broadcast Network LUXeTV
2 pages
Digital Video Guidebook
100% (2)
Digital Video Guidebook
18 pages
Euro Company Profile-Packers and Movers Kolkata
No ratings yet
Euro Company Profile-Packers and Movers Kolkata
17 pages
Installation Guide: DB2 Universal Database For OS/390
No ratings yet
Installation Guide: DB2 Universal Database For OS/390
576 pages
Modelling Tomography Using Reflexw
No ratings yet
Modelling Tomography Using Reflexw
16 pages

Internal File Structure: Methods and Design Paradigm

Uploaded by

Internal File Structure: Methods and Design Paradigm

Uploaded by

Introduction

We will look at two components of file organization here:

1. The way the internal file structure is arranged and

Internal File Structure

Methods and Design Paradigm

It is a high-level design decision to specify a system of file organization for a computer

Types of File Organization

Techniques of File Organization

The three techniques of file organization are:

Records in these files can only be read or written sequentially.

Hierarchical extension of this scheme is possible since an index is a sequential file in

Hardware for Index-Sequential Organization is usually Disk-based, rather than tape.

• Primary Area:-Contains file records stored by key or ID numbers.

Direct or Hashed Access

External File Structure and File Extensions

Microsoft Windows and MS-DOS File Systems

UNIX and Apple Macintosh File Systems

Organizing files using Libraries

You might also like