0% found this document useful (0 votes)
1K views17 pages

Fundamental File Structure Concepts

This document summarizes fundamental file structure concepts. It discusses files as a stream of bytes with no structure or as a collection of records with fields. It describes key file concepts like records, fields, keys, and different record and field structures. It also covers different types of file organization like sequential, indexed, indexed sequential, and hashed files as well as common file operations.

Uploaded by

Junaid khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views17 pages

Fundamental File Structure Concepts

This document summarizes fundamental file structure concepts. It discusses files as a stream of bytes with no structure or as a collection of records with fields. It describes key file concepts like records, fields, keys, and different record and field structures. It also covers different types of file organization like sequential, indexed, indexed sequential, and hashed files as well as common file operations.

Uploaded by

Junaid khan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17

Fundamental File

Structure Concepts

By

Junaid Ali
Siddiqui
1
Files
A file can be seen as
2. A stream of bytes (no structure), or

3. A collection of records with fields

2
A Stream File
 File is viewed as a sequence of bytes:
87359CarrollAlice in wonderland38180FolkFile Structures ...

 Data semantics is lost: there is no


way to get it apart again.

3
Field and Record Organization
Definitions
Record: a collection of related fields.
Field: the smallest logically meaningful unit
of information in a file.
Key: a subset of the fields in a record used
to identify (uniquely) the record.

e.g. In the example file of books:


 Each line corresponds to a record.
 Fields in each record: ISBN, Author, Title

4
Record Keys
 Primary key: a key that uniquely
identifies a record.
 Secondary key: other keys that may be
used for search
 Author name
 Book title
 Author name + book title
 Note that in general not every field is a
key (keys correspond to fields, or a
combination of fields, that may be used in
a search). 5
Field Structures
 Fixed-length fields
87359Carroll Alice in wonderland
38180Folk File Structures
 Begin each field with a length
indicator
058735907Carroll19Alice in wonderland
053818004Folk15File Structures
 Place a delimiter at the end of
each field
87359|Carroll|Alice in wonderland|
38180|Folk|File Structures|
 Store field as keyword = value
ISBN=87359|AU=Carroll|TI=Alice in wonderland|
ISBN=38180|AU=Folk|TI=File Structures 6
Record Structures
1. Fixed-length records.
2. Fixed number of fields.
3. Begin each record with a length
indicator.
4. Use an index to keep track of
addresses.
5. Place a delimiter at the end of the
record.
7
Fixed-length records
Two ways of making fixed-length
records:
2. Fixed-length records with fixed-
length fields.
87359 Carroll Alice in wonderland
03818 Folk File Structures

3. Fixed-length records with


variable-length fields.
87359|Carroll|Alice in wonderland| unused
38180|Folk|File Structures| unused

8
Variable-length records
 Fixed number of fields:
87359|Carroll|Alice in wonderland|38180|Folk|File Structures| ...

 Record beginning with length indicator:


3387359|Carroll|Alice in wonderland|2638180|Folk|File Structures| ..

 Use an index file to keep track of record


addresses:
 The index file keeps the byte offset for each record;
this allows us to search the index (which have fixed
length records) in order to discover the beginning of
the record.
 Placing a delimiter: e.g. end-of-line char
9
File Organization
 Four basic types of organization:
• Sequential
today

• Indexed

• Indexed Sequential

• Hashed

 In all cases we view a file as a sequence


of records.
 A record is a list of fields. Each field has
a data type.
10
File Operations
 Typical Operations:
 Retrieve a record
 Insert a record
 Delete a record
 Modify a field of a record
 In direct files:
 Get a record with a given field value
 In sequential files:
 Get the next record

11
Sequential files
 Records are stored contiguously on the
storage device.
 Sequential files are read from beginning
to end.
 Some operations are very efficient on
sequential files (e.g. finding averages)
 Organization of records:
– Unordered sequential files (pile files)

– Sorted sequential files (records are

ordered by some field)


12
Pile Files
 A pile file is a succession of records,
simply placed one after another with
no additional structure.
 Records may vary in length.
 Typical Request:
 Print all records with a given field value
 e.g. print all books by Folk.
 We must examine each record in the
file, in order, starting from the first
record. 13
Searching Sequential Files
 To look-up a record, given the value of one
or more of its fields, we must search the
whole file.
 In general, (b is the total number of blocks
in file):
 At least 1 block is accessed
 At most b blocks are accessed.
 On average 1/b * b (b + 1) / 2 => b/2
 Thus, time to find and read a record in a
pile file is approximately
Time to: fetch one record
TF = (b/2) * btt 14
Exhaustive Reading of the File
 Read and process all records
(reading order is not important)
TX = b * btt
(approximately twice the time to fetch one
record)
 e.g. Finding averages, min or max, or
sum.
 Pile file is the best organization for this
kind of operations.
 They can be calculated using double 15
Sorted Sequential Files
 Sorted files are usually read sequentially
to produce lists, such as mailing lists,
invoices.etc.
 A sorted file cannot stay in order after
additions (usually it is used as a temporary
file).
 A sorted file will have an overflow area of
added records. Overflow area is not
sorted.
 To find a record:
 First look at sorted area
 Then search overflow area
16

Searching for a record
 We can do binary search (assuming
fixed-length records) in the sorted part.
Sorted part overflow

x blocks y blocks (x + y = b)

17

You might also like