0% found this document useful (0 votes)
20 views15 pages

TOPIC THREE-File System

The document outlines various filing systems and their definitions, including files, records, fields, and methods of organization and access. It describes different file organization methods such as serial, sequential, indexed sequential, and random, detailing their characteristics, uses, and processes for reading and updating records. Additionally, it covers concepts like data buffers and parity checks for error detection in data transmission.

Uploaded by

Vokez Hitch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views15 pages

TOPIC THREE-File System

The document outlines various filing systems and their definitions, including files, records, fields, and methods of organization and access. It describes different file organization methods such as serial, sequential, indexed sequential, and random, detailing their characteristics, uses, and processes for reading and updating records. Additionally, it covers concepts like data buffers and parity checks for error detection in data transmission.

Uploaded by

Vokez Hitch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

FILING SYSTEMS

DEFINITIONS
File = An organised collection of RELATED records. e.g. ALL the records of students in a college.
Record = ALL the details about a particular person or thing. e.g. details of one student.
Field = One indivisible item within a record. e.g. Date of Birth
Organisation = The way in which records are held on the file. This will dictate HOW the data can
be accessed.
Access = The method of retrieving a record from file. This is limited by the method of organisation.
In trying to locate a record on file, the record can either be found and read into memory or a
positive statement that the record is not present is output – because the user keys in the wrong
access details or perhaps another clerk has not yet entered the relevant data.
Key field = The fi eld on a file normally used to order that file. Student number might be used to
ensure the records on the student file are in student number order.
Block = Records on all media are grouped in blocks (sometimes called buckets). A single record
is NOT read from the file into memory but the whole block is read. This speeds up file access
because the slowest part of disc reading is locating where a record is. By reading a block, the next
record is also in memory and this is often the next needed.
Hit-rate = This is the proportion of records needing changing compared with the total number of
records on the file. If this is high, serial or sequential organisation and access could be used. If
low, then perhaps indexed-sequential or random organisation is needed.
Master file = This is a main/semi-permanent file holding all the details about a particular business
area. e.g. All employee records would be held on an employee file and separate from the customer
records on the customer file. Much of the data stays the same for a long time. Changes might be
added to the master file from the transaction file.
Transaction file = This is a temporary file used to collect changes until they are updated onto the
master file. This type of file is only used in batch processing. The file would first have to be sorted
so that a new file is produced which is in the same.
Batch Processing = Transaction data is collected as above over a period of time and then perhaps
once per week month, the master file is updated. Apart from keeping the file for a short while for
security purposes, the data is no longer needed. (See below)
File Update = By Copying – All changes to the master file are achieved by creating a new file,
changing those records that need changing but copying the others across, unchanged
(Serial/Sequential organisation) By Overlaying – Changes are implemented immediately by
overwriting existing records (Indexed- Sequential/Random organisation)

WHY FILES?
Data cannot be retained in main memory of a computer because RAM is only temporary and too
small. Using ROM would prevent the data being changed. Hence, secondary storage systems were
developed to store data not currently needed in the computer. If a computer is currently running a
retail system, there would be no need for employee details to be available until a payroll run was
required. Data on file has to be organized.

FILE ORGANISATION
SERIAL (SER)
Records stored one after another in the order that they are entered into the computer. There are
no gaps between the records. The only order is therefore chronological - Customers do not
conveniently place orders in customer number order nor do they order in product number order.
Reading the file – Records can only be read in the order they are on file starting with the first.
To find a particular record, it could be necessary to read the whole file if the required record is
the last one.
Adding a new record to/Amending a record/Deleting a record from an existing file –
 Generally these processes require a new file to be created which is an exact copy of the
original but with the one change. Records can be added to the end of a disc file but with
tape, a new file is created.
Uses – Transaction data as it is received (processed LATER) – it may have to be sorted before
processing. SMALL reference files which can be read quickly from the beginning every time it is
accessed. e.g. VAT rates, payment levels for staff .
Media – ALL types of storage media permit Serial file processing.
SEQUENTIAL (SEQ)
Records are stored one after another with no gaps and in KEY FIELD order.
Reading the file – The key fi eld value is entered into the computer. The file is read from the
beginning until the required record is located or until either a key-field value on file is higher than
the required one OR end of file is reached. In either case, the record is not present and should be
reported.
Adding a new record to/Amending a record/Deleting a record from an existing file –
 Generally these processes require a new file to be created which is an exact copy of the
original but with the one change and the record placed in its correct key fi eld order.
Uses – Transaction data as it is received (processed LATER) – it may have to be sorted before
processing. SMALL master files which can be read quickly from the beginning every time it is
accessed.
Uses – Reference/Master files where instant access is not needed and batch processing is
suitable.
Media – ALL types of storage media permit Sequential file processing.

INDEXED SEQUENTIAL (IS)


This organization stores records effectively in sequential order but in blocks with a range of key
values associated with each block. There are therefore some physical gaps in the file if records for
some key values are not present. A small index file is associated with the main file showing the
HIGHEST key field value permitted in that block. To reduce the size of the file so that there are
not large physical gaps, the key values allocated to each block may mean that IF all records
were present, they would not fit in. In practice, this is uncommon rather than usual. To overcome
this rare situation, an overflow area is allocated. Where the record SHOULD have been is placed
a small tag indicating the block number where the record actually is. Such a record would therefore
need TWO reads instead of the normal one.

Reading the whole file – The file is treated as sequential and follows the same process as above.
Reading the one record - The key field value is entered into the computer. The small INDEX file
is read sequentially until a key value is found which is equal to or greater than the required record.
This then identifies the block where the record is stored. The whole block is read into memory and
searched to find the required record. With very large files, there could be several levels of index
so that the first indicates the cylinder in which the record should appear and the second the block
within that cylinder. See hard disk (below).
Adding a new record to/Amending a record/Deleting a record from an existing file – The
single record is located as for reading above, the change is made in memory and then written back
to the file overwriting the original. Deleted records can be removed from the block to free up space
for other record that should be there including those moved into the overflow area.
Uses – Reference/Master files where instant access is needed but also where the whole file might
be needed for certain processes. i.e. find a customer record but also report on the while file for
customers who have not paid their bills or are to be targeted for a sales promotion.
This is ideal for online access perhaps in telephone ordering system.
Media – Only disc-based media because of the need to overwrite existing records.

RANDOM (R)
Random organization does not mean records are stored randomly - implying they could be
anywhere on the file. The file “appears” to store records randomly. The key field is used to identify
where the record is stored. The process is:
Key field -> algorithm - >block number

So, a mathematical process is applied to the key and the resulting number that comes out gives the
block number. The size of the available storage space usually dictates the formula. If 1000 blocks
of disc space are available, a simple algorithm could be DIVIDE key value by 1000 and use the
remainder is used as the block address. This remainder would therefore be in the range 0 to 999.
It follows that consecutive key values would be located in different blocks and helps to spread the
records over the fi le. An overflow area is again needed where some bunching occurs and this
would be totally unpredictable. Tags are used as with Indexed Sequential files.

Reading the whole file – This organization is not suited to applications where the whole fi le
would need to be read because of the apparent spread of the records because there would be no
order in the records accessed.
Reading one record - The key field value is entered into the algorithm, the block address
calculated and the record then read as for an indexed sequential block read.
Adding a new record to/Amending a record/Deleting a record from an existing fi le – The
single record is located using the algorithm as above, the change is made in memory and then
written back to the fi le overwriting the original.
Deleted records can be removed from the block to free up space for other records that should be
there but are located in the overflow area.
Uses – Reference/Master fi les where instant access is needed but where reading the whole file is
NOT appropriate. This is also ideal for online access perhaps in telephone ordering system.
Media – Only disc-based media because of the need to overwrite existing records.

WHICH FILE ORGANISATION IS NEEDED FOR AN APPLICATION?


The designer will make the decision after considering the following questions.
 Will individual records need to be located or is the whole always used? Are both methods
needed? How quickly must the data be accessed?
 How big is the data fi le? SERIAL gives quick very access if the fi le is small because
there is no indexes to consult or algorithms to use.
 Record content and size?
 Is data permanent or temporary?
 Who is the data for?
 How frequent is the data collection?
 Will the data need to be amended/deleted/appended?
 How long will the data kept and what happens to it at the end of that period?
Data buffer
Data buffer (or just buffer) is a region of a physical memory storage used to temporarily
store data while it is being moved from one place to another. Typically, the data is stored in a
buffer as it is retrieved from an input device (such as a microphone) or just before it is sent to an
output device (such as speakers). However, a buffer may be used when moving data
between processes within a computer. This is comparable to buffers in telecommunication. Buffers
can be implemented in a fixed memory location in hardware—or by using a virtual data buffer in
software, pointing at a location in the physical memory. In all cases, the data stored in a data buffer
are stored on a physical storage medium. A majority of buffers are implemented in software, which
typically use the faster RAM to store temporary data, due to the much faster access time compared
with hard disk drives. Buffers are typically used when there is a difference between the rate at
which data is received and the rate at which it can be processed, or in the case that these rates are
variable, for example in a printer spooler or in online video streaming. In the distributed
computing environment, data buffer is often implemented in the form of burst buffer that provides
distributed buffering service.

Parity Check
A parity bit, or check bit, is a bit added to a string of binary code. Parity bits are used as the
simplest form of error detecting code. Parity bits are generally applied to the smallest units of a
communication protocol, typically 8-bit octets (bytes), although they can also be applied
separately to an entire message string of bits.

The parity bit ensures that the total number of 1-bits in the string is even or odd.[1] Accordingly,
there are two variants of parity bits: even parity bit and odd parity bit. In the case of even parity,
for a given set of bits, the occurrences of bits whose value is 1 are counted. If that count is odd,
the parity bit value is set to 1, making the total count of occurrences of 1s in the whole set
(including the parity bit) an even number. If the count of 1s in a given set of bits is already even,
the parity bit's value is 0. In the case of odd parity, the coding is reversed. For a given set of bits,
if the count of bits with a value of 1 is even, the parity bit value is set to 1 making the total count
of 1s in the whole set (including the parity bit) an odd number. If the count of bits with a value of
1 is odd, the count is already odd so the parity bit's value is 0. Even parity is a special case of
a cyclic redundancy check (CRC), where the 1-bit CRC is generated by the polynomial x+1.

If a bit is present at a point otherwise dedicated to a parity bit but is not used for parity, it may be
referred to as a mark parity bit if the parity bit is always 1, or a space parity bit if the bit is always
0. In such cases where the value of the bit is constant, it may be called a stick parity bit even though
its function has nothing to do with parity. The function of such bits varies with the system design,
but examples of functions for such bits include timing management or identification of a packet as
being of data or address significance. If its actual bit value is irrelevant to its function, the bit
amounts to a don't-care term.

Error detection

If an odd number of bits (including the parity bit) are transmitted incorrectly, the parity bit will be
incorrect, thus indicating that a parity error occurred in the transmission. The parity bit is only
suitable for detecting errors; it cannot correct any errors, as there is no way to determine which
particular bit is corrupted. The data must be discarded entirely, and re-transmitted from scratch.
On a noisy transmission medium, successful transmission can therefore take a long time, or even
never occur. However, parity has the advantage that it uses only a single bit and requires only a
number of XOR gates to generate. See Hamming code for an example of an error-correcting code.

Parity bit checking is used occasionally for transmitting ASCII characters, which have 7 bits,
leaving the 8th bit as a parity bit.

For example, the parity bit can be computed as follows. Assume Alice and Bob are communicating
and Alice wants to send Bob the simple 4-bit message 1001.

Type of bit parity Successful transmission scenario

Alice wants to transmit: 1001

Even parity Alice computes parity bit value: 1+0+0+1 (mod 2) = 0

Alice adds parity bit and sends: 10010


Bob receives: 10010

Bob computes parity: 1+0+0+1+0 (mod 2) = 0

Bob reports correct transmission after observing expected even result.

Alice wants to transmit: 1001

Alice computes parity bit value: 1+0+0+1 (mod 2) = 0

Alice adds parity bit and sends: 10011


Odd parity
Bob receives: 10011

Bob computes overall parity: 1+0+0+1+1 (mod 2) = 1

Bob reports correct transmission after observing expected odd result.

This mechanism enables the detection of single bit errors, because if one bit gets flipped due to
line noise, there will be an incorrect number of ones in the received data. In the two examples
above, Bob's calculated parity value matches the parity bit in its received value, indicating there
are no single bit errors. Consider the following example with a transmission error in the second bit
using XOR:

Type of bit parity


Failed transmission scenario
error

Alice wants to transmit: 1001

Alice computes parity bit value: 1^0^0^1 = 0

Even parity Alice adds parity bit and sends: 10010

...TRANSMISSION ERROR...
Error in the second bit
Bob receives: 11010

Bob computes overall parity: 1^1^0^1^0 = 1


Bob reports incorrect transmission after observing unexpected odd
result.

Alice wants to transmit: 1001

Alice computes even parity value: 1^0^0^1 = 0

Alice sends: 10010


Even parity ...TRANSMISSION ERROR...
Error in the parity bit Bob receives: 10011

Bob computes overall parity: 1^0^0^1^1 = 1

Bob reports incorrect transmission after observing unexpected odd


result.

There is a limitation to parity schemes. A parity bit is only guaranteed to detect an odd number of
bit errors. If an even number of bits have errors, the parity bit records the correct number of ones,
even though the data is corrupt. (See also error detection and correction.) Consider the same
example as before with an even number of corrupted bits:

Type of bit parity error Failed transmission scenario

Alice wants to transmit: 1001

Alice computes even parity value: 1^0^0^1 = 0

Alice sends: 10010


Even parity
...TRANSMISSION ERROR...
Two corrupted bits Bob receives: 11011

Bob computes overall parity: 1^1^0^1^1 = 0

Bob reports correct transmission though actually incorrect.

Bob observes even parity, as expected, thereby failing to catch the two bit errors.
Concept of master and transaction file

Master File:

 Purpose: Contains comprehensive, long-term data that is relatively static.


 Examples: Customer information, product details, employee records.
 Characteristics:
o Holds detailed and persistent records.
o Updated periodically, often by processing transaction files.
o Essential for ongoing operations and decision-making.

Transaction File:

 Purpose: Contains temporary, dynamic data that represents activities or events.


 Examples: Sales transactions, inventory changes, daily financial transactions.
 Characteristics:
o Holds records of individual transactions.
o Used to update the master file.
o Often created and processed on a regular basis (e.g., daily, weekly).

Relationship Between Master and Transaction Files:

 Integration: Transaction files are used to update master files. Each record in the
transaction file corresponds to a change in the master file.
 Processing: An update program reads the transaction file and applies its changes to the
master file, ensuring data accuracy and integrity.
 Purpose: Master files maintain stable data, while transaction files track ongoing activities
and changes.

Example:

Imagine a retail store:


 Master File: Contains a list of all products, including item names, descriptions, prices,
and stock levels.
 Transaction File: Records each sale transaction, including item ID, quantity sold, date,
and time.

Updating Master file from transaction


A master file is updated from a transaction file using batch processing. Batch processing
involves collecting all data in a transaction file and then processing it.

How it works

1. A transaction file is created that records all transactions, such as sales, over a period of
time.

2. The transaction file is used to update the master file.

3. The master file is a permanent record of products and inventory.

4. The transaction file is cleared once it has been used to update the master file.
Types of transactions

 Add: A new record is added to the master file.

 Change: An existing record in the master file is updated.

 Delete: An existing record in the master file is removed.


Updating the master file

1. The transaction file key is compared to the master file key.

2. If the transaction file key is less than the master file key, the transaction is added to the
new master file.

3. If the transaction file key is equal to the master file key, the master file record is changed
or deleted.

4. If the transaction file key is greater than the master file key, the old master file record is
written to the new master file.
Diagram of a master file update

Contents of A Particular File


To understand the contents of a specific file, you generally need to know:

1. Fields

 Fields are the individual pieces of data within a file. These could be:
o Text (e.g., names, addresses)
o Numbers (e.g., account balances, ages)
o Dates (e.g., birthdates, timestamps)
o Other data types specific to the file’s purpose

2. Data Types

 Data Types define the nature of the data. Common data types include:
o String/Text
o Integer
o Floating-point (decimal numbers)
o Date/Time
o Boolean (true/false)

3. Sizes

 Sizes refer to the amount of space each field occupies. This can vary based on:
o The type of data (e.g., text fields can vary in length)
o File format and system specifications
4. Purpose of the File

 Purpose refers to why the file exists. It might be used for:


o Storing contact information (e.g., an address book)
o Maintaining financial records (e.g., a ledger)
o Logging events (e.g., a system log)
o Any other specific function depending on the context

Security of Data Files

Ensuring the security of your data files is essential in our digital age. Here are some key practices
to help keep your files secure:

1. Use Strong Passwords

 Ensure your passwords are complex and unique. Use a combination of letters, numbers,
and special characters.
 Avoid using easily guessable passwords like "123456" or "password."

2. Encrypt Your Data

 Encryption protects your data by making it unreadable to unauthorized users.


 Use tools like BitLocker (for Windows) or FileVault (for macOS) to encrypt your files.

3. Regular Backups

 Regularly back up your data to an external hard drive or a secure cloud service.
 This ensures you have a copy of your data in case of loss or theft.

4. Use Antivirus Software

 Install reputable antivirus software to protect your system from malware and viruses.
 Keep your antivirus software up to date.

5. Keep Software Updated


 Regularly update your operating system and software to patch any security
vulnerabilities.
 Enable automatic updates to ensure you always have the latest security patches.

6. Be Cautious with Public Wi-Fi

 Avoid accessing sensitive data over public Wi-Fi networks, as they are often insecure.
 Use a Virtual Private Network (VPN) to encrypt your internet connection if you must use
public Wi-Fi.

7. Monitor for Unauthorized Access

 Regularly check for any unusual activity or unauthorized access to your data.
 Set up alerts for suspicious activities if your system supports it.

Distinction between different types of files

Certainly! Different types of files serve various purposes and have distinct characteristics. Here’s
a breakdown:

1. Program Files

 Purpose: Contain code or executable instructions for software applications.


 Examples: .exe (Windows executable), .app (macOS application), .bat (batch file),
.sh (shell script).
 Characteristics: Typically compiled or interpreted by the operating system or specific
software to perform tasks.

2. Data Files

 Purpose: Store information used by programs or users.


 Examples: .csv (Comma-Separated Values), .json (JavaScript Object Notation),
.xml (eXtensible Markup Language), .db (Database files like SQLite).
 Characteristics: Often structured and formatted to be read and written by applications.
3. Text Files

 Purpose: Contain plain text or human-readable content.


 Examples: .txt (plain text), .log (log file), .md (Markdown).
 Characteristics: Simple format, can be opened and edited with basic text editors, no
special formatting like fonts or images.

4. Parameter Files

 Purpose: Store configuration settings or parameters for programs.


 Examples: .ini (Initialization file), .conf (Configuration file), .cfg (Configuration
file).
 Characteristics: Often structured in key-value pairs, used to set up or customize
software behavior.

You might also like