0% found this document useful (0 votes)
13 views30 pages

2 - Fundamental File Processing Operations

Uploaded by

alhamzahaudai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views30 pages

2 - Fundamental File Processing Operations

Uploaded by

alhamzahaudai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

File Structures

(CIS 256)
Yarmouk University
Department of Computer Information Systems

Note:
- Most of these slides have been prepared by Dr. Binnur Kurt from Istanbul Technical University, Computer
Engineering Department and adopted for our course with his permission.
-Additional slides have been added from the mentioned references in the syllabus.

© Department of Computer Information Systems, Yarmouk University (2016)


2 Fundamental File Processing Operations
Content
Content
Fundamental File Processing Operations 2

►Sample programs for file manipulation


►Physical files and logical files
►Opening and closing files
►Reading from files and writing into files
►How these operations are done in C and C++
►Standard input/output and redirection

CIS 256 (File Structures) 3


What
What isis aa FILE?
FILE?
Fundamental File Processing Operations 2

I wonder...

A file is...
►A collection of data placed under permanent or
non-volatile storage
►Examples: anything that you can store in a disk,
hard drive, tape, optical media, and any other
medium which doesn’t lose the information when
the power is turned off.
►Notice that this is only an informal definition!

CIS 256 (File Structures) 4


Where
Where do
do File
File Structures
Structures fit
fit in
in CS?
CS?
Fundamental File Processing Operations 2

Application

DBMS

File system

Operating System

Hardware

CIS 256 (File Structures) 5


Physical
Physical Files
Files &
& Logical
Logical Files
Files
Fundamental File Processing Operations 2

► Physical file: physically exists on secondary storage;


known by the operating system; appears in its file
directory
► Logical file, what your program actually uses, a ‘pipe’
though which information can be extracted, or sent.
►Operating system: get instruction from program or
command line; link logical file with physical file or device
► Why is the distinction useful? Why not allow our
programs to deal directly with physical files?

CIS 256 (File Structures) 6


Basic
Basic File
File Operations
Operations
Fundamental File Processing Operations 2

►Opening a file - basically, links a logical file to a physical


file.
– On open, the O/S performs a series operations that end
in the program that is trying to open the file being
assigned a file descriptor.
– Additionally, the O/S will perform particular operations
on the file at the request of the calling program, these
operations are intended to ‘initialize’ the file for use by
the program.
– What happens when the O/S detects an error?

CIS 256 (File Structures) 7


infile:
infile: Logical
Logical File,
File, “account.txt”:
“account.txt”: Physical
Physical File
File
Fundamental File Processing Operations 2

#include <fstream>
#include <iostream>
using namespace std ;
int main(){
char c;
fstream infile ;
infile.open("account.txt",ios::in) ;
infile.unsetf(ios::skipws) ;
infile >> c ;

CIS 256 (File Structures) 8


Fundamental File Processing Operations 2

while (! infile.fail()){
cout << c ;
infile >> c ;
}
infile.close() ;
return 0;
}

CIS 256 (File Structures) 9


Physical
Physical Files
Files &
& Logical
Logical Files
Files ─
─ Revisited
Revisited ## 11
Fundamental File Processing Operations 2

► OS is responsible for associating a logical file in a program to a


physical file in disk or tape. Writing to or reading from a file in a
program is done through the OS.
► Note that from the program point of view, input devices (keyboard)
and output devices (console, printer, etc) are treated as files ─
places where bytes come from or sent to
► There may be thousands of physical files on a disk, but a program
only have a limited number of logical files open at the same time.
► The physical file has a name, for instance “account.txt”
► The logical file has a logical name used for referring to the file
inside the program. The logical name is a variable inside the
program, for instance “infile”

CIS 256 (File Structures) 10


Physical
Physical Files
Files &
& Logical
Logical Files
Files ─
─ Revisited
Revisited ## 22
Fundamental File Processing Operations 2

►In C++ PL, the logical name is the name of an object of


the class fstream:
fstream
fstream infile ;
►In both languages, the logical name infile will be
associated to the physical file “account.txt” at the time of
opening the file.

CIS 256 (File Structures) 11


More
More on
on Opening
Opening Files
Files
Fundamental File Processing Operations 2

►Two options for opening a file:


– Open an existing file
– Create a new file

CIS 256 (File Structures) 12


How
How to
to do
do in
in C++
C++
Fundamental File Processing Operations 2

fstream outfile;
outfile.open(“account.txt”, ios::out) ;
►The 1st argument indicates the physical name of the file
►The 2nd argument is an integer indicating the mode defined
in the class ios.
ios

CIS 256 (File Structures) 13


The
The Mode
Mode
Fundamental File Processing Operations 2

►ios::in open for reading


►ios::out open for writing
►ios::app seek to the end of file before each write
►ios::trunc always create a new file
►ios::nocreate fail if file does not exist
►ios::binary open in binary mode

CIS 256 (File Structures) 14


Basic
Basic File
File Operations
Operations
Fundamental File Processing Operations 2

►Closing a file - cuts the link between physical and logical


files
– Upon closing, the OS takes care of ‘synchronizing’ the
contents of the file, e.g. often a buffer is used, need to
write buffer content to file.
– In general, files are automatically closed when the
program ends.
– So, why do we need to worry about closing files?
– In C++: outfile.close()

CIS 256 (File Structures) 15


Basic
Basic File
File Operations
Operations
Fundamental File Processing Operations 2

►Reading and Writing – basic I/O operations.


– Usually require three parameters: a logical file, an
address, and the amount of data that is to be read or
written.
– What is the use of the address parameter?

CIS 256 (File Structures) 16


Reading
Reading in
in C++
C++
Fundamental File Processing Operations 2

char c ; // a character
char a[100] ; // an array with 100 characters
fstream infile ;
infile.open(“myfile.txt”, ios::in) ;
infile >> c; // reads one character
infile.read(&c,1) ;
infile.read(a,10); // reads 10 bytes
►Note that thanks to operator overloading in C++,
operator >> gets the same info at a higher level

CIS 256 (File Structures) 17


Writing
Writing in
in C++
C++
Fundamental File Processing Operations 2

char c ; // a character
char a[100] ; // an array with 100 characters
fstream outfile ;
outfile.open(“myfile.txt”, ios::out) ;
outfile << c; // writes one character
outfile.write(&c,1) ;
outfile.write(a,10); // writes 10 bytes

CIS 256 (File Structures) 18


Additional
Additional File
File Operations
Operations
Fundamental File Processing Operations 2

►Seeking: source file, offset.


►Detecting the end of a file
►Detecting I/O error

CIS 256 (File Structures) 19


Seeking
Seeking with
with C++
C++ Stream
Stream Classes
Classes
Fundamental File Processing Operations 2

A fstream has 2 file pointers: get pointer & put pointer


(for input) (for output)
file1.seekg ( byte_offset, origin); //moves get pointer
file1.seekp ( byte_offset, origin); //moves put pointer

origin can be ios::beg (beginning of file)


ios::cur (current position)
ios::end (end of file)

file1.seekg ( 373, ios::beg); // moves get pointer 373 bytes from


// the beginning of file
CIS 256 (File Structures) 20
Detecting
Detecting End
End of
of File
File
Fundamental File Processing Operations 2

►In C++: Check whether infile.fail() returns true


infile >> c ;
if (infile.fail()) // true if file has ended
►Alternatively, use the function infile.eof()
►Also note that fail() indicates that an operation is
unsuccessful, so it is more general than just checking for
end of file

CIS 256 (File Structures) 21


Metadata
Metadata
Fundamental File Processing Operations 2

►Data About Data


–Usually in the form of a file header
–Example in text
•Astronomy image storage format
•HTML format (name = value)
•But look on page 177: coding style makes a BIG difference
–Parsing this kind of data
•Read field name; read field value
•Convert ASCII value to type required for storage & use
•Store converted value into right variable
–Why use this type of header?

CIS 256 (File Structures) 22


More
More Metadata
Metadata
Fundamental File Processing Operations 2

►Graphics Storage Formats


–Data
•Color values for each pixel in image
•Data compression often used (GIF, JPG)
•Different color “depth” possibilities
–Metadata
•Height & width of image
•Number of bits per pixel (color depth)
•If not true color (24 bits / pixel)
–Color look-up table
»Normally 256 entries
»Indexed by values stored for each pixel (normally 1 byte)
»Contains R/G/B values for color combination
–Often formatted to be loaded directly into graphics RAM

CIS 256 (File Structures) 23


Fundamental File Processing Operations 2

File Portability and


Standardization

CIS 256 (File Structures) 24


Portability
Portability and
and Standardization
Standardization
Fundamental File Processing Operations 2

► Want to be able to share files


– Must be accessible on different computers
– Must be compatible with different programs that will access them
– Several factors affect portability
• Operating systems
• Languages
• Machine architectures

CIS 256 (File Structures) 25


– Differences among operating systems
• In Chapter 2:
Fundamental File Processing Operations 2

– Saw DOS adds extra line-feed character when it sees CR


– Not the case on most other file systems
• Ultimate physical format of the same logical file can vary depending
on the OS
– Differences among languages
• Talked about C++ versus Pascal
– C++ can have header and data records of different sizes
– Pascal cannot
• Physical layout of files may be constrained by the way languages
allow file structure definitions

CIS 256 (File Structures) 26


– Differences in machine architectures
• Saw problem of “Endean-ness”
Fundamental File Processing Operations 2

– Multi-byte integers:
» Store high-order byte first or low-order byte first?
• Word size may affect file layout
– For a struct item, may allocate:
» 8-bytes (64-bit word)
» 4-bytes (32-bit word)
» 3-bytes (24-bit word)
• Different encodings for text
– ASCII
– EBCDIC
– Maybe other problems with international languages

CIS 256 (File Structures) 27


► Achieving portability
– Must determine how to deal with differences among languages, OSs, and
Fundamental File Processing Operations 2

hardware
• It is not a trivial matter
• Text offers some guidelines
– Agree on standard physical record format
• FITS is a good example
– Specifies physical format, keywords, order of keywords, bit
pattern for binary numbers
• Once get standard, stay with it
– Make the standard extensible
– Make it simple enough for wide range of machines, languages,
and OSs

CIS 256 (File Structures) 28


– Agree on a standard binary encoding
• ASCII vs EBCDIC for text
Fundamental File Processing Operations 2

• Binary numbers have more options


– IEEE standard
» Specifies format for 32, 64, & 128-bit floating point
» Specifies format for 8, 16, &32-bit integers
» Most computers follow
– XDR
» External Data Representation
» Specifies IEEE formats
» Also provides routines to convert to/from XDR format
and host machine format

CIS 256 (File Structures) 29


The
The Conversion
Conversion Problem
Problem
Fundamental File Processing Operations 2

►Only a Few Environments – do it directly:


IBM Sun

Sun IBM
►Many Env’ts. – need an intermediate form:

IBM IBM

Sun Sun
.
.
.
XML

IA-32 IA-32

IA-64 (or some other IA-64


standard format)

CIS 256 (File Structures) 30

You might also like