0% found this document useful (0 votes)
21 views19 pages

unix-UNIT 1 & 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views19 pages

unix-UNIT 1 & 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIX & SHELL PROGRAMMING (502) [1]

UNIX
Unix (officially trademarked as UNIX, sometimes also written as Unix with small caps) is a computer operating
system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie,
Brian Kernighan, Douglas McIlroy, and Joe Ossanna.

History of UNIX
In late 1960’s at AT&T Bell Laboratories the developers were working on the project known as MULTICS
(Multiplexed Information and Computing Service).MULTICS was an operating system designed on GE-645 mainframe
computers. The purpose of developing the operating was to have an operating system that can be portable and can
support multiuser. It was developed with multiuser capability still it was considered to be inadequate, the project was
dropped but the researchers working the system, Ken Thompson, Dennis Ritchie, M. D. McIlroy, and J. F. Ossanna, decided
to redo the work on a much smaller scale. Ken Thompson still had access to the Multics environment; he wrote simulations
for the new file and paging system on it. A team of Bell Labs researchers led by Thompson and Ritchie, including Rudd
Canaday, developed a hierarchical file system, the notions of computer processes and device files, a command-line
interpreter, and some small utility programs.

In the 1970s Brian Kernighan coined the project name Unics (Uniplexed Information and Computing Service) as a
play on Multics. Unics could eventually support multiple simultaneous users, and it was renamed UNIX. UNIX code was
written in the assembly language of PDP-7 and was, consequently machine dependent. Ritchie and Thompson worked
quietly on UNIX for several years. Ken Thompson then developed a new programming “B” which was subsequently
modified by Dennis Ritchie and renamed the “C” language. The whole UNIX system was rewritten in “C” language by 1973
which made is portable language.

The AT&T Company distributed the UNIX to academic and research institutions at a nominal fee. The University
of California, Berkeley (UCB), created a UNIX of its own. It was called BDS UNIX (Berkeley Software Distribution). These
versions became quite popular worldwide, especially in university and engineering circles. Berkeley rewrites the whole
operating system in the way they wanted. They created the Standard editor of the UNIX system (vi) and a popular shell(C
shell). They created a better file system and they also offered with their standard distribution networking protocol
software (TCP/IP) that made the Internet possible.

Some other Organizations also developed the other versions of UNIX as given in table.

Organization Version of UNIX


Sun Systems Solaris
IBM AIX
HP HP-UNIX
DEC Digital UNIX now Tru64
MIT Xwindow
etc.

Features of UNIX
UNIX provides many features as given below:

1. Multiuser Capability:
a. UNIX is a multiuser system. In a multiuser system the same computers resources-hard disk, memory etc.
are accessible to many users. All the users are given different terminal. All the terminals are connected
to the main computer whose resources are availed by all users. So, a user at any of the terminals can use
not only the computer, but also any peripherals that may be attached. e.g.: Printer. The following figure
depicts the multiuser system.
UNIX & SHELL PROGRAMMING (502) [2]

Terminal Terminal

Host Machine
Terminal Terminal
b.

c. The number of terminals connected to the host machine depends on the number of ports that are present
in its controller card. There are several types of terminals that can be connected to the host machine.
d. Dumb Terminals: These terminals consist of the keyboard and display unit with no memory or disk of
its own. These can never acts independent machines If they are to be used they have to be connected to
the host machine.
e. Terminal Emulation/Smart Terminal: PC has its own microprocessor, memory & storage device by
attaching this PC to the host machine through a cable & running s/w from this machine, we can emulate
to work as it is dumb terminal. However memory & disk are not in use & machine can’t carry out any
processes on its own.
f. Dial-In terminals:
i. This terminals use telephone lines to connect with host machine. To communicate over
telephone lines it is necessary to attach a unit called modem to the terminal as well as to the host
machine.

2. Multitasking Capability:
a. UNIX is capable of carrying out multiple jobs at the same time, i.e. a single user can type a program in its
editor as well as simultaneously executes some other command you might have given earlier. The latter
job is performed in the background while in the foreground he uses the editor. This is managed by dividing
the CPU time intelligently between all processes being carried out. Depending upon the priority the
operating system allots small time slots to all the process running in foreground and background. MS-DOS
also provides multitasking facility which is known as serial Multiprocessing. In DOS time slicing is done i.e.
at a time only one job will run rest of the jobs are stopped temporarily. Where as in UNIX time slicing is
not given, in UNIX priorities are given and the processes that have same priority are scheduled on a
round-robin base.

3. Communication:
a. UNIX provides excellent feature that allows users to communicate with fellow users. The communication
can be within the network of a single main computer or between two or more networks. The users can
easily exchange mail, data and programs through such networks.

4. Security:
a. UNIX provides outstanding facility for security. It provides security in 3 levels.
b. The first level security is provided by providing passwords (also called Login Level Security). It can be
considered as a System Level Security. Username and passwords are assigned to all the users to assure
that no other user can access his work.
c. The second level security is at the file level. There are read, write and execute permissions to each file
that decide who can access a particular file, who can modify it and who can execute it.
UNIX & SHELL PROGRAMMING (502) [3]
5. The third level security is given by file encryption. This utility encodes user’s file into an unreadable format, so
that even if the user succeeds in opening the file he will not be able to read it. When user wants to see the contents
of the file he can decrypt the file.
6. System Portability:
a. UNIX is written in High level language ”C”, hence it can easily run on any machine with or without small
changes. The code can be changed and compiled on a new machine.

7. Open System:
a. UNIX has an open architecture one can add to the toolkit by simply writing a program & Starting that
executable in a separate area in the file. A separate device can also be added by creating file for it.
Modification of system is easy because the source code is always available.

8. Programming facility:
a. UNIX shell is also a programming language; it is designed for a programmer, not a casual end user. It has
all the necessary ingredients, like control structures loops and variables that establish it as a powerful
programming language in its own right. These features are used to design shell scripts.

9. Documentation:
a. UNIX provides an excellent feature of online help. The principle online help facility available is the man
command, which is the most important reference for commands and their configuration. It gives detailed
information about all the commands.

10. Pattern Matching:


a. UNIX features very sophisticated pattern matching feature. There are some special characters (like
*,?,[..]) through UNIX provides very nice feature of pattern matching. There is also facility of regular
expression that is framed with characters from this set.
UNIX & SHELL PROGRAMMING (502) [4]

System Structure
UNIX & SHELL PROGRAMMING (502) [5]

Kernel:
The kernel is the core of the operating system. Kernel is mostly written in C. It is loaded into memory when the system
is booted and communicates directly with the hardware. User programs that need to access the hardware use the
services of the kernel, which performs the job on the user’s behalf. The programs access kernel through set of
functions called system calls. The kernel also manages the system memory, Schedules processes, decides their
priorities and performs other task.

Shell:
The shell is the command interpreter. It acts as an interface between the user and kernel. When the user types any
command it is interpreted by shell and then given to the kernel. There could be several copies of shell running on the
system. For each user who logs in the system a different shell is created.

System Calls:
In UNIX there are thousand commands, all these commands use a handful functions called System calls to
communicate with the kernel. The system calls are always same in all the versions of UNIX. In
UNIX there are many system calls to perform different tasks. e.g.: a typical UNIX command writes the file with the write
system call, for opening a file UNIX uses open system call. The system call in UNIX works same for a file and device (as
device is also considered as a file in UNIX). The system calls are built into kernel, and interaction through them
represents an efficient means of communication with the system i.e. once the software has been developed on one
UNIX system, it can easily be ported to another UNIX machine.

The Files and Processes:


Two simple entities support the UNIX system-the file and process. A File is just an array of bytes and can contain
virtually anything. It is also related to another file by being part of a single hierarchical structure. UNIX does not care
to know the type of file you are using. Directories and devices are considered as members of file system. UNIX provides
a vast array of text manipulation tools that can edit the files without using an editor.
Process is a file in execution. Processes also belong to hierarchical tree structure. Processes are treated as living
organism in UNIX as they have parents, children and grandchildren and they born and die also. UNIX provides the tools
that allow us to control processes move them between foreground and background and even kill them.

Tools & Application:


The outermost layer of the UNIX operating system is its tools and applications. Tools vary from one implementation of
UNIX to another. Some versions of UNIX are decked with more than 400 tools and applications. These tools can be
invoked from the command line itself and help perform the day-to-day as well as complex tasks of the system. These
are placed one level above shell and can be expanded and patched as required by the user. The tools are not
mandatory, hence different implementations of UNIX have varying number of tools and applications available.

Shell & Its Features


Shell: Shell is command interpreter in UNIX. It accepts commands from the user and analyzes and interprets these
commands. Shell requests the kernel to carry out the actual transfer of data which finally leads to the output that is
displayed on the terminal. The shell hence acts an interface between the user and the kernel. When user logs on the
system a separate copy of his shell is created. This means at a particular instance there may be several copies of shell
running on the system with a minimum of one Shell per user. The shell program is stored in file called ‘sh’.

There are several different shells available for UNIX:


You can use any one of these shells if they are available on your system. And you can switch between the different
shells once you have found out if they are available.
 Bourne Shell (sh)
 C Shell (csh)
 TC Shell (tcsh)
 Korn Shell (ksh)
UNIX & SHELL PROGRAMMING (502) [6]
 Bourne Again Shell (bash)

Bourne Shell (sh):


This is the original UNIX shell written by Steve Bourne of Bell Labs in late 1970’s. It is available on all UNIX systems. The
ubiquitous “dollar prompt” on UNIX installation is trademark of Bourne Shell
This shell does not have the interactive facilities provided by modern shells such as the C shell and Korn shell. You are
advised to use another shell which has these features.
The Bourne shell does provide an easy to use language with which you can write shell scripts.

C Shell (csh):
This shell was written at the University of California, Berkeley by Bill Joy who was a graduate student there. This is the
default shell in the Berkeley versions of UNIX and is very popular with UNIX programmer and university researchers. It
provides a C-like language with which to write shell scripts - hence its name.

It also provides some principal advantages over the Bourne Shell like,
 A History mechanism:
The Shell remembers the commands that the users types and allows him to recall them without having them typing
again. Hence that avoids lots of error as UNIX commands are too long that the user may miss spell.
 Aliasing: The C shell permits you to call frequently used commands by your own formulated abbreviations. This too
proves very useful at the command line. This is a type of “macro” facility that is available at the command line.

TC Shell (tcsh)
This shell is available in the public domain. It provides all the features of the C shell together with emac style editing of
the command line

Korn Shell (ksh)


This shell was written by David Korn of Bell labs. It is now provided as the standard shell on UNIX systems. It provides all
the features of the C and TC shells together with a shell programming language similar to that of the original Bourne shell.
It is the most efficient shell.

Bourne Again Shell (bash)


This is a public domain shell written by the Free Software Foundation under their GNU initiative. Ultimately it is intended
to be a full implementation of the IEEE Posix Shell and Tools specification. This shell is widely used within the academic
community.
bash provides all the interactive features of the C shell (csh) and the Korn shell (ksh). Its programming language is
compatible with the Bourne shell (sh).

Features of Shells:

 Interactive Environment: The shell allows the user to create a dialog (communication channel) between the user
and the host UNIX system. This dialog terminates until the user ends the system session.
 Shell scripts: It is the shell that has the facility to be programmed; the shell contains commands that can be utilized
by the user. Shell scripts are group of UNIX command string together and executed as individual files. The shell is
itself a program; accept that is written in “C” language.
 Input/Output Redirection: Input/Output Redirection is a function of shell that redirects the o/p from program to
a destination other than screen. This way you can save the o/p from a command into a file and redirect it to a
printer, another terminal on the h/w or even another program. similarly, a shell can be a program that accepts i/p
form other than keyboard by redirecting its i/p from another source.
 Piping Mechanism: Pipe facility allows the o/p of one command to be used as input to another UNIX command
e.g. who | wc. The program that performs simple functions can easily be connected to perform more complex
functions minimizing the need to develop new program.
UNIX & SHELL PROGRAMMING (502) [7]
 Meta Character Facility: Shell recognizes “*”,”?” or “[..]” as a special characters when reading the arguments from
the command line, shell than perform file name expansion this list before executing the requested program. E.g.
ls s* (enter) Displays the file or directories starting with s character.
 Background Processing: Multitasking facility allows the user to run command in the background. This allows the
command to be processed while the user can proceed with other task when a background task is completed, the
user is notified.
 Customized Environment: The shell is your working environment facilities are available by which the shell can be
customized for your personal need.
 Programming Language: The shell includes features that allow it to be used as programming language. This
feature can be used to build shell script that performs complex operations.
 Shell variables: The user can control the behavior of the shell as well as other programs & utilities by storing data
in variables.

KERNEL
The kernel is the core of the operating system. Kernel is mostly written in C. It is loaded into memory when the system
is booted and communicates directly with the hardware. User programs that need to access the hardware use the
services of the kernel, which performs the job on the user’s behalf. The programs access kernel through set of
functions called system calls. The kernel program is usually stored in a file called “Unix”.
Kernel manages many functions as given below:
 Manages files
 Carries out all the data transfer between the file system and hardware
 Manages memory
 Schedules various programs running in memory
 Handles interrupts etc...

Architecture of UNIX (Kernel Architecture) * (in exam write few sentences of kernel, shell also)
The UNIX architecture can be divided into three levels: User level, Kernel level, Hardware level.
UNIX & SHELL PROGRAMMING (502) [8]
The system call and library Interface represent the border between user programs and the kernel as shown in the
figure. System calls are ordinary function calls in C programs and libraries map these functions calls to primitive needed
to enter the operating system. Programs frequently use other libraries such as standard I/O library to provide more
sophisticated use of the system calls. The libraries are linked with the programs at compile time.
The system calls are partitioned to the system calls that interact with the file sub system and the system calls that
interact with the process control subsystem.
The file subsystem manages the files, allocates files space, administrating the free space, controlling access to
files, and retrieving data for users. Processes interact with the file system through system calls. E.g. Open, close, read,
write, etc. The files subsystem accesses the data using buffering mechanism that regulates dataflow between the kernel
and secondary storage devices. The buffering mechanism interacts with block I/O devices drivers to initiate data transfer
to and from the kernel.
Device drivers are the kernel modules that control the operation of peripheral devices. Block I/O devices are
random access storage device to the rest of the system.
The file subsystem also interacts directly with the raw (Character devices) I/O device drivers without the
intervention of a buffering mechanism.
The process control subsystem is responsible for process synchronization, inter-process communication, memory
management, process scheduling. The system calls used with process control systems are fork (creating a new process),
exec (overlay the image of a program onto the running process), Exit (finish executing a process), wait (Synchronize process
execution with the exit of a previously forked process), brk (control the size of memory allocated to a process) and signal
(control process response to extraordinary events).
The memory management module controls the allocation of memory. If at any time the system doesn’t have
enough physical memory for all processes, the kernel moves between main memory and secondary memory so that the
all processes get a fair chance to execute.
The scheduler module allocates the CPU to processes. It schedules them to run in turn until they voluntarily
relinquish the CPU while awaiting a resource or until the kernel preempts them when their recent run time exceeds a time
quantum. The scheduler then chooses the highest priority eligible process to run; the original process will run when it is
the highest priority eligible process available.
The inter-process communication provides message passing between processes. i.e.: it facilitates the
communication between processes.
The hardware control is responsible for handling interrupts and for communicating with the machine.
Devices such as disks or terminals may interrupt the CPU while a process is executing. The kernel executes the
interrupt and then resumes the previously executing process. This way it provides access of hardware devices.

Kernel Data Structure


Kernel data structure mostly occupies fixed-size tables rather than dynamically allocated space. The advantage of
this approach is that the kernel code is simple but (disadvantage) it limits the number of entries for data structure to the
number that was originally configured when the system was generated.
If during the operating the kernel should run out of entries for a data structure, it cannot allocated space for new
entries dynamically and hence must report for an error. If it is allowed to run out of able space the rest of the space is
wasted as it won’t be used by other purposes. The simplicity of the kernel algorithm has been considered more important
and hence its data structure is kept to be simple. To find free table space simple loops can be used which is easy to
understand and sometime very efficient than complex allocation scheme. Examples of kernel data structures are file table
(table allocated per process), User file table descriptor table (Global kernel data structure)

Logging in and Logging out


There are two types of connection that a user of UNIX seeks to make with the system.

1. Physical connection:
The physical connection implies that the terminal is adequately connected to the host using proper cables, cards
and other necessary h/w. When the user gets the login prompt that signifies that there is a physical connection.
UNIX & SHELL PROGRAMMING (502) [9]
2. Logical connection:
The logical connection is when user types the correct login and password; the host machine recognizes the user
and allows him to enter into the system.

Logging in
UNIX is a multiuser system. It has very strong security. User working with UNIX system generally works
with terminal after the boot procedure is over, the kernel is loaded into memory and the following
message appears at each user’s terminal…
Login:
The user has to give his/her identification by enter the user name and password. The user name is also
known as logging id. When password is entered, it is shown as ******. (In some systems to provide more
security even ****** is not displayed.) If the password is correct the system allows you to work with it. If
the password is wrong, it displays error message and gives the login prompt again.

UNIX keeps track of all user name and password in a special file name /etc/passwd.

We can summarize this process in following steps:


Step-1: Make contact with the system: Connecting with the system
Step-2: Wait for the system login prompt: Once we are connected with the system we need to wait for
the login prompt that is displayed on user terminal.
Step-3: Type user-id: Enter the user name.
Step-4: Type password: Enter the password to enter in your system.
Once everything is correct then the users session starts and user can start entering commands.

Logging Out
When the user logs in, unless the user terminates, the session continues. It is very important that you log
out when you are through with the system. The reasons for this are:
 It frees the resources allocated to the user
 And important reason is security, if the terminal is leaved logged anyone can start working in once
login and may temper with important data.

To logout the system the user should perform any of the following commands depending upon the version
of the UNIX he/she is using and even depending upon the shell.
 Logout: Once this command is typed the session gets over of particular user and at terminal we get
the login prompt again.
 Exit: Typing exit at prompt
 Ctrl-d: Pressing control key and d from the keyboard.

UNIX Files
In UNIX a file is any source from which data can be read or any destination to which data can be written. Therefore
the keyboard, monitor, printer documents stored on disk everything is considered to be a file.

There are very few restrictions on how to make up filenames in UNIX.


 Some implementations limit length of the name of the file to 14 characters some may allow up to 255
characters.
 A file name can be any sequence of ASCII characters. Some of the characters should be avoided as file names
e.g. “>”, “<” as they are used in redirection one should avoid its use in file name.
 Period(.) can be used anywhere in the file system as it has no meaning UNIX, like in other operating system,
where period is used for differentiating the file name form extension.
UNIX & SHELL PROGRAMMING (502) [10]

There are some rules that are used to make meaningful file name in UNIX.
 Start the name with an alphabet character.
 Use dividers to separate parts of the name. Good dividers are the underscore.
 Use an extension at end of the filename, even though UNIX doesn’t recognize extensions. Some applications,
such as compliers require file extension. In addition, file extension helps user to classify files so that they can
be easily indentified.
 Never start a file name with a period. File names that start with a period are hidden files in UNIX. Generally
hidden files are created and used by the system. E.g. .profile file, .mailrc and .cshrc are some of the hidden
files of UNIX.

File Types
UNIX provides seven types of files as given in the figure.

Files

Character Block Symbolic


Regular Directory FIFO Socket
Special Special Link

Regular Files (Ordinary Files): Regular files contain user data that need to be available for future processing.
Sometimes called ordinary files, regular files are the most common files found in a system. Regular files are divided
by the physical format used to store the data as text or binary. The physical format is controlled by the application
program or utility that processes it.
 Text Files:
A text file is a file of characters drawn from the computer’s character set. UNIX computer use the ASCII
character set. Because the UNIX shells treat data almost universally as strings of characters, the text file
is most common UNIX file.
 Binary Files:
A binary file is a collection of data stored in the internal format of the computer. In general, there are two
types of binary files: data files and program files.
o Data files contain application data.
o Program files contain instructions that make a program work.
If one tries to process binary file with a text-processing utility, the output will look very strange because
it is not in a format that can be read by people.

Directory Files: A directory is a file that contains the names and locations of all files stored on physical device. The
file system begins with the root directory. The root directory is denoted a backslash (/). There are several other
directories called bin, lib, usr, tmp, etc

Other files types (Special File Types):

 Device files:
In UNIX devices are also considered to be file and saved under /dev directory. The advantage of treating
the devices as files is that some of the commands used to access regular files also work with device files.
I-nodes of such files do not reference to data instead they contains two numbers known as major and
minor numbers. The major number indicates a device type such as terminal and the minor number
indicates the unit number (instance of) of the device.
UNIX & SHELL PROGRAMMING (502) [11]
The device files can be categorized in two types as given below.
o Character Special File: A character special file represents a physical device, such as terminals and
network media that reads or writes one character at a time. These files are named as /dev/mem
o Block Special Files: A block special file represents a physical device, such as a disk and tapes, that
reads or write data a block at time. These files are named as /dev/kmem
 Symbolic Link Files:
A symbolic link is a logical file that defines the location of another file somewhere else in the system.
 FIFO Files:
A first-in, first-out, also known as a named pipe, is a file that is used for inter-process communication.
 Socket:
A socket is a special file that is used for network communication.

The UNIX File System Tree


The whole file system has a hierarchical structure. The file system begins with the root directory. The root
directory is denoted by a backslash (/). There are several other directories called bin, lib, usr, tmp, dev, etc.
The root directory also contains a UNIX file which is UNIX kernel itself. These directories are called subdirectories,
there parent being root directory. Following shows the basic structure of UNIX file system.

root

bin lib home tmp usr dev etc var sbin

bin adm

 /bin directory: It contains essential UNIX utilities that all users will have to use at some time. The word bin
stands for the binary; hence this directory contains binary files. Binary files are executable files. The common
majority of UNIX utilities are all binary files. Although some utilities are shell scripts. Shell scripts are also
executable but they are not binary files. E.g. we have commands like , cat, who, date stored in bin directory
 /dev directory: It contains files for all devices that are connected with UNIX server. UNIX treats devices (such
as printer, disk storage, terminals and even areas of computer memory) as a file. The files in dev directory are
termed as special files neither directory nor an ordinary files. I-nodes of such files do not reference to data
instead they contains two numbers known as major and minor numbers. The major number indicates a device
type such as terminal and the minor number indicates the unit number (instance of) of the device. *(can write
about device file type also, see topic Device Files under TYPES OF FILES)
 /etc directory: etc directory contains various administration utilities together with other special system files
that allow the host UNIX system (server) to startup properly at bootstrap time.
The .passwd file can also be found in the etc directory. Message of the file is displayed at the login time & is
useful for displaying important messages to user.
It also contains several system files which store the relevant information about the user of the system and the
terminal and devices connected to the system.
 /tmp directory: tmp directory contains temporary files created by the UNIX. These files automatically get
deleted when the system is shut down or restarted.
UNIX & SHELL PROGRAMMING (502) [12]
 /usr directory: In the usr directory the user work area is provided. There are several directories each
associated with a particular user. The system administrator creates these directories when he creates account
for different users. Each user is allowed to work with this directory known as home directory.
o /usr/bin directory: Within the usr directory there is another directory which contains additional UNIX
command files that are more important to end users.
o /usr/adm directory: adm stands for administrator; hence adm directory contains administrative utilities
used by system administrator. Although most users will not be able to access them.
 /home directory: On many systems are housed here. If any user wants to create his own home directory he
create user directory in this directory.
 /lib directory: The lib directory contains all the library functions provided by UNIX for programmers. When
programs written under UNIX need to make system calls, they can use libraries provided in this directory.
 /sbin directory: There are some commands which only the system administrator can execute, a user can not
execute such commands are stored under this directory.
 /var: The variable part of the files system. Contains all the print jobs and outgoing and incoming mails.

Features of UNIX File System


 Hierarchical file system:
The UNIX system organizes its files using an upside-down hierarchical tree structure. All files will have a parent
file, apart from a directory called the root directory (/), which is the parent of all files on the system. This
hierarchical component also adds to the dynamic flexibility of the file system.
 Access permissions:
The files are protected using file ownership mechanism. This allows only a specific class of user to access certain
files. Files have read, write & execute permissions. These permissions can be given to owner ( or user), group and
others.
 All devices are implemented as files:
All the hardware devices such as printers, terminals, disk drives are considered to be files in UNIX operating
system.
 Files can grow dynamically:
The files system structure is dynamic i.e. its size is not determined by any rule other than amount of disk storage
that it is available on the system. A file can be changed by the user at any time.
 Structure less files: (It contains sequence of bytes in a file)
It has not specific structure. All the files are considered as collection of stream of bytes.

Physical File Structure of UNIX (UNIX File system Structure)


A file system is a group of files and relevant information regarding them. Disk space is allotted to UNIX file system
is made up of blocks each of them is of 512 bytes. Some file system may have blocks of 1024 or 2048 bytes as well.
All the blocks belonging to the file system are logically divided into four parts as given below:

Boot Block Super Block I-node Block Data Block

BOOT BLOCK: Boot block (or Block-0) is the first block of the file system. This block is normally reserved for booting
process. It is also known as MBR (Master Boot Record). The Boot block contains small boot strapping program.
when the system is booted the system BIOS checks for existence of hard disk and loads the entire code of boot
block in to the memory. It then handovers control to the boot strapping program. This program loads the kernel
in to memory. Though all the files and directories have boot block, the bootstrap program of the boot block of
root file system is loaded in to memory. For other file systems, this block is simply kept empty.
UNIX & SHELL PROGRAMMING (502) [13]
SUPER BLOCK: The super block (or Block-1) is the balance sheet of every UNIX file system. It contains the
information about disk usage, availability of data block and I-nodes. It also contains the total size of the file system,
logical blocks, the last time of updating, length of the file system, no. of free data block available and a partial list
of immediately allocable free data blocks, no. of free I-nodes available and partial list of immediately usable I-
nodes, the state of file system (clear or dirty). It also stores the information of bad blocks.

I-NODE BLOCK (Index/Information Node block): In UNIX all the entities are considered to be files. I-Nodes are
used to describe the file characteristic & the location of data block, which store the contents of the file. The
information related to all these files (not the contents) is stored in an I-node table on the disk. For each file there
is an I-node entry in the table. Each entry is made up of 64 byte and contains the relevant details are:
 File Type (ordinary, directory, device file)
 Number of links (No. of alias)
 The numeric user-id of the owner.
 The numeric group-id (GUID - Global User Id) of the owner.
 File mode (access permission)
 Size of file.
 Date and time of last modification of file data.
 Date and time of last access of the file data.
 Date and time of last change of I-node.
 Address of blocks where the file is physically present.
I-node is accessed by a number known as I-node number. This number is unique for every file in the file system.

DATA BLOCK: The remaining space of the file system is taken up by data block or storage block. This blocks store
the contents of the file in case of regular file.
For a directory file each data block contains 16 byte entries. Each such entry would have a file or subdirectory
name up to 14 bytes and 2 bytes will be taken for I-node. Thus for directory the table of file-names & there I-
nodes are available.
E.g. Directory name is /usr
14 bytes 2 bytes
File name 1 I-node 1
File name 2 I-node 2
‘ ‘
‘ ‘

I-Node
In UNIX all the entities are considered to be files. I-nodes are used to describe the file characteristic & the location
of data block, which store the contents of the file. The information related to all these files (not the contents) is stored in
an I-node table on the disk.

Owner (02254)
Group (xyz)
Type (regular file)
Permission (rwxr--r-x)
Time Accessed (Oct 23 1984 1:45 P.M.)
Time Modified (Oct 23 1984 9:10 A.M.)
Time Created (Oct 23 1984 1:30 P.M.) When I-node was last modified
I-Node (37585)
Size (6030 byte)
Disk addresses (004526...)

Sample Disk I-node


UNIX & SHELL PROGRAMMING (502) [14]
For each file there is an I-node entry in the table. Each entry is made up of 64 byte and contains the relevant details are:

 File Type (ordinary, directory, device file)


 Number of links (No. of alias)
 The numeric user-id of the owner.
 The numeric group-id of the owner.
 File mode(access permission)
 Size of file.
 Date and time of last modification of file data.
 Date and time of last access of the file data.
 Date and time of last change of I-node.
 Address of blocks where the file is physically present.

I-node is accessed by a number known as I-node number. This number is unique for every file in the file system.
I-node exists in a static form on disk and the kernel reads them into in-core I-node to manipulate them.
The content of the I-node changes when any modification is done in file (like contents of file, changing owner,
group, permissions etc). *[The in-core(is I-node created by kernel-it is created when the file is create with disk I-node) I-
node has some extra information like the status of I-node, logical device no. of file system that contains the file, the I-node
number, pointers to other in-core I-nodes. ]* Topic just for information

How I-node access data Block addressing scheme:

As per the Figure,…

 I-nodes have an array of 13 disk block addresses. I-nodes use these 13 pointers to Access data blocks related
to file. The first 10 pointers are direct addresses i.e. they point to the first 10 blocks of the file that particular
I-node is representing.
 Each logical block of UNIX has 1024 bytes (1KB), these 10 pointers allows access to files up to a maximum of
10,240 bytes (10 KB). So whenever a file of less than 10 KB is stored on the file system, the I-node representing
that file uses its first 10 pointers.
UNIX & SHELL PROGRAMMING (502) [15]
 When file is more than 10 KB then its 11th pointer indirectly accesses 256 blocks, each of which addresses
points to 1 KB. Thus, the maximum file size that can be address using 11th pointer entry is 256 KB. This is called
single indirection.
 Similarly, 12th pointer points to a block which contains the addresses of 256 blocks each of which has
addresses of another 256 blocks thus the maximum file size that can be addressed using 12th pointer entry is
256 KB * 256 KB (=65536 KB or 65 MB). This is called double indirection.
 and the 13th pointer contains address of block that contains addresses of 256th blocks each of which contains
the address of 256 blocks & each of which contains addresses of 256 blocks. I.e.: 256 KB* 256 KB* 256 KB
(approx. 16 GB). This is the maximum file size that can be addressed using 13 pointer is 16 GB. This is called
Triple Indirection.
 Thus, the maximum file size that can be accessed by UNIX system is 10 KB + 256 KB + 65 MB + 16 GB = (approx.)
17 GB.

Booting Sequence (or Stages of System Startup or Booting Process)


 Step 1: ROM Diagnostics are run: When the power comes on or the system is reset, the BIOS start the master
boot program, located in the first 512 bytes of the system disk. This program then typically loads the boot program
located in the first 512 bytes of the active partition on that disk. The firmware program is basically just smart
enough to figure out if the hardware devices it needs are accessible (e.g., can it find the system disk or the
network) and to load and initiate the boot program. This first-stage boot program often performs additional
hardware status verification, checking for the presence of expected system memory and major peripheral devices.
Some systems do much more elaborate hardware checks, verifying the status of virtually every device and
detecting new ones added since the last boot.
 Step 2: Boot Loader is loaded: Boot loader is loaded into RAM by ROM. i.e. in earlier step the ROM checks the
H/W and memory and then loads the Boot Loader program in main memory (i.e. RAM)
 Step 3: Kernel is loaded into main memory: After the boot loader is loaded in memory it loads the kernel in
memory.
The kernel is the part of the UNIX operating system that remains running at all times when the system is up. The kernel
has executable image itself, conventionally named UNIX (System V-based systems), vmunix (BSD-based system), or
something similar. It is traditionally stored in or linked to the root directory.
Once control passes to the kernel, it prepares itself to run the system by initializing its internal tables, creating the in-
memory data structures at sizes appropriate to current system resources and kernel parameter values. The kernel may
also complete the hardware diagnostics that are part of the boot process, as well as installing loadable drivers for the
various hardware devices present on the system.
When these preparatory activities have been completed, the kernel creates another process that will run the init program
as the process with PID
Init is the ancestor of all subsequent UNIX processes and the direct parent of user login shells. During the remainder of
the boot process, init does the work needed to prepare the system for users.
One of init's first activities is to verify the integrity of the local file systems, beginning with the root file system and other
essential file systems, such as /usr. Since the kernel and the init program itself reside in the root file system (or sometimes
the /usr file system in the case of init), you might wonder how either one can be running before the corresponding file
system has been checked. There are several ways around this chicken-and-egg problem. Sometimes, there is a copy of the
kernel in the boot partition of the root disk as well as in the root file system. Alternatively, if the executable from the root
file system successfully begins executing, it is probably safe to assume that the file is OK.

In the case of init, there are several possibilities. Under System V, the root file system is mounted read-only until after it
has been checked, and init remounts it read-write. Alternatively, in the traditional BSD approach, the kernel handles
checking and mounting the root file system itself.
Still another method, used when booting from tape or CD-ROM (for example, during an operating system installation or
upgrade), and on some systems for normal boots, involves the use of an in-memory (RAM) file system containing just the
limited set of commands needed to access the system and its disks, including a version of init. Once control passes from
UNIX & SHELL PROGRAMMING (502) [16]
the RAM file system to the disk-based file system, the init process exits and restarts, this time from the "real" executable
on disk, a result that somewhat resembles a magician's sleight-of-hand trick.
Other activities performed by init include the following:
 Checking the integrity of the file systems, traditionally using the fsck utility
 Mounting local disks
 Designating and initializing paging areas
 Performing file system cleanup activities: checking disk quotas, preserving editor recovery files, and deleting
temporary files in /tmp and elsewhere
 Starting system server processes (daemons) for subsystems like printing, electronic mail, accounting, error
logging, and cron.
 Starting networking daemons and mounting remote disks
 Enabling user logins, usually by starting getty processes and/or the graphical login interface on the system
console (e.g., xdm), and removing the file /etc/nologin, if present
These activities are specified and carried out by means of the system initialization scripts , shell programs traditionally
stored in /etc or /sbin or their subdirectories and executed by init at boot time. These files are organized very differently
under System V and BSD, but they accomplish the same purposes. They are described in detail later in this chapter.
Once these activities are complete, users may log in to the system. At this point, the boot process is complete, and the
system is said to be in multiuser mode.

Internal & External Commands


 A command built into the shell is known as Internal Commands.
The shell doesn’t create a separate process to run internal commands. E.g.: cd, echo, pwd, etc…
cd & echo doesn’t generate process & execute directly by shell.

 A command stored in /bin directory known as External Commands.


External commands require the shell to create a new process. E.g.: cat, date, ls, etc…

Sticky Bits
The most common use of the sticky bit today is on directories. When the sticky bit is set, only the item's owner,
the directory's owner, or the super-user can rename or delete files. Without the sticky bit set, any user with write and
execute permissions for the directory can rename or delete contained files, regardless of owner.
e.g.
The sticky bit can be set using the chmod command and can be set using its octal mode 1000 or by its symbol t (s
is already used by the setuid bit).
For example, to add the bit on the directory /usr/local/tmp,
One would type chmod +t /usr/local/tmp.
Or, to make sure that directory has standard tmp permissions;
One could also type chmod 1777 /usr/local/tmp.

 In UNIX symbolic file system permission notation, the sticky bit is represented by the letter t in the final
character-place.

The /tmp directory, which by default has the sticky-bit set, shows up as:

$ ls -ld /tmp
drwxrwxrwt 4 root sys 485 Nov 10 06:01 /tmp
Boot Process Steps in Detail
An OS is low-level software that handles resources, gives basic services to other software,
and controls peripherals. We will explain each boot process in detail:

o BIOS
BIOS is an acronym for Basic Input/Output System. In other words, the BIOS can load and
run the MBR (Master Boot Record) boot loader. When we first turn on our system, the BIOS
first implements a few integrity checks of the SSD or HDD.
After that, the BIOS finds, loads, and runs the boot loader function, which can be detected
in the MBR. Sometimes, the MBR is on a CD-ROM or USB stick, like with a live Linux
installation. Then, the boot loader function is loaded into memory, and BIOS provides
system control to it once it is detected.
o MBR
MBR is an acronym for Master Boot Record and is liable to load and run the GRUB boot
loader. MBR is placed in the first bootable disk sector, which is generally /dev/sda, relying
on our hardware. Also, the MBR includes details of GRUB, or LILO is an older system.
o GRUB
GRUB is sometimes known as GNU GRUB, which stands for GNU GRand Unified
Bootloader. It is the classic bootloader for almost all the latest Linux systems. The splash
screen of GRUB is often the initial thing we see when we boot our system. It contains a
general menu where we can choose some portions.
We can use our keyboard to choose the one we wish our system to initiate with if we have
multiple installed kernel images. The latest kernel image is chosen by default. The splash
screen will delay for some seconds for us to choose options. It will load the kernel image
(default) if we don't. In several systems, we can see the GRUB configuration file
at /etc/grub/conf or /boot/grub/grub.conf.
o Kernel
Often, the kernel is called the code of an operating system. It contained full control on
everything in our system. In this boot process stage, the kernel mounts the base file system
that was chosen that is set up in the file, i.e., grub.conf. Then, it runs the /sbin/init function,
which is always the initial function to be run. We can confirm it with its PID (process id),
which should be always 1. Then, the kernel creates a temporary base file system with the
help of initrd (Initial RAM Disk) until the actual file system is mounted.
o Init
At this stage, our system runs runlevel programs. It would find an init file, generally
detected at /etc/inittab, to determine the run level of Linux. Modern Linux systems utilize
systemd to select a run level rather. Then, systemd will start running runlevel programs.
There are six run labels in the Linux operating system:
o 0- halt
o 1- single-user mode
o 2- multiuser, without NFS
o 3- Full multiuser mode
o 4- unused
o 5- X11
o 6- reboot
File Permissions
All the three owners (user owner, group, others) in the Linux system have three types of
permissions defined. Nine characters denotes the three types of permissions.

1. Read (r) : The read permission allows you to open and read the content of a file. But you
can't do any editing or modification in the file.
2. Write (w) : The write permission allows you to edit, remove or rename a file. For instance,
if a file is present in a directory, and write permission is set on the file but not on the
directory, then you can edit the content of the file but can't remove, or rename it.
3. Execute (x): In Unix type system, you can't run or execute a program unless execute
permission is set.But in Windows, there is no such permission available.

Permissions are listed below:

permission on a file on a directory

r (read) read file content (cat) read directory content (ls)

w (write) change file content (vi) create file in directory (touch)

x (execute) execute the file enter the directory (cd)

You might also like